mirror of
https://github.com/kidwellj/trs_admissions_survey2021.git
synced 2025-01-09 21:22:19 +00:00
54 lines
6.3 KiB
Markdown
54 lines
6.3 KiB
Markdown
# L11-13 Admissions Preferences Survey 2021
|
||
|
||
R code and data relating to analysis of a survey deployed in 2021 to L11-13 pupils in the UK regarding their preferences for University study
|
||
|
||
## Why Reproducible Research?
|
||
|
||
If you're new to github and reproducible research, welcome! It's nice to have you here. Github is ordinarily a place where software developers working on open source software projects deposit their code as they write software collaboratively. However, in recent years a number of scholarly researchers, especially people working on research that involves a digital component (including me!) have begun to deposit their papers in these same software repositories. The idea is that you can download all of the source-code and data used in this paper alongside the actual text, run it yourself and ["reproduce" the results](http://kbroman.org/steps2rr/). This can serve as a useful safeguard, a layer of research transparency, and a cool teaching tool for other persons interested in doing similar work. Particularly when, as is the case in subject areas that are only just starting to get involved in the digital humanities, like religious studies, there is a dearth of work of this nature, it can be helpful to have examples of practice which can be reused, or at least used as an example.
|
||
|
||
Eschewing proprietary, expensive and unreliable software like Microsoft Word, I write in a combination of two languages: (1) [Markdown](https://en.wikipedia.org/wiki/Markdown) which is intended to be as close as possible to plain text while still allowing for things like boldfaced type, headings and footnotes; and (2) a programming language called [R](https://en.wikipedia.org/wiki/R_(programming_language)) to do all the data analysis. R is an object oriented language that was specifically designed for statistical analysis. It's also great fun to tinker with. As you look through this paper, you'll see that R code is integrated into the text of the document. This is indicated by a series of three backticks (```). There is a formal specification now at a mature stage of development, which is RMarkdown. You can read semi-official specification [for this here](https://bookdown.org/yihui/rmarkdown/pdf-document.html).
|
||
|
||
To read a bit more on these things and start on your own path towards plain text reproducible research, I highly recommend:
|
||
- Karl Broman's guide, "[Initial Steps Toward Reproducible Research](http://kbroman.org/steps2rr/)"
|
||
- Kieran Healy's guide, "[The Plain Person’s Guide to Plain Text Social Science](http://kieranhealy.org/files/papers/plain-person-text.pdf)"
|
||
|
||
The other advantage of putting this paper here is that readers and reviewers can suggest changes and point out errors in the document. To do this, I recommend that you create a github issue by clicking on the green "New issue" button [here](https://github.com/kidwellj/trs_admissions_survey2021/issues). If you must, you can also send me emails. More stuff about the project lead Jeremy [can be found here](http://jeremykidwell.info).
|
||
|
||
Now for...
|
||
|
||
## The technical version
|
||
|
||
Code and the paper here are written in R Markdown and for the most part, using the conventions outlined by Kieran Healy [here](https://kieranhealy.org/blog/archives/2014/01/23/plain-text/) which is best viewed (I think) in [R Studio](https://www.rstudio.com) though it will be reasonably comprehensible to anyone using a Markdown editor. If I'm not working in RStudio, I'm probably in Sublime text, FYI. Co-authors and collaborators take note, generally, I use [Hadley Wickham's venerable R Style Guide](http://adv-r.had.co.nz/Style.html).
|
||
|
||
I'd be extremely happy if someone found errors, or imagined a more efficient means of analysis and either reported them as an issue on this github repository or sent me an email.
|
||
|
||
Paths in this folder are used mostly for R processing. I'm using a "project" oriented workflow, on which you can read more [in a blog by Jenny Bryan here](https://www.tidyverse.org/blog/2017/12/workflow-vs-script/). This uses the R package [here](https://cran.r-project.org/web/packages/here/index.html).
|
||
Towards this end folders have the following significance:
|
||
|
||
- `data` contains datasets used for analysis.
|
||
- `derived_data` contains files which represent modified forms of files in the above path.
|
||
- `figures` contains images and visualisations (graphic files) which are generated by R for the final form of the document.
|
||
- `cache` isn't included in github but is usually used for working files
|
||
|
||
Note: none of the contents of the above are included in the github repository unless they are unavailable from an external repository.
|
||
|
||
|
||
# Prerequisites for reproducing this codebase
|
||
|
||
We've tried to follow best practices in setting up this script for reproducibility, but some setup is required before execution will be successful.
|
||
|
||
These steps are:
|
||
|
||
1. Acquire a working installation of R (and RStudio). I have produced a Docker container that replicates the environment I have used to execute this script that is probably the easiest way to complete this task.
|
||
2. Install platform appropriate prerequisites for ...
|
||
3. Clone or download the code from this repository
|
||
4. Set up a proper R/RStudio working environment. I use the `renv` package to manage working environment, which takes snapshots and stores them to `renv.lock`. If you run `renv::restore()` in R after loading this code, it will install necessary libraries at proper versions.
|
||
5. Nearly all of the data used in this study is open, with one exception, that of the Ordnance Survey PointX data product. This is available to most UK academics via the EDINA service, so the user will need to manually download this data and place it in the `/data/` directory.
|
||
|
||
# Contributing
|
||
|
||
Please note that this project is released with a [Contributor Code of Conduct](CODE_OF_CONDUCT.md). By contributing to this project, you agree to abide by its terms.
|
||
|
||
# License
|
||
|
||
The content of any research papers in this repository are licensed under the [Creative Commons Attribution-ShareAlike 4.0 International Public License](https://creativecommons.org/licenses/by-sa/4.0/legalcode), and the underlying source code used to generate the paper is licensed under the [GNU AGPLv3](https://www.gnu.org/licenses/agpl-3.0.en.html) license. Underlying datasets designed as part of this research have their own licenses that are specified in their respective repositories. |