updating readme

This commit is contained in:
Jeremy Kidwell 2021-09-21 09:01:16 +01:00
parent 4e95216aa1
commit 7e5cc61275
2 changed files with 78 additions and 1 deletions

25
CODE_OF_CONDUCT.md Normal file
View File

@ -0,0 +1,25 @@
# Contributor Code of Conduct
As contributors and maintainers of this project, we pledge to respect all people who
contribute through reporting issues, posting feature requests, updating documentation,
submitting pull requests or patches, and other activities.
We are committed to making participation in this project a harassment-free experience for
everyone, regardless of level of experience, gender, gender identity and expression,
sexual orientation, disability, personal appearance, body size, race, ethnicity, age, or religion.
Examples of unacceptable behavior by participants include the use of sexual language or
imagery, derogatory comments or personal attacks, trolling, public or private harassment,
insults, or other unprofessional conduct.
Project maintainers have the right and responsibility to remove, edit, or reject comments,
commits, code, wiki edits, issues, and other contributions that are not aligned to this
Code of Conduct. Project maintainers who do not follow the Code of Conduct may be removed
from the project team.
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by
opening an issue or contacting one or more of the project maintainers.
This Code of Conduct is adapted from the Contributor Covenant
(https://www.contributor-covenant.org), version 1.0.0, available at
https://contributor-covenant.org/version/1/0/0/.

View File

@ -1,2 +1,54 @@
# trs_admissions_survey2021
# L11-13 Admissions Preferences Survey 2021
R code and data relating to analysis of a survey deployed in 2021 to L11-13 pupils in the UK regarding their preferences for University study
## Why Reproducible Research?
If you're new to github and reproducible research, welcome! It's nice to have you here. Github is ordinarily a place where software developers working on open source software projects deposit their code as they write software collaboratively. However, in recent years a number of scholarly researchers, especially people working on research that involves a digital component (including me!) have begun to deposit their papers in these same software repositories. The idea is that you can download all of the source-code and data used in this paper alongside the actual text, run it yourself and ["reproduce" the results](http://kbroman.org/steps2rr/). This can serve as a useful safeguard, a layer of research transparency, and a cool teaching tool for other persons interested in doing similar work. Particularly when, as is the case in subject areas that are only just starting to get involved in the digital humanities, like religious studies, there is a dearth of work of this nature, it can be helpful to have examples of practice which can be reused, or at least used as an example.
Eschewing proprietary, expensive and unreliable software like Microsoft Word, I write in a combination of two languages: (1) [Markdown](https://en.wikipedia.org/wiki/Markdown) which is intended to be as close as possible to plain text while still allowing for things like boldfaced type, headings and footnotes; and (2) a programming language called [R](https://en.wikipedia.org/wiki/R_(programming_language)) to do all the data analysis. R is an object oriented language that was specifically designed for statistical analysis. It's also great fun to tinker with. As you look through this paper, you'll see that R code is integrated into the text of the document. This is indicated by a series of three backticks (```). There is a formal specification now at a mature stage of development, which is RMarkdown. You can read semi-official specification [for this here](https://bookdown.org/yihui/rmarkdown/pdf-document.html).
To read a bit more on these things and start on your own path towards plain text reproducible research, I highly recommend:
- Karl Broman's guide, "[Initial Steps Toward Reproducible Research](http://kbroman.org/steps2rr/)"
- Kieran Healy's guide, "[The Plain Persons Guide to Plain Text Social Science](http://kieranhealy.org/files/papers/plain-person-text.pdf)"
The other advantage of putting this paper here is that readers and reviewers can suggest changes and point out errors in the document. To do this, I recommend that you create a github issue by clicking on the green "New issue" button [here](https://github.com/kidwellj/trs_admissions_survey2021/issues). If you must, you can also send me emails. More stuff about the project lead Jeremy [can be found here](http://jeremykidwell.info).
Now for...
## The technical version
Code and the paper here are written in R Markdown and for the most part, using the conventions outlined by Kieran Healy [here](https://kieranhealy.org/blog/archives/2014/01/23/plain-text/) which is best viewed (I think) in [R Studio](https://www.rstudio.com) though it will be reasonably comprehensible to anyone using a Markdown editor. If I'm not working in RStudio, I'm probably in Sublime text, FYI. Co-authors and collaborators take note, generally, I use [Hadley Wickham's venerable R Style Guide](http://adv-r.had.co.nz/Style.html).
I'd be extremely happy if someone found errors, or imagined a more efficient means of analysis and either reported them as an issue on this github repository or sent me an email.
Paths in this folder are used mostly for R processing. I'm using a "project" oriented workflow, on which you can read more [in a blog by Jenny Bryan here](https://www.tidyverse.org/blog/2017/12/workflow-vs-script/). This uses the R package [here](https://cran.r-project.org/web/packages/here/index.html).
Towards this end folders have the following significance:
- `data` contains datasets used for analysis.
- `derived_data` contains files which represent modified forms of files in the above path.
- `figures` contains images and visualisations (graphic files) which are generated by R for the final form of the document.
- `cache` isn't included in github but is usually used for working files
Note: none of the contents of the above are included in the github repository unless they are unavailable from an external repository.
# Prerequisites for reproducing this codebase
We've tried to follow best practices in setting up this script for reproducibility, but some setup is required before execution will be successful.
These steps are:
1. Acquire a working installation of R (and RStudio). I have produced a Docker container that replicates the environment I have used to execute this script that is probably the easiest way to complete this task.
2. Install platform appropriate prerequisites for ...
3. Clone or download the code from this repository
4. Set up a proper R/RStudio working environment. I use the `renv` package to manage working environment, which takes snapshots and stores them to `renv.lock`. If you run `renv::restore()` in R after loading this code, it will install necessary libraries at proper versions.
5. Nearly all of the data used in this study is open, with one exception, that of the Ordnance Survey PointX data product. This is available to most UK academics via the EDINA service, so the user will need to manually download this data and place it in the `/data/` directory.
# Contributing
Please note that this project is released with a [Contributor Code of Conduct](CODE_OF_CONDUCT.md). By contributing to this project, you agree to abide by its terms.
# License
The content of any research papers in this repository are licensed under the [Creative Commons Attribution-ShareAlike 4.0 International Public License](https://creativecommons.org/licenses/by-sa/4.0/legalcode), and the underlying source code used to generate the paper is licensed under the [GNU AGPLv3](https://www.gnu.org/licenses/agpl-3.0.en.html) license. Underlying datasets designed as part of this research have their own licenses that are specified in their respective repositories.