trs_admissions_survey2021/To-do list Markdown.Rmd

158 lines
7.3 KiB
Plaintext
Raw Normal View History

---
title: "RMarkdown Admissions_Survey2021"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
2021-12-09 10:16:29 +00:00
library(ggplot2)
```
# Upload Data
```{r Upload Data}
TSR_data <- read.csv("./data/TSR data complete.csv")
2021-12-09 10:16:29 +00:00
subject_data <- read.csv("./data/Subject data.csv")
```
# Basic summary visualisations (RH):
- Q2 (respondent age)
```{r respondent age}
TSR_data$Age <- factor(TSR_data$Age, levels = c(1, 2, 3, 4, 5, 6, 7, 8), labels = c("15 or under", "16", "17", "18", "19", "20", "21 or over", "Prefer not to say"))
age_pie <- pie(table(TSR_data$Age))
```
- Q3 (year of study)
```{r year of study}
TSR_data$MOST.RECENT.year.of.study <- factor(TSR_data$MOST.RECENT.year.of.study, levels = c(1, 2, 3, 4, 5, 6, 7, 8, 9), labels = c("Year 11/S4/Year 12(NI)", "Year 12/S5/Year 13(NI)", "Year 13/S6/Year 14(NI)", "I am currently on a gap year", "I am currently on an undergraduate/HE college course", "I am in full-time employment", "I am unemployed", "Other", "Prefer not to say"))
Year_study_pie <- pie(table(TSR_data$MOST.RECENT.year.of.study))
```
- Q16 (gender identity)
```{r gender identity}
TSR_data$Gender <- factor(TSR_data$Gender, levels = c(1, 2, 3, 4), labels = c("Male", "Female", "I identify my gender in another way", "Prefer not to say"))
gender_pie <- pie(table(TSR_data$Gender))
```
- Q17 (ethnic self-id)
```{r ethnic self-id}
TSR_data$Ethnicity <- factor(TSR_data$Ethnicity, levels = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17 ,18, 19), labels = c("Arab", "Asian/Asian British - Indian", "Asian/Asian British - Pakistani", "Asian/Asian British - Bangladeshi", "Asian/Asian British - Chinese", "Asian/Asian British - Any other Asian background", "Black/Black British - African", "Black/Black British - Caribbean", "Black/Black British - Any other Black background", "Mixed/Multiple Ethnic Groups - White and Black Caribbean", "Mixed/Multiple Ethnic Groups - White and Black African", "Mixed/Multiple Ethnic Groups - White and Black Asian", "Mixed/Multiple Ethnic Groups - Any other Mixed/Multiple Ethnic background", "White - English/Welsh/Scottish/Northern Irish/British", "White - Irish", "White - Gypsy or Irish Traveller", "White - Any other White background", "Other Ethnic group, please describe", "Prefer not to say"))
Ethnicity_bar <- ggplot(TSR_data, aes(Ethnicity)) + geom_bar() + coord_flip()
Ethnicity_bar
```
- Q18 (religion)
```{r religion}
TSR_data$Religious.Affliation <- factor(TSR_data$Religious.Affliation, levels = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19), labels = c("Agnostic", "Atheist", "Baha'i", "Buddhist", "Christian", "Confucian", "Jain", "Jewish", "Hindu", "Indigenous Traditional Religious", "Muslim", "Pagan", "Shinto", "Sikh", "Spiritual but not religious", "Zoroastrian", "No religion", "Prefer not to say", "Other"))
Religious_affiliation_bar <- ggplot(TSR_data, aes(Religious.Affliation)) + geom_bar() + coord_flip()
Religious_affiliation_bar
```
# Visualisations of LIKERT responses (RH):
- For questions Q6 (subject interest) / Q5 (subject knowledge) / Q7 employability prospects:
- visualisation as summaries for all subjects LIKERT data as stacked bar chart (colours for bar segments from cool to warm)
```{r Visualization by Subject}
### Each Subject is a different column so will need to figure out how to code the columns together into one graph
# Higher score indicates less agreement...need to reverse score
## 1=5, 2=4, 3=3, 4=2, 5=1, 6=0 --- Not done yet. See what they look like without reverse scoring
# The way the code is now - the below will help you visualize overall across the entire respondent cohort what the understanding, interest, and view of employability are by subject
#Q5 Subject Knowledge/Understanding
subject_data$Subject <- factor(subject_data$Subject, levels = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), labels = c("Philosophy", "Sociology", "Psychology", "History", "Ethics", "Theology", "Religious Studies", "Politics", "English", "Math", "Computer Science", "Business"))
understanding_mean <- aggregate(Understanding ~ Subject, data = subject_data, mean)
#understanding_bar <- ggplot(subject_data, aes(x = Subject, y = aggregate(Understanding, by = Subject, FUN = mean))) + geom_bar() + labs(x = "Subject") + labs(y = "Understanding")
#understanding_bar
understanding_bar <- ggplot(understanding_mean, aes(x = Subject, y = Understanding)) + stat_summary(fun = "mean", geom = "bar") + coord_flip()
understanding_bar
#Q6 Subject Interest
interest_mean <- aggregate(Interest ~ Subject, data = subject_data, mean)
interest_bar <- ggplot(interest_mean, aes(x = Subject, y = Interest)) + stat_summary(fun = "mean", geom = "bar") + coord_flip()
interest_bar
#Q7 Employability Prospects
employability_mean <- aggregate(Employability ~ Subject, data = subject_data, mean)
employability_bar <- ggplot(employability_mean, aes(x = Subject, y = Employability)) + stat_summary(fun = "mean", geom = "bar") + coord_flip()
employability_bar
```
- separate visualisation of summary data as pie chart only for 4 key subjects: Philosophy, Ethics, Theology, Religious Studies, but with data represented as aggregated "Positive" / "Negative" responses
```{r Visualization for 4 Key Subjects}
## Subset by "Positive" / "Negative"
keysubjects_data <- subject_data[subject_data$Subject == "Philosophy" | subject_data$Subject == "Ethics" | subject_data$Subject == "Theology" | subject_data$Subject == "Religious Studies", ]
recode_interest <- ifelse(1 <= keysubjects_data$Interest & keysubjects_data$Interest >=3, "Positive", "Negative")
keysubjects_data <- cbind(keysubjects_data, recode_interest)
keysubjects_data$recode_interest <- factor(keysubjects_data$recode_interest)
table(keysubjects_data$recode_interest, keysubjects_data$Subject)
```
- subsetted visualisations of responses with separate subsetting by response to Q8-9, Q18, Q17, Q16
- For question Q8 + Q9 (for religious people)
- visualisation summary of responses
- show subsetted visualisations of responses by response to, Q18, Q17, Q16, Q13, Q14
- For responses to Q10-12 (what subjects are involved in...):
- represent answer counts as descending bar chart for each Q
- subset answers by Q6 (positive / negative) and Q5 (positive / negative)
# Correlation testing:
- For Q6 (subject interest) / Q5 (subject knowledge) / Q7 employability prospects, test for nature / strength of correlation with responses to:
- Q8-9 responses
- Q18 responses
- Q17
<<<<<<< Updated upstream
- Q18
=======
- Q18
```{r Q6 Correlations - Subject Interest}
#Q8-9 (8 - Theology as subject for religious people; 9 - Religion as study for religious people)
# This would be suitable for correlation
#Q17 (Ethnicity)
# This would be categorical, so ANOVA
#Q18 (Religion)
# This would also be categorical, so ANOVA
```
```{r Q5 Correlations - Subject Knowledge}
#Q8-9 (8 - Theology as subject for religious people; 9 - Religion as study for religious people)
# This would be suitable for correlation
#Q17 (Ethnicity)
# This would be categorical, so ANOVA
#Q18 (Religion)
# This would also be categorical, so ANOVA
```
```{r Q7 Correlations - Employability}
#Q8-9 (8 - Theology as subject for religious people; 9 - Religion as study for religious people)
# This would be suitable for correlation
#Q17 (Ethnicity)
# This would be categorical, so ANOVA
#Q18 (Religion)
# This would also be categorical, so ANOVA
```