1 What are the highest math classes that students took?

As seen in the graph and table in the tabs below, the most frequent highest math course taken is MTH 231 in both 2019 and 2020. This course represents 36.8% and 41.0% of the maximum math courses taken in 2019 and 2020, respectively, out of courses that range from MTH 103 to MTH 432.

1.1 Highest Math Course Graph

1.2 Highest Math Course Table

2019
2020
MTH N % N %
103 0 0.0% 1 0.2%
107 3 0.6% 4 0.8%
111 4 0.8% 2 0.4%
114 3 0.6% 1 0.2%
121 51 10.7% 47 9.6%
131 51 10.7% 42 8.6%
132 1 0.2% 0 0.0%
141 9 1.9% 10 2.0%
231 176 36.9% 201 41.0%
241 30 6.3% 20 4.1%
242 38 8.0% 25 5.1%
305 1 0.2% 0 0.0%
331 2 0.4% 0 0.0%
341 14 2.9% 21 4.3%
405 1 0.2% 1 0.2%
425 0 0.0% 1 0.2%
426 0 0.0% 1 0.2%
432 1 0.2% 1 0.2%
No Record 92 19.3% 112 22.9%
Total 477 100.0% 490 100.0%

2 When did they take their highest math?

The heatmap displays the most frequent term in which the highest math class was taken, by year. The colors/values correspond to the counts/frequencies that are greater than the median count value (excluding zeros). The median frequency across all year and MTH course combinations for 2019 was 7 and was 6 for 2020 (excluding zeros). Most students took their highest math class within 4 semesters of taking GN 311. For GN 311 2019, the semester where most students took their highest MTH course was the 2018 Spring Term. For GN 311 2020, the 2019 Spring Term contained the highest frequencies of maximum MTH course taken.

NOTE: Students that may have taken their highest math class more than once are counted only once according to the semester in which their grade points were highest (which generally corresponds to the latest enrollment), and then by latest enrollment if grade points were the same across different semesters.

2.1 Graphs of Highest Math by Year Taken

2.2 2019 Table

107 111 114 121 131 132 141 231 241 242 305 331 341 405 432 Total
2020 Spring Term 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2019 Fall Term 1 0 0 2 4 1 1 14 3 4 0 1 2 0 0 33
2019 Summer Term 2 0 0 0 0 2 0 0 1 0 0 0 0 0 0 0 3
2019 Summer Term 1 0 0 0 1 1 0 0 1 0 0 0 0 1 0 0 4
2019 Spring Term 0 0 0 6 7 0 0 40 2 9 1 0 1 0 1 67
2018 Fall Term 0 0 2 10 9 0 3 32 9 10 0 0 3 0 0 78
2018 Summer Term 2 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 2
2018 Summer Term 1 0 0 0 0 0 0 0 1 0 2 0 0 1 0 0 4
2018 Spring Term 0 1 0 13 10 0 1 47 7 9 0 0 1 1 0 90
2017 Fall Term 2 0 1 7 11 0 1 21 4 3 0 1 0 0 0 51
2017 Summer Term 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1
2017 Spring Term 0 1 0 3 1 0 1 10 3 0 0 0 5 0 0 24
2016 Fall Term 0 0 0 7 4 0 0 5 1 0 0 0 0 0 0 17
2016 Spring Term 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2015 Fall Term 0 0 0 1 2 0 1 0 0 1 0 0 0 0 0 5
2015 Summer Term 2 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1
2014 Fall Term 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1
2014 Spring Term 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1
2013 Fall Term 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1
2013 Spring Term 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2011 Summer Term 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2011 Spring Term 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1
2010 Fall Term 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1
Total 3 4 3 51 51 1 9 176 30 38 1 2 14 1 1 385

2.3 2020 Table

103 107 111 114 121 131 141 231 241 242 341 405 425 426 432 Total
2020 Spring Term 1 1 1 0 2 7 0 21 1 4 3 0 0 1 0 42
2019 Fall Term 0 0 0 0 3 3 1 43 2 7 1 1 1 0 0 62
2019 Summer Term 2 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1
2019 Summer Term 1 0 0 0 0 0 2 0 1 0 1 0 0 0 0 0 4
2019 Spring Term 0 0 1 0 7 9 1 45 5 3 6 0 0 0 1 78
2018 Fall Term 0 1 0 0 12 5 3 35 2 6 1 0 0 0 0 65
2018 Summer Term 2 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 2
2018 Summer Term 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 2
2018 Spring Term 0 1 0 0 9 5 0 28 3 0 4 0 0 0 0 50
2017 Fall Term 0 0 0 0 10 8 4 12 2 3 3 0 0 0 0 42
2017 Summer Term 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1
2017 Spring Term 0 0 0 0 2 0 0 7 2 0 1 0 0 0 0 12
2016 Fall Term 0 0 0 1 1 2 0 5 1 0 0 0 0 0 0 10
2016 Spring Term 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1
2015 Fall Term 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1
2015 Summer Term 2 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1
2014 Fall Term 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1
2014 Spring Term 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2013 Fall Term 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1
2013 Spring Term 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1
2011 Summer Term 2 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1
2011 Spring Term 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2010 Fall Term 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Total 1 4 2 1 47 42 10 201 20 25 21 1 1 1 1 378

3 How many students have taken ST 311? (prior to enrollment in GN 311)

193 students took ST 311 prior to taking GN 311 in the fall of 2019. 213 students took ST 311 prior to taking GN 311 in the spring of 2020. This table excludes students that took ST 311 during their enrollment in GN 311.

NOTE: Students that may have taken ST 311 more than once are counted only once according to the semester in which their grade points were highest (which generally corresponds to the latest enrollment), and then by latest enrollment if grade points were the same across different semesters.

3.1 Table

2019
2020
Term N % N %
2019 Fall Term 0 0.0% 67 31.5%
2019 Summer Term 2 2 1.0% 0 0.0%
2019 Summer Term 1 3 1.6% 3 1.4%
2019 Spring Term 50 25.9% 51 23.9%
2018 Fall Term 62 32.1% 45 21.1%
2018 Summer Term 1 3 1.6% 2 0.9%
2018 Spring Term 33 17.1% 23 10.8%
2017 Fall Term 20 10.4% 10 4.7%
2017 Summer Term 2 1 0.5% 0 0.0%
2017 Spring Term 12 6.2% 9 4.2%
2016 Fall Term 4 2.1% 2 0.9%
2016 Spring Term 0 0.0% 1 0.5%
2015 Fall Term 1 0.5% 0 0.0%
2014 Fall Term 1 0.5% 0 0.0%
2011 Fall Term 1 0.5% 0 0.0%
Total 193 100.0% 213 100.0%

4 How many people are currently taking ST 311? (while enrolled in GN 311)

36 students were enrolled in ST 311 while taking GN 311 during the fall of 2019. 28 students are currently enrolled in ST 311 and GN 311 for the spring, 2020.

NOTE: Few students may have also taken ST 311 prior to current ST 311 course enrollment with respect to their GN 311 cohort.

4.1 Table

Enrolled
Not Enrolled
N % N %
2019 36 8.2% 441 92.5%
2020 28 5.7% 462 94.3%

5 What was the average GPA for ST 311?

The average GPA for students who took ST 311 and GN 311 during Fall 2019 (indicated by the ‘Current (2019)’ row) is 3.0. The average GPA for students who did not take (or are not taking) ST 311 concurrently with GN 311, in 2019 and 2020 (middle row), is 3.3. The overall average GPA including all students (see notes below) is 3.1.

5.1 Table

Min Q1 Median Mean Q3 Max N
Current (2019) 1.0 2.7 3.0 3.0 3.3 4.0 36
Prior 1.0 3.0 3.3 3.3 4.0 4.3 404
Overall 0.0 2.7 3.3 3.1 4.0 4.3 468

6 How many students got Homework 1 question 10 correct? How many got Homework 2 question 9 correct?

The table below displays 2020 in blue. The counts and percentages of correct, incorrect, and ‘No Record’ responses are indicated in the columns below.

6.1 Table

2019
2020
Correct
Incorrect
No Record
Correct
Incorrect
No Record
N % N % N % N % N % N %
Q9 343 71.9% 114 23.9% 20 4.2% 378 77.1% 95 19.4% 17 3.5%
Q10 321 67.3% 139 29.1% 17 3.6% 320 65.3% 154 31.4% 16 3.3%
Both 246 51.6% 214 44.9% 17 3.6% 268 54.7% 206 42.0% 16 3.3%

7 Does it make a difference whether someone has taken ST 311 or not as to whether they got question 10 or 9 correct?

7.1 Table

Below is a frequency table for questions 9, 10 and both that tabulates correct, incorrect and no record responses by whether or not ST 311 was taken. The first two rows are for 2019 and the second two rows are for 2020. These questions are prior to intervention in the spring 2020 semester so we would roughly expect to see the same trends across 2019 and 2020 assuming similar populations of students.

Q9
Q10
Both
Correct Incorrect No Record Correct Incorrect No Record Correct Incorrect No Record
2019 Not Taken 175 60 13 169 70 9 125 114 9
Took 311 168 54 7 152 69 8 121 100 8
2020 Not Taken 202 39 10 168 73 10 150 91 10
Took 311 176 56 7 152 81 6 118 115 6

7.2 Significant Differences

Using a binomial glm model logodds(BothCorrect) ~ HighestMTH + tookST311 + ST311current did not yield any significant factors.

Marginal binomial models for the log odds of answering each separate question correctly, using whether or not a student has taken ST 311 as the factor of interest indicated no significant association of taking ST 311 and grades for 2019 on correct answers for each question and both combined. The same results were found for 2020 except for question 10 where the model indicates a significant association between taking ST 311 and correct answers at the .032 significane level using a Wald Test. However, the chi-square test does not indicate significance and accounting for the multiple tests, the association is not likely significant.

‘No Record’ responses were excluded from these models.

Below is the summary output for the binomial model coefficients for question 9 in 2019 along with the chi-square association test. The last two rows are the same models for question 9 in 2020. All other tests of interest are similar in that there is no evidence of statistically significant association between taking ST 311 on answer status.

##                      Estimate Std. Error    z value     Pr(>|z|)
## (Intercept)       -1.07044141  0.1496025 -7.1552365 8.352878e-13
## tookST311Took 311 -0.06453852  0.2164527 -0.2981645 7.655776e-01
## Number of cases in table: 477 
## Number of factors: 2 
## Test for independence of all factors:
##  Chisq = 1.5042, df = 2, p-value = 0.4714
##                     Estimate Std. Error   z value     Pr(>|z|)
## (Intercept)       -1.6447061  0.1749043 -9.403462 5.279685e-21
## tookST311Took 311  0.4995737  0.2326595  2.147231 3.177488e-02
## Number of cases in table: 490 
## Number of factors: 2 
## Test for independence of all factors:
##  Chisq = 5.069, df = 2, p-value = 0.0793

8 Analysis in regard to HW 6 questions

Intervention occurred between homework 2 and homework 6. We want to look at any possible impact the intervention may have had. There are a number of questions from homework 6 to consider here. There are binary indicators of correct/incorrect for each and one summary average variable that aggregates across the questions.

For all of the questions, ignoring missing values (students that didn’t complete the question and students that didn’t have a cohort) the first quartile of binary scores is 1. This indicates that 75% of people got an individual question correct. The sample proportion of correct responses for each question is given below as well as the summary stats on the aggregate variable.

means <- full_data %>% group_by(Cohort) %>% select(Cohort, ends_with("bin")) %>% 
  summarize_if(.predicate = function(x) is.numeric(x), .funs = funs(mean="mean"), na.rm = TRUE) 
kable(means[1:2,], col.names = c("Cohort", paste("Q", c(3:10, 15:19)), "Aggregate"), digits = 3) %>% 
  kable_styling(bootstrap_options = "striped",full_width = T) 
Cohort Q 3 Q 4 Q 5 Q 6 Q 7 Q 8 Q 9 Q 10 Q 15 Q 16 Q 17 Q 18 Q 19 Aggregate
Fall2019 0.978 0.978 0.876 0.868 0.868 0.863 0.857 0.854 0.881 0.951 0.911 0.867 0.880 0.900
Spring2020 0.969 0.966 0.921 0.906 0.901 0.864 0.841 0.846 0.890 0.959 0.931 0.927 0.921 0.921

Looking closely at the questions themselves, I’m going to remove question 8, 9, and 10 (Keeping all decimal digits during your calculations and rounding your answer to 3 decimal digits, what is the calculated chi-square value? What is the number for the degrees of freedom? What is the critical value? Round to two decimal places.) as these weren’t things focused on in the intervention.

full_data <- select(full_data, -HW6_Q_8_score_bin, -HW6_Q_9_score_bin, - HW6_Q_10_score_bin)

Conducting a quick comparison of these sample proportions by question using a logistic regression model to get a p-value (assumptions likely not met but still somewhat useful).

full_data <- full_data[!is.na(full_data$Cohort),]

#create indicator of math above 100 level or not
full_data$MTHLevel <- ifelse(full_data$MTHcourse > 200, "200's", "100's")

sum_glm <- function(data, response){ 
  fit <- glm(data[[response]] ~ Cohort, data = data)
  means <- aggregate(data[[response]] ~ Cohort, data = data, FUN = mean)
  resp <- gregexpr(response, pattern = "(Q_\\d+)")
  ret <- c(substring(response, resp[[1]], resp[[1]]+attr(resp[[1]],'match.length')-1), round(c(fit$coefficients, means[,2], means[2,2]-means[1,2], summary(fit)$coefficients[2,4]), 4))
  names(ret) <- c("Response", "Intercept", "Beta1", "2019 mean", "2020 mean", "Difference", "p-value")
  ret
}

bin_names <- c("HW6_Q_3_score_bin", "HW6_Q_4_score_bin", "HW6_Q_5_score_bin",
               "HW6_Q_6_score_bin", "HW6_Q_7_score_bin",  "HW6_Q_15_score_bin",
               "HW6_Q_16_score_bin", "HW6_Q_17_score_bin", "HW6_Q_18_score_bin",
           "HW6_Q_19_score_bin")
lapply(X = bin_names, FUN = sum_glm, data = full_data)
## [[1]]
##   Response  Intercept      Beta1  2019 mean  2020 mean Difference    p-value 
##      "Q_3"   "0.9785"  "-0.0097"   "0.9785"   "0.9688"  "-0.0097"   "0.4166" 
## 
## [[2]]
##   Response  Intercept      Beta1  2019 mean  2020 mean Difference    p-value 
##      "Q_4"   "0.9784"  "-0.0125"   "0.9784"   "0.9659"  "-0.0125"   "0.3053" 
## 
## [[3]]
##   Response  Intercept      Beta1  2019 mean  2020 mean Difference    p-value 
##      "Q_5"    "0.876"   "0.0453"    "0.876"   "0.9213"   "0.0453"   "0.0461" 
## 
## [[4]]
##   Response  Intercept      Beta1  2019 mean  2020 mean Difference    p-value 
##      "Q_6"   "0.8679"   "0.0385"   "0.8679"   "0.9064"   "0.0385"   "0.1058" 
## 
## [[5]]
##   Response  Intercept      Beta1  2019 mean  2020 mean Difference    p-value 
##      "Q_7"   "0.8679"    "0.033"   "0.8679"   "0.9009"    "0.033"   "0.1704" 
## 
## [[6]]
##   Response  Intercept      Beta1  2019 mean  2020 mean Difference    p-value 
##     "Q_15"   "0.8808"   "0.0091"   "0.8808"   "0.8899"   "0.0091"   "0.7037" 
## 
## [[7]]
##   Response  Intercept      Beta1  2019 mean  2020 mean Difference    p-value 
##     "Q_16"   "0.9514"   "0.0081"   "0.9514"   "0.9594"   "0.0081"   "0.6027" 
## 
## [[8]]
##   Response  Intercept      Beta1  2019 mean  2020 mean Difference    p-value 
##     "Q_17"   "0.9111"   "0.0198"   "0.9111"   "0.9308"   "0.0198"   "0.3278" 
## 
## [[9]]
##   Response  Intercept      Beta1  2019 mean  2020 mean Difference    p-value 
##     "Q_18"   "0.8672"   "0.0599"   "0.8672"   "0.9271"   "0.0599"   "0.0088" 
## 
## [[10]]
##   Response  Intercept      Beta1  2019 mean  2020 mean Difference    p-value 
##     "Q_19"   "0.8804"   "0.0408"   "0.8804"   "0.9213"   "0.0408"   "0.0696"

Three of these are significant or marginally signficant and in the direction we’d hope for. That’s a good sign.

We’ll also consider just the students that missed at least one question from the earlier homework assignment. Perhaps they are the ones that will show improvement.

#filter by missing an earlier question or having no record
missed_data <- filter(full_data, bothQscorrect != 1)
lapply(X = bin_names, FUN = sum_glm, data = missed_data)
## [[1]]
##   Response  Intercept      Beta1  2019 mean  2020 mean Difference    p-value 
##      "Q_3"   "0.9665"  "-0.0242"   "0.9665"   "0.9423"  "-0.0242"   "0.2873" 
## 
## [[2]]
##   Response  Intercept      Beta1  2019 mean  2020 mean Difference    p-value 
##      "Q_4"   "0.9719"   "-0.036"   "0.9719"   "0.9359"   "-0.036"   "0.1135" 
## 
## [[3]]
##   Response  Intercept      Beta1  2019 mean  2020 mean Difference    p-value 
##      "Q_5"   "0.8212"   "0.0359"   "0.8212"   "0.8571"   "0.0359"   "0.3834" 
## 
## [[4]]
##   Response  Intercept      Beta1  2019 mean  2020 mean Difference    p-value 
##      "Q_6"   "0.8212"    "7e-04"   "0.8212"   "0.8219"    "7e-04"   "0.9872" 
## 
## [[5]]
##   Response  Intercept      Beta1  2019 mean  2020 mean Difference    p-value 
##      "Q_7"   "0.8156"    "7e-04"   "0.8156"   "0.8163"    "7e-04"   "0.9874" 
## 
## [[6]]
##   Response  Intercept      Beta1  2019 mean  2020 mean Difference    p-value 
##     "Q_15"   "0.8475"   "0.0192"   "0.8475"   "0.8667"   "0.0192"    "0.623" 
## 
## [[7]]
##   Response  Intercept      Beta1  2019 mean  2020 mean Difference    p-value 
##     "Q_16"   "0.9438"  "-0.0034"   "0.9438"   "0.9404"  "-0.0034"   "0.8949" 
## 
## [[8]]
##   Response  Intercept      Beta1  2019 mean  2020 mean Difference    p-value 
##     "Q_17"   "0.8939"   "-0.005"   "0.8939"   "0.8889"   "-0.005"   "0.8851" 
## 
## [[9]]
##   Response  Intercept      Beta1  2019 mean  2020 mean Difference    p-value 
##     "Q_18"   "0.8079"   "0.0847"   "0.8079"   "0.8926"   "0.0847"   "0.0346" 
## 
## [[10]]
##   Response  Intercept      Beta1  2019 mean  2020 mean Difference    p-value 
##     "Q_19"   "0.8239"   "0.0755"   "0.8239"   "0.8993"   "0.0755"    "0.052"

Looking like just the last two questions showed improvement (none went the wrong way) when averaged across things.

9 HW 6 Q’s broken down - all students

To further investigate the effect of the intervention we’ll look at a break down comparing cohorts for correct/incorrect question status by things like:

  • highest math course taken (>200 or <200) (MTHLevel)
  • ST 311 status (not taken or taken) (tookST311)
  • class level (senior, etc.) (NC_LVL_BOT_DESCR)
  • race (STDNT_RACE_IPEDS)
  • sex (STUDENT_GENDER_IPEDS)
  • transfer status (ORIG_ENROLL_STATUS)
  • rural status(RuralStatus)
  • first gen status (FirstGenStatus)

First just the means across some combinations.

9.1 By Cohort, Math Level, Transfer, and First Gen

out2 <- lapply(X = bin_names, FUN = q_means, data = full_data, preds = preds[c(1,2,7,9)])

out2e <- lapply(FUN = nice_it, X= out2)

for(x in 1:length(out2e)){
  print(kable(out2e[[x]], digits = 3, col.names = c("MathLevel", "Transfer", "First Gen", "Avg2019", "Avg2020", "n2019", "n2020", "AvgDiff"), caption = paste0("Question ", qs[x], " Summary")) %>% 
          kable_styling(bootstrap_options = "striped",full_width = T) )
} 
Question 3 Summary
MathLevel Transfer First Gen Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s New Student First Gen 0.909 1.000 11 10 0.091
200’s New Student First Gen 1.000 0.976 33 42 -0.024
100’s New Transfer Student First Gen 1.000 1.000 9 5 0.000
200’s New Transfer Student First Gen 1.000 0.818 8 11 -0.182
100’s New Student Not First 0.975 0.984 80 63 0.009
200’s New Student Not First 0.979 0.978 190 180 -0.001
100’s New Transfer Student Not First 1.000 1.000 11 11 0.000
200’s New Transfer Student Not First 0.957 1.000 23 17 0.043
100’s New Student Unknown 1.000 0.667 3 3 -0.333
200’s New Student Unknown 1.000 1.000 2 3 0.000
100’s New Transfer Student Unknown NA 1.000 NA 3 NA
200’s New Transfer Student Unknown 1.000 0.500 2 4 -0.500
Question 4 Summary
MathLevel Transfer First Gen Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s New Student First Gen 0.909 1.000 11 10 0.091
200’s New Student First Gen 1.000 1.000 33 42 0.000
100’s New Transfer Student First Gen 1.000 0.600 9 5 -0.400
200’s New Transfer Student First Gen 1.000 0.818 8 11 -0.182
100’s New Student Not First 0.962 0.984 79 63 0.022
200’s New Student Not First 0.984 0.983 190 179 -0.001
100’s New Transfer Student Not First 1.000 1.000 11 11 0.000
200’s New Transfer Student Not First 0.957 0.941 23 17 -0.015
100’s New Student Unknown 1.000 0.667 3 3 -0.333
200’s New Student Unknown 1.000 1.000 2 3 0.000
100’s New Transfer Student Unknown NA 1.000 NA 3 NA
200’s New Transfer Student Unknown 1.000 0.500 2 4 -0.500
Question 5 Summary
MathLevel Transfer First Gen Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s New Student First Gen 0.636 0.667 11 9 0.030
200’s New Student First Gen 0.970 0.927 33 41 -0.043
100’s New Transfer Student First Gen 0.778 1.000 9 5 0.222
200’s New Transfer Student First Gen 0.750 0.818 8 11 0.068
100’s New Student Not First 0.838 0.935 80 62 0.098
200’s New Student Not First 0.905 0.926 190 176 0.021
100’s New Transfer Student Not First 0.909 1.000 11 10 0.091
200’s New Transfer Student Not First 0.818 1.000 22 17 0.182
100’s New Student Unknown 0.667 1.000 3 2 0.333
200’s New Student Unknown 1.000 1.000 2 3 0.000
100’s New Transfer Student Unknown NA 0.667 NA 3 NA
200’s New Transfer Student Unknown 1.000 0.667 2 3 -0.333
Question 6 Summary
MathLevel Transfer First Gen Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s New Student First Gen 0.636 0.750 11 8 0.114
200’s New Student First Gen 0.970 0.878 33 41 -0.092
100’s New Transfer Student First Gen 0.778 0.800 9 5 0.022
200’s New Transfer Student First Gen 0.750 0.909 8 11 0.159
100’s New Student Not First 0.838 0.903 80 62 0.066
200’s New Student Not First 0.889 0.920 190 176 0.031
100’s New Transfer Student Not First 0.909 1.000 11 10 0.091
200’s New Transfer Student Not First 0.818 0.941 22 17 0.123
100’s New Student Unknown 0.667 1.000 3 2 0.333
200’s New Student Unknown 1.000 1.000 2 3 0.000
100’s New Transfer Student Unknown NA 0.667 NA 3 NA
200’s New Transfer Student Unknown 1.000 0.667 2 3 -0.333
Question 7 Summary
MathLevel Transfer First Gen Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s New Student First Gen 0.636 0.556 11 9 -0.081
200’s New Student First Gen 0.970 0.878 33 41 -0.092
100’s New Transfer Student First Gen 0.778 0.800 9 5 0.022
200’s New Transfer Student First Gen 0.750 0.818 8 11 0.068
100’s New Student Not First 0.838 0.935 80 62 0.098
200’s New Student Not First 0.895 0.915 190 176 0.020
100’s New Transfer Student Not First 0.909 1.000 11 10 0.091
200’s New Transfer Student Not First 0.818 0.941 22 17 0.123
100’s New Student Unknown 0.667 1.000 3 2 0.333
200’s New Student Unknown 1.000 1.000 2 3 0.000
100’s New Transfer Student Unknown NA 0.667 NA 3 NA
200’s New Transfer Student Unknown 0.500 0.667 2 3 0.167
Question 8 Summary
MathLevel Transfer First Gen Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s New Student First Gen 0.727 0.889 11 9 0.162
200’s New Student First Gen 0.848 0.929 33 42 0.080
100’s New Transfer Student First Gen 1.000 1.000 9 5 0.000
200’s New Transfer Student First Gen 0.875 1.000 8 10 0.125
100’s New Student Not First 0.886 0.852 79 61 -0.034
200’s New Student Not First 0.884 0.876 189 177 -0.008
100’s New Transfer Student Not First 0.818 1.000 11 11 0.182
200’s New Transfer Student Not First 0.909 0.941 22 17 0.032
100’s New Student Unknown 1.000 0.667 3 3 -0.333
200’s New Student Unknown 1.000 1.000 2 3 0.000
100’s New Transfer Student Unknown NA 1.000 NA 3 NA
200’s New Transfer Student Unknown 1.000 0.667 2 3 -0.333
Question 9 Summary
MathLevel Transfer First Gen Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s New Student First Gen 0.909 1.000 11 9 0.091
200’s New Student First Gen 0.970 0.952 33 42 -0.017
100’s New Transfer Student First Gen 1.000 0.800 9 5 -0.200
200’s New Transfer Student First Gen 1.000 0.909 8 11 -0.091
100’s New Student Not First 0.975 0.984 80 61 0.009
200’s New Student Not First 0.937 0.966 189 177 0.030
100’s New Transfer Student Not First 1.000 1.000 11 11 0.000
200’s New Transfer Student Not First 0.955 0.941 22 17 -0.013
100’s New Student Unknown 0.667 0.500 3 2 -0.167
200’s New Student Unknown 1.000 1.000 2 3 0.000
100’s New Transfer Student Unknown NA 0.667 NA 3 NA
200’s New Transfer Student Unknown 1.000 1.000 2 3 0.000
Question 10 Summary
MathLevel Transfer First Gen Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s New Student First Gen 0.727 0.800 11 10 0.073
200’s New Student First Gen 0.970 0.929 33 42 -0.041
100’s New Transfer Student First Gen 0.778 1.000 9 5 0.222
200’s New Transfer Student First Gen 0.750 0.818 8 11 0.068
100’s New Student Not First 0.938 0.933 81 60 -0.005
200’s New Student Not First 0.910 0.966 189 178 0.056
100’s New Transfer Student Not First 1.000 0.636 11 11 -0.364
200’s New Transfer Student Not First 0.864 0.941 22 17 0.078
100’s New Student Unknown 1.000 0.667 3 3 -0.333
200’s New Student Unknown 1.000 1.000 2 3 0.000
100’s New Transfer Student Unknown NA 0.667 NA 3 NA
200’s New Transfer Student Unknown 1.000 1.000 2 3 0.000
Question 15 Summary
MathLevel Transfer First Gen Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s New Student First Gen 0.818 0.778 11 9 -0.040
200’s New Student First Gen 0.939 0.952 33 42 0.013
100’s New Transfer Student First Gen 0.889 0.800 9 5 -0.089
200’s New Transfer Student First Gen 0.750 0.900 8 10 0.150
100’s New Student Not First 0.888 0.934 80 61 0.047
200’s New Student Not First 0.851 0.932 188 176 0.081
100’s New Transfer Student Not First 0.909 0.909 11 11 0.000
200’s New Transfer Student Not First 0.909 0.941 22 17 0.032
100’s New Student Unknown 1.000 1.000 3 2 0.000
200’s New Student Unknown 0.500 1.000 2 3 0.500
100’s New Transfer Student Unknown NA 1.000 NA 3 NA
200’s New Transfer Student Unknown 0.500 0.667 2 3 0.167
Question 16 Summary
MathLevel Transfer First Gen Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s New Student First Gen 0.818 0.778 11 9 -0.040
200’s New Student First Gen 0.939 0.929 33 42 -0.011
100’s New Transfer Student First Gen 0.889 0.800 9 5 -0.089
200’s New Transfer Student First Gen 0.750 0.900 8 10 0.150
100’s New Student Not First 0.888 0.902 80 61 0.014
200’s New Student Not First 0.872 0.943 188 176 0.071
100’s New Transfer Student Not First 0.909 0.909 11 11 0.000
200’s New Transfer Student Not First 0.952 0.882 21 17 -0.070
100’s New Student Unknown 1.000 1.000 3 2 0.000
200’s New Student Unknown 0.500 1.000 2 3 0.500
100’s New Transfer Student Unknown NA 1.000 NA 3 NA
200’s New Transfer Student Unknown 0.500 0.667 2 3 0.167

9.2 BY Cohort, Math Level, and Rural status

out3 <- lapply(X = bin_names, FUN = q_means, data = full_data, preds = preds[c(1,2, 8)])

out3e <- lapply(FUN = nice_it, X= out3)

for(x in 1:length(out3e)){
  print(kable(out3e[[x]], digits = 3, col.names = c("MathLevel", "Rural", "Avg2019", "Avg2020", "n2019", "n2020", "AvgDiff"), caption = paste0("Question ", qs[x], " Summary")) %>% 
          kable_styling(bootstrap_options = "striped",full_width = T) )
} 
Question 3 Summary
MathLevel Rural Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s OutofState_International 1.000 1.000 19 13 0.000
200’s OutofState_International 1.000 1.000 31 26 0.000
100’s Tier1&2 0.946 0.967 56 30 0.020
200’s Tier1&2 0.987 0.939 79 82 -0.048
100’s Tier3 1.000 0.981 39 52 -0.019
200’s Tier3 0.973 0.979 148 146 0.006
Question 4 Summary
MathLevel Rural Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s OutofState_International 0.895 1.000 19 13 0.105
200’s OutofState_International 1.000 1.000 31 26 0.000
100’s Tier1&2 0.964 0.967 55 30 0.003
200’s Tier1&2 1.000 0.963 79 81 -0.037
100’s Tier3 1.000 0.942 39 52 -0.058
200’s Tier3 0.973 0.973 148 146 0.000
Question 5 Summary
MathLevel Rural Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s OutofState_International 0.842 1.000 19 13 0.158
200’s OutofState_International 0.935 0.923 31 26 -0.012
100’s Tier1&2 0.800 0.893 55 28 0.093
200’s Tier1&2 0.886 0.924 79 79 0.038
100’s Tier3 0.825 0.900 40 50 0.075
200’s Tier3 0.905 0.930 147 143 0.025
Question 6 Summary
MathLevel Rural Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s OutofState_International 0.737 1.000 19 13 0.263
200’s OutofState_International 0.935 0.923 31 26 -0.012
100’s Tier1&2 0.818 0.889 55 27 0.071
200’s Tier1&2 0.873 0.911 79 79 0.038
100’s Tier3 0.850 0.860 40 50 0.010
200’s Tier3 0.891 0.916 147 143 0.025
Question 7 Summary
MathLevel Rural Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s OutofState_International 0.737 1.000 19 13 0.263
200’s OutofState_International 0.935 0.923 31 26 -0.012
100’s Tier1&2 0.818 0.857 55 28 0.039
200’s Tier1&2 0.873 0.911 79 79 0.038
100’s Tier3 0.850 0.880 40 50 0.030
200’s Tier3 0.891 0.909 147 143 0.018
Question 8 Summary
MathLevel Rural Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s OutofState_International 0.895 1.000 19 13 0.105
200’s OutofState_International 0.871 0.800 31 25 -0.071
100’s Tier1&2 0.889 0.862 54 29 -0.027
200’s Tier1&2 0.873 0.873 79 79 0.000
100’s Tier3 0.850 0.860 40 50 0.010
200’s Tier3 0.890 0.917 146 145 0.027
Question 9 Summary
MathLevel Rural Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s OutofState_International 0.895 1.000 19 13 0.105
200’s OutofState_International 0.903 1.000 31 26 0.097
100’s Tier1&2 0.982 0.929 55 28 -0.053
200’s Tier1&2 0.962 0.975 79 80 0.013
100’s Tier3 0.975 0.960 40 50 -0.015
200’s Tier3 0.945 0.944 146 144 -0.001
Question 10 Summary
MathLevel Rural Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s OutofState_International 0.947 0.846 19 13 -0.101
200’s OutofState_International 0.935 0.923 31 26 -0.012
100’s Tier1&2 0.893 0.833 56 30 -0.060
200’s Tier1&2 0.899 0.950 79 80 0.051
100’s Tier3 0.925 0.898 40 49 -0.027
200’s Tier3 0.911 0.959 146 145 0.048
Question 15 Summary
MathLevel Rural Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s OutofState_International 0.947 0.923 19 13 -0.024
200’s OutofState_International 0.839 0.923 31 26 0.084
100’s Tier1&2 0.873 0.964 55 28 0.092
200’s Tier1&2 0.899 0.937 79 79 0.038
100’s Tier3 0.875 0.880 40 50 0.005
200’s Tier3 0.841 0.930 145 143 0.089
Question 16 Summary
MathLevel Rural Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s OutofState_International 0.947 0.923 19 13 -0.024
200’s OutofState_International 0.871 0.962 31 26 0.091
100’s Tier1&2 0.873 0.964 55 28 0.092
200’s Tier1&2 0.910 0.924 78 79 0.014
100’s Tier3 0.875 0.840 40 50 -0.035
200’s Tier3 0.862 0.930 145 143 0.068

10 HW 6 Q’s broken down - incorrect HW 2

10.1 By Cohort, Math Level, Transfer, and First Gen

out2 <- lapply(X = bin_names, FUN = q_means, data = missed_data, preds = preds[c(1,2,7,9)])

out2e <- lapply(FUN = nice_it, X= out2)

for(x in 1:length(out2e)){
  print(kable(out2e[[x]], digits = 3, col.names = c("MathLevel", "Transfer", "First Gen", "Avg2019", "Avg2020", "n2019", "n2020", "AvgDiff"), caption = paste0("Question ", qs[x], " Summary")) %>% 
          kable_styling(bootstrap_options = "striped",full_width = T) )
} 
Question 3 Summary
MathLevel Transfer First Gen Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s New Student First Gen 0.800 1.000 5 9 0.200
200’s New Student First Gen 1.000 0.944 19 18 -0.056
100’s New Transfer Student First Gen 1.000 1.000 7 4 0.000
200’s New Transfer Student First Gen 1.000 0.833 3 6 -0.167
100’s New Student Not First 0.935 0.966 31 29 0.030
200’s New Student Not First 0.968 0.957 94 69 -0.012
100’s New Transfer Student Not First 1.000 1.000 5 7 0.000
200’s New Transfer Student Not First 1.000 1.000 9 6 0.000
100’s New Student Unknown 1.000 0.667 3 3 -0.333
200’s New Student Unknown 1.000 NA 2 NA NA
100’s New Transfer Student Unknown NA 1.000 NA 1 NA
200’s New Transfer Student Unknown 1.000 0.500 1 4 -0.500
Question 4 Summary
MathLevel Transfer First Gen Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s New Student First Gen 0.800 1.000 5 9 0.200
200’s New Student First Gen 1.000 1.000 19 18 0.000
100’s New Transfer Student First Gen 1.000 0.750 7 4 -0.250
200’s New Transfer Student First Gen 1.000 0.833 3 6 -0.167
100’s New Student Not First 0.933 0.966 30 29 0.032
200’s New Student Not First 0.979 0.957 94 69 -0.022
100’s New Transfer Student Not First 1.000 1.000 5 7 0.000
200’s New Transfer Student Not First 1.000 0.833 9 6 -0.167
100’s New Student Unknown 1.000 0.667 3 3 -0.333
200’s New Student Unknown 1.000 NA 2 NA NA
100’s New Transfer Student Unknown NA 1.000 NA 1 NA
200’s New Transfer Student Unknown 1.000 0.500 1 4 -0.500
Question 5 Summary
MathLevel Transfer First Gen Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s New Student First Gen 0.400 0.625 5 8 0.225
200’s New Student First Gen 0.947 0.882 19 17 -0.065
100’s New Transfer Student First Gen 0.714 1.000 7 4 0.286
200’s New Transfer Student First Gen 0.667 0.833 3 6 0.167
100’s New Student Not First 0.774 0.893 31 28 0.119
200’s New Student Not First 0.851 0.848 94 66 -0.003
100’s New Transfer Student Not First 0.800 1.000 5 6 0.200
200’s New Transfer Student Not First 0.778 1.000 9 6 0.222
100’s New Student Unknown 0.667 1.000 3 2 0.333
200’s New Student Unknown 1.000 NA 2 NA NA
100’s New Transfer Student Unknown NA 0.000 NA 1 NA
200’s New Transfer Student Unknown 1.000 0.667 1 3 -0.333
Question 6 Summary
MathLevel Transfer First Gen Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s New Student First Gen 0.400 0.714 5 7 0.314
200’s New Student First Gen 0.947 0.765 19 17 -0.183
100’s New Transfer Student First Gen 0.714 0.750 7 4 0.036
200’s New Transfer Student First Gen 0.667 0.833 3 6 0.167
100’s New Student Not First 0.774 0.857 31 28 0.083
200’s New Student Not First 0.851 0.833 94 66 -0.018
100’s New Transfer Student Not First 0.800 1.000 5 6 0.200
200’s New Transfer Student Not First 0.778 0.833 9 6 0.056
100’s New Student Unknown 0.667 1.000 3 2 0.333
200’s New Student Unknown 1.000 NA 2 NA NA
100’s New Transfer Student Unknown NA 0.000 NA 1 NA
200’s New Transfer Student Unknown 1.000 0.667 1 3 -0.333
Question 7 Summary
MathLevel Transfer First Gen Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s New Student First Gen 0.400 0.500 5 8 0.100
200’s New Student First Gen 0.947 0.765 19 17 -0.183
100’s New Transfer Student First Gen 0.714 0.750 7 4 0.036
200’s New Transfer Student First Gen 0.667 0.833 3 6 0.167
100’s New Student Not First 0.774 0.893 31 28 0.119
200’s New Student Not First 0.840 0.833 94 66 -0.007
100’s New Transfer Student Not First 0.800 1.000 5 6 0.200
200’s New Transfer Student Not First 0.778 0.833 9 6 0.056
100’s New Student Unknown 0.667 1.000 3 2 0.333
200’s New Student Unknown 1.000 NA 2 NA NA
100’s New Transfer Student Unknown NA 0.000 NA 1 NA
200’s New Transfer Student Unknown 1.000 0.667 1 3 -0.333
Question 8 Summary
MathLevel Transfer First Gen Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s New Student First Gen 0.800 0.875 5 8 0.075
200’s New Student First Gen 0.789 0.944 19 18 0.155
100’s New Transfer Student First Gen 1.000 1.000 7 4 0.000
200’s New Transfer Student First Gen 0.667 1.000 3 5 0.333
100’s New Student Not First 0.867 0.815 30 27 -0.052
200’s New Student Not First 0.828 0.838 93 68 0.010
100’s New Transfer Student Not First 0.800 1.000 5 7 0.200
200’s New Transfer Student Not First 1.000 1.000 9 6 0.000
100’s New Student Unknown 1.000 0.667 3 3 -0.333
200’s New Student Unknown 1.000 NA 2 NA NA
100’s New Transfer Student Unknown NA 1.000 NA 1 NA
200’s New Transfer Student Unknown 1.000 0.667 1 3 -0.333
Question 9 Summary
MathLevel Transfer First Gen Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s New Student First Gen 0.800 1.000 5 8 0.200
200’s New Student First Gen 0.947 1.000 19 18 0.053
100’s New Transfer Student First Gen 1.000 0.750 7 4 -0.250
200’s New Transfer Student First Gen 1.000 0.833 3 6 -0.167
100’s New Student Not First 1.000 0.964 31 28 -0.036
200’s New Student Not First 0.925 0.956 93 68 0.031
100’s New Transfer Student Not First 1.000 1.000 5 7 0.000
200’s New Transfer Student Not First 1.000 0.833 9 6 -0.167
100’s New Student Unknown 0.667 0.500 3 2 -0.167
200’s New Student Unknown 1.000 NA 2 NA NA
100’s New Transfer Student Unknown NA 0.000 NA 1 NA
200’s New Transfer Student Unknown 1.000 1.000 1 3 0.000
Question 10 Summary
MathLevel Transfer First Gen Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s New Student First Gen 0.400 0.778 5 9 0.378
200’s New Student First Gen 0.947 0.833 19 18 -0.114
100’s New Transfer Student First Gen 0.857 1.000 7 4 0.143
200’s New Transfer Student First Gen 1.000 0.667 3 6 -0.333
100’s New Student Not First 0.938 0.963 32 27 0.025
200’s New Student Not First 0.882 0.942 93 69 0.060
100’s New Transfer Student Not First 1.000 0.571 5 7 -0.429
200’s New Transfer Student Not First 0.889 0.833 9 6 -0.056
100’s New Student Unknown 1.000 0.667 3 3 -0.333
200’s New Student Unknown 1.000 NA 2 NA NA
100’s New Transfer Student Unknown NA 1.000 NA 1 NA
200’s New Transfer Student Unknown 1.000 1.000 1 3 0.000
Question 15 Summary
MathLevel Transfer First Gen Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s New Student First Gen 0.600 0.750 5 8 0.150
200’s New Student First Gen 0.895 1.000 19 18 0.105
100’s New Transfer Student First Gen 0.857 0.750 7 4 -0.107
200’s New Transfer Student First Gen 0.667 1.000 3 5 0.333
100’s New Student Not First 0.774 0.893 31 28 0.119
200’s New Student Not First 0.804 0.896 92 67 0.091
100’s New Transfer Student Not First 0.800 0.857 5 7 0.057
200’s New Transfer Student Not First 0.889 0.833 9 6 -0.056
100’s New Student Unknown 1.000 1.000 3 2 0.000
200’s New Student Unknown 0.500 NA 2 NA NA
100’s New Transfer Student Unknown NA 1.000 NA 1 NA
200’s New Transfer Student Unknown 1.000 0.667 1 3 -0.333
Question 16 Summary
MathLevel Transfer First Gen Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s New Student First Gen 0.600 0.750 5 8 0.150
200’s New Student First Gen 0.895 0.944 19 18 0.050
100’s New Transfer Student First Gen 0.857 0.750 7 4 -0.107
200’s New Transfer Student First Gen 0.667 1.000 3 5 0.333
100’s New Student Not First 0.742 0.893 31 28 0.151
200’s New Student Not First 0.837 0.925 92 67 0.088
100’s New Transfer Student Not First 0.800 0.857 5 7 0.057
200’s New Transfer Student Not First 1.000 0.833 8 6 -0.167
100’s New Student Unknown 1.000 1.000 3 2 0.000
200’s New Student Unknown 0.500 NA 2 NA NA
100’s New Transfer Student Unknown NA 1.000 NA 1 NA
200’s New Transfer Student Unknown 1.000 0.667 1 3 -0.333

10.2 BY Cohort, Math Level, and Rural status

out3 <- lapply(X = bin_names, FUN = q_means, data = missed_data, preds = preds[c(1,2, 8)])

out3e <- lapply(FUN = nice_it, X= out3)

for(x in 1:length(out3e)){
  print(kable(out3e[[x]], digits = 3, col.names = c("MathLevel", "Rural", "Avg2019", "Avg2020", "n2019", "n2020", "AvgDiff"), caption = paste0("Question ", qs[x], " Summary")) %>% 
          kable_styling(bootstrap_options = "striped",full_width = T) )
} 
Question 3 Summary
MathLevel Rural Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s OutofState_International 1.000 1.000 9 6 0.000
200’s OutofState_International 1.000 1.000 14 12 0.000
100’s Tier1&2 0.889 0.955 27 22 0.066
200’s Tier1&2 0.978 0.900 46 40 -0.078
100’s Tier3 1.000 0.960 15 25 -0.040
200’s Tier3 0.971 0.959 68 49 -0.011
Question 4 Summary
MathLevel Rural Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s OutofState_International 0.889 1.000 9 6 0.111
200’s OutofState_International 1.000 1.000 14 12 0.000
100’s Tier1&2 0.923 0.955 26 22 0.031
200’s Tier1&2 1.000 0.925 46 40 -0.075
100’s Tier3 1.000 0.920 15 25 -0.080
200’s Tier3 0.971 0.939 68 49 -0.032
Question 5 Summary
MathLevel Rural Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s OutofState_International 0.778 1.000 9 6 0.222
200’s OutofState_International 0.929 0.917 14 12 -0.012
100’s Tier1&2 0.654 0.850 26 20 0.196
200’s Tier1&2 0.848 0.868 46 38 0.021
100’s Tier3 0.812 0.826 16 23 0.014
200’s Tier3 0.853 0.848 68 46 -0.005
Question 6 Summary
MathLevel Rural Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s OutofState_International 0.667 1.000 9 6 0.333
200’s OutofState_International 0.929 0.917 14 12 -0.012
100’s Tier1&2 0.692 0.842 26 19 0.150
200’s Tier1&2 0.826 0.842 46 38 0.016
100’s Tier3 0.812 0.783 16 23 -0.030
200’s Tier3 0.868 0.783 68 46 -0.085
Question 7 Summary
MathLevel Rural Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s OutofState_International 0.667 1.000 9 6 0.333
200’s OutofState_International 0.929 0.917 14 12 -0.012
100’s Tier1&2 0.692 0.800 26 20 0.108
200’s Tier1&2 0.826 0.842 46 38 0.016
100’s Tier3 0.812 0.783 16 23 -0.030
200’s Tier3 0.853 0.804 68 46 -0.049
Question 8 Summary
MathLevel Rural Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s OutofState_International 1.000 1.000 9 6 0.000
200’s OutofState_International 0.857 0.750 14 12 -0.107
100’s Tier1&2 0.920 0.857 25 21 -0.063
200’s Tier1&2 0.826 0.868 46 38 0.042
100’s Tier3 0.750 0.826 16 23 0.076
200’s Tier3 0.836 0.896 67 48 0.060
Question 9 Summary
MathLevel Rural Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s OutofState_International 0.889 1.000 9 6 0.111
200’s OutofState_International 0.929 1.000 14 12 0.071
100’s Tier1&2 0.962 0.900 26 20 -0.062
200’s Tier1&2 0.957 0.974 46 39 0.018
100’s Tier3 1.000 0.917 16 24 -0.083
200’s Tier3 0.925 0.917 67 48 -0.009
Question 10 Summary
MathLevel Rural Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s OutofState_International 0.889 1.000 9 6 0.111
200’s OutofState_International 0.857 0.833 14 12 -0.024
100’s Tier1&2 0.889 0.864 27 22 -0.025
200’s Tier1&2 0.891 0.923 46 39 0.032
100’s Tier3 0.875 0.826 16 23 -0.049
200’s Tier3 0.910 0.898 67 49 -0.012
Question 15 Summary
MathLevel Rural Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s OutofState_International 0.889 0.833 9 6 -0.056
200’s OutofState_International 0.714 0.833 14 12 0.119
100’s Tier1&2 0.808 0.950 26 20 0.142
200’s Tier1&2 0.870 0.947 46 38 0.078
100’s Tier3 0.688 0.792 16 24 0.104
200’s Tier3 0.803 0.894 66 47 0.091
Question 16 Summary
MathLevel Rural Avg2019 Avg2020 n2019 n2020 AvgDiff
100’s OutofState_International 0.889 0.833 9 6 -0.056
200’s OutofState_International 0.786 0.917 14 12 0.131
100’s Tier1&2 0.769 0.950 26 20 0.181
200’s Tier1&2 0.889 0.921 45 38 0.032
100’s Tier3 0.688 0.792 16 24 0.104
200’s Tier3 0.833 0.915 66 47 0.082

11 Logistic Regression with Variable Selection

On the full data I’ll do some automated variable selection just to see what variables pop out as useful.

full_fit <- function(data, response, preds){
  fit <- glm(as.formula(paste0(response, "~", "(", paste(preds, collapse = "*"), ")^2")), data = data, family = binomial)
  mod <- stepAIC(fit, scope = list(lower =~Cohort, upper = fit), trace = FALSE)
  summary(mod)
}


#remove NAs before use...
full_data <- full_data %>% filter(!is.na(FirstGenStatus))
full_data <- full_data %>% filter(!is.na(RuralStatus))

11.1 Cohort, Math Level, and Transfer

lapply(X = bin_names, FUN = full_fit, data = full_data, preds = preds[c(1,2, 7)])
## [[1]]
## 
## Call:
## glm(formula = HW6_Q_3_score_bin ~ Cohort + MTHLevel + ORIG_ENROLL_STATUS + 
##     MTHLevel:ORIG_ENROLL_STATUS, family = binomial, data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.8553   0.1850   0.2164   0.2266   0.3903  
## 
## Coefficients:
##                                                       Estimate Std. Error
## (Intercept)                                             3.6496     0.5191
## CohortSpring2020                                       -0.3162     0.4826
## MTHLevel200's                                           0.4097     0.5660
## ORIG_ENROLL_STATUSNew Transfer Student                 15.0798  1042.7900
## MTHLevel200's:ORIG_ENROLL_STATUSNew Transfer Student  -16.2864  1042.7902
##                                                      z value Pr(>|z|)    
## (Intercept)                                            7.030 2.06e-12 ***
## CohortSpring2020                                      -0.655    0.512    
## MTHLevel200's                                          0.724    0.469    
## ORIG_ENROLL_STATUSNew Transfer Student                 0.014    0.988    
## MTHLevel200's:ORIG_ENROLL_STATUSNew Transfer Student  -0.016    0.988    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 168.40  on 720  degrees of freedom
## Residual deviance: 162.67  on 716  degrees of freedom
##   (38 observations deleted due to missingness)
## AIC: 172.67
## 
## Number of Fisher Scoring iterations: 17
## 
## 
## [[2]]
## 
## Call:
## glm(formula = HW6_Q_4_score_bin ~ Cohort + MTHLevel + ORIG_ENROLL_STATUS + 
##     Cohort:ORIG_ENROLL_STATUS + MTHLevel:ORIG_ENROLL_STATUS, 
##     family = binomial, data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.9791   0.1542   0.1736   0.2514   0.5541  
## 
## Coefficients:
##                                                         Estimate Std. Error
## (Intercept)                                               3.2016     0.4775
## CohortSpring2020                                          0.2375     0.5932
## MTHLevel200's                                             0.9866     0.5853
## ORIG_ENROLL_STATUSNew Transfer Student                    1.0912     1.2878
## CohortSpring2020:ORIG_ENROLL_STATUSNew Transfer Student  -2.2297     1.2505
## MTHLevel200's:ORIG_ENROLL_STATUSNew Transfer Student     -1.4910     1.0561
##                                                         z value Pr(>|z|)    
## (Intercept)                                               6.705 2.02e-11 ***
## CohortSpring2020                                          0.400   0.6888    
## MTHLevel200's                                             1.686   0.0919 .  
## ORIG_ENROLL_STATUSNew Transfer Student                    0.847   0.3968    
## CohortSpring2020:ORIG_ENROLL_STATUSNew Transfer Student  -1.783   0.0746 .  
## MTHLevel200's:ORIG_ENROLL_STATUSNew Transfer Student     -1.412   0.1580    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 175.56  on 718  degrees of freedom
## Residual deviance: 161.36  on 713  degrees of freedom
##   (40 observations deleted due to missingness)
## AIC: 173.36
## 
## Number of Fisher Scoring iterations: 7
## 
## 
## [[3]]
## 
## Call:
## glm(formula = HW6_Q_5_score_bin ~ Cohort + MTHLevel, family = binomial, 
##     data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.3367   0.3673   0.4696   0.4696   0.6069  
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)        1.5985     0.2211   7.230 4.84e-13 ***
## CohortSpring2020   0.5131     0.2588   1.983   0.0474 *  
## MTHLevel200's      0.5510     0.2571   2.144   0.0321 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 466.00  on 709  degrees of freedom
## Residual deviance: 457.15  on 707  degrees of freedom
##   (49 observations deleted due to missingness)
## AIC: 463.15
## 
## Number of Fisher Scoring iterations: 5
## 
## 
## [[4]]
## 
## Call:
## glm(formula = HW6_Q_6_score_bin ~ Cohort + MTHLevel, family = binomial, 
##     data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.2485   0.4079   0.4912   0.4912   0.6176  
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)        1.5600     0.2158   7.229 4.87e-13 ***
## CohortSpring2020   0.3906     0.2438   1.602   0.1091    
## MTHLevel200's      0.4942     0.2469   2.001   0.0454 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 499.71  on 708  degrees of freedom
## Residual deviance: 492.92  on 706  degrees of freedom
##   (50 observations deleted due to missingness)
## AIC: 498.92
## 
## Number of Fisher Scoring iterations: 5
## 
## 
## [[5]]
## 
## Call:
## glm(formula = HW6_Q_7_score_bin ~ Cohort + MTHLevel, family = binomial, 
##     data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.2322   0.4158   0.4935   0.4935   0.6131  
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)        1.5760     0.2161   7.294 3.02e-13 ***
## CohortSpring2020   0.3608     0.2416   1.493   0.1354    
## MTHLevel200's      0.4680     0.2460   1.903   0.0571 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 504.06  on 709  degrees of freedom
## Residual deviance: 498.04  on 707  degrees of freedom
##   (49 observations deleted due to missingness)
## AIC: 504.04
## 
## Number of Fisher Scoring iterations: 5
## 
## 
## [[6]]
## 
## Call:
## glm(formula = HW6_Q_15_score_bin ~ Cohort + ORIG_ENROLL_STATUS, 
##     family = binomial, data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.3187   0.5020   0.5020   0.5208   0.5208  
## 
## Coefficients:
##                                        Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                             1.92953    0.16522  11.679   <2e-16 ***
## CohortSpring2020                        0.07834    0.23579   0.332    0.740    
## ORIG_ENROLL_STATUSNew Transfer Student  0.60981    0.41103   1.484    0.138    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 508.14  on 709  degrees of freedom
## Residual deviance: 505.50  on 707  degrees of freedom
##   (49 observations deleted due to missingness)
## AIC: 511.5
## 
## Number of Fisher Scoring iterations: 5
## 
## 
## [[7]]
## 
## Call:
## glm(formula = HW6_Q_16_score_bin ~ Cohort + ORIG_ENROLL_STATUS + 
##     Cohort:ORIG_ENROLL_STATUS, family = binomial, data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.8111   0.2635   0.2635   0.3315   0.4172  
## 
## Coefficients:
##                                                         Estimate Std. Error
## (Intercept)                                               2.8739     0.2493
## CohortSpring2020                                          0.4690     0.4070
## ORIG_ENROLL_STATUSNew Transfer Student                    1.0579     1.0400
## CohortSpring2020:ORIG_ENROLL_STATUSNew Transfer Student  -2.0029     1.2074
##                                                         z value Pr(>|z|)    
## (Intercept)                                              11.528   <2e-16 ***
## CohortSpring2020                                          1.152   0.2493    
## ORIG_ENROLL_STATUSNew Transfer Student                    1.017   0.3090    
## CohortSpring2020:ORIG_ENROLL_STATUSNew Transfer Student  -1.659   0.0972 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 261.00  on 710  degrees of freedom
## Residual deviance: 257.28  on 707  degrees of freedom
##   (48 observations deleted due to missingness)
## AIC: 265.28
## 
## Number of Fisher Scoring iterations: 6
## 
## 
## [[8]]
## 
## Call:
## glm(formula = HW6_Q_17_score_bin ~ Cohort + MTHLevel + ORIG_ENROLL_STATUS + 
##     Cohort:MTHLevel, family = binomial, data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.5161   0.2936   0.3983   0.4066   0.6972  
## 
## Coefficients:
##                                        Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                             2.53751    0.34561   7.342  2.1e-13 ***
## CohortSpring2020                       -0.42858    0.45641  -0.939   0.3477    
## MTHLevel200's                          -0.08611    0.39948  -0.216   0.8293    
## ORIG_ENROLL_STATUSNew Transfer Student -0.81826    0.32633  -2.507   0.0122 *  
## CohortSpring2020:MTHLevel200's          1.09953    0.58694   1.873   0.0610 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 397.50  on 713  degrees of freedom
## Residual deviance: 384.64  on 709  degrees of freedom
##   (45 observations deleted due to missingness)
## AIC: 394.64
## 
## Number of Fisher Scoring iterations: 5
## 
## 
## [[9]]
## 
## Call:
## glm(formula = HW6_Q_18_score_bin ~ Cohort, family = binomial, 
##     data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.2835   0.3914   0.3914   0.5338   0.5338  
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)        1.8765     0.1534  12.232   <2e-16 ***
## CohortSpring2020   0.6540     0.2583   2.532   0.0113 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 474.22  on 707  degrees of freedom
## Residual deviance: 467.51  on 706  degrees of freedom
##   (51 observations deleted due to missingness)
## AIC: 471.51
## 
## Number of Fisher Scoring iterations: 5
## 
## 
## [[10]]
## 
## Call:
## glm(formula = HW6_Q_19_score_bin ~ Cohort, family = binomial, 
##     data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.2495   0.4074   0.4074   0.5047   0.5047  
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)        1.9966     0.1607  12.427   <2e-16 ***
## CohortSpring2020   0.4506     0.2570   1.753   0.0796 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 460.98  on 706  degrees of freedom
## Residual deviance: 457.84  on 705  degrees of freedom
##   (52 observations deleted due to missingness)
## AIC: 461.84
## 
## Number of Fisher Scoring iterations: 5

11.2 Cohort, Math Level, and Rural Status

lapply(X = bin_names, FUN = full_fit, data = full_data, preds = preds[c(1,2, 8)])
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## [[1]]
## 
## Call:
## glm(formula = HW6_Q_3_score_bin ~ Cohort + RuralStatus, family = binomial, 
##     data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.8435   0.1882   0.2196   0.2667   0.3108  
## 
## Coefficients:
##                     Estimate Std. Error z value Pr(>|z|)
## (Intercept)          19.7119  1138.1521   0.017    0.986
## CohortSpring2020     -0.3123     0.4830  -0.647    0.518
## RuralStatusTier1&2  -16.3937  1138.1521  -0.014    0.989
## RuralStatusTier3    -15.6869  1138.1521  -0.014    0.989
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 168.40  on 720  degrees of freedom
## Residual deviance: 161.12  on 717  degrees of freedom
##   (38 observations deleted due to missingness)
## AIC: 169.12
## 
## Number of Fisher Scoring iterations: 18
## 
## 
## [[2]]
## 
## Call:
## glm(formula = HW6_Q_4_score_bin ~ (Cohort * MTHLevel * RuralStatus)^2, 
##     family = binomial, data = data)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -2.68735   0.00005   0.23571   0.27218   0.47165  
## 
## Coefficients:
##                                                    Estimate Std. Error z value
## (Intercept)                                          2.1401     0.7475   2.863
## CohortSpring2020                                    18.4260  4917.5199   0.004
## MTHLevel200's                                       18.4260  3184.4685   0.006
## RuralStatusTier1&2                                   1.1371     1.0381   1.095
## RuralStatusTier3                                    18.4260  2839.1315   0.006
## CohortSpring2020:MTHLevel200's                     -18.4260  6812.7705  -0.003
## CohortSpring2020:RuralStatusTier1&2                -18.3359  4917.5201  -0.004
## CohortSpring2020:RuralStatusTier3                  -36.1989  5678.2629  -0.006
## MTHLevel200's:RuralStatusTier1&2                    -1.1371  3757.6798   0.000
## MTHLevel200's:RuralStatusTier3                     -35.4086  4266.3225  -0.008
## CohortSpring2020:MTHLevel200's:RuralStatusTier1&2    1.0279  7098.8140   0.000
## CohortSpring2020:MTHLevel200's:RuralStatusTier3     36.1849  7380.6850   0.005
##                                                   Pr(>|z|)   
## (Intercept)                                         0.0042 **
## CohortSpring2020                                    0.9970   
## MTHLevel200's                                       0.9954   
## RuralStatusTier1&2                                  0.2734   
## RuralStatusTier3                                    0.9948   
## CohortSpring2020:MTHLevel200's                      0.9978   
## CohortSpring2020:RuralStatusTier1&2                 0.9970   
## CohortSpring2020:RuralStatusTier3                   0.9949   
## MTHLevel200's:RuralStatusTier1&2                    0.9998   
## MTHLevel200's:RuralStatusTier3                      0.9934   
## CohortSpring2020:MTHLevel200's:RuralStatusTier1&2   0.9999   
## CohortSpring2020:MTHLevel200's:RuralStatusTier3     0.9961   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 175.56  on 718  degrees of freedom
## Residual deviance: 160.79  on 707  degrees of freedom
##   (40 observations deleted due to missingness)
## AIC: 184.79
## 
## Number of Fisher Scoring iterations: 19
## 
## 
## [[3]]
## 
## Call:
## glm(formula = HW6_Q_5_score_bin ~ Cohort + MTHLevel, family = binomial, 
##     data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.3367   0.3673   0.4696   0.4696   0.6069  
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)        1.5985     0.2211   7.230 4.84e-13 ***
## CohortSpring2020   0.5131     0.2588   1.983   0.0474 *  
## MTHLevel200's      0.5510     0.2571   2.144   0.0321 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 466.00  on 709  degrees of freedom
## Residual deviance: 457.15  on 707  degrees of freedom
##   (49 observations deleted due to missingness)
## AIC: 463.15
## 
## Number of Fisher Scoring iterations: 5
## 
## 
## [[4]]
## 
## Call:
## glm(formula = HW6_Q_6_score_bin ~ Cohort + MTHLevel, family = binomial, 
##     data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.2485   0.4079   0.4912   0.4912   0.6176  
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)        1.5600     0.2158   7.229 4.87e-13 ***
## CohortSpring2020   0.3906     0.2438   1.602   0.1091    
## MTHLevel200's      0.4942     0.2469   2.001   0.0454 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 499.71  on 708  degrees of freedom
## Residual deviance: 492.92  on 706  degrees of freedom
##   (50 observations deleted due to missingness)
## AIC: 498.92
## 
## Number of Fisher Scoring iterations: 5
## 
## 
## [[5]]
## 
## Call:
## glm(formula = HW6_Q_7_score_bin ~ Cohort + MTHLevel, family = binomial, 
##     data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.2322   0.4158   0.4935   0.4935   0.6131  
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)        1.5760     0.2161   7.294 3.02e-13 ***
## CohortSpring2020   0.3608     0.2416   1.493   0.1354    
## MTHLevel200's      0.4680     0.2460   1.903   0.0571 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 504.06  on 709  degrees of freedom
## Residual deviance: 498.04  on 707  degrees of freedom
##   (49 observations deleted due to missingness)
## AIC: 504.04
## 
## Number of Fisher Scoring iterations: 5
## 
## 
## [[6]]
## 
## Call:
## glm(formula = HW6_Q_15_score_bin ~ Cohort, family = binomial, 
##     data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.0949   0.4861   0.4861   0.5039   0.5039  
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)       1.99964    0.16064  12.448   <2e-16 ***
## CohortSpring2020  0.07651    0.23541   0.325    0.745    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 508.14  on 709  degrees of freedom
## Residual deviance: 508.04  on 708  degrees of freedom
##   (49 observations deleted due to missingness)
## AIC: 512.04
## 
## Number of Fisher Scoring iterations: 4
## 
## 
## [[7]]
## 
## Call:
## glm(formula = HW6_Q_16_score_bin ~ Cohort + RuralStatus + Cohort:RuralStatus, 
##     family = binomial, data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.6501   0.2462   0.3150   0.3253   0.4590  
## 
## Coefficients:
##                                      Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                            2.1972     0.4714   4.661 3.15e-06 ***
## CohortSpring2020                      16.3688  1044.4582   0.016   0.9875    
## RuralStatusTier1&2                     1.2840     0.6928   1.853   0.0638 .  
## RuralStatusTier3                       0.7817     0.5822   1.343   0.1794    
## CohortSpring2020:RuralStatusTier1&2  -16.5920  1044.4584  -0.016   0.9873    
## CohortSpring2020:RuralStatusTier3    -16.4354  1044.4583  -0.016   0.9874    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 261.00  on 710  degrees of freedom
## Residual deviance: 253.55  on 705  degrees of freedom
##   (48 observations deleted due to missingness)
## AIC: 265.55
## 
## Number of Fisher Scoring iterations: 17
## 
## 
## [[8]]
## 
## Call:
## glm(formula = HW6_Q_17_score_bin ~ Cohort + MTHLevel + Cohort:MTHLevel, 
##     family = binomial, data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.4660   0.3130   0.4265   0.4339   0.5287  
## 
## Coefficients:
##                                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                     2.35138    0.33094   7.105  1.2e-12 ***
## CohortSpring2020               -0.45426    0.45316  -1.002   0.3161    
## MTHLevel200's                  -0.03583    0.39660  -0.090   0.9280    
## CohortSpring2020:MTHLevel200's  1.13027    0.58364   1.937   0.0528 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 397.5  on 713  degrees of freedom
## Residual deviance: 390.3  on 710  degrees of freedom
##   (45 observations deleted due to missingness)
## AIC: 398.3
## 
## Number of Fisher Scoring iterations: 5
## 
## 
## [[9]]
## 
## Call:
## glm(formula = HW6_Q_18_score_bin ~ Cohort, family = binomial, 
##     data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.2835   0.3914   0.3914   0.5338   0.5338  
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)        1.8765     0.1534  12.232   <2e-16 ***
## CohortSpring2020   0.6540     0.2583   2.532   0.0113 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 474.22  on 707  degrees of freedom
## Residual deviance: 467.51  on 706  degrees of freedom
##   (51 observations deleted due to missingness)
## AIC: 471.51
## 
## Number of Fisher Scoring iterations: 5
## 
## 
## [[10]]
## 
## Call:
## glm(formula = HW6_Q_19_score_bin ~ Cohort, family = binomial, 
##     data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.2495   0.4074   0.4074   0.5047   0.5047  
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)        1.9966     0.1607  12.427   <2e-16 ***
## CohortSpring2020   0.4506     0.2570   1.753   0.0796 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 460.98  on 706  degrees of freedom
## Residual deviance: 457.84  on 705  degrees of freedom
##   (52 observations deleted due to missingness)
## AIC: 461.84
## 
## Number of Fisher Scoring iterations: 5

11.3 Cohort, Math Level, and First Gen

lapply(X = bin_names, FUN = full_fit, data = full_data, preds = preds[c(1,2, 9)])
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## [[1]]
## 
## Call:
## glm(formula = HW6_Q_3_score_bin ~ Cohort, family = binomial, 
##     data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.7711   0.2085   0.2085   0.2411   0.2411  
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)        3.8177     0.3574  10.682   <2e-16 ***
## CohortSpring2020  -0.2943     0.4803  -0.613     0.54    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 168.40  on 720  degrees of freedom
## Residual deviance: 168.02  on 719  degrees of freedom
##   (38 observations deleted due to missingness)
## AIC: 172.02
## 
## Number of Fisher Scoring iterations: 6
## 
## 
## [[2]]
## 
## Call:
## glm(formula = HW6_Q_4_score_bin ~ Cohort, family = binomial, 
##     data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.7701   0.2088   0.2088   0.2535   0.2535  
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)        3.8150     0.3574  10.674   <2e-16 ***
## CohortSpring2020  -0.3928     0.4708  -0.834    0.404    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 175.56  on 718  degrees of freedom
## Residual deviance: 174.86  on 717  degrees of freedom
##   (40 observations deleted due to missingness)
## AIC: 178.86
## 
## Number of Fisher Scoring iterations: 6
## 
## 
## [[3]]
## 
## Call:
## glm(formula = HW6_Q_5_score_bin ~ Cohort + MTHLevel + FirstGenStatus + 
##     MTHLevel:FirstGenStatus, family = binomial, data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.3345   0.3682   0.4160   0.4727   0.8658  
## 
## Coefficients:
##                                        Estimate Std. Error z value Pr(>|z|)  
## (Intercept)                             0.82245    0.40172   2.047   0.0406 *
## CohortSpring2020                        0.52192    0.26124   1.998   0.0457 *
## MTHLevel200's                           1.26757    0.53966   2.349   0.0188 *
## FirstGenStatusNot First                 1.05914    0.46454   2.280   0.0226 *
## FirstGenStatusUnknown                  -0.03425    0.91170  -0.038   0.9700  
## MTHLevel200's:FirstGenStatusNot First  -1.01399    0.62066  -1.634   0.1023  
## MTHLevel200's:FirstGenStatusUnknown    14.24292  795.33269   0.018   0.9857  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 466.0  on 709  degrees of freedom
## Residual deviance: 449.9  on 703  degrees of freedom
##   (49 observations deleted due to missingness)
## AIC: 463.9
## 
## Number of Fisher Scoring iterations: 15
## 
## 
## [[4]]
## 
## Call:
## glm(formula = HW6_Q_6_score_bin ~ Cohort + MTHLevel + FirstGenStatus + 
##     MTHLevel:FirstGenStatus, family = binomial, data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.2419   0.4111   0.4586   0.4950   0.8489  
## 
## Coefficients:
##                                        Estimate Std. Error z value Pr(>|z|)  
## (Intercept)                             0.83524    0.40159   2.080   0.0375 *
## CohortSpring2020                        0.39087    0.24587   1.590   0.1119  
## MTHLevel200's                           1.18496    0.52760   2.246   0.0247 *
## FirstGenStatusNot First                 0.97330    0.45943   2.119   0.0341 *
## FirstGenStatusUnknown                   0.02796    0.91014   0.031   0.9755  
## MTHLevel200's:FirstGenStatusNot First  -0.95581    0.60253  -1.586   0.1127  
## MTHLevel200's:FirstGenStatusUnknown    14.31313  797.19440   0.018   0.9857  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 499.71  on 708  degrees of freedom
## Residual deviance: 486.29  on 702  degrees of freedom
##   (50 observations deleted due to missingness)
## AIC: 500.29
## 
## Number of Fisher Scoring iterations: 15
## 
## 
## [[5]]
## 
## Call:
## glm(formula = HW6_Q_7_score_bin ~ Cohort + MTHLevel + FirstGenStatus + 
##     MTHLevel:FirstGenStatus, family = binomial, data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.2363   0.4138   0.4362   0.4929   0.9384  
## 
## Coefficients:
##                                       Estimate Std. Error z value Pr(>|z|)   
## (Intercept)                             0.5920     0.3794   1.560  0.11868   
## CohortSpring2020                        0.3680     0.2446   1.505  0.13244   
## MTHLevel200's                           1.4392     0.5094   2.825  0.00472 **
## FirstGenStatusNot First                 1.3447     0.4449   3.022  0.00251 **
## FirstGenStatusUnknown                   0.2845     0.8992   0.316  0.75170   
## MTHLevel200's:FirstGenStatusNot First  -1.3290     0.5913  -2.248  0.02460 * 
## MTHLevel200's:FirstGenStatusUnknown    -0.4276     1.4356  -0.298  0.76581   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 504.06  on 709  degrees of freedom
## Residual deviance: 488.88  on 703  degrees of freedom
##   (49 observations deleted due to missingness)
## AIC: 502.88
## 
## Number of Fisher Scoring iterations: 5
## 
## 
## [[6]]
## 
## Call:
## glm(formula = HW6_Q_15_score_bin ~ Cohort + FirstGenStatus + 
##     Cohort:FirstGenStatus, family = binomial, data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.3614   0.4972   0.4972   0.5073   0.6335  
## 
## Coefficients:
##                                          Estimate Std. Error z value Pr(>|z|)
## (Intercept)                                1.7540     0.3610   4.858 1.18e-06
## CohortSpring2020                           0.9706     0.6299   1.541    0.123
## FirstGenStatusNot First                    0.2741     0.4033   0.680    0.497
## FirstGenStatusUnknown                     13.8120   550.0887   0.025    0.980
## CohortSpring2020:FirstGenStatusNot First  -1.0134     0.6816  -1.487    0.137
## CohortSpring2020:FirstGenStatusUnknown   -15.0326   550.0895  -0.027    0.978
##                                             
## (Intercept)                              ***
## CohortSpring2020                            
## FirstGenStatusNot First                     
## FirstGenStatusUnknown                       
## CohortSpring2020:FirstGenStatusNot First    
## CohortSpring2020:FirstGenStatusUnknown      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 508.14  on 709  degrees of freedom
## Residual deviance: 503.18  on 704  degrees of freedom
##   (49 observations deleted due to missingness)
## AIC: 515.18
## 
## Number of Fisher Scoring iterations: 14
## 
## 
## [[7]]
## 
## Call:
## glm(formula = HW6_Q_16_score_bin ~ Cohort + MTHLevel + FirstGenStatus + 
##     MTHLevel:FirstGenStatus, family = binomial, data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.8774   0.2029   0.2979   0.3366   1.0304  
## 
## Coefficients:
##                                       Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                             2.6761     0.7414   3.609 0.000307 ***
## CohortSpring2020                        0.2503     0.3737   0.670 0.502965    
## MTHLevel200's                           0.2934     0.8922   0.329 0.742290    
## FirstGenStatusNot First                 1.1972     0.9335   1.283 0.199666    
## FirstGenStatusUnknown                  -2.3200     1.0372  -2.237 0.025298 *  
## MTHLevel200's:FirstGenStatusNot First  -1.3244     1.0895  -1.216 0.224128    
## MTHLevel200's:FirstGenStatusUnknown    15.7831   798.9864   0.020 0.984240    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 261.00  on 710  degrees of freedom
## Residual deviance: 247.48  on 704  degrees of freedom
##   (48 observations deleted due to missingness)
## AIC: 261.48
## 
## Number of Fisher Scoring iterations: 15
## 
## 
## [[8]]
## 
## Call:
## glm(formula = HW6_Q_17_score_bin ~ (Cohort * MTHLevel * FirstGenStatus)^2, 
##     family = binomial, data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.5776   0.2711   0.3898   0.4463   0.9005  
## 
## Coefficients:
##                                                         Estimate Std. Error
## (Intercept)                                               1.0986     0.5164
## CohortSpring2020                                          0.7732     0.9185
## MTHLevel200's                                             1.4404     0.7914
## FirstGenStatusNot First                                   1.7579     0.6915
## FirstGenStatusUnknown                                    15.4675  1385.3779
## CohortSpring2020:MTHLevel200's                           -1.0715     1.1935
## CohortSpring2020:FirstGenStatusNot First                 -1.5660     1.0936
## CohortSpring2020:FirstGenStatusUnknown                  -16.6461  1385.3784
## MTHLevel200's:FirstGenStatusNot First                    -2.0403     0.9450
## MTHLevel200's:FirstGenStatusUnknown                      -1.4404  1832.6827
## CohortSpring2020:MTHLevel200's:FirstGenStatusNot First    2.8929     1.4072
## CohortSpring2020:MTHLevel200's:FirstGenStatusUnknown     16.9444  2123.7444
##                                                        z value Pr(>|z|)  
## (Intercept)                                              2.127   0.0334 *
## CohortSpring2020                                         0.842   0.3999  
## MTHLevel200's                                            1.820   0.0688 .
## FirstGenStatusNot First                                  2.542   0.0110 *
## FirstGenStatusUnknown                                    0.011   0.9911  
## CohortSpring2020:MTHLevel200's                          -0.898   0.3693  
## CohortSpring2020:FirstGenStatusNot First                -1.432   0.1522  
## CohortSpring2020:FirstGenStatusUnknown                  -0.012   0.9904  
## MTHLevel200's:FirstGenStatusNot First                   -2.159   0.0308 *
## MTHLevel200's:FirstGenStatusUnknown                     -0.001   0.9994  
## CohortSpring2020:MTHLevel200's:FirstGenStatusNot First   2.056   0.0398 *
## CohortSpring2020:MTHLevel200's:FirstGenStatusUnknown     0.008   0.9936  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 397.50  on 713  degrees of freedom
## Residual deviance: 377.68  on 702  degrees of freedom
##   (45 observations deleted due to missingness)
## AIC: 401.68
## 
## Number of Fisher Scoring iterations: 15
## 
## 
## [[9]]
## 
## Call:
## glm(formula = HW6_Q_18_score_bin ~ Cohort + MTHLevel + FirstGenStatus + 
##     MTHLevel:FirstGenStatus, family = binomial, data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.4036   0.3617   0.3966   0.5399   1.0334  
## 
## Coefficients:
##                                       Estimate Std. Error z value Pr(>|z|)   
## (Intercept)                             1.3053     0.4601   2.837  0.00456 **
## CohortSpring2020                        0.6508     0.2608   2.496  0.01257 * 
## MTHLevel200's                           0.8753     0.6014   1.456  0.14553   
## FirstGenStatusNot First                 0.7380     0.5283   1.397  0.16245   
## FirstGenStatusUnknown                  14.8882   841.0358   0.018  0.98588   
## MTHLevel200's:FirstGenStatusNot First  -1.0663     0.6793  -1.570  0.11651   
## MTHLevel200's:FirstGenStatusUnknown   -16.7201   841.0362  -0.020  0.98414   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 474.22  on 707  degrees of freedom
## Residual deviance: 459.75  on 701  degrees of freedom
##   (51 observations deleted due to missingness)
## AIC: 473.75
## 
## Number of Fisher Scoring iterations: 15
## 
## 
## [[10]]
## 
## Call:
## glm(formula = HW6_Q_19_score_bin ~ Cohort + MTHLevel + FirstGenStatus + 
##     MTHLevel:FirstGenStatus, family = binomial, data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.2992   0.3902   0.4134   0.4821   0.9914  
## 
## Coefficients:
##                                       Estimate Std. Error z value Pr(>|z|)   
## (Intercept)                             1.3731     0.4600   2.985  0.00284 **
## CohortSpring2020                        0.4435     0.2595   1.709  0.08744 . 
## MTHLevel200's                           0.7526     0.5849   1.287  0.19822   
## FirstGenStatusNot First                 0.6003     0.5193   1.156  0.24769   
## FirstGenStatusUnknown                  14.9314   844.9684   0.018  0.98590   
## MTHLevel200's:FirstGenStatusNot First  -0.6323     0.6614  -0.956  0.33913   
## MTHLevel200's:FirstGenStatusUnknown   -16.6024   844.9688  -0.020  0.98432   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 460.98  on 706  degrees of freedom
## Residual deviance: 450.36  on 700  degrees of freedom
##   (52 observations deleted due to missingness)
## AIC: 464.36
## 
## Number of Fisher Scoring iterations: 15

12 Logistic rough conclusions

Cohort is reasonably important in quite a few of the models. We are doing a lot of model fitting and tests so p-values aren’t a great metric here but we are getting some evidence of an improvement by cohort. Math level is also important in a good number of the models. There are some interactions with cohort and the other variables that are significant here or there as well. This would imply that the intervention might help certain groups more. However, again, with all the testing it is hard to say that these are real results - encouraging to some degree though!