I have always wondered how important it was for students to show up to my class. My aim in teaching is to make my readings, lectures, exams, extra materials, etc. complementary so that the way to get an A or a good grade is to stay actively engaged in all of the course material. Simply coming to class without reading should not generally get you an A nor vice versa.
This year I managed to quietly take attendance in my Money and Banking class (I do not require students to come so coming does not give you an explicit grade advantage over those who choose not to come), at least for most students. And I was able to put together an extremely simple regression to show how important coming to class was for the determination of the final course grades. Here are the results:
Final Course Grade (in percentage) = 0.877 +.013 Female – 0.045 International -.016 Missed Class -.102 Missed Class Dummy
Now, this is a ridiculously stupid regression for reasons obvious to anyone who suffered through graduate school (or perhaps even undergrad) econometrics. But let me simplify what it tells you. I had 81 students take the class so the “Final Course Grade” variable is measured from 0 to 1 (a 1 corresponding to 100%). Controlling for whether the student was female, an international student, or missed a class my class average was 87.7% (the real average was much lower, about 80.0%). I did not include the standard errors here, but all variables but the female one were statistically significant at acceptable levels. Ignoring that, the results show that females in my class do better by 1.3 percentage points than the boys. Their raw performance is also better in all of my other classes. But this result is noisy and we cannot reject the idea that their performance is no different than men, it is only a sampling artifact that accounts for these results (I bet with a huge sample this result would hold).
All else equal, my international students (all non-Americans grouped into one category, if I disaggregated I might be giving away WHO particular students are) score 4.5 percentage points lower than their American counterparts, so this is at least a third of a grade lower. You can imagine how that impacts my course ratings. And for the key result, holding everything else constant, it appears that each missed class is correlated (note I did not say “causes”) with a 1.6 percentage point lower grade as compared to a student who did not miss that class. Thus, if you miss 6 classes you would expect to score about 10 percentage points lower in the course than someone who missed zero classes. So, if my typical good student misses zero classes and scores an 87%, then a typical non-attendee would score a 77% –> this is the difference between a B and a C. If you decide to mail it in and never come to class, then it seems to be that you have no statistical chance of passing – at best you would score a 55.2% for the course.
What is that missed class dummy? These are students for whom I do not have good data on, either because I think they missed a ton of classes, or that for the few times when I actively asked them about attendance I did not get an answer. And what the results show is that these students have a grade that is 10.2 percentage points lower than the students who I do have records on.
OK, so it appears that coming to class is a really big deal. But here is the problem in the above specification. You cannot say anything about whether missing classes actually causes or results in lower grades.
I am sorry if I am boring the economists who read me (not many do as I understand it), but for the benefit of others, let’s try to think why. Most important perhaps is that maybe the people who miss class are systematically different than the people who do not miss class. Suppose, for example, that poorer students are generally those who miss, while excellent students are generally those who come. Then the results above are totally meaningless! Because what those missed class variables are picking up is not the “productivity” effects of class, but actually the fact that the people who have missed class would score worse on the course even if they showed up every single time. It is still plausible that missing class matters in this setting, but you would have to play a few tricks to fix the problem. I do not want to go into that here, but one way to deal with the problem is to try to include a control in the regression that proxies for student ability. If the college would provide me with overall GPA for the college (if you read yesterday’s post you will learn that I do not put much stock in that) or better yet their SAT score, then I could include that as a control. If that were in the regression then you would likely see a strong positive relationship between SAT scores and Money and Banking grades, and you would read the result on missed class as something like, “controlling for inherent ability and other personal attributes, missing class is correlated with an ___ reduction in course performance per class missed.”
There are of course even fancier tools we can use, which amount to modeling out who and why people are missing classes and then making appropriate adjustments to the above regression. To understand this methodological problem perhaps turn the issue around on its head. From the results, would you be confident in the observation that, “hey, if you are a female, American student and you simply go to class, that means you should expect an 89% in the course (B+)? Of course you wouldn’t, so it has to be the case that we need to understand the implications of the above model a little bit better.