Guanglei Hong and Stephen W. Raudenbush
Educational Evaluation and Policy Analysis
Fall 2005
If a kindergarten student is struggling to succeed academically, does it make sense to hold the child back? And how does that decision affect the performance of his/her classmates? These are difficult questions to address, but the authors give it the old college try. On the latter question, they find no effect, simply because fewer than 5 percent of students are typically retained. The former question yielded more interesting results. The authors found that for the retained students, their academic performance diminished significantly. Had the students instead been promoted to first grade, their math and reading achievement gaps (relative to the normally-promoted students) would be halved. Could this be correct? As with much education research, a key question is whether the analysis has sufficiently approximated a random experiment. Ideally, one would study a group of children who all deserved to be retained, while randomly promoting some and retaining others. Of course, no such experiment exists, so the authors did the next best thing: they compared similar students, some of whom were promoted and some of whom were retained. To ensure similarity, they relied on a robust data set from the National Center for Education Statistics, the Early Childhood Longitudinal Study Kindergarten, with detailed student-level data including demographics, and even indicators of parental involvement and home life. Hong and Raudenbush constructed their model from 207 such indicators. It's impressive work, but doubts remain. These students clearly were retained with cause - their teachers and/or parents judged them not ready for the first grade for some reason. Was that reason some subtle or immeasurable, but important, difference between retained students and those promoted? Furthermore, was it this undefined or undetected difference, and not just the act of retention, that contributed to their slow growth? (The authors briefly explain their statistical test and dismiss the possibility of "unmeasured confounders," but there is little information by which a lay person can judge it.) Though written for statisticians, it's an interesting paper that raises some significant questions. We're not persuaded that their answer is as conclusive as they suggest, however, and of course we'd be curious to know whether one would find similar conclusions in higher grades. You can find it online here.