This is the third article in a series that looks at a recent AEI paper by Colin Hitt, Michael Q. McShane, and Patrick J. Wolf, “Do Impacts on Test Scores Even Matter? Lessons from Long-Run Outcomes in School Choice Research.” The first and second essays are respectively titled “How to think about short-term test score changes and long-term student outcomes,” and “When looking only at school choice programs, both short-term test scores and long-term outcomes are overwhelmingly positive.”
Yesterday, I argued that Hitt, McShane, and Wolf erred in including programs in their review of “school choice” studies that were only incidentally related to school choice or that have idiosyncratic designs that would lead one to expect a mismatch between test score gains and long-term impacts (early college high schools, selective enrollment high schools, and career and technical education initiatives). If you take those studies out of the sample, the findings for true school choice programs are overwhelmingly positive for both short- and long-term outcomes.
Today I’ll take up another problem with their study: They set an unreasonably high bar for a study to show a match between test score changes and attainment. Let me quote the authors themselves, in a forthcoming academic version of their AEI article:
Our coding scheme may seem rigid. One can criticize the exactitude that we are imposing on the data, as achievement and attainment results must match regarding both direction and statistical significance. We believe this exactitude is justified by the fact that the conclusions that many policymakers and commentators draw about whether school choice “works” depends on the direction and significance of the effect parameter. As social scientists, we would prefer that practitioners rigidly adhere to the convention of treating all non-significant findings as null or essentially 0, regardless of whether they are positive or negative in their direction, but we also eschew utopianism. Outside of the ivory tower, people treat a negative school choice effect as a bad result and a positive effect as a good result, regardless of whether it reaches scientific standards of statistical significance. Since one of our goals is to evaluate the prudence of those judgments, we adopt the same posture for our main analysis.
And later:
Our resulting four-category classifications might be biasing our findings against the true power of test score results to predict attainment results, as it is mathematically more difficult perfectly to match one of four outcome categories than it is to match one of three. Moreover, the disconnect between test score impacts and attainment impacts might be driven by a bunch of noisy non-significant findings not aligning which each other, which is exactly what we might expect noisy non-significant findings to do.
It is indeed a rigid approach the authors used in the AEI paper, and in the forthcoming academic version they appropriately test to see whether their findings might change if they treat non-statistically-significant findings as null, regardless of whether those findings point in a positive or negative direction. Their results were largely the same.
But there’s another reasonable way to look for matches, and that’s seeing whether a given study’s findings point in the same direction for both achievement and attainment, regardless of statistical significance. In other words, treat findings as positive regardless of whether they are statistically significant, and treat findings as negative regardless of whether they are statistically significant.
That’s what I do in the tables below, based on Hitt’s, McShane’s, and Wolf’s coding of the twenty-two studies of bona fide school choice programs. The shaded cells show the number of studies with findings that match—either both pointing in a negative or both pointing in a positive direction.
High School Graduation Impacts versus ELA Impacts:
Number of Estimates by Sign for School Choice Programs Only
High School Graduation Negative (5) | High School Graduation Positive (17) | |
ELA Negative (7) | 3 | 4 |
ELA Positive (15) | 2 | 13 |
High School Graduation Impacts versus Math Impacts:
Number of Estimates by Sign for School Choice Programs Only
High School Graduation Negative (4) | High School Graduation Positive (16) | |
Math Negative (5) | 1 | 4 |
Math Positive (15) | 3 | 12 |
College Enrollment Impacts versus ELA Impacts: Number of Estimates by Sign and Statistical Significance for School Choice Programs Only
College Enrollment Negative (1) | College Enrollment Positive (7) | |
ELA Negative (1) | 1 | 0 |
ELA Positive (7) | 0 | 7 |
College Enrollment Impacts versus Math Impacts: Number of Estimates by Sign and Statistical Significance for School Choice Programs Only
College Enrollment Negative (1) | College Enrollment Positive (6) | |
Math Negative (1) | 1 | 0 |
Math Positive (6) | 0 | 6 |
College Graduation Impacts versus ELA Impacts: Number of Estimates by Sign and Statistical Significance for School Choice Programs Only
College Graduation Negative (0) | College Graduation Positive (2) | |
ELA Negative (0) | 0 | 0 |
ELA Positive (2) | 0 | 2 |
College Graduation Impacts versus Math Impacts: Number of Estimates by Sign and Statistical Significance for School Choice Programs Only
College Graduation Negative (0) | College Graduation Positive (2) | |
Math Negative (0) | 0 | 0 |
Math Positive (2) | 0 | 2 |
Analyzed this way, we see that the impacts on ELA achievement and high school graduation point in the same direction in sixteen out of twenty-two studies, or 73 percent of the time. For math, it’s thirteen out of twenty studies, or 65 percent. For college enrollment, the results point in the same direction 100 percent of the time.
Now, perhaps we should be worried about the six estimates in ELA and seven in math where achievement and high school graduation outcomes pointed in opposite direction. So let’s dig into those.
First, studies found that three school choice programs improved ELA and/or math achievement but not high school graduation:
- Boston charter schools
- SEED charter school
- Texas I-STEM School
Keep in mind that for both SEED and the Texas I-STEM School, the results for achievement and high school graduation were both insignificant. So these might actually be matches. Still, for Boston charter schools, there is a clear mismatch, with significantly positive results for ELA and math and significantly negative results for high school graduation. So is this cause for concern? Hardly. As I wrote on Monday,
We know from a 2003 study by Julian Betts and Jeff Grogger that there is a trade-off between higher standards (for what it takes to get a good grade) and graduation rates, at least for children of color. Higher standards boost the achievement of the kids who rise to the challenge, and help those students longer-term, but it also encourages some of the other students to drop out.
There’s good reason to believe, based on everything we know about Boston charter schools and their concentration of “no excuses” models, that they are holding their students to very high standards. This is raising achievement for the kids who stay but may be encouraging some of their peers to drop out. This trade-off is a legitimate subject for discussion—is it worth it?—but it’s not logical to conclude that the test score gains are meaningless. Especially since Boston charters do show a positive (though statistically insignificant) impact on college enrollment.
How about the four programs that “don’t test well”—initiatives that don’t improve achievement but do boost high school graduation rates? They are:
- Milwaukee Parental Choice
- Charlotte Open Enrollment
- Non-No Excuses Texas Charter Schools
- Chicago’s Small Schools of Choice
(Only the Texas charter schools had statistically significant impacts for both achievement (ELA and math) and high school graduation. Milwaukee’s voucher program had a significantly negative impact on ELA; the findings for math and high school graduation rates were statistically insignificant. In Charlotte and Chicago, all of the findings were statistically insignificant.)
The other day, I wrote:
It’s this category that most concerns Hitt, McShane, and Wolf, especially in the context of school choice. “In 2010,” they write, “a federally funded evaluation of a school voucher program in Washington, DC, found that the program produced large increases in high school graduation rates after years of producing no large or consistent impacts on reading and math scores.” Later they conclude that “focusing on test scores may lead authorities to favor the wrong school choice programs.”
It’s a legitimate concern, and one I share…the experience of attending a private school in the nation’s capital could bring benefits that might not show up until years later: exposure to a new peer group that holds higher expectations in terms of college-going and the like; access to a network of families that opens up opportunities; a religious education that provides meaning, perhaps a stronger grounding in both purpose and character, and that leads to personal growth.
It would be a shame—no, a tragedy—for Congress to kill this program, especially if it ends up showing positive impacts on college-going, graduation, and earnings.
So I agree that these mixed findings are a red flag and should give policymakers pause before moving to kill off such programs. I’m encouraged that there are only four such studies in the literature among the twenty-two examined here, just one of which had statistically significant findings for both achievement and high school graduation. But still, such initiatives may well be changing students’ lives, although we wouldn’t know that by looking at test scores alone.
Does that mean that we shouldn’t use test scores to hold individual schools accountable? Including closing weak charter schools or cutting off public funding to private schools of choice if they diminish achievement?
Tune in to our next installment tomorrow for a discussion of why these results for programs should not be extrapolated to individual schools.