Why disparate impact theory is a bad fit for school discipline
By Michael J. Petrilli
By Michael J. Petrilli
Editor’s note: School discipline reform has been the subject of several articles by Fordham’s Mike Petrilli in recent weeks. See this one for an overview of his concerns about the unintended consequences of top-down reform, and this one for ideas on where we might find common ground.
In 2014, in response to findings that African American students were three times as likely to be suspended as white students, the Obama Administration sent a lengthy “Dear Colleague” letter to school districts nationwide, spelling out a new policy on school discipline, motivated by disparate impact theory. It warned administrators that they could be subject to a federal civil rights investigation if their data showed significant racial disparities in the use of suspensions or expulsions, and could be found guilty of discrimination even if they had race-neutral discipline policies that were being applied even-handedly.
It’s this use of disparate impact analysis—and the threat of federal investigations based on discipline disparities alone—that gives many of us on the right such pause, and is why we believe the current administration should rescind or revise the 2014 letter. We worry that it will tie the hands of teachers and school administrators when it comes to maintaining discipline, possibly leading to greater disorder and even violence. We also worry that well-behaved students of color are most likely to suffer from the current approach. And we worry that it rests on a faulty assumption—that any racial disparity in discipline reflects discrimination rather than differences in student behavior.
Disparate impact theory: A little background
The most relevant section of the 2014 “Dear Colleague” letter states:
Schools also violate Federal law when they evenhandedly implement facially neutral policies and practices that, although not adopted with the intent to discriminate, nonetheless have an unjustified effect of discriminating against students on the basis of race. The resulting discriminatory effect is commonly referred to as “disparate impact.”
Disparate impact theory originated in the world of labor law. The classic example comes from firefighting. For decades, many urban fire departments had been the exclusive province of whites, thanks to explicit discrimination against blacks and other people of color. Once civil rights statutes banned such de jure—i.e., legally enforced—discrimination, some fire departments came up with ways to maintain the discriminatory status quo, such as requiring applicants to pass standardized tests before being hired, tests that African American and Latino applicants tended to fail. Agencies and courts turned to disparate impact theory to probe practices like those, which may have been “facially neutral” but were intended to have, and did have, discriminatory effects. Fire departments had to justify their use of such tests as being relevant to the job. If they couldn’t, the tests were deemed unlawful by the courts.
The 2014 letter from the Education and Justice Departments brought a similar approach to bear in the realm of school discipline. When determining whether schools’ policies and practices are having an “adverse impact” on students, the feds would look at:
…instances where students of a particular race, as compared to students of other races, are disproportionately: sanctioned at higher rates; disciplined for specific offenses; subjected to longer sanctions or more severe penalties; removed from the regular school setting to an alternative school setting; or excluded from one or more educational programs or activities.
If policies did result in disproportionality—and thus had an adverse impact on certain groups of students—schools would have to justify them by proving to the Department of Education that their policies were “necessary to meet an important educational goal” and that there’s no “comparatively effective alternative policies or practices” that the schools could use instead.
Does student misbehavior vary by race and socioeconomic status?
The 2014 policy rested on a fundamental assumption: that differences in student behavior were not themselves driving differential rates of suspensions and expulsions. The letter stated that “research suggests that the substantial racial disparities of the kind reflected in the [Civil Rights Data Collection] data are not explained by more frequent or more serious misbehavior by students of color.”
But is that true? It’s virtually impossible for researchers to know because they can’t “see” student misbehavior, only records of disciplinary actions taken in response to misbehavior. (Maybe the ubiquitous use of cameras and the like will change that one day, for better or worse.) It is certainly likely that some of the disparities are being driven by the bias of teachers and principals, implicit or otherwise, as some studies suggest.
Furthermore, recent studies like this one from Arkansas are showing big differences in suspension rates across schools in the same district. It seems probable that some middle and high schools are taking a tougher approach to discipline than others, and some of these schools—“suspension factories” if you will—serve high proportions of children of color.
Yet just because student behavior doesn’t explain all of the suspension gap, that doesn’t mean it doesn’t explain some of it. And, as I argued a few weeks ago, it’s extremely likely that student behavior does in fact vary across different subgroups—not because of race, but because of the vastly different socioeconomic circumstances that children of different groups are facing. Kids growing up in poverty are more likely to experience trauma, to live without their fathers, to go home to more violent neighborhoods, and to otherwise face all manner of difficult circumstances that make it more likely that they may misbehave at school. As African Americans are three times more likely to grow up in poverty as white students (36 percent versus 12 percent, as of 2015), we would expect to see racial differences in student behavior, just as we see racial differences in achievement—which are not driven by race, but by socioeconomic differences.
How might we test this theory? One way is to look at the relationship between poverty gaps and discipline disparities across a number of districts. If socioeconomic differences are a major force driving discipline disparities, than we would expect to see bigger discipline disparities in districts with bigger socioeconomic disparities—that is, in places where most of the white students are middle class or above and most of the African American students are poor. Likewise, we would see the smallest disparities in districts where everyone is middle class or everyone is poor.
To find out whether that’s the case, my colleague David Griffith took census poverty data from the new Stanford Education Data Archive and crossed it with data on out-of-school suspensions from the Civil Rights Data Collection.
Each dot in figure 1 represents one of the 120 largest school districts in the country, excluding those that have fewer than 1,000 white students or 1,000 African American students.
The graph shows a simple correlation between black-white discipline disparities (the percentage of black students given one or more out-of-school suspensions in 2013–14 divided by the percentage of white students given the same) versus black-white poverty disparities (the percentage of black children between the ages of five and seventeen in the district living below the poverty line divided by the percentage of white children living below the poverty line).
Figure 1: Black-white suspension gap versus the black-white poverty gap
Sure enough there’s a strong relationship between the poverty gap and the suspensions gap, with differences in poverty explaining more than a third of the differences in discipline disparities from district to district. Maybe something else could explain this phenomenon, but it appears likely that poor students misbehave at higher rates than non-poor students, and in some districts poor students are much more likely to be black.
***
Where might we turn to find out whether misbehavior itself varies across different groups? How about the students themselves? One piece of evidence comes from the Centers for Disease Control and its Youth Risk Behavior Surveillance System. In 2015, high school students were asked if they had been in a fight on school property at any time in the past 12 months. African American students were 2.2 times more likely to say yes than white students—11.4 percent to 5.2 percent. (See table 13.2 in this 2016 Department of Education report.)
Or how about lower-level offenses, the type that many discipline reformers want treated without suspensions? According to a 2009 nationally representative survey of ninth grader, 17 percent of African American students reported going to class late “sometimes,” versus 10 percent of white students. And 4.1 percent of black students reported going late to class “often,” versus 2.2 percent of white students. (See table 1.)
Table 1. Percentage distribution of ninth graders who reported going to class late in 2009-10
Never |
Rarely |
Sometimes |
Often |
|
All students |
44.2 |
40.0 |
12.7 |
3.2 |
White |
47.2 |
40.6 |
10.0 |
2.2 |
Black |
38.7 |
40.4 |
16.8 |
4.1 |
Hispanic |
41.6 |
37.9 |
16.2 |
4.4 |
Asian |
49.3 |
38.8 |
10.5 |
1.5 |
American Indian/Pacific Islander |
34.0 |
44.1 |
15.3 |
6.6 |
More than one race |
40.4 |
40.6 |
14.2 |
4.8 |
SOURCE: U.S. Department of Education, National Center for Education Statistics, High School Longitudinal Study of 2009 (HSLS:09), Base Year.
So according to the kids themselves, compared with white students, African American pupils are more than twice as likely to get into fights at school and almost twice as likely to get to class late. These differences are almost surely being driven in large part by socioeconomic factors. (See figure 2.)
Figure 2. Race, class, and student behavior
Is every school system in the country guilty of discrimination?
Imagine that you run a school system with pupil demographics that reflect national averages. Your African American students are three times likelier to live in poverty than your white students, are more than twice as likely to get into fights at school, and almost twice as likely to be chronically tardy to class. According to the 2014 “Dear Colleague” letter, if you suspend African American students at twice the rate of white students, you can be subject to an OCR investigation—even if those suspensions are completely justified by student behavior. Because your policy of suspending students who get into fights on school grounds will be said to have had an “adverse impact” on black students, you will face, at minimum, the negative publicity that comes from a civil rights investigation.
And when federal investigators come knocking, you will have to convince them (and not even an impartial judge) that (a) the suspensions serve an educational purpose and (b) that there’s not a better approach. Otherwise you can be found guilty of discrimination when in fact what you did was treat all kids the same according to their actions.
The message to administrators nationwide is quite clear: Get your discipline data in line, even if that means under-disciplining certain groups of students, or else you can expect a civil rights investigation. This is why many of us don’t think it’s a stretch to say that the 2014 policy imposed racial quotas on discipline and has likely had a chilling effect on discipline nationwide.
***
Though plenty of conservatives disagree, personally I believe that disparate impact theory has a limited but real role to play in ferreting out discrimination in certain domains, including hiring. When a facially neutral policy is intended to have a discriminatory impact, it’s appropriate for the government to put an end to it.
But it’s not a good fit for the complicated issue of school discipline. Here, a great deal of the racial disparities actually stem from differences in actual student behavior, which in turn is related to differences in socioeconomic circumstances. Threatening districts with civil rights investigations may reduce the number of suspensions, especially for children of color, but it may also increase disorder in the classroom by depriving teachers of an effective tool for maintaining order. That only serves to harm the peers of disruptive students, who are most likely to be children of color themselves.
To be sure, it remains essential for students and their families to have the right to file complaints with the Office for Civil Rights if the feel they have been subject to discrimination, and for investigators to examine the facts of their cases and come to appropriate judgments.
But jumping to conclusions from districts’ raw discipline data ought to end. School systems should know that if they are treating students fairly, the federal government will have their backs—even if some groups of students are suspended at higher rates than others.
I was four years old in September 1967 on my first day of kindergarten at Countrywood Elementary School in South Huntington School District 13 on Long Island. Before my first year of formal education was over, Martin Luther King, Jr., would be shot to death in Memphis. Robert Kennedy, too, in Los Angeles. More than 30,000 American boys had been killed in Vietnam by the time riots disrupted the Democratic National Convention, just weeks before I started first grade in Mrs. Bobowitz’s class. Riots and social unrest were not infrequent in the America of my early childhood years: Watts, Detroit, Newark, Washington, D.C. Airline hijackings were common, too. My dad flew for American.
My parents made no attempt that I’m aware of to shield me from the events of those turbulent years. There was always a copy of Newsday and the New York Daily News on the kitchen table, and the TV was rarely turned off in our house. I remember the moon landing and the Manson murders, both of which occurred in the sixth summer of my life. Memory is untrustworthy, but despite growing up in the most tumultuous years of the last few generations, I don’t believe I got a sense from the adults in my life—my parents, teachers, or others—that the world was unusually dangerous, volatile, or spiraling out of control. I went to school and became obsessed with animals, airplanes, and the New York Mets. I could ride my bike and play kickball in the street with my friends unsupervised, as long as I stayed within the sound of my mother’s shouts (not her sight) and came inside when the streetlights came on.
Last month the New York Times ran a piece titled “Can Kindness Be Taught?” about a “social-emotional learning” (SEL) curriculum developed by the Center for Healthy Minds at the University of Wisconsin–Madison, “in which preschoolers are introduced to a potpourri of sensory games, songs and stories that are designed to help them pay closer attention to their emotions.” Children who received the “kindness training” became more altruistic, the paper reported. “It also strengthened children’s ability to focus and modestly boosted their academic performance.” I’m out of my depth on whether personality traits can be “taught” in school and agnostic at best on whether they ought to be. But a quote from a teacher in the Times piece leapt out at me. “Our world is kind of a scary place,” said Danielle Mahoney-Kertes, a literacy coach at P.S. 212 in Queens, which implements the pre-K Kindness Curriculum. “We can’t always control what is happening outside us. But what we’re teaching them is that they can control how they respond.”
Her words made me wonder which would have a more significant effect on students, particularly the very youngest ones: a good social-emotional learning curriculum? Or a teacher who thinks the world is a scary place? Perhaps early childhood teachers who view the world as awe-inspiring and who are eager to share their optimism and excitement with students might be more beneficial than teaching coping skills.
My question struck a chord with Holly Korbey, one of the more thoughtful education writers I know, who has been reporting an upcoming piece on social-emotional learning, visiting about two dozen classrooms, mostly in Tennessee and Nevada, which implement the sort of kindness and “mindfulness” training highlighted in the Times piece. “When you have an SEL hammer, everything looks like a nail,” she observed about the classrooms she’s visited. “I noticed that kids talked a lot more about their stress and problems.”
It should go without saying (alas, in nuance-averse 2018, few assumptions can safely go unsaid) that we ought not be blithe about societal problems or the degree to which the impulse to insulate very young children from very real problems is a form of “privilege.” But neither should we overstate the dangers of the present moment. Mostly I wonder if it’s not worth thinking long and hard about the effects on children of a rapidly growing educational movement that proceeds from the pessimistic assumption that the world is so cracked and broken, that even the very youngest children need tools to make their way in it. At the very least, we might consider the age at which it’s appropriate to introduce these curricula and tools.
Korbey, who is the mother of three boys, notes that asking children to be problem-solvers, puts the focus on the problem. “I wonder whether to value what we want them to value—the planet, democracy, kindness—we should focus more on wonder, optimism, and magnificence. That doesn't mean we don't get to the problems eventually, but maybe for older kids,” she said. “You’ve got to love the world before you can save it.”
My thought exactly.
In my book, state-level policymaking should be like good parenting. It should incentivize the behaviors you’re looking to inspire, grant autonomy (when your charges have earned it), and refrain from too much meddling or coddling. It should be transparent and honest, truthful about tradeoffs between short-term discomfort and long-term gain, and motivated by a clear compass rooted in what’s in the best interest of the kids’ wellbeing.
So why does Ohio’s latest softening on what we expect of our high schoolers bring to mind so many parallels to helicopter parenting? Allow me to explain.
I first learned about helicopter parenting from my husband, a psychotherapist who counsels a number of adolescents and young adults. Years ago, he began noting (broadly, never in specifics) that many young people he counseled seemed to lack the fortitude and emotional resilience to overcome basic life obstacles. For instance, they might have a panic attack after earning a “C” on a paper, find themselves bedridden with depression if they didn’t get into their first-choice college, or wind up suicidal after a break-up with a girlfriend or boyfriend.
A common thread among these clients was that their parents tended to “hover,” micromanaging their lives and worse yet protecting them from the natural consequences of their decisions and from the discomfort that comes with failure. At that point, the term “helicopter parenting” wasn’t yet part of my lexicon, nor had I seen expert takes from university deans, psychologists, or doctors that brought it fully into the mainstream. But I had heard enough to be suspicious of participation trophies and the self-esteem movement. I vowed that if and when I had children of my own, I’d allow them the dignity of facing the truth, even when—and maybe especially when—it included failure. It seemed clear that the alternative was infinitely worse, and much harder to recover from.
Since then, these themes have shown up somewhat unexpectedly in my own line of work—education policy. The “opt out movement,” wherein parents keep their kids home on state testing days, struck me as an offshoot of helicopter parenting—especially among those reluctant to expose children to stress or anxiety. (Certainly, there are kids for whom anxiety is a legitimate challenge, though such debilitating levels are arguably rare.)
Support for higher academic standards, which Ohio began moving toward in 2009, and the fight to overcome the “honesty gap”—the yawning chasm between states’ sunny numbers on student proficiency versus more rigorous national measures (e.g., NAEP or ACT)—made deep sense. It paralleled my nascent views on parenting and my antagonism toward helicoptering, inasmuch as I believed that if young people aren’t prepared for life—whether academically or emotionally—it’s better that parents know right away, while they still have time to adjust course.
Most recently, Ohio’s debate over graduation requirements has me looking through this lens again. Ohio lawmakers must decide whether to accept the recommendations passed by the State Board of Education this week to extend the latest “fix” to the classes of 2019 and 2020. This menu of options allows students to earn a diploma without demonstrating mastery in any academic content area. A new “jobs readiness seal” earned by completing a checklist of vague and subjective traits such as “global/intercultural fluency” and “dressing and acting appropriately” takes us even further away from the mark. Are we really ready to award diplomas so subjectively?
These options weaken what is already a somewhat meaningless diploma, amount to state-sanctioned low expectations, and set Ohio back three decades when it comes to expecting high school students to meet a basic threshold of competency in key subjects. (Ohio’s 117th General Assembly passed a law in 1987 that required the graduating class of 1994 and classes thereafter to pass ninth grade proficiency exams.)
Like helicoptering, creating competency-free options for young people to graduate is detrimental no matter how loving it’s intended to be. Students, rather than schools and adults charged with running them, will bear the costs down the line. And the cruel irony is that the students for whom we’ve created these alternatives, unlike those who are fiercely helicoptered, are far less likely to have a safety net to fall back on when their academic deficiencies catch up to them.
To some, criticizing the notion of diplomas-as-participation-trophies might come across as elitist. After all, who are we to stand between eighteen-year-olds and their high school diplomas? I worry less about the fact that we’re letting students off the hook than the reality that we’re letting schools off the hook. If Ohio lawmakers accept the State Board’s recommendation and extend these soft alternatives for several more years, they are tacitly agreeing with the idea that mastery is just not possible for some students.
At the end of the day, Ohio’s weakening of graduation standards reminds me of the most disturbing aspects of helicopter parenting. It represents the path of least resistance, regardless of the long-term consequences for students whose interests we’re supposed to protect. It robs students of the opportunity to work hard and achieve something meaningful. It allows adults to save face, albeit temporarily. And, sadly, the repercussions will likely not be felt until years down the line, once those same students enter a world for which they are utterly unprepared.
On this week’s podcast, Benjamin Boer, deputy director at Advance Illinois, joins Mike Petrilli and Alyssa Schwenk to discuss how a coalition of advocates succeeded in getting the Land of Lincoln to overhaul its inequitable school funding formula. During the Research Minute, Amber Northern examines the relationship between high school value added and students’ college success.
Daniel Hubbard, “More Gains Than Score Gains? High School Accountability and College Success,” University of Michigan (October 2017).
Critics of test-based accountability sometimes argue that there’s little evidence that schools that boost students’ test scores also prepare them for long-term success. A recent Institute of Education Sciences–commissioned study by Daniel Hubbard helps to fill this gap by examining how attending high schools with high value added affects students’ first year grades in college.
Hubbard uses student-level test scores and demographic data from public middle and high schools in Michigan to estimate school-level value-added scores and then merges them with college data to measure post-secondary grade point average. His sample includes all students in Michigan public schools who first sat for the eighth grade math and reading Michigan Educational Assessment Program state test (MEAP) between the 2005–06 and 2007–2008 school years. To be included in the sample, they must also take the eleventh grade state test and take a course in a Michigan public college within five years of taking their eighth grade test.
Hubbard uses a number of empirical adjustments and other tests of robustness to address the problem of selection and sorting into high school and college. That includes, for instance, restricting the sample to students who are very likely to go to college, meaning they meet all of the ACT’s benchmarks for college readiness. In theory, these students have less leeway for their college-going decisions to be altered by the quality of their high school.
Yet these results and others do not change the overall tenor of the key finding, which is that there is a statistically positive relationship between high schools’ value added and college course grades. The effect of attending a school with one standard deviation higher value added is about 0.09 grade points higher than the grades of an identical student in an average high school, which is about one third of the difference between a B and a B+. Results by subject area show similar, statistically positive results for both tested (math and English language arts) and untested subjects—the latter of which includes a wide spectrum of courses such as psychology, business, and even welding.
Effects are larger for black students than for white students and slightly larger for poor students. They are also larger for students in low-scoring (based on high schools’ average eighth grade exam scores) but high value-added schools. In other words, attending an effective school, as measured by value-added gains, is especially important for disadvantaged students.
Hubbard concludes that his results “imply that schools with high value added are not earning those scores by teaching to the test or by reallocating resources toward tested subjects, but instead by preparing students effectively to perform well on the standardized test and beyond.” That’s mighty good news for the high-flying schools that invest copious blood, sweat, and tears into preparing their students not just for the here and now, but for the elsewhere and later. And it’s also good news for the testing-and-accountability movement, given that it shows that test score gains are related to other outcomes most everyone agrees are important.
SOURCE: Daniel Hubbard, “More Gains than Score Gains? High School Quality and College Success,” Working Paper (October 13, 2017).
Evidence continues to mount that students at urban charter schools are achieving higher academic growth than their traditional public school peers. Replicating successful models requires understanding what features of these charter schools are contributing to their gains. Patrick J. Wolf of the University of Arkansas and Shannon Lasserre-Cortez of American Institutes for Research have taken an important step in a recently published report for the U.S. Department of Education. In the study, they examine correlations between various school features and student achievement growth in New Orleans charter schools serving grades three through eight.
The study used school-level value-added scores in English language arts, math, and science for 2012–13 and 2013–14, provided by the Louisiana Department of Education (LDOE). It included fourteen organizational, operational, and instructional features of schools as potential indicators of charter school effectiveness, using data provided by LDOE and the New Orleans Parents’ Guide, a local nonprofit. The analysts used regression analyses to determine the association of these indicators with variations in student achievement growth, as determined by value-added scores.
They found that during the 2012–13 school year, schools that included kindergarten, had an extended school year, and had more experienced teachers all had statistically significantly associations with growth in English language arts. Schools with kindergarten were also significantly associated with higher growth in math, a subject for which none of the other thirteen indicators had any effect. On the negative side, higher percentages of teachers with a graduate degree and higher student-teacher ratios were significantly associated with declines in ELA growth. Higher student-teacher ratios were also associated with declines in science growth, as were higher levels of support staff, such as speech and occupational therapists, counselors, and mentors.
The report does, however, note a number of limitations to the data. Most significantly, the analysts caution against any conclusions of causation, due to the non-experimental nature of the study. Additionally, all statistically significant findings for the 2012–13 school years were not statistically significant in 2013–14, further limiting conclusions. Finally, the study’s fourteen indicators excluded many relevant school characteristics for a variety of reasons. For example, tutoring and transportation were ubiquitous among New Orleans charters, eliminating those as viable variables. And others like higher expectations, school-leadership quality, and parent involvement lacked reliable data.
This exploratory study does not provide any concrete answers about which charter school features should be replicated to improve student learning. But the findings do provide valuable guidance for which school features warrant more rigorous study.
SOURCE: Patrick J. Wolf and Shannon Lasserre-Cortez, “An Exploratory Analysis of Feature of New Orleans Charter Schools Associated with Student Achievement Growth,” Institute of Education Sciences; U.S. Department of Education (January 2018).