Ohio Education Gadfly Biweekly Volume 10, Number 21

Volume 10, Number 21

10.21.2016

Ohio Education Gadfly Biweekly Volume 10, Number 21

Volume 10, Number 21

view

Voters cannot possibly comprehend school districts’ tax requests

It’s October, and that means election season. One important decision facing many Buckeye voters is whether to approve their school districts’ tax requests.

Aaron Churchill 10.21.2016

OhioBlog

Voters cannot possibly comprehend school districts’ tax requests

Aaron Churchill

10.21.2016

Blog

Ohio’s sponsor evaluations: Take two

Kathryn Mullen Upton

10.18.2016

Blog

Evaluating Denver's new proposal for teacher pay

Jamie Davies O'Leary

10.19.2016

Flypaper

Recognizing teachers’ success on Ohio’s Resident Educator Summative Assessment (RESA)

10.24.2016

Blog

Why states should use student growth, and not proficiency rates, when gauging school effectiveness

Michael J. Petrilli | Aaron Churchill

10.13.2016

Flypaper

Disputing Mike and Aaron on ESSA school ratings

Chester E. Finn, Jr. | Chad L. Aldis

10.17.2016

Flypaper

view

Ohio’s sponsor evaluations: Take two

Kathryn Mullen Upton 10.18.2016

Blog

view

Evaluating Denver's new proposal for teacher pay

Jamie Davies O'Leary 10.19.2016

Flypaper

view

Recognizing teachers’ success on Ohio’s Resident Educator Summative Assessment (RESA)

10.24.2016

Blog

view

Why states should use student growth, and not proficiency rates, when gauging school effectiveness

Michael J. Petrilli, Aaron Churchill 10.13.2016

Flypaper

view

Disputing Mike and Aaron on ESSA school ratings

Chester E. Finn, Jr., Chad L. Aldis 10.17.2016

Flypaper

view

Voters cannot possibly comprehend school districts’ tax requests

Aaron Churchill

10.21.2016

Blog

It’s October, and that means election season. One important decision facing many Buckeye voters is whether to approve their school districts’ tax requests. These referenda represent a unique intersection between direct democracy and public finance; unlike most tax policies, which are set by legislatures, voters have the opportunity to decide, in large part, their own property-tax rates. In Ohio, districts must seek voter approval for property taxes above 10 mills (equivalent to 1 percent) on the taxable value of their property.

Some citizens will enter the voting booth well-informed about these tax issues, but for others, the question printed on the ballot might be all they know. Voters have busy lives and they may not always carefully follow their district’s finances and tax issues. This means that the ballot itself ought to clearly and fairly present the proposition to voters. State law prescribes certain standard ballot language, but districts have some discretion in how the proposition is written. County boards of elections and the Secretary of State approve the final language. How does the actual language read? Is it impartial? Can it be easily understood?

Let’s take a look at a few high-profile ballot issues facing voters in November. First, here is the tax-issue posed to Cincinnati voters:

Shall a levy be imposed by the Cincinnati City School District, County of Hamilton, Ohio, for the purpose of PROVIDING FOR THE EMERGENCY REQUIREMENTS OF THE SCHOOL DISTRICT in the sum of $48,000,000 and a levy of taxes to be made outside of the ten-mill limitation estimated by the county auditor to average seven and ninety-three hundredths (7.93) mills for each one dollar of valuation, which amounts to seventy-nine and three-tenths cents ($0.793) for each one hundred dollars of valuation, for five (5) years, commencing in 2016, first due in calendar year 2017?

As with all property-tax issues, one of the most complicated terms is “mill”—the amount of the levy and equal to one thousandth of a dollar. None of us, however, go to the supermarket and buy 100 mills worth of groceries; and in the realm of taxes, we’re more accustomed to seeing them expressed as percentages—a 6 percent sales tax, for instance. Because millage rates are so rarely used in everyday life, a voter may find it hard to discern the size of the request. Is 7.93 mills a huge tax hike, or relatively affordable? Unless a voter has done her homework, she probably wouldn’t know. But voters shouldn’t be expected to be tax experts or follow the news to understand the impact on their personal finances. Simpler, less technical language would help the average voter better understand the question. Perhaps the tax could be stated also as percentages or in more realistic dollar terms—for instance, the proposed levy would increase taxes by $100 for a property with a taxable value of $100,000.

Also noticeable in this tax request is the “emergency” language—it is hard to miss when printed in capital letters. While the district is not in fiscal emergency, it is seeking an emergency levy nevertheless. The state permits this type of levy when districts are projecting a financial deficit in future years. But the prominent ballot language could impact the electoral outcome, especially if marginal or undecided voters tip the scales. Perhaps the district is indeed in financial straits, but shouldn’t that case be made independent of the ballot itself? Opponents might argue that district could address the deficit in other ways, such as by renegotiating unaffordable teacher union contracts. Referenda should be presented as neutrally as possible,[1] because we know from surveys that the wording of questions can alter the results. Though allowed, the use of the word “emergency,” which comes with a powerful connotation, is likely to influence voters.[2]

Now let’s turn to the 274-word question facing Columbus voters.

Shall the Columbus City School District be authorized to do the following: 1. Issue bonds for the purpose of improving the safety and security of existing buildings including needed repairs and/or replacement of roofing, plumbing, fire alarms, electrical systems, HVAC, and lighting; equipping classrooms with upgraded technology; acquiring school buses and other vehicles; and other improvements in the principal amount of $125,000,000, to be repaid annually over a maximum period of 30 years, and levy a property tax outside the ten-mill limitation, estimated by the county auditor to average over the bond repayment period 0.84 mill for each one dollar of tax valuation, which amounts to $0.084 for each one hundred dollars of tax valuation, to pay the annual debt charges on the bonds, and to pay debt charges on any notes issued in anticipation of those bonds? 2. Levy an additional property tax to provide funds for the acquisition, construction, enlargement, renovation, and financing of permanent improvements to implement ongoing maintenance, repair and replacement at a rate not exceeding 0.5 mill for each one dollar of tax valuation, which amounts to $0.05 for each one hundred dollars of tax valuation, for a continuing period of time? 3. Levy an additional property tax to pay current operating expenses (including expanding Pre-Kindergarten education; improving the social, emotional, and physical safety of students; expanding career exploration opportunities; reducing class sizes; providing increased support to students with exceptional needs; and enhancing reading and mathematics instruction) at a rate not exceeding 5.58 mills for each one dollar of tax valuation, which amounts to $0.558 for each one hundred dollars of tax valuation, for a continuing period of time?

I won’t repeat the point about millage, but let me make three additional observations. First and most obviously, this is a complicated request: The district is seeking approval for a tax package that includes not only debt financing but also funding for capital improvements and day-to-day operations. This puts a daunting burden on voters who must either gather the requisite information beforehand, or spend serious time in the booth reading and understanding it.

Second, consider how different Columbus’s tax request is compared to Cincinnati’s. Columbus is seeking a fixed rate levy at a maximum 0.5 mills for permanent improvements and 5.58 mills for operations. In contrast, Cincinnati is seeking a fixed sum levy generating $48 million per year, where the tax rate could vary (note the “estimated” rate). Also there is no set time in which Columbus’s tax would expire, while Cincinnati’s would sunset after five years. This illustrates how varied Ohio’s different property-tax types are, adding more complexity to what voters must know in order to make an informed decision.

Third, note how the 5.58 mill request lists several specific purposes of the levy, such as expanded pre-K, reduced class sizes, and other initiatives. Other district tax requests don’t include such specific lists and could be thought of as more neutral. For instance, Cleveland’s levy request simply states that it would be used for “current expenses for the school district and partnering community schools.” Similarly, Hilliard’s levy request says its purpose is for “current operating expenses.” That’s it. Nothing more with respect to the levy’s purpose. Does enumerating a handful of likable programs improve the chances of passage? It’s hard to know, of course, but they do seem to frame the tax in a more favorable light.

One could argue that voters are responsible for being educated before they enter the booth, and the question itself doesn’t matter. To be fair, local media usually cover school tax issues—albeit much less than top-of-the-ticket races—and I suspect a fair number of voters come modestly well-informed. But we also know that some voters might not be quite as well attuned. That means the ballot words matter and, if the examples of Cincinnati and Columbus are any indication, the language for property tax referenda could be made more understandable and fair. Accomplishing this will probably require revisions in state tax law and/or changes in how county boards oversee districts’ ballot language.

To be clear, I’m not taking a position on either of these tax issues. The benefits of each tax could very well outweigh the costs, or vice-versa. Nor am I suggesting that direct democracy is an inappropriate way of setting tax policy. Other taxing arrangements, of course, have their own set of challenges. My point is that so long as voters are tasked with setting property tax rates, the referenda should be presented as clear, simple, and unbiased propositions. As economist John Cochrane has argued, one imperative of modern governing is to “bring a reasonable simplicity to our public life.” Reasonable simplicity in tax referenda language seems to be warranted.

[1] In the case of the “Brexit” vote, the neutrality of the referendum language came into question and the government was forced to revise it. In Indiana, school tax referenda language has been disapproved by the state on the grounds that it might bias the vote. See here, here, and here for examples of disapproved ballot language.

[2] A look at a couple other emergency levy requests also reveals prominent typeface, so this is not unique to Cincinnati’s emergency request. See here for Parma and here for East Knox.

view

Ohio’s sponsor evaluations: Take two

Kathryn Mullen Upton

10.18.2016

Blog

The Ohio Department of Education (ODE) recently released the results of its revised sponsor evaluation, including new ratings for all of the state’s charter-school sponsors. Called “authorizers” in most other states, sponsors are the entities responsible for monitoring and oversight of charter schools. Under the current rating system, sponsors are evaluated in three areas—compliance, quality practice, and school academic outcomes—and receive overall ratings of “Exemplary,” “Effective,” “Ineffective,” or “Poor.” Of the sixty-five Buckeye State sponsors evaluated, five were rated “Effective,” thirty-nine “Ineffective,” and twenty-one “Poor.” Incentives are built into the system for sponsors rated “Effective” or “Exemplary” (for instance, only having to be evaluated on the quality practice component every three years); however, sponsors rated “Ineffective” are prohibited from sponsoring new schools, and sponsors rated “Poor” have their sponsorship revoked.

Number of charter schools by sponsor rating

[[{"fid":"117360","view_mode":"default","fields":{"format":"default"},"type":"media","link_text":null,"attributes":{"height":"300","width":"559","style":"width: 350px; height: 188px;","class":"media-element file-default"}}]]

Evaluating sponsors is a key step in the direction of accountability and quality control, especially in Ohio, where the charter sector has been beset with performance challenges. Indeed, the point of implementing the evaluation was two-fold. First, the existence of the evaluation system and its rubric for ratings is meant to prod sponsors to focus on academic outcomes of the charter schools in their portfolios. Second, they’re designed to help sponsors improve their own work, which would result in stronger oversight (without micromanagement) of schools and an improved charter sector. Results-driven accountability is important, as is continually improving one’s practice.

What happens next is also important. ODE has time to improve its sponsor evaluation system before the next cycle, and it should take that opportunity seriously. Strengthening both the framework and the process will improve the evaluation. Let us offer a few ideas.

First, the academic component should be revised to more accurately capture whether schools are making a difference for their students. Largely as a function of current state policy, Ohio charters are mostly located in economically challenged communities. As we’ve long known and are reminded of each year when state report cards on schools and districts are released, academic outcomes correlate closely with demographics. So we need to look at the gains that they are (or aren’t) making in their schools, as well as their present achievement. In communities where children are well below grade level, the extent and velocity of growth matter enormously. Make no mistake: proficiency is also important. But schools whose pupils consistently make well over a year of achievement growth within a single school year are doing what they’re supposed to: helping kids catch up and preparing them for the future.

It’s critical that we make sure that achievement and growth both be given their due when evaluating Ohio schools—and the entities that sponsor them. Fortunately, Ohio will soon unveil a modified school-accountability plan under the federal Every Student Succeeds Act (ESSA): This would be a perfect opportunity to rebalance school report cards in a way that places appropriate weight—for all public schools and sponsors—on student growth over time.

Because dropout recovery charters are graded on a different scale from other kinds of charters, their sponsors may get artificially high ratings on the academic portion of the sponsor evaluation. That needs fine-tuning too.

The compliance component of the sponsor evaluation system also needs attention. The current version looks at compliance with “all laws and rules,” which is a list of 319 laws and rules applicable to Ohio’s charter schools, many of which don’t apply to individual sponsors. (For example, many sponsors have no e-schools in their portfolios and therefore the laws and rules that apply to such schools aren’t really pertinent to them.) Yet all Ohio sponsors were forced to gather/draft more than a hundred documents and memos—many of them duplicative—for each of their schools over a 30-day period. A better way to do this would be to figure out what applies and what matters most, then examine compliance against those provisions. For example, current item 209 (“The School displays a US flag, not less than five feet in length, when school is in session”) is not as important as whether the school has a safety plan (i.e., how to deal with armed intruders). ODE should focus on compliance with the most critical regulations on a regular basis while spot-checking or periodically checking compliance with the more picayune regulations. Another option would be to review a sample of the required documents each year, much as an auditor randomly reviews transactions. The current compliance regimen is hugely burdensome with, in many cases, very little payoff.

The sponsor evaluation is critically important, and reflects continued progress in Ohio’s efforts to improve charter school outcomes. But it’s also important to get it right if it’s indeed going to improve sponsor practice and in turn the charter sector. In its current form, it measures how well a sponsor responded to rubric questions and whether there were enough staff on hand to upload documents. It needs to quickly move to 2.0 if it seeks to be a credible and effective instrument long-term.

view

Evaluating Denver's new proposal for teacher pay

Jamie Davies O'Leary

10.19.2016

Flypaper

This report from A+ Colorado examines Denver’s ProComp (Professional Compensation System for Teachers), a system forged collaboratively between the district and teachers union in 2005 that was on the vanguard of reforming teacher pay scales. The analysis is timely for Denver Public Schools and the Denver Classroom Teachers Association, who are back at the negotiating table (the current agreement expires in December 2017).

The A+ report outlines the urgency of getting ProComp’s next iteration right. Denver loses about half of newly-hired teachers within the first three years—a turnover rate that is costly not only for the district, which must recruit, hire, and train new teachers, but for the students who are taught by inexperienced educators (research shows that effectiveness increases greatly in the first five years). Denver Public Schools also faces another challenge in that Denver’s cost of living has increased sharply. The report notes that more than half of all renters face “serious cost burdens,” meaning they spend more than 30 percent of income on housing. The situation is worse for homeowners or would-be homeowners. Thus, ProComp is a critical part of “making DPS an attractive place to teach.”

ProComp was revolutionary at its outset. Funded in part through an annual $25 million property tax increase (the cost for the entire system is a sizeable $330 million for 4,300 teachers), it aimed to reward teachers working in hard-to-staff positions and schools, as well as those demonstrating instructional effectiveness, measured in part by student test scores. The average teacher salary change in a given year looks markedly different under ProComp than in traditional pay systems. Last year, teachers received an average $1,444 cost of living increase, $1,253 increase in base pay, and $4,914 bonus through one-time incentives. Yet A+ finds that the system still “strongly tracks with experience” and that “teacher pay only looks modestly different than it would under a more traditional salary schedule.” That’s because ProComp maintains traditional “steps” for salary based on teachers’ years of experience and credentials. Increases to base pay are determined by negotiated cost of living increases, as well as meeting ProComp objectives. One-time bonuses are available for serving in hard-to-serve schools, boosting student test scores, or working in a high-performing or high-growth school. Denver’s teachers, when surveyed, perceived ProComp as a repackaging of the same salary as “salary plus bonuses” in exchange for extra work.

A+ finds that despite the intentions and theory of change behind ProComp, to incentivize and reward teachers and ultimately drive student achievement, studies have shown mixed results to date. While the Center for Education Data and Research found small positive effects on student achievement pre- and post-ProComp, that study couldn’t prove causality. A+ concludes that it’s “hard to prove any measurable student achievement gains attributable to ProComp.” Another study from Harvard University found that teachers with students attaining the top and lowest levels of math growth earned about the same.

Even the $25 million pot of money—just 8 percent of the district’s total spending on teacher pay—isn’t targeted to reward individual teachers for effectiveness. In 2015–16, 27 percent of these one-time dollars were allocated for market incentives. Ten percent went to teachers who gained additional education, while 52 percent were aligned to student outcomes—but mostly at the building level. The authors further find that the system is difficult for teachers to understand—a “hodgepodge of incentives” in desperate need of being streamlined and better aligned to solving district challenges.

Toward that end, A+ makes good recommendations for improving Denver’s system: 1) “Front load” the salary schedule dramatically, awarding 10 percent increases in the first five years (with 1 percent increases thereafter, up to year fifteen); 2) Streamline salary increases and prioritize expertise, specifically by offering two lanes based on education level, instead of seven, and allow subject-matter experts to earn more; 3) Increase pay for teachers teaching in, and returning to, the highest-need schools; 4) Allow for base pay increases, rather than stipends, for taking on leadership roles, thereby better aligning pay with one’s career ladder; 5) reward high performance among teachers individually, either through more bonuses or additional promotional opportunities, to leadership roles and advances on the salary ladder.

Perhaps the most valuable contribution this report makes is a powerful reminder that ProComp (and any teacher pay system, for that matter) should be aligned with district goals. If Denver wants to mitigate teacher turnover, its pay scale must do more to incentivize teachers to stay at earlier points in their careers. The brief is also pertinent nationally. As the breakdown of Cleveland’s promising teacher pay system reminds us, challenge lies in not only crafting innovative pay systems but sustaining them over the long haul. In that respect, there’s a lot to learn from Denver’s eleven-year-old program.

SOURCE: A+ Colorado, “A Fair Share: A New Proposal for Teacher Pay in Denver” (September 2016).

view

Recognizing teachers’ success on Ohio’s Resident Educator Summative Assessment (RESA)

10.24.2016

Blog

On October 12, in the ornate Rotunda and Atrium of the Ohio Statehouse, surrounded by family and many of the state’s top education leaders, some of Ohio’s highest-performing beginning teachers were honored for demonstrating superior practice. We at Educopia, Ohio’s partner in administering the Resident Educator Summative Assessment (RESA), feel truly privileged to have hosted the event, which recognized Ohio educators who earned the top 100 overall scores on RESA in each of the past three years. More than 120 of the state’s highest-scoring teachers attended, joined by their spouses, children, and parents in celebration of the honor. State Superintendent Paolo DeMaria, Representative Andrew Brenner - Chair of the House Education Committee, and other state policymakers attended the event. Seeing the teachers beam with pride in front of their families and hearing their sincere gratitude for being recognized for their professional excellence was by far the most moving experience of my career in education policy.

RESA 1 0

For background, RESA is required for all third-year educators seeking a permanent teaching license in Ohio. It consists of four performance tasks that teachers complete by submitting videos, lesson plans, and student assignments from their actual teaching. The assessment was custom-developed for Ohio with the assistance of national experts Charlotte Danielson and Mari Pearlman to accurately mirror Ohio’s Teaching Standards. Ohio educators, who complete extensive training and earn certification by passing a rigorous examination, score the RESA submissions. The teachers honored at the event were among a very select group: over 15,900 educators have taken RESA since its first year in 2013-2014.

The Ohio Resident Educator program gives new teachers the chance to develop their competencies with the support of a mentor. According to Connie Ball, a program coordinator at Worthington Schools, “The Ohio Resident Educator program provides strong support for beginning teachers allowing them the grace of time to grow in the profession and continue to learn through the guidance of a strong mentorship program and a network of their peers. The program encourages teachers to ask, ‘how can I be a better educator tomorrow than I was today?’ and our teachers are certainly meeting that challenge.”

Through RESA, the state then determines whether candidates have the knowledge and skills to lead a classroom anywhere in the state. This process allows local leadership to focus on what they're best situated to do, which is to work with teachers to help them address areas for improvement. It's a bit like the AP test, in which the test is a consistent bar that all students must pass to get credit, and an AP teacher’s job is to help the students get over it. In Ohio, local leaders and mentors are there to help teachers develop the skills assessed on RESA so they can pass and earn their professional license.

RESA is an objective measure of important teaching practices, such as lesson planning, differentiation of instruction, use of assessment, and the ability to engage students intellectually so they understand concepts deeply. It also measures a teacher's ability to reflect and identify ways to improve her own practice, which is absolutely essential in a profession that requires an ongoing commitment to continual improvement.

Demonstrating the skills that RESA measures is a lot of work, as any teacher will tell you. Just as teachers and schools must commit to ongoing improvement, Educopia, the state’s testing vendor, is gathering feedback and working with the Ohio Department of Education to streamline the assessment to alleviate teacher burden. Still, the RESA “tasks” are not busywork; they capture essential skills required of any effective teacher.

RESA 2

On questionnaires distributed at the end of the event, teachers provided suggestions on how to improve RESA and wrote about what they gained from the RESA process. Among their comments:

Madison Welker, an 8^th grade teacher, commented, “[T]he idea of reflection aided me to further my impact through instruction.”
Allison Meyer, a Kindergarten teacher, wrote, “Reflecting upon my teaching practices in a purposeful manner was incredibly beneficial, as it forced me to stop amongst the hectic day-in and day-out and evaluate my own teaching practices.”
Jessica Russell, a Pre-K teacher, also commented on the reflection element of RESA, “RESA has helped make lesson reflection second nature! As soon as I finish teaching a lesson I am already thinking about how I can improve it for next time. It has helped me become my best!”

RESA 4 crop

Pre-K teacher Jessica Russell with State Superintendent of Public Instruction Paolo DeMaria
All photos used in this piece are by kind permission of Educopia/Matt Verber

This was the first year that Educopia hosted such an event to honor outstanding RESA candidates, and it is just the first step in our efforts to recognize high-performing educators in Ohio. We encourage these teachers to continue their professional growth and to consider future roles as teacher leaders, so that they can share what they clearly do so well. Although the event on October 12^th honored a select group of teachers who scored in the top 100 on RESA, we hope districts across Ohio recognize all their teachers who are successful on the assessment, which is truly an accomplishment that deserves celebration.

Matt Verber is the Executive Director of Policy & Advocacy of Educopia.

view

Why states should use student growth, and not proficiency rates, when gauging school effectiveness

Michael J. Petrilli | Aaron Churchill

10.13.2016

Flypaper

Our goal with this post is to convince you that continuing to use status measures like proficiency rates to grade schools is misleading and irresponsible—so much so that the results from growth measures ought to count much more—three, five, maybe even nine times more—than proficiency when determining school performance under the Every Student Succeeds Act (ESSA). We draw upon our experience with our home state of Ohio and its current accountability system, which currently generates separate school grades for proficiency and for growth.

We argue three points:

In an era of high standards and tough tests, proficiency rates are correlated with student demographics and prior achievement. If schools are judged predominantly on these rates, almost every high-poverty school will be labeled a failure. That is not only inaccurate and unfair, but it will also demoralize educators and/or hurt the credibility of school accountability systems. In turn, states will be pressured to lower their proficiency standards.
Growth measures—like “value added” or “student growth percentiles”—are a much fairer way to evaluate schools, since they can control for prior achievement and can ascertain progress over the course of the school year. They can also differentiate between high-poverty schools where kids are making steady progress and those where they are not.
In contrast with conventional wisdom, growth models don’t let too many poor-performing schools “off the hook.” Failure rates for high-poverty schools are still high when judged by “value added” or “student growth percentiles”—they just aren’t as ridiculously high as with proficiency rates.

Finally, we tackle a fourth point, addressing the most compelling argument against growth measures:

That schools can score well on growth measures even if their low-income students and/or students of color don’t close gaps in achievement and college-and-career readiness.

(And these arguments are on top of one of the best reasons to support growth models: Because they encourage schools to pay attention to all students, including their high achievers.)

Point #1: Proficiency rates are poor measures of school quality.

States should use proficiency rates cautiously because of their correlation with student demographics and prior achievement—factors that are outside of schools’ control. Let’s illustrate what this looks like in the Buckeye State. One of Ohio’s primary school-quality indicators is its performance index (PI)—essentially, a weighted proficiency measure that awards more credit when students achieve at higher levels. Decades of research have shown the existence of a link between student proficiency and student demographics, and that unfortunate relationship persists today. Chart 1 displays the correlation between PI scores and a school’s proportion of economically disadvantaged (ED) pupils. Schools with more ED students tend to post lower PI scores—and vice-versa.

Chart 1: Relationship between performance index scores and percent economically disadvantaged, Ohio schools, 2015–16

chart showing a descending trend line

Data source: Ohio Department of Education Notes: Each point represents a school’s performance index score and its percentage of economically disadvantaged students. The red line displays the linear relationship between the variables. Several high-poverty districts in Ohio participate in the Community Eligibility Provision program; in turn, all of their students are reported as economically disadvantaged. As a result, some less impoverished schools (in high-poverty districts) are reported as enrolling all ED students, explaining some of the high PI scores in the top right portion of the chart.

Given this strong correlation, it’s not surprising that almost all high-poverty urban schools in Ohio get failing grades on the performance index. In 2015–16, a staggering 93 percent of public schools in Ohio’s eight major cities received a D or F on this measure, including several well-regarded schools (more on those below). Adding to their misery, urban schools received even worse ratings on a couple of Ohio’s other proficiency-based measures, such as its indicators met and annual measureable objectives components. Parents and students should absolutely know whether they are proficient in key subjects—and on track for future success. But that’s a different question from whether their schools should be judged by this standard.

Point #2: Growth measures are truer indicators of school quality.

Because they account for prior achievement, ratings based on student growth are largely independent of demographics. This helps us make better distinctions in the performance of high-poverty schools. Like several other states, Ohio uses a value-added measure developed by the analytics firm SAS. (Other states utilize a similar type of measure called “student growth percentiles.”) When we look at the value-added ratings from Ohio’s urban schools, we see differentiation in performance. Chart 2 below shows a fairer balance across the A-F categories on this measure: 22 percent received an A or B rating; 15 percent received C’s; and 63 percent were assigned a D or F rating.*

Chart 2: Rating distribution of Ohio’s urban schools, performance index versus “value added,” 2015–16

chart showing two bar graphs

*Due to transitions in state tests, Ohio rated schools on just one year of value-added results in 2014–15 and 2015–16 leading to some swings in ratings. In previous years and starting again in 2016–17, the state will use a multi-year average which helps to improve the stability of these ratings.

We suppose one could argue that the performance-index distribution more accurately depicts what is going on in Ohio’s urban schools: Nearly every school, whether district or charter, is failing. Yet we know from experience that this simply isn’t true. Yes, terrible schools exist, but there are also terrific ones whose efforts are best reflected in student growth. In fact, we proudly serve as the charter authorizer for KIPP Columbus and Columbus Collegiate Academy-Main. Both schools have earned an impressive three straight years of value-added ratings of “A,” indicating sustained excellence that is making a big impact in their students’ lives. Yet both of these high-poverty charter schools were assigned Ds on the performance index for 2015–16. That is to say, their students are making impressive gains—catching up, even—but not yet at “grade level” in terms of meeting academic standards. If we as an authorizer relied solely or primarily on PI ratings, these great schools might be shut—wrongly.

Point #3: Growth measures don’t let too many bad schools “off the hook.”

One worry about a growth-centered approach is that it might award honors grades to mediocre or dismal schools. But how often does this occur in the real world? As chart 2 indicates, 63 percent of urban public schools in Ohio received Ds or Fs on the state’s value-added measure last year. In the two previous years, 46 and 39 percent of urban schools were rated D or F. To be sure, fewer high-poverty schools will flunk under value-added as under a proficiency measure. But a well-designed growth-centered system will identify a considerable number of chronically underperforming schools, as indeed it should.

Point #4: It’s true that schools can score well on growth measures even if their low-income students and/or students of color don’t close gaps in achievement and college-and-career readiness. But let’s not shoot the messenger.

Probably the strongest argument against using growth models as the centerpiece of accountability systems is that they don’t expect “enough” growth, especially for poor kids and kids of color. The Education Trust, for example, is urging states to use caution in choosing “comparative” growth models, including growth percentiles and value-added measures, because they don’t tell us whether students are making enough progress to hit the college-ready target by the end of high school, or whether low-performing subgroups are making fast enough gains to close achievement gaps. And that much is true. But let’s keep this in mind: Closing the achievement gap, or readying disadvantaged students for college, is not a one-year “fix.” It takes steady progress—and gains accumulated over time—for lower-achieving students to draw even with their peers. An analysis of Colorado’s highest-performing schools, for example, found that the trajectory of learning gains for the lowest-performing students simply wasn’t fast enough to reach the high standard of college readiness. An article by Harvard’s Tom Kane reports that the wildly successful Boston charter schools cut the black-white achievement gap by roughly one-fifth each year in reading and one-third in math. So even in the most extraordinary academic environments, disadvantaged students may need many years to draw even with their peers (and perhaps longer to meet a high college-ready bar). That is sobering indeed.

We should certainly encourage innovation in growth modelling—and state accountability—that can generate more transparent results on “how much” growth is happening in a school and whether such growth is “enough.” But the first step is accepting that student growth is the right yardstick, not status measures. And the second step is to be realistic about how much growth on an annual basis is humanly possible, even in the very best schools.

***

Using proficiency rates to rate high-poverty schools is an unfair practice to schools that has real-world consequences. Not only does this policy give the false impression that practically all high-poverty schools are ineffective, but it also demeans educators in high-needs schools who are working hard to advance student learning. Plus, it actually weakens the accountability spotlight on the truly bad high-poverty schools, since they cannot be distinguished from the strong ones. Moreover, it can lead to unintended consequences such as shutting schools that are actually benefitting students (as measured by growth), discouraging new-school startups in needy communities (if social entrepreneurs believe that “failure” is inevitable), or thwarting the replication of high-performing urban schools. Lastly, assigning universally low ratings to virtually all high-poverty schools could breed resentment and pushback, pressuring policy makers to water down proficiency standards or easing up on accountability as a whole.

Growth measures won’t magically ensure that all students reach college and career readiness by the end of high school, or close our yawning achievement gaps. But they do offer a clearer picture of which schools are making a difference in their students’ academic lives, allowing policy makers and families to better distinguish the school lemons from peaches. If this information is put to use, students should have more opportunities to reach their lofty goals. Measures of school quality should be challenging, yes, but also fair and credible. Growth percentiles and value-added measures meet those standards. Proficiency rates simply do not. And states should keep that in mind when deciding how much weight to give to these various indicators when determining school grades.

view

Disputing Mike and Aaron on ESSA school ratings

Chester E. Finn, Jr. | Chad L. Aldis

10.17.2016

Flypaper

The central problem with making growth the polestar of accountability systems, as Mike and Aaron argue, is that it is only convincing if one is rating schools from the perspective of a charter authorizer or local superintendent who wants to know whether a given school is boosting the achievement of its pupils, worsening their achievement, or holding it in some kind of steady state. To parents choosing among schools, to families deciding where to live, to taxpayers attempting to gauge the ROI on schools they’re supporting, and to policy makers concerned with big-picture questions such as how their education system is doing when compared with those in another city, state, or country, that information is only marginally helpful—and potentially quite misleading.

Worse still, it’s potentially very misleading to the kids who attend a given school and to their parents, as it can immerse them in a Lake Wobegon of complacency and false reality.

It’s certainly true, as Mike and Aaron say, that achievement tends to correlate with family wealth and with prior academic achievement. It’s therefore also true that judging a school’s effectiveness entirely on the basis of its students’ achievement as measured on test scores is unfair because, yes, a given school full of poor kids might be moving them ahead more than another school (with higher scores) and a population of rich kids. Indeed, the latter might be adding little or no value. (Recall the old jest about Harvard: Its curriculum is fine and its faculty is strong but what really explains its reputation is its admissions office.)

It’s further true that to judge a school simply on the basis of how many of its pupils clear a fixed “proficiency” bar, or because its “performance index” (in Ohio terms) gets above a certain level, not only fails to signal whether that school is adding value to its students but also neglects whatever is or isn’t being learned by (or taught to) the high achievers who had already cleared that bar when they arrived in school.

Yes, yes and yes. We can travel this far down the path with Mike and Aaron. But no farther.

Try this thought experiment. You’re evaluating swim coaches. One of them starts with kids most of whom already know how to swim and, after a few lessons, they’re all making it to the end of the pool. The other coach starts with aquatic newbies and, after a few lessons, some are getting across but most are foundering mid-pool and a few have drowned. Which is the better coach? What grade would you give the second one?

Now try this one. You’re evaluating two business schools. One enrolls upper middle class students who emerge—with or without having learned much—and join successful firms or start successful new enterprises of their own. The other enrolls disadvantaged students, works very hard to educate them, but after graduating most of them fail to get decent jobs and many of their start-up ventures end in bankruptcy. Which is the better business school? What grade would you give the second one?

The point, obviously, is that a school’s (or teacher’s or coach’s) results matter in the real world, more even than the gains its students made while enrolled there. A swim coach whose pupils drown is not a good coach. A business school whose graduates can’t get good jobs or start successful enterprises is not a business school that deserves much praise. Nor, if you were selecting a swim coach or business school for yourself or your loved one, would you—should you—opt for one whose former charges can’t make it in the real world.

Public education exists in the real world, too, and EdTrust is right that we ought not to signal satisfaction with schools whose graduates aren’t ready to succeed in what follows when those schools have done what they can.

Mike and Aaron are trying so hard to find a way to heap praise on schools that “add value” to their pupils that they risk leaving the real world in which those pupils will one day attempt to survive, even to thrive.

Sure, schools whose students show “growth” while enrolled there deserve one kind of praise—and schools that cannot demonstrate growth don’t deserve that kind of praise. But we mustn’t signal to students, parents, educators, taxpayers or policymakers that we are in any way content with schools that show growth if their students aren’t also ready for what follows.

Yes, school ratings should incorporate both proficiency and growth but should they, as Mike and Aaron urge, give far heavier weight to growth? A better course for states is to defy the federal Education Department’s push for a single rating for schools and give every school at least two grades, one for proficiency and one for growth. The former should, in fact, incorporate both proficiency and advanced achievement, and the latter should take pains to calculate growth by all students, not just those “growing toward proficiency.” Neither is a simple calculation—growth being far trickier—but better to have both than to amalgamate them in a single less revealing grade or adjective. Don’t you know quite a bit more than you need to know about a school when you learn that it deserves an A for proficiency and a C for growth—or vice versa—than simply to learn that it got a B? On reflection, how impressed are you by a high school—especially a high school—that looks good on growth metrics but leaves its graduates (and, worse, its dropouts) ill-prepared for what comes next? (Mike and Aaron agree with us that giving a school two—or more—grades is more revealing than single consolidated rating.)

We will not here get into the many technical problems with measures of achievement growth—they can be significant—and we surely don’t suggest that school ratings and evaluations should be based entirely on test scores, no matter how those are sliced and diced. People need to know tons of other things about schools before legitimately judging or comparing them. Our immediate point is simply that Mike and Aaron are half-right. It’s the half that would let kids drown in Lake Wobegon that we protest.

The Education Gadfly Weekly

Sign Up to Receive Fordham Updates