A new Mathematica study examines whether principal evaluations are accurate predictors of principal effectiveness as measured by student achievement. Researchers have done some research on the validity of teacher evaluation measures, but principal measures are less studied.
The authors examine a principal evaluation measure called the “Framework for Leadership” (FLL), which was developed by the Pennsylvania Department of Education as part of a mandated revision of the state’s principal evaluation process. Superintendents and other district supervisors use the tool to assess principals, and it includes twenty leadership practices grouped into four domains. These domains comprise practices that, when employed by principals, the state believes can raise student achievement. The four domains are strategic/cultural leadership, systems leadership, leadership for learning, and professional and community leadership (more on some of these later).
The study uses data from the pilot implementation of the FLL—which had no consequences for principals who participated—during the 2013–14 school year. The study focuses on 305 of the 517 principals in the pilot for whom the analysts had suitable administrative data. It included state test scores for all Pennsylvania students who were administered state math and reading tests from 2006–07 to 2013–14, in grades 3–8 and the eleventh grade. Also included were test scores from other grades and subjects where the data were available, such as science in grades four, eight, and eleven and writing in grades five, eight, and eleven.
To account for school-level factors outside of the current principal’s control (such as neighborhood safety or the effectiveness of teachers hired by previous principals), analysts compared the school’s value added before the arrival of the sampled principal with its current value added. Then they assessed the extent to which principals with higher value added earned higher FLL scores than principals with lower value added.
Key findings: The FLL scores are statistically and positively correlated with value-added estimates—more so in math than in other subjects. The strongest links are in the areas of “systems leadership,” which includes practices like “establishing and implementing expectations for students and staff” and “ensuring a high-quality, high-performing staff”; and “professional and community leadership,” including attributes such as “shows professionalism” and “supports professional growth [of staff].”
The results are mostly driven by principals with at least three years of tenure at their schools; none of the links between FLL and value-added were significant for principals with one or two years of school tenure. This is presumably because, as other studies have shown, it can take multiple years for principals to make a concrete impact on schools.
It’s encouraging that the practices and behaviors of principals as measured by a tool actually link to student achievement. Perhaps studies like these will help us to define empirically what we mean by principal quality. That’s been a black box for far too long.
SOURCE: Moira McCullough, Stephen Lipscomb, Hanley Chiang, and Brian Gill, "Do Principals’ Professional Practice Ratings Reflect Their Contributions to Student Achievement?," Mathematica (June 2016).