The 2018 PISA results are out. Generally, countries scored within an expected range given their past records. Except one. The scores are astonishing for B-S-J-Z, an acronym for the four Chinese provinces that participated: Beijing, Shanghai, Jiangsu and Zhejiang. Out of seventy-seven international systems, B-S-J-Z scored number one in all three subjects: reading, math, and science.
The four Chinese provinces taking PISA changed from 2015 to 2018, with Zhejiang taking the place of Guangdong. The 2018 group’s scores are dramatically higher than those of the 2015 group (which appropriately is called B-S-J-G). In fact, the differences are so large that they are bound to raise eyebrows.
B-S-J-Z’s scores are 61 scale score points higher (494 versus 555) in reading, 60 points higher (531 versus 591) in math, and a whopping 72 points higher (518 versus 590) in science. How uncommon are differences like these? To answer that question, I examined PISA data from 2006–2015.
For each three-year test interval, I computed the changes for each country on the three PISA tests and converted them to absolute values. That produced 497 observations, with a mean of 9.5 points and standard deviation of 8.6.
So the typical change in a nation’s scores is about 10 points. The differences between the 2015 and 2018 Chinese participants are at least six times that amount. The differences are also at least seven times the standard deviation of all interval changes. Highly unusual.
A reasonable hypothesis is that changing the provinces participating in PISA, even if it was just one out of a group of four, influenced the test scores. Indeed, when I originally composed a thread for Twitter on this topic, I overlooked the change and treated the 2015–2018 score differences as if the participating provinces were the same. I apologize for the error. My mistake does underscore, however, the larger issue: that PISA scores from China should be viewed skeptically.
Why was Guangdong, China’s most populous province, dropped from participating and Zhejiang added? Is it only a coincidence that scores soared after the change?
The past PISA scores of Chinese provinces have been called into question (by me and others) because of the culling effect of hukou on the population of fifteen-year-olds—and for the OECD allowing China to approve which provinces can be tested. In 2009, PISA tests were administered in twelve Chinese provinces, including several rural areas, but only scores from Shanghai were released.
Three years later, the BBC reported, “The Chinese government has so far not allowed the OECD to publish the actual data.” To this day, the data have not been released.
The OECD responded to past criticism by attacking critics and conducting data reviews behind closed doors. A cloud hangs over PISA scores from Chinese provinces. I urge the OECD to release, as soon as possible, the results of any quality checks of 2018 data that have been conducted, along with scores, disaggregated by province, from both the 2015 and 2018 participants.
The credibility of international assessments rests on the transparency of test procedures, including how participants are selected and the rules for reporting test results. The OECD risks undermining the credibility of PISA by not being open on its conduct of the assessment in China.
Editor’s note: This essay was originally published as part of a piece in the Washington Post.