Computer-adaptive testing (CAT) is on the rise in K–12 schools, from Seattle to Virginia and everywhere in between. Proponents of the algorithm-driven testing model say that having variable problem sets geared toward the demonstrated capabilities of individual students allows for more accurate assessment, especially for very high and very low achievers—with less gaming and guessing—along with a focus on student growth rather than raw scores. Opponents, meanwhile, contend that those features play into the fears of test-averse students and can depress their scores. New research from Japan may offer some clues to help unravel this conundrum.
Japanese education officials have been resistant to using computer-adaptive testing for several reasons, including those noted above, but also some that are peculiar to Japan. There, large groups of students taking the same test in the same place at the same time is a longstanding cultural touchstone that would be entirely eliminated by administration of tests that must be taken on electronic devices, have different questions for each student, and likely require differing amounts of time for students to complete. Nevertheless, CAT’s expansion into international arenas like PISA and the Test of English as a Foreign Language (TOEFL) has made the Education Ministry take a second look. Three researchers from institutions across the country developed a computer-adaptive testing protocol designed to determine not only how students performed, but also how they perceived the testing method and whether their own attitudes toward testing were reinforced or allayed by the experience.
The researchers conducted two experiments. The first was to determine whether students who have testing anxiety performed more poorly on computer-adaptive tests than those who do not. They compiled a proxy measure of anxiety based on students’ self-reported achievement goals (whether they are focused on performance or mastery, à la Dweck) and approach to learning (deep vs. surface). A total of 870 students (474 fifth graders, 174 sixth graders, 163 seventh graders, and 58 students from an unknown grade) from the Kansai region were recruited between June and September 2020 to participate in the first experiment. (Japanese schools mostly remained open during the pandemic.) A total of 415 students were male; 383 were female. Data indicate that a student’s performance goal did not directly predict their test outcome, implying that the computer-adaptive test did not reinforce any preexisting anxiety about test taking. Additionally, students were surveyed post-test regarding their attitudes toward three types of assessments: the computer-adaptive version they just took, the traditional fixed-item version with which they were familiar, and a putative human-adaptive test which is a hybrid of the two. Despite limited familiarity with CAT, students in the first experiment expressed more positive attitudes toward it than toward traditional fixed-item and human-adaptive versions.
The second experiment, conducted in Kansai from March to July 2021, involved an additional 745 students (540 fifth graders, 90 sixth graders, 100 seventh graders, and 15 students from unknown grade levels), roughly half male and half female. The basics of the previous experiment were repeated with the same results. However, this time students were given more detailed versions of the two surveys, aimed at identifying potential mechanisms at work. Results revealed that students who felt that the value of testing was primarily to show improvement or to get them to develop study habits were least accepting of computer-adaptive testing, although the difference between them and their peers was minimal. Students who felt that the value of testing was to determine comparative achievement levels preferred a putative human-adaptive test over the other two options.
What might all this mean? The researchers believe it means that computer-adaptive testing does not make students who are prone to anxiety regarding their test performance any more or less anxious than a traditional fixed-item test would. On the contrary, most students seem unopposed to the new model regardless of their level of test anxiety. In the highly-traditional Japanese education sector, that’s saying something. More specifically, if the purpose of testing is made clear to students—that is, what their teachers and school leaders are going to do with the resulting data—nearly any type of test seems to be acceptable to them. And, as noted above, the data obtained from a computer-adaptive test are more focused on growth measures rather than raw or scaled scores, which can be compared among students. As long as teachers, administrators, and policymakers are good with that new focus, it seems that CAT could be expanded in K–12 education without concern.
SOURCE: Takayuki Goto, Kei Kano, and Takayuki Shiose, “Students’ acceptance on computer-adaptive testing for achievement assessment in Japanese elementary and secondary school,” Educational Psychology (July 2023).