The essay-scoring competition that just concluded offered a mere $60,000 as a first prize, but it drew 159 teams. At the same time, the Hewlett Foundation sponsored a study of automated essay-scoring engines now offered by commercial vendors. The researchers found that these produced scores effectively identical to those of human graders.
Barbara Chow, education program director at the Hewlett Foundation, says: “We had heard the claim that the machine algorithms are as good as human graders, but we wanted to create a neutral and fair platform to assess the various claims of the vendors. It turns out the claims are not hype.”
If the thought of an algorithm replacing a human causes queasiness, consider this: In states’ standardized tests, each essay is typically scored by two human graders; machine scoring replaces only one of the two. And humans are not necessarily ideal graders: they provide an average of only three minutes of attention per essay, Ms. Chow says.