Beware the Robo-Grader

A friend drew my attention to an op/ed in the Boston Globe about something Pearson was developing in order to grade the essay section of the PARCC exam – a robo-reader. If you think this idea is a disaster in the making the author of the op/ed, Les Perelman, who was the director of Writing Across Curriculum at MIT would agree with you.

An excerpt:

PARCC, the consortium of states including Massachusetts that is developing assessments for the Common Core Curriculum, has contracted with Pearson Education, the same company that graded the notorious SAT essay, to grade the essay portions of the Common Core tests. Some students throughout Massachusetts just took the pilot test, which wasted precious school time on an exercise that will provide no feedback to students or to their schools.

It was, however, not wasted time for Pearson. The company is using these student essays to train its robo-grader to replace one of the two human readers grading the essay, although there are no published data on their effectiveness in correcting human readers.

Robo-graders do not score by understanding meaning but almost solely by use of gross measures, especially length and the presence of pretentious language. The fallacy underlying this approach is confusing association with causation. A person makes the observation that many smart college professors wear tweed jackets and then believes that if she wears a tweed jacket, she will be a smart college professor.

Robo-graders rely on the same twisted logic. Papers written under time pressure often have a significant correlation between length and score. Robo-graders are able to match human scores simply by over-valuing length compared to human readers. A much publicized study claimed that machines could match human readers. However, the machines accomplished this feat primarily by simply counting words.

Read the rest.