Computers now critiquing essays

Grading: Software will soon be used to score compositions on the Graduate Managemnet Admissions Test, but critics say machines can't analyze writing.

February 01, 1999|By Michael Stroh | Michael Stroh,SUN STAFF

Can a computer tell good prose from bad?

Some 200,000 prospective business school applicants are about to find out.

Starting Feb. 10, those who take the Graduate Management Admissions Test will have their two essay questions judged -- at least in part -- by a computer.

Test administrators say a computerized grader will help them cut costs and the time it takes to return tests. But critics argue that computers, no matter how sophisticated, have no place judging writing.

"My question is: Why are you assigning essays in the first place if you're not going to read them?" said Dennis Baron, head of the English Department at the University of Illinois in Urbana-Champaign.

Until now, two human readers have scored each GMAT essay on a scale of zero to 6 points. If the scores disagreed by more than one point, the essay was given to a third reader for a decision.

Starting this month, the GMAT's new grading software, dubbed "e-rater," will serve as the second reader.

Developed by Educational Testing Services in Princeton, N.J., e-rater took five years of research and testing to create and is programmed to evaluate more than 50 elements of an essay, ranging from content to organization of ideas.

To grade each GMAT essay, e-rater analyzes its contents, and then, based on what it finds, assigns it a numeric score using grading criteria previously determined by human test administrators.

"We're not evaluating creative writing here," said Frederic McHale of the Graduate Management Admission Council, which represents business schools around the country and hired ETS to develop e-rater. "What we're judging is: Can a person organize their thoughts and ideas and express them coherently on a specific topic?"

McHale said some school administrators were skeptical about replacing man with machine. "Their first thought was, `Are you going to score what Shakespeare would write?' " he said.

But researchers who tested e-rater's scoring ability found it was as likely to agree with the first scorer as a human second reader was, a conclusion that McHale says put university officials more at ease.

If e-rater differs by more than a point from its human partner, the essay will be judged by a second human reader before going to a final referee.

"It's still not ready for use by itself," McHale concedes. "If I was highly creative or unique about the way I write down my ideas, it's possible that a human reader could see that and give a higher score to an applicant, while e-rater might not."

By using a computer, GMAT administrators expect to halve the 10-day period it now takes to grade and return the test and eventually lower the exam's $150 price tag.

The digital-age approach to grading essays has sent some test preparation services scrambling to revise their strategies.

Kaplan Educational Centers recently dispatched an updated battle plan to clients preparing for an encounter with e-rater.

"Use transitional phrases like `therefore,' `since,' and `for example,' so that the computer can recognize that your essay contains a structured argument," reads one suggestion. "Use synonyms for more important terms. The computer rewards strong vocabulary," advises another.

"It's not a radical departure from the strategy we've been teaching for years," said Trent Anderson, executive director of graduate and professional programs at Kaplan, who added that the company has been tracking developments in computerized essay grading for some time.

But the notion of students making any change in style to satisfy a computer sets some professors' teeth on edge.

In a recent editorial in the Chronicle of Higher Education, Baron of the University of Illinois argued that forcing students to write for a computer cheapens the value of essay writing and said a computer can be used in better ways to assess knowledge.

"That's why we developed multiple choice," he declared.

Baron said his students had mixed reactions when he asked them how they would like to have their classroom essays graded by a computer.

One lamented that the computer would not be able to tell if a student's writing improved over time.

Another wondered whether the software would reward creativity or extra effort.

Like it or not, the GMAT may be just the beginning for robot readers. Some educators say it won't be long before students from elementary school through college will have their prose graded by machine.

Researchers at the University of Colorado and New Mexico State University are putting the final touches on their Intelligent Essay Assessor, a technology designed to appeal to overworked teachers who want to dig themselves out from mountains of paperwork.

Like the e-rater, the Intelligent Essay Assessor can be "trained" to distinguish good writing from bad by feeding it pre-graded essays on the same subject, said Peter Foltz, a psychologist at New Mexico State who co-developed the technology.

Baltimore Sun Articles
|
|
|
Please note the green-lined linked article text has been applied commercially without any involvement from our newsroom editors, reporters or any other editorial staff.