MY KOLKATA EDUGRAPH
ADVERTISEMENT
regular-article-logo Friday, 15 November 2024

Alarm on how exam cheats can misuse AI: Examiners can’t distinguish, says study

Researchers at the University of Reading in the UK have found in their study that human examiners could not distinguish between answers to real-life university exam questions generated by AI and those written by real students

G.S. Mudur New Delhi Published 27.06.24, 05:04 AM
Real or fake?

Real or fake? Sourced by the Telegraph

Students might use artificial intelligence tools to not only cheat in exams but also score higher than those who do not cheat, scientists cautioned on Wednesday after a study on how AI might threaten the educational sector.

Researchers at the University of Reading in the UK have found in their study that human examiners could not distinguish between answers to real-life university exam questions generated by AI and those written by real students. The examiners even gave higher grades to the AI answers.

ADVERTISEMENT

“From a perspective of academic integrity, 100 per cent AI (answers) being virtually undetectable is extremely concerning,” Peter Scarfe, an associate professor of psychology at the University of Reading, and his colleagues said in their study.

“This poses serious questions for the educational sector if we are to maintain the academic integrity of our assessments,” they wrote in a research paper describing their findings published on Wednesday in the journal PLOS One.

The researchers generated AI answers to questions from their own university’s BSc psychology degree modules, using GPT4, a sophisticated computer program created by US-based technology company OpenAI and designed to understand and generate human language.

They submitted the AI answers on behalf of 33 “fake” students to examiners who did not know whether they had received answers from real students or AI answers. The exam included short-answer questions with a 200-word limit on answers and essay-based questions with a 1,500-word limit on answers.

They found that examiners could identify only 6 per cent of the AI answers as coming from AI, or 94 per cent of the AI answers remained undetected. On average, the AI answers received higher grades than answers from real students.

“Ours is the first blind test of a real-world examination system,” Scarfe told The Telegraph in an email. “To the examiners, the AI submissions looked like any other student submissions. The data suggest that the global education sector needs to reconsider how students are assessed.”

The AI, however, did not outperform real students on all answers. The exception was in the finalist module which, the researchers said, is consistent with the idea that current AI struggles with tasks demanding superior levels of abstract reasoning. But under current trends, AI’s capacity for abstract reasoning is expected to increase in the years to come, the researchers said.

The findings come amid practices, increasingly adopted by academic institutions in the post-Covid period, that allow students to take exams home. A simple way to address the challenge posed by AI-based answers would be to return to supervised, in-person exams, in contrast to take-home, unsupervised exams.

“There are different ways these could be run. For example, open-book exams where students have access to computers and a set of resources, but not AI,” Scarfe said. “Other options include exams where students are asked to demonstrate competencies in-person rather than writing about those competencies.”

The researchers pointed out that this would not solve the problem if students use AI for coursework or homework which is largely unsupervised.

Follow us on:
ADVERTISEMENT
ADVERTISEMENT