A research group led by Hiroaki Funayama, a researcher (doctoral course) at the Graduate School of Tohoku University, developed a practice for assuring the quality of grading by dividing the work between a human grader and AI for automatic grading of written answers by artificial intelligence (AI). It was clarified that the grading quality can be appropriately controlled by a systematic framework.

 With the advent of machine learning methods using deep learning, the accuracy of automatic scoring of descriptive answers by AI has improved remarkably.In particular, automatic grading of short-answer short-answer questions, which target written answers of several crosses, achieves the same level of grading quality as a human grader for some questions.However, it is difficult for grading AI to appropriately grade answers that contain unknown expressions that do not exist in the learning data, which is a major obstacle to the practical use of automatic grading by AI.

 Therefore, the research group built a scoring framework in which an automatic scoring system and humans cooperate in scoring.This framework utilizes confidence, which is a measure of the reliability of scoring results from scoring AI.Check the confidence level of the automatic scoring results for each answer, and if the confidence level is low, re-grade by a human grader.

 First, based on a small amount of graded answer data, we estimate the lower bound of confidence to achieve the desired grade quality.When the certainty factor falls below the lower limit during actual automatic scoring, a human performs re-scoring to achieve a desired scoring quality.

 This time, using a data set of descriptive questions in Japan and the English-speaking world, we conducted a simulation to confirm the expected effect, and clarified its feasibility.In addition, it was found that the higher the matching rate of the scoring results between human graders, the higher the quality of scoring can be realized at low cost.Based on these findings, it is expected that the practical use of automatic scoring will progress in the future.

Paper information:[The 23rd International Conference on Artificial Intelligence in Education (AIED2022)] Balancing Cost and Quality: An Exploration of Human-inthe-Loop Frameworks for Automated Short Answer Scoring

Tohoku University

Create excellent research results that will be the source of innovation, and develop talented human resources who will lead the next generation

Tohoku University has a rich culture and humanity based on the tradition of "research first principle" since its opening, the idea of ​​"opening the door" and the spirit of "respect for practical science", and is a phenomenon of human beings, society and nature. In response to this, human resources with the ability to carry out intellectual exploration with a "scientific mind", demonstrating their expertise in various fields from an international perspective and leading […]

University Journal Online Editorial Department

This is the online editorial department of the university journal.
Articles are written by editorial staff who have a high level of knowledge and interest in universities and education.