QUICK
MENU
맨위로가기

로고이미지로고이미지

대학장학금입학안내기숙사등록가톨릭대학교대학원입학안내

The Catholic University of Korea

Research Results


Professor Kang-min Kim’s Team Developed AI to Predict Perceived Difficulty of Exams Before Tests

  • Writer :External Affairs Team
  • Date :2024.11.28
  • Views :45

- Developed AI to predict students' perceived difficulty of questions by linking Item Response Theory (IRT) and 65 large language models (LLMs) before test questions are disclosed

- The question difficulty prediction system (LLaSA) was unveiled for the first time through a poster presentation at the prestigious natural language processing conference EMNLP.


*Image description: Diagram of the question difficulty prediction system (LLaSA) developed by Professor Kang-min Kim’s team at Catholic University of Korea.


  The research team, led by Professor Kang-min Kim from the Department of Data Science (jointly affiliated with the Department of Artificial Intelligence) at the Catholic University of Korea, has developed an AI technology that effectively predicts the expected difficulty of each test question and helps allocate scores before presenting them to students. This technology leverages large language models (LLMs) like ChatGPT.  

The results of this study were first revealed on December 12 through a poster presentation at "Empirical Methods in Natural Language Processing (EMNLP) 2024," the world’s leading conference on natural language processing held in Miami, USA. The research was also published in "EMNLP 2024 Findings," receiving high praise for its excellence.


  Traditionally, many educational institutions rely heavily on the qualitative evaluations of question creators when crafting exam questions and assigning scores, making it challenging to reflect the actual difficulty students might experience. To complement this, methods like Item Response Theory (IRT), which measures question difficulty based on extensive student answer record data and adjusts score allocations accordingly, have been used in exams like TEPS. However, this approach requires pre-collection of students' answer records, posing a limitation. 


  To overcome this, Professor Kang-min Kim’s research team replaced the process of collecting answer records from student groups with 65 large language models (LLMs). This allowed them to develop the LLaSA system, capable of successfully predicting the actual perceived difficulty for a student group’s level without prior exposure to the questions.


  The newly developed question difficulty prediction system (LLaSA) enhances difficulty prediction performance by applying IRT to large language models. The research team selected the language model most similar to actual students’ ability among 65 LLMs capable of solving problems in various fields and formats. This model solved the test questions on behalf of the students. Based on the answer records of the large language model, the system adjusted score allocation according to the difficulty of each question, enabling effective score assignment before actual exams.


  The study found that the difficulty prediction performance of the system developed by the research team was 8–23% higher than that of existing AI-based methods, which analyzed question content only. Furthermore, the system demonstrated exceptional versatility by adjusting the composition of language models to flexibly respond to changes in student group composition.


  This research, led by Professor Kang-min Kim, was conducted as part of the National Research Foundation of Korea’s Excellent Young Researcher Support Project. The research outcomes were first revealed through the poster presentation at EMNLP 2024 by the co-first authors Jae-woo Park (School of Information, Communications, and Electronic Engineering, undergraduate student) and Sung-jin Park (Department of Artificial Intelligence, combined B.A.-M.S. program student) . The research has also been published in the conference findings, recognized for its excellence. 


  Professor Kang-min Kim commented, “This study confirmed that students’ problem-solving abilities can be effectively modeled using large language models. Based on these results, we expect the application of a more scientific and advanced question difficulty prediction system in educational settings, which will enhance exam differentiation.” He added, “It is very meaningful that world-class research results have emerged from undergraduate students in the departments of Artificial Intelligence and Data Science, established by the Catholic University of Korea in 2020 and 2021, respectively.”