In this paper, we evaluate the most effective error message types through a large-scale randomized controlled trial (RCT) conducted in an open-access, online introductory computer science course with over 9,000 students from 150 countries. We assess existing error message enhancement strategies, as well as two novel approaches of our own: (1) generating error messages using OpenAI’s GPT in real time and (2) constructing error messages that incorporate the course discussion forum. By examining students’ direct responses to error messages, and their behavior throughout the course, we quantitatively evaluate the immediate and longer term efficacy of different error message types. We find that students using GPT generated error messages repeat an error 23.5% less often in the subsequent attempt, and resolve an error in 36.1% fewer additional attempts, compared to students using standard error messages. We also perform an analysis across various demographics to understand any disparities in the impact of different error message types. Our results find no significant difference in the effectiveness of GPT generated error messages for students from varying socioeconomic and demographic backgrounds. Our findings underscore GPT generated error messages as the most helpful error message type, especially as a universally effective intervention across demographics.
Fri 22 MarDisplayed time zone: Pacific Time (US & Canada) change
15:45 - 17:00 | LLMs - Error message and Coding strugglesPapers at Oregon Ballroom 204 Chair(s): Celine Latulipe University of Manitoba | ||
15:45 25mTalk | A Large Scale RCT on Effective Error Messages in CS1Global Papers Sierra Wang Stanford University, John C. Mitchell Stanford University, Chris Piech Stanford University DOI | ||
16:10 25mTalk | dcc --help: Transforming the Role of the Compiler by Generating Context-Aware Error Explanations with Large Language ModelsGlobal Papers Andrew Taylor University of New South Wales, Sydney, Alexandra Vassar University of New South Wales, Sydney, Jake Renzella University of New South Wales, Sydney, Hammond Pearce University of New South Wales, Sydney DOI | ||
16:35 25mTalk | Exploring Novice Programmers' Testing Behavior: A first step to define coding struggle Papers Gabriel Silva de Oliveira North Carolina State University, Zhikai Gao North Carolina State University, Sarah Heckman North Carolina State University, Collin Lynch North Carolina State University DOI |