In this post, I present a dialogue between two experts in computer science education, exploring the need to rethink assessment in the field in the GenAI era. This dialogue is a synthesis of insights gathered from various sources, including conversations, discussions, and email exchanges on the SIGCSE mailing lists regarding assessment in computer science. Using these materials, I refined and crafted the dialogue with the help of Claude and Perplexity, resulting after several iterations in the final version which follows here.
A conversation between Prof. Sarah C. and Prof. David R. during a computer science department coffee break.
Sarah: [stirring her coffee]: David, I’ve been struggling with our upper-level course assessments lately. With students using AI for practically everything, I’m not even sure what we should be evaluating anymore.
David: [nods in agreement]: Tell me about it. Remember when we used to spend hours grading semicolons and bracket matching? Now AI handles all that syntax perfectly.
Sarah: Exactly! And that makes me wonder—what skills really matter now? Should we still care if students remember to put semicolons at the end of their lines, or should we focus entirely on higher-order thinking?
David: [leaning forward] That’s an interesting point. I’ve noticed in my algorithms class that students can get perfect syntax from AI, but many still struggle with fundamental problem decomposition. Maybe we should shift our focus there?
Sarah: But how do we assess that? I’ve always relied on project work for evaluation, but now I can’t tell if I’m assessing the student’s understanding or their ability to prompt an AI.
David: [thoughtfully] You know, yesterday a student showed me how they used AI to debug their code. The interesting part wasn’t that they found the bug—it’s how they interpreted the AI’s suggestions and decided which ones made sense for their specific problem.
Sarah: That’s fascinating! So maybe instead of testing if they can spot a missing semicolon, we should assess their ability to:
1. Interpret AI suggestions critically
2. Choose appropriate data structures and explain why
3. Evaluate different solutions
David: Right! And what about testing their ability to write good test cases? AI can help write code, but students still need to think about edge cases and potential failures.
Sarah: [excited] Yes! And algorithmic complexity questions, those require higher-order thinking. Like asking them to analyze why one solution would be more efficient than another, considering both time and space trade-offs.
In upper-level courses, we can ask students to evaluate the strengths and limitations of using a specific problem (e.g., the 3-SAT problem) as a basis for reductions in NP-completeness proofs, or to propose and justify a novel application of an NP-complete problem in a field outside of theoretical computer science, such as economics, transportation, medicine, or social sciences.
David: [pulls out a notebook] Let’s break this down. What if our assessments focused on:
-
- Problem decomposition strategies
-
- Algorithm selection and justification
-
- System scalability considerations
-
- Performance optimization reasoning
Sarah: And we could have them explain their reasoning! Like, “Why did you choose this data structure over alternatives?” Even if AI helped them implement it, they need to understand the implications of their choices.
David: [pauses] But what about basic programming skills? We still need to ensure they can code independently when needed, right?
Sarah: Maybe we need a hybrid approach? Proctored exams for basic skills, but project work that explicitly embraces AI while focusing on these higher-order abilities we’re discussing?
David: [nodding] That makes sense. We could have them document their AI interactions too; explain what they asked, why they asked it that way, and how they validated the responses.
Sarah: [finishing her coffee] You know, this conversation has made me feel better about the whole AI situation. We’re not losing the ability to assess; we’re evolving what we assess to match the real-world skills our students will need.
David: Exactly! In industry, they’ll need to work with AI tools anyway. We should teach them to use these tools effectively while ensuring they understand the fundamental principles that make them good computer scientists.
Sarah: [standing up] Well, I need to revise my assessment plans for next semester. But this helps—focus on the thinking and problem solving, not just the coding.
David: [also getting up] And maybe we should share these thoughts at the next faculty meeting? I bet others are wrestling with the same questions.
Sarah: Definitely. You know what? For the first time in a while, I’m actually excited about redesigning my assessments. This feels like an opportunity rather than just a challenge or a threat.
David: [smiling] Welcome to the future of computer science, where knowing how to think is more important than remembering where to put your semicolons!
The advent of GenAI has precipitated a critical need to re-evaluate assessment methods in computer science education. Traditional assessment approaches, which often center on coding tasks and theoretical examinations, are increasingly susceptible to AI-assisted completion. This challenge is particularly acute in computer science education. As GenAI tools become more sophisticated in generating code, debugging programs, and explaining complex concepts, educators must adapt to ensure that assessments genuinely measure students’ understanding and skills, rather than their ability to leverage AI tools.
The need to rethink assessment in this context stems from several factors. Firstly, the ease with which GenAI can produce code or solutions to standard problems necessitates a shift towards evaluating higher-order thinking skills, creativity, and the ability to apply knowledge in novel contexts. Secondly, as AI becomes an integral part of the software development process, assessment methods must evolve to reflect this reality, potentially incorporating AI tools as part of the learning and evaluation process rather than viewing them as a threat. Thirdly, there’s a growing need to assess students’ ability to critically evaluate AI-generated content, understanding its limitations and potential biases.
Moreover, this reassessment will provide an opportunity to align educational practices more closely with the rapidly evolving tech industry landscape, where AI collaboration is becoming the norm. By adapting new assessment strategies, educators can better prepare students for future careers where AI literacy and the ability to work alongside AI systems will be crucial. Ultimately, rethinking assessment in computer science education in the GenAI era is not just about maintaining academic integrity; it’s about equipping students with the skills and mindset necessary to thrive in an AI-augmented world.
In summary, while GenAI tools are powerful aids in software development, they do not automatically solve all problems easily. Developers still need to apply their knowledge, creativity, and problem-solving skills to effectively use these tools and address complex software development challenges. Ultimately, the true value of GenAI lies not in replacing computer scientists, but in enhancing their capabilities and empowering them to tackle challenges that were once considered insurmountable.
Orit Hazzan is a professor at the Technion’s Faculty of Education in Science and Technology. Her research focuses on computer science, software engineering, and data science education. Additional details about Hazzan’s professional work can be found on her website; her email is oritha@technion.ac.il.