I'm a CS professor, which, admittedly is the probably the one field where gen AI is almost certainly a net win out of the box, so I understand my experience won't transfer easily to humanities or social sciences, but I'll share it anyway.
I teach the intro to programming course, and we've always had to deal with ch…
I'm a CS professor, which, admittedly is the probably the one field where gen AI is almost certainly a net win out of the box, so I understand my experience won't transfer easily to humanities or social sciences, but I'll share it anyway.
I teach the intro to programming course, and we've always had to deal with cheating, because tools that make it easier to write large chunks of code --unlike tools for the same purpose in general prose-- have been around for decades. So we use a hybrid evaluation system with in-person exams and take-home longer term projects.
The in-person exam is where you test for individual low-level hard skills like actually writing code --i.e., knowing the syntax-- and mastering data structures and algorithms, and perhaps a bit more abstract skills related to problem-solving but in very narrow setups.
The long-term projects test for planning skills, communication, documentation, and capacity to pivot and adapt to changing requirements.
Now here's the kicker, students can cheat in-person but it's much easier to cheat in projects. So we make them present their projects and go very deep into explaining all their design decisions.
I don't really care what they use, as long as they can explain all their process. Code generation tools don't really change anything in this picture for me, it's just another hammer in their toolset. As long as they can stand behind all their design decisions and explain what's this piece of code, down to variable assignment, doing there, I'm fine. My experience is that cheaters are super easy to discover if you ask deep enough in the exposition.
This deep face-to-face evaluation does require a significantly bigger effort from evaluators, but we've already been doing that in CS for years. I understand other disciplines that have relied more on asynchronous evaluation have some reckoning to do.
"So we make them present their projects and go very deep into explaining all their design decisions ... My experience is that cheaters are super easy to discover if you ask deep enough in the exposition."
Loving that. It makes sense and matches my intuition for how it would work. I don't think there's any fundamental difference between what you do and applying it to the humanities.
Great write-up as usual.
I'm a CS professor, which, admittedly is the probably the one field where gen AI is almost certainly a net win out of the box, so I understand my experience won't transfer easily to humanities or social sciences, but I'll share it anyway.
I teach the intro to programming course, and we've always had to deal with cheating, because tools that make it easier to write large chunks of code --unlike tools for the same purpose in general prose-- have been around for decades. So we use a hybrid evaluation system with in-person exams and take-home longer term projects.
The in-person exam is where you test for individual low-level hard skills like actually writing code --i.e., knowing the syntax-- and mastering data structures and algorithms, and perhaps a bit more abstract skills related to problem-solving but in very narrow setups.
The long-term projects test for planning skills, communication, documentation, and capacity to pivot and adapt to changing requirements.
Now here's the kicker, students can cheat in-person but it's much easier to cheat in projects. So we make them present their projects and go very deep into explaining all their design decisions.
I don't really care what they use, as long as they can explain all their process. Code generation tools don't really change anything in this picture for me, it's just another hammer in their toolset. As long as they can stand behind all their design decisions and explain what's this piece of code, down to variable assignment, doing there, I'm fine. My experience is that cheaters are super easy to discover if you ask deep enough in the exposition.
This deep face-to-face evaluation does require a significantly bigger effort from evaluators, but we've already been doing that in CS for years. I understand other disciplines that have relied more on asynchronous evaluation have some reckoning to do.
"So we make them present their projects and go very deep into explaining all their design decisions ... My experience is that cheaters are super easy to discover if you ask deep enough in the exposition."
Loving that. It makes sense and matches my intuition for how it would work. I don't think there's any fundamental difference between what you do and applying it to the humanities.