Hi everyone,
Hope everyone is doing well. I have a favor to ask.
I am currently supervising a student who is investigating the reliability and perceived usefulness of Large Language Models (LLMs) as assessment aids for proof-based undergraduate mathematics. We are looking for instructors to help us evaluate how AI-generated feedback holds up against the real thing.
The Study Design: We’re using authentic student-generated proofs from existing courses and comparing numerical scores assigned by instructors against those from an LLM. We’re employing a blinded comparative-judgment design where you would evaluate paired human-versus-LLM grades and feedback—both with and without a rubric—and provide some brief reflections on the quality.
How you can help: If you (or anyone you know) would be interested in participating, the survey takes about 25–30 minutes.
If you’re up for it, please send your email address to ede...@calstatela.edu and we will get the link over to you.
Thank you so much for considering this and for supporting our student research!
Best,