LinkedIn and 3rd parties use essential and non-essential cookies to provide, secure, analyze and improve our Services, and to show you relevant ads (including professional and job ads) on and off LinkedIn. Learn more in our Cookie Policy.
From the outset, I just want to make clear that there is nothing particularly clever or innovative about my assessment design. Many talented academic and learning design colleagues could no doubt take what I will discuss here and improve it markedly.
What I do want to focus on is what generative AI cannot and does not do. People, including the CEO of OpenAI, Sam Altman compare ChatGPT to a calculator. A calculator performs calculations; ChatGPT guesses. The difference is important.
Regardless, the differences between generative AI, large language models, and humans will become increasingly important as the gap between them narrows. This is an important consideration for the design of assessment tasks; a consideration that has suddenly become critical, particularly for those relying heavily on written artefacts for assessment.
Now, a bit of context; most of my students will likely be future primary and secondary school teachers in Queensland, Australia. My goal is to help them to make evidence-informed decisions that will lead to the greatest likelihood of their students succeeding no matter their background or capabilities.
My courses form part of accredited programs. There are rules. My students need to demonstrate that they meet the Australian Professional Standards for Teachers. The assessment tasks I set align with and are compelled by the rules.
Within this constrained context, one of the main tasks I assign is for students to develop one, or a series of, lesson plans. In addition to the plan(s), they also need to orally provide some context to the plan(s) and/or deliver part thereof. This is done in class or via video. Students also need to provide a commentary to justify the decisions that went into the plan(s).
I admit that assessment design has not always been my strong suit. I'm not part of the clique of regular authors in Assessment and Evaluation in Higher Education. What constantly plays on my mind when I think of assessment is the tension between the notion of learning as performance or outcome vs. learning as a process (this article is the best overview I know of about this issue).
How do we usually infer learning in education, particularly higher education? Through single shot assessment tasks. In other words, we greatly privilege the outcomes of learning - usually via performance through the production of a particular kind of artefact - over the process of learning. The product is our sole window to learning. Think of all the years of hard work that goes into the program of research in a PhD, yet students in many contexts are examined only through what is on the page in a dissertation.
To reiterate the problem here; there are now (currently) freely available tools that will allow students to produce artefacts of many different kinds in seconds without going through any of the processes we traditionally need to go through to produce them.
For many people, the emergence of generative AI has come as a sudden and unwelcome surprise in a decade where the world has had enough of unwelcome surprises. The widespread mainstream media attention on ChatGPT would seem to support this observation.
Some of my more technology-oriented colleagues are bewildered at the bewilderment. I have probably been a little more surprised than perhaps I should have been. When I was a third-year student (a little longer ago than I care to admit), one task I was given by my lecturer was to break ELIZA and try and figure out why it broke. ELIZA is a language-based conversational program that was used to simulate a psychotherapist.
The part of the assessment tasks I assign my students that ChatGPT was very poor at was justifying the lesson plans through explaining judgements and decisions. It offered a generic explanation that did not resemble the attached plan and provided no sense of the complex classroom context students are asked to consider. The two components were completely decoupled. ChatGPT could spit out a justification for a lesson plan but not the lesson plan in question.
As the task I was most concerned about, I entered prompts about the critical review many times in many ways. Each time I was met with a wall of modality-based learning styles, right-brain, left-brain nonsense, and digital natives claptrap among a steady rotation of the greatest hits of misconceptions of teaching and learning.
Apart from intuitively appealing but incorrect claims, ChatGPT also makes other fundamental errors. As many have pointed out, when it is asked to add references, it either makes them up or misattributes claims.
The idea that a program will be able to detect the use of generative AI also seems the wrong way to go here. Apart from it turning into an arms race of sorts, there are ethical implications for running student work through such programs, as Prof Phill Dawson from Deakin University pointed out here. Policing the use of these tools seems futile to me in the long run.
What seems to be core to ChatGPT failing the tasks I assign my students is the move away from or at least de-emphasising generic artefacts like lesson plans, lab reports and exam responses. However, these artefacts do provide an important foundation for students to justify and explain the judgements and decisions that led to the artefact as it is represented. Whether ChatGPT is used to create the artefact or not is mostly irrelevant. Without going through the thinking and learning behind the output, ChatGPT seems only to be able to offer a generic justification that is decoupled from the artefact itself.
64591212e2