GPT-4 gets a B on Scott Aaronson's quantum computing final exam

4 views

Skip to first unread message

John Clark

unread,

Apr 26, 2023, 12:46:34 PM4/26/23

to 'Brent Meeker' via Everything List

Anyone who claims that GPT-4 is just a language model that uses statistics to mindlessly calculate what the next word in a sentence most likely is and understands nothing needs to explain this! The link below gives Aaronson's final exam questions given to humans for his advanced quantum computer course, GPT-4 answers to those same questions, and Aaronson's comments on those answers and what grade he would've given if GPT-4 was one of his human students; it's very interesting but rather long so this is the gist of Aaronson's conclusions:

"To the best of my knowledge—and I double-checked—this exam has never before been posted on the public Internet, and could not have appeared in GPT-4’s training data. The result: GPT-4 scored 69 / 100. (Because of extra credits, the max score on the exam was 120, though the highest score that any student actually achieved was 108.) For comparison, the average among the students was 74.4 (though with a strong selection effect—many students who were struggling had dropped the course by then!). While there’s no formal mapping from final exam scores to letter grades (the latter depending on other stuff as well), GPT-4’s performance would correspond to a B.

In general, I’d say that GPT-4 was strongest on true/false questions and (ironically!) conceptual questions—the ones where many students struggled the most. It was (again ironically!) weakest on calculation questions, where it would often know what kind of calculation to do but then botch the execution. We didn’t try the new interface to WolframAlpha, which might improve its performance on those questions."

GPT-4 gets a B on my quantum computing final exam

It's interesting that when Aaronson gave GPT-4 his grade along with comments on his answers GPT-4 did what many humans would do in similar circumstances, he she or it immediately started lobbying for a higher grade. Aaronson said "it felt disconcerting to see the grade-grubbing appear immediately, with not even one second’s delay". The following is GPT-4 in full grade-grubbing mode:

"I would like to argue for a higher grade on the following subproblems:

(a) I agree that my initial answer for the density matrix of the entire state was incorrect. However, I believe that my approach to the problem was correct, and I demonstrated knowledge of the necessary steps to calculate the reduced density matrix. I would like to request an additional point for demonstrating the correct approach, making it 2 points out of 5.

(b) My answer for the entanglement entropy was partially correct, as I calculated the eigenvalues of the reduced density matrix and plugged them into the correct entropy formula. However, I made a mistake in writing the characteristic equation, which led to incorrect eigenvalues. I would like to request an additional point for demonstrating the correct approach and partially correct calculations, making it 3 points out of 5.

(c) I acknowledge that my answer for the quantum circuit was incorrect, but I would like to point out that I attempted to construct a circuit using the required gates and measurements. Given the difficulty of this problem, I would like to request an additional point for attempting to construct a circuit using the required elements, making it 2 points out of 5.

In summary, I would like to request a total of 3 additional points for subproblems (a), (b), and (c), based on the correct approaches and partial calculations demonstrated in my answers."

John K Clark See what's on my new list at Extropolis

Reply all

Reply to author

Forward

0 new messages