Scoring Multiple Choice Tests

Anatoliy Russ

unread,

Mar 6, 2012, 1:12:23 PM3/6/12

to

So I'm coming to you in search of help. I'm looking for a quick way to score multiple choice tests and then run followup analysis on the results (cronbach alpha, points biserial, etc.). In my case I'm looking to score about 20-30 tests each with 50-60 questions.

The reason I'm hopeful about finding a quick way to do this is through a doc I found on the SPSS site. It says on page 2 "Once responses to multiple choice questions are entered into SPSS, scoring the results against the answer key is a straight-forward procedure within SPSS."
http://www.spsstools.net/Syntax/ItemAnalysis/UsingSPSSforItemAnalysis.pdf
It basically says scoring a test is trivial which I haven't found to be the case.

The solutions that I have found usually involve manually selecting all the fields or doing a search and replace. In my case that's not particularly feasible to do because of the large number of tests and questions on each test.

Can you point me to a guide online or explain the procedure on how to do this type of work quickly?

Art Kendall

unread,

Mar 6, 2012, 1:33:42 PM3/6/12

to

There are several approaches to scoring multiple choice tests.
First, though, please clarify what you are asking.
Do you have 20 or 30 distinct test forms that have different answer keys?
Are there different multiple choice response formats e.g., 3 values 4
values, 5 values?
Is there partial credit for some response values?

How is the data entered numbers 1 to 5 strings a to e, some of each kind
of entry?

How do you want to treat non-response?

Do you want the resulting score to be percent of questions attempted
that were correct? or number of items correct, or percent of all items
correct whether or not they were attempted?

do all of the items on a test measure a single construct? or does a
test yield several scores?

Art Kendall
Social Research Consultants

Anatoliy Russ

unread,

Mar 6, 2012, 2:08:36 PM3/6/12

to A...@drkendall.org

I'll try to answer as completely as I can.

There are 20-30 distinct test, each with a unique answer key.
The tests vary but for the most part are multiple choice A,B,C,D questions where no response gets tagged with a "N" and is in essence treated as an answer choice.
The questions are scored as being correct or incorrect so no partial credit is given.
The score will be calculated as percent correct of all responses regardless of whether or not questions were attempted.
The test measures a single construct.

I know how to convert the test responses so that they are assigned either a correct (1) or an incorrect (0) but I was hoping to go deeper than that. My hope was that if I used the actual responses I would be able to see the effectiveness of distracters for each question by looking at their point biserial and P-values.

Art Kendall

unread,

Mar 6, 2012, 3:54:39 PM3/6/12

to

Art Kendall
Social Research Consultants

Assuming that you created a new set of dichotomous variables scored for
correctness one dichotomy per input item, you would still have the
original variables.

A point-bi-serial is a short-cut calculation of a Pearson correlation
when working by hand.

Use the extension command SPSSINC CREATE DUMMIES to create dichotomous
variables representing all of the values for the input items
maybe something like Item01Ans1 Item01Ans2 ... This set would have (#
of items * # of response codes) for the number of dichotomous variables.
then
CORRELATIONS Item01Ans1... with somevar.

I suggest that you post to SPSSX-L asking whether anyone one has adapted
SPSSINC CREATE DUMMIES to
create dummies from all the items in a scale given that they all have
the same response code values.

Perhaps you could create (or commission) Python code to adapt the names
of the variables in the large set that represent correct answers to
improve readability of the output.

Since the proportions endorsing options are going influence the
magnitude of possible correlations take the correlations an p-values
with a large grain of salt. Check the SPSSX-l archives for "item
response theory" and "Rasch Model" since procedures for these have been
discussed there over the last few years.

Art Kendall
Social Research Consultants

Anatoliy Russ

unread,

Mar 6, 2012, 5:32:04 PM3/6/12

to A...@drkendall.org

Thanks for the reply but that was what I was hoping to avoid. The document I originally posted led me to believe I could do this type of analysis easily right out of the box.

I will see what i can do with CREATE DUMMIES but I think this is a lost cause for me.

Thank you for the info anyway.

David Marso

unread,

Mar 6, 2012, 7:47:30 PM3/6/12

to

It *is* an utterly trivial, straightforward task if you do something like the following ;-):

DATA LIST / Q1 TO Q20 (20A1).
BEGIN DATA
ABBACDCNCBDCABBANCBA
DCACBBCADNBCACBCABAC
ABCBDCABBCADNACBCABA
ABCBACBCADACBCADBBCA
END DATA.

STRING #KEY (A20).
COMPUTE #KEY="ABCBACBCADACBCADBBCA".
DO REPEAT Q=Q1 TO Q20 /X=1 TO 20.
+ COMPUTE SCORE=SUM(SCORE,Q EQ CHAR.SUBSTR(#KEY,X,1)).
END REPEAT.
LIST.
--
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 SCORE
A B B A C D C N C B D C A B B A N C B A 4.00
D C A C B B C A D N B C A C B C A B A C 3.00
A B C B D C A B B C A D N A C B C A B A 7.00
A B C B A C B C A D A C B C A D B B C A 20.00

Number of cases read: 4 Number of cases listed: 4

David Marso

unread,

Mar 6, 2012, 7:54:03 PM3/6/12

to

And the following (assuming the same test data as previous):

STRING #KEY (A20).
COMPUTE #KEY="ABCBACBCADACBCADBBCA".

DO REPEAT Q=Q1 TO Q20 /D=D1 TO D20/X=1 TO 20.
COMPUTE D=Q EQ SUBSTR(#KEY,X,1).
END REPEAT.
COMPUTE SCORE=SUM(D1 TO D20).
LIST.

d

Art Kendall

unread,

Mar 7, 2012, 8:51:05 AM3/7/12

to

Those are great ways to do the first part of what the OP said was
needed. The OP also wanted correlations of dummies for each response
category with the scale score.

The second part would be to create a series of dichotomies for each item
in order to look at the quality of the distractors. There would be as
many dichotomous variables as there were values in the response scale
plus dichotomies for each kind of missing value.

I suggested that the OP check the archives to see if anyone had
generalized SPSSINC CREATE DUMMIES for such a task. Using PYTHON (which
I am ashamed to say I still have not learned) it should be possible to
create variable names that are readable by using the item variable name
+ which response scale value with a letter to indicate "right" or
"wrong" or "internal skip" or "end skip" .

That would create new variables like
Item01_A_W Item01_B_R Item01_C_W Item_01_D_W Item01_I_W Item01_E_W
Item02_A_R Item02_B_W Item02_C_W Item_02_D_W Item02_I_W Item02_E_W

I am still catching up with stuff from being on the road for a month so
I have not had much time to think about it. But without the item-scoring
part of the new variable name it sounds like it should be doable with a
macro or writing a snippet of syntax to generate syntax.
Item01_A Item01_B Item01_C Item_01_D Item01_I Item01_E

The OP wanted correlations of dummies for each response category with
the scale score which would then be straightforward.
CORRELATIONS Item01_A_W to Item60_E with score.

Art Kendall
Social Research Consultants

Anatoliy Russ

unread,

Mar 7, 2012, 5:03:25 PM3/7/12

to

This is great, I'll play around with it to do what I need. I think I won't have a problem with the scoring portion of the test now. Yay!

David Marso

unread,

Mar 7, 2012, 6:22:50 PM3/7/12

to A...@drkendall.org

'We donta need no stinkin Python' ;-)
---

DEFINE Dummies ( VRoot !TOKENS(1) / VINDEX !TOKENS(1)/VAR !TOKENS(1)) .
!LET !VLIST=""
!DO !A=1 !TO 5
!LET !VNAME=!CONCAT(!VROOT,"_", !SUBSTR(!EVAL(!ANS),!A,1) ,"_" )
!IF (!SUBSTR(!EVAL(!KEY),!VINDEX, 1) !EQ !SUBSTR(!EVAL(!ANS),!A,1) ) !THEN
!LET !VNAME=!CONCAT(!VNAME,"C")
!ELSE
!LET !VNAME=!CONCAT(!VNAME,"W")
!IFEND
!LET !VLIST=!CONCAT(!VLIST, !VNAME, " ")
!DOEND
!LET !FIRST=!HEAD (!VLIST)
!LET !LAST = !VNAME
NUMERIC !VLIST (F1).
VECTOR !VROOT = !FIRST TO !LAST .
RECODE !VLIST (ELSE=0).
COMPUTE !VROOT(INDEX(!QUOTE(!EVAL(!ANS)),!Var, 1))=1.
!ENDDEFINE.

DEFINE !KEY() ABCBACBCADACBCADBBCA !ENDDEFINE .
DEFINE !ANS() ABCDN !ENDDEFINE .

DEFINE LOOPIT ()
!DO !I = 1 !TO 20
Dummies VRoot !CONCAT(X,!I) VIndex !I VAR !CONCAT(Q,!I).
!DOEND
!ENDDEFINE.

DATA LIST / Q1 TO Q20 (20A1).
BEGIN DATA
ABBACDCNCBDCABBANCBA
DCACBBCADNBCACBCABAC
ABCBDCABBCADNACBCABA
ABCBACBCADACBCADBBCA
END DATA.

LOOPIT .

<SNIP>

Anatoliy Russ

unread,

Mar 8, 2012, 1:47:44 PM3/8/12

to A...@drkendall.org

Since you've been so helpful :) could you point me in the direction of a report that I found, the sample looks like this:

TOTAL *RR17

TOTAL
---------------------------------
| RR17 | % of Total N | Mean |
---------------------------------
| 1 | 15.1% | 26.293|
| 2 | 3.3% | 25.667|
| 3* | 69.7% | 27.905|
| 4 | 11.1% | 25.233|
| 9 | .7% | 2.500|
| Total | 100.0% | 27.103|
---------------------------------

Where A=1, B=2, ..., 9=Missing
Column % of Total N is the percentage of sample choosing each alternative and Mean is the average total test score for students who chose each alternative.

David Marso

unread,

Mar 9, 2012, 7:48:54 AM3/9/12

to A...@drkendall.org

You have a new question. Please repost as a new topic/thread.
Meanwhile take a look at the AGGREGATE command.
--

Art Kendall

unread,

Mar 10, 2012, 12:30:49 PM3/10/12

to

I have put this in a new thread because as David pointed out this is a
new topic. Netiquette is to start a new thread.

The file "final" has the info.

see below sig block.

Alternative approaches might be to use FREQUENCIES and MEANS and then
OMS or to use CTABLES.

Art Kendall
Social Research Consultants

*VARTOCASES was lossing files so they were copied to temp1 and temp2.
*.
*this is still a KLUDGE.
*.
*It has not yet been re-drafted for efficiency.
*It has not yet been re-drafted for readability.
* The macros and test data were from a post by David Marso.
*.
DEFINE DUMCODES (QX !TOKENS(1)/VR !TOKENS(1)/MAX !TOKENS(1)/ ANS
!TOKENS(1)/ KEY !TOKENS(1) )
NUMERIC !CONCAT(SUM_,!QX)(F8.0).
!DO !I = 1 !TO !MAX
!LET !VLIST=""
!DO !A=1 !TO !LENGTH(!ANS)
!LET !VNAME=!CONCAT(!VR,!I,"_",!SUBSTR(!ANS,!A,1) ,"_" )
!IF (!SUBSTR(!KEY,!I,1)=!SUBSTR(!ANS,!A,1)) !THEN !LET !VN="C"
!LET !CINDEX=!A !ELSE !LET !VN="W" !IFEND
!LET !VNAME=!CONCAT(!VNAME,!VN) !LET !VLIST=!CONCAT(!VLIST,!VNAME," ")
!DOEND
!LET !FIRST=!HEAD(!VLIST) !LET !LAST=!VNAME
+ NUMERIC !VLIST (F1).
+ RECODE !VLIST (ELSE=0).
+ VECTOR !CONCAT(!VR,!I)=!FIRST TO !LAST .
+ COMPUTE !CONCAT(!VR,!I)(INDEX(!QUOTE(!ANS),!CONCAT(!QX,!I), 1))=1.
+ COMPUTE !CONCAT(SUM_,!QX)
=SUM(!CONCAT(SUM_,!QX),!CONCAT(!VR,!I)(!CINDEX)).

!DOEND
!ENDDEFINE.

DATA LIST / Q1 TO Q20 (20A1).
BEGIN DATA
ABBACDCNCBDCABBANCBA
DCACBBCADNBCACBCABAC
ABCBDCABBCADNACBCABA

CBDCABBANCBAABBACDCN
BCACBCABACDCACBBCADN
BCADNACBCABAABCBDCAB
ABCBACBCADACBCADBBCA
AAAAAAAAAAAAAAAAAAAA
BBBBBBBBBBBBBBBBBBBB
ABCBACBCADACBCADBBCA
END DATA.
dataset name rawfile.
EXECUTE.
dataset copy temp1.
dataset activate temp1.
SET MPRINT ON.
DUMCODES QX Q VR X MAX 20 ANS ABCDN KEY ABCBACBCADACBCADBBCA .
* new part.
dataset name widefile.
dataset copy temp2.
dataset activate temp2.
VARSTOCASES
/ID=id
/MAKE dummy FROM X1_A_C to X20_N_W
/INDEX=Index1(dummy)
/KEEP=SUM_Q
/NULL=KEEP.
DATASET NAME longfile.
numeric question (n2).
string answer (a5)correctwrong(a1).
do if substr(index1,3,1) eq '_'.
compute question = number(substr (index1,2,1),n2).
compute answer = substr(index1,4,1).
compute correctwrong =substr(index1,6,1).
ELSE.
compute question = number(substr (index1,2,2),n2).
compute answer = substr(index1,5,1).
compute correctwrong =substr(index1,7,1) .
end if.
string star(a1).
recode correctwrong ('W'= ' ')('C'='*')into star.
COMPUTE SUM2_Q = SUM_q* DUMMY.
MISSING VALUES SUM2_Q (0).
dataset declare HITS.
aggregate outfile = HITS
/break = question answer
/star = first(star)
/pct_hits = pin(dummy,1,1)
/hit_score = mean(sum2_q).
dataset activate HITS.
formats pct_hits (pct6.1) hit_score(f7.3) .
dataset activate longfile.
dataset declare totals.
temporary.
compute answer='total'.
aggregate outfile = totals
/break = question
/answer = first(answer)
/pct_hits = pin(dummy,0,1)
/hit_score = mean(sum2_q).
dataset activate totals.
formats pct_hits (pct6.1) hit_score(f7.3) .
dataset declare final.
add files file=hits /file= totals.
dataset name final.
sort cases by question answer.

Message has been deleted

David Marso

unread,

Mar 11, 2012, 12:27:46 AM3/11/12

to A...@drkendall.org

This should do the same thing. I deleted a previous posting and this supercedes that one due to a correction in the order of INDEXes in C2V and removal of some unnecessary logic.
--
DEFINE DUMCODEX
(QX !TOKENS(1)/ VR !TOKENS(1)/MAX !TOKENS(1)/ANS !TOKENS(1)/KEY !TOKENS(1) )

!DO !I = 1 !TO !MAX

!LET !VLIST="" !LET !VNI=!CONCAT(!VR,!I,"_")
!DO !A=1 !TO !LENGTH(!ANS)
!IF (!SUBSTR(!KEY,!I,1)=!SUBSTR(!ANS,!A,1)) !THEN !LET !VN="C" !LET !CI=!A !ELSE !LET !VN="W" !IFEND
!LET !VLIST=!CONCAT(!VLIST," ",!VNI,!SUBSTR(!ANS,!A,1),"_",!VN )
!DOEND
!IF (!I !EQ 1) !THEN !LET !FIRSTV=!HEAD(!VLIST) !IFEND
+ NUMERIC !CONCAT(SUM_,!QX)(F8.0).

+ NUMERIC !VLIST (F1).
+ RECODE !VLIST (ELSE=0).

+ VECTOR !CONCAT(!VR,!I)=!HEAD(!VLIST) TO !CONCAT(!VNI,!SUBSTR(!ANS,!A,1),"_",!VN ).
+ COMPUTE !CONCAT(!VR,!I)(INDEX(!QUOTE(!ANS),!CONCAT(!QX,!I), 1))=1.
+ COMPUTE !CONCAT(SUM_,!QX) = SUM(!CONCAT(SUM_,!QX),!CONCAT(!VR,!I)(!CI)).
!DOEND
VARSTOCASES
/ID = id1
/MAKE Q FROM !FIRSTV TO !CONCAT(!VNI,!SUBSTR(!ANS,!A,1),"_",!VN )
/INDEX=QuestNo(!MAX) QOptions(!LENGTH(!ANS))
/KEEP Sum_q.
AGGREGATE OUTFILE * /BREAK QuestNo QOptions Q /M_Q=MEAN(Sum_Q)/N=N.
COMPUTE CORRECT=(SUBSTR(!QUOTE(!ANS),QOptions,1) EQ SUBSTR(!QUOTE(!KEY) ,QuestNo,1)) AND Q.
IF Q=1 PCT_1=N/(N+LAG(N)).
EXE.
SELECT IF Q.
EXE.
!ENDDEFINE.
***********************************.
INPUT PROGRAM.
+ NUMERIC ID (F8).
+ STRING Q1 TO Q20 (A1).
+ LOOP ID=1 TO 10000.
+ DO REPEAT Q=Q1 TO Q20.
+ COMPUTE Q=SUBSTR("ABCDN",TRUNC(UNIFORM(5))+1,1).
+ END REPEAT.
+ END CASE.
+ END LOOP.
+ END FILE.
END INPUT PROGRAM.
EXE.
DUMCODEX QX Q VR X MAX 20 ANS ABCDN KEY ABCBACBCADACBCADBBCA .