Nathan Foley
unread,Oct 26, 2011, 12:38:24 PM10/26/11Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Stanford AI Class
Hey all,
I really enjoyed the cancer problem from this week's class. I
discussed it with a few friends of mine, none of whom found the puzzle
as interesting as I did and all of whom struggled to follow my
explanation of the solution. I ended up drafting an email
walkthrough, and I thought it'd be interesting to share it here.
Enjoy!
-------------
First, the puzzle (in case anyone here didn't read it):
There's a 1% chance you have cancer. You are given a test with the
following limitations:
* If you HAVE the cancer, it is 10% likely to return a false negative,
* If you DO NOT HAVE the cancer, it is 20% likely to return a false
positive
The test is administered twice. Both results are positive. What is
the likelihood that the tests are accurate?
-------------
Let’s start with the base information. We’re given three data points,
which I expand into six:
1% of having the cancer
99% of not having the cancer
90% for a positive result on cancer positive patient
10% for a negative result on cancer positive patient
20% for a positive result on a cancer negative patient
80% for a negative result on a cancer negative patient
We’re given two positive results.
Now we look at Test One’s outcome by itself for a moment. Here’s the
full breakdown:
Cancer/Test
• Positive/Positive = 1% * 90% = 0.9%
• Positive/Negative = 1% * 10% = 0.1%
• Negative/Positive = 99% * 20% = 19.8%
• Negative/Negative = 99% * 80% = 79.2%
Overall, that translates to a 20.7% chance of a positive result and a
79.3% chance of a negative result. We’re only interested in positive
results, so we throw out the negative ones leaving us with 20.7% (only
0.9% of which are accurate). So the likelihood of an accurate
positive outcome after one test is 9/207, which can be reduced to
1/23. Roughly 4.35%.
This is the point at which I made my mistake. Once I had the 1/23 I
looked at it and said, “okay, now I just need to do it again.” I know
I’m getting a positive result in Test Two, and I know it’s a 1/23 of
being “successful”, so I look at the likelihood of “failure” (22/23)
and square it:
22/23 * 22/23 = 484/529
If 484/529 is the chance of getting two false positives, then 45/529
is the chance that at least one of the tests was accurate. Which sort
of makes sense, right? The percentage is going from 4.35% to 8.51%,
which seems reasonable. Except for one problem.
You either have the cancer or you don’t; that condition can’t change
from one test to the next. So while we have four possibilities for
the first “roll” (the Patient/Test chart above), the second test is
limited by the condition(s) we picked for the first test. This gives
us three pieces of information at the start of Test Two:
• The positive result from Test One is accurate in 1 case and
inaccurate in 22,
• The result from Test Two will be positive,
• Regardless of the test results, we can only be in one state at a
time - actually positive or actually negative (unless you’re
Shrodinger’s cancer patient). That is, if you got a false positive in
Test One then you can’t get a true positive in Test Two, and vice
versa.
That simplifies the hell out of Test Two (particularly since we get to
throw out all the negative results). Here it is:
Test One was positive (1 in 23): Test Two is 90% likely to return a
positive result [1 * 0.9 = 0.9]
Test One was negative (22 in 23): Test Two is 20% likely to return a
positive result [22 * 0.2 = 4.4]
Plugging those in we get 9:44, or 9/53. The likelihood of cancer has
jumped from 4.34% to 16.98%.