12 views

Skip to first unread message

Oct 26, 2011, 12:38:24 PM10/26/11

to Stanford AI Class

Hey all,

I really enjoyed the cancer problem from this week's class. I

discussed it with a few friends of mine, none of whom found the puzzle

as interesting as I did and all of whom struggled to follow my

explanation of the solution. I ended up drafting an email

walkthrough, and I thought it'd be interesting to share it here.

Enjoy!

-------------

First, the puzzle (in case anyone here didn't read it):

There's a 1% chance you have cancer. You are given a test with the

following limitations:

* If you HAVE the cancer, it is 10% likely to return a false negative,

* If you DO NOT HAVE the cancer, it is 20% likely to return a false

positive

The test is administered twice. Both results are positive. What is

the likelihood that the tests are accurate?

-------------

Let’s start with the base information. We’re given three data points,

which I expand into six:

1% of having the cancer

99% of not having the cancer

90% for a positive result on cancer positive patient

10% for a negative result on cancer positive patient

20% for a positive result on a cancer negative patient

80% for a negative result on a cancer negative patient

We’re given two positive results.

Now we look at Test One’s outcome by itself for a moment. Here’s the

full breakdown:

Cancer/Test

• Positive/Positive = 1% * 90% = 0.9%

• Positive/Negative = 1% * 10% = 0.1%

• Negative/Positive = 99% * 20% = 19.8%

• Negative/Negative = 99% * 80% = 79.2%

Overall, that translates to a 20.7% chance of a positive result and a

79.3% chance of a negative result. We’re only interested in positive

results, so we throw out the negative ones leaving us with 20.7% (only

0.9% of which are accurate). So the likelihood of an accurate

positive outcome after one test is 9/207, which can be reduced to

1/23. Roughly 4.35%.

This is the point at which I made my mistake. Once I had the 1/23 I

looked at it and said, “okay, now I just need to do it again.” I know

I’m getting a positive result in Test Two, and I know it’s a 1/23 of

being “successful”, so I look at the likelihood of “failure” (22/23)

and square it:

22/23 * 22/23 = 484/529

If 484/529 is the chance of getting two false positives, then 45/529

is the chance that at least one of the tests was accurate. Which sort

of makes sense, right? The percentage is going from 4.35% to 8.51%,

which seems reasonable. Except for one problem.

You either have the cancer or you don’t; that condition can’t change

from one test to the next. So while we have four possibilities for

the first “roll” (the Patient/Test chart above), the second test is

limited by the condition(s) we picked for the first test. This gives

us three pieces of information at the start of Test Two:

• The positive result from Test One is accurate in 1 case and

inaccurate in 22,

• The result from Test Two will be positive,

• Regardless of the test results, we can only be in one state at a

time - actually positive or actually negative (unless you’re

Shrodinger’s cancer patient). That is, if you got a false positive in

Test One then you can’t get a true positive in Test Two, and vice

versa.

That simplifies the hell out of Test Two (particularly since we get to

throw out all the negative results). Here it is:

Test One was positive (1 in 23): Test Two is 90% likely to return a

positive result [1 * 0.9 = 0.9]

Test One was negative (22 in 23): Test Two is 20% likely to return a

positive result [22 * 0.2 = 4.4]

Plugging those in we get 9:44, or 9/53. The likelihood of cancer has

jumped from 4.34% to 16.98%.

I really enjoyed the cancer problem from this week's class. I

discussed it with a few friends of mine, none of whom found the puzzle

as interesting as I did and all of whom struggled to follow my

explanation of the solution. I ended up drafting an email

walkthrough, and I thought it'd be interesting to share it here.

Enjoy!

-------------

First, the puzzle (in case anyone here didn't read it):

There's a 1% chance you have cancer. You are given a test with the

following limitations:

* If you HAVE the cancer, it is 10% likely to return a false negative,

* If you DO NOT HAVE the cancer, it is 20% likely to return a false

positive

The test is administered twice. Both results are positive. What is

the likelihood that the tests are accurate?

-------------

Let’s start with the base information. We’re given three data points,

which I expand into six:

1% of having the cancer

99% of not having the cancer

90% for a positive result on cancer positive patient

10% for a negative result on cancer positive patient

20% for a positive result on a cancer negative patient

80% for a negative result on a cancer negative patient

We’re given two positive results.

Now we look at Test One’s outcome by itself for a moment. Here’s the

full breakdown:

Cancer/Test

• Positive/Positive = 1% * 90% = 0.9%

• Positive/Negative = 1% * 10% = 0.1%

• Negative/Positive = 99% * 20% = 19.8%

• Negative/Negative = 99% * 80% = 79.2%

Overall, that translates to a 20.7% chance of a positive result and a

79.3% chance of a negative result. We’re only interested in positive

results, so we throw out the negative ones leaving us with 20.7% (only

0.9% of which are accurate). So the likelihood of an accurate

positive outcome after one test is 9/207, which can be reduced to

1/23. Roughly 4.35%.

This is the point at which I made my mistake. Once I had the 1/23 I

looked at it and said, “okay, now I just need to do it again.” I know

I’m getting a positive result in Test Two, and I know it’s a 1/23 of

being “successful”, so I look at the likelihood of “failure” (22/23)

and square it:

22/23 * 22/23 = 484/529

If 484/529 is the chance of getting two false positives, then 45/529

is the chance that at least one of the tests was accurate. Which sort

of makes sense, right? The percentage is going from 4.35% to 8.51%,

which seems reasonable. Except for one problem.

You either have the cancer or you don’t; that condition can’t change

from one test to the next. So while we have four possibilities for

the first “roll” (the Patient/Test chart above), the second test is

limited by the condition(s) we picked for the first test. This gives

us three pieces of information at the start of Test Two:

• The positive result from Test One is accurate in 1 case and

inaccurate in 22,

• The result from Test Two will be positive,

• Regardless of the test results, we can only be in one state at a

time - actually positive or actually negative (unless you’re

Shrodinger’s cancer patient). That is, if you got a false positive in

Test One then you can’t get a true positive in Test Two, and vice

versa.

That simplifies the hell out of Test Two (particularly since we get to

throw out all the negative results). Here it is:

Test One was positive (1 in 23): Test Two is 90% likely to return a

positive result [1 * 0.9 = 0.9]

Test One was negative (22 in 23): Test Two is 20% likely to return a

positive result [22 * 0.2 = 4.4]

Plugging those in we get 9:44, or 9/53. The likelihood of cancer has

jumped from 4.34% to 16.98%.

Oct 26, 2011, 1:25:00 PM10/26/11

to Stanford AI Class

Minor correction to that final bit:

Test One was a true positive (1 in 23): Test Two is 90% likely to

Test One was a true positive (1 in 23): Test Two is 90% likely to

return a

positive result [1 * 0.9 = 0.9]

Test One was a false positive (22 in 23): Test Two is 20% likely to
positive result [1 * 0.9 = 0.9]

Oct 26, 2011, 1:38:36 PM10/26/11

to stanford...@googlegroups.com

Thank you, i missed that one and now i understood

--

Camilo Cervantes S.

Est. Ingeniería de Sistemas

Universidad de Córdoba

Grupo de desarrollo 64 bits

www.group64bits.co.cc

313 561 0849 - 300 213 6906

2011/10/26 Nathan Foley <nrf...@gmail.com>

--

You received this message because you are subscribed to the Google Groups "Stanford AI Class" group.

To post to this group, send email to stanford...@googlegroups.com.

To unsubscribe from this group, send email to stanford-ai-cl...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/stanford-ai-class?hl=en.

Camilo Cervantes S.

Est. Ingeniería de Sistemas

Universidad de Córdoba

Grupo de desarrollo 64 bits

www.group64bits.co.cc

313 561 0849 - 300 213 6906

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu