Hey folks, I'm confused by a question in the quizzes for Unit 5 - Machine Learning. If anybody can help with this it would be great.
So for the straight application of Bayes' Theorem, we use
P(spam | "secret","is","secret") = P("secret","is","secret" | spam) * P(spam) / P("secret","is","secret")
In the explanation, he uses total probability for the denominator, and calculates this:
P("secret","is","secret")
= P("secret","is","secret" | spam) * P(spam) + P("secret","is","secret" | ham) * P(ham)
= P("secret" | spam)^2 * P("is" | spam) * P(spam) + P("secret" | ham)^2 * P("is" | ham) * P(ham)
= 3/9 * 3/9 * 1/9 * 3/8 + 1/15 * 1/15 * 1/15 * 5/8
= 1 / 216 + 1 / 5400
= 13 / 2700
however I thought it would be easier to calculate P("secret","is","secret") directly, by counting how many words are in the dictionary in total, and how many instances of those are the specified word:
P("secret","is","secret")
= P("secret")^2 * P("is")
= 4/24 * 4/24 * 2/24
= 1 / 432
I can't see why my calculation is incorrect. Any ideas?
Richard
--
Richard Neilsen
richard...@gmail.com"If we do not steer, we run the danger of ending up where we are going."
-- Eliezer Yudkowsky