Gus Gassmann writes:The double "money" was an obvious typo. It should have been "naughty"
> On May 8, 12:00 pm, FFMG wrote:
> > On Tuesday, 8 May 2012 16:41:18 UTC+2, Jussi Piitulainen wrote:
> > > FFMG writes:
> > > > > You have made an incorrect independence assumption. As both
> > > > Sorry, that's not an assumption, that's the way the problem
> > > > And they are independent variables, the presence of "naughty"
> > > > The formula is P(C|F1...Fn) = P(C)P(F1|C)...P(Fn|C)
> > > > So, given the problem in my original post, the result is not
> > > Probability theory only gives you
> > > Then come the independence assumptions which allow you to expand
> > > If "naughty" and "money" were exactly independent and
> > > Since we don't want to accept 1/2 = 1 and we think that relative
> > So, if I understand you correctly the 2 issues at hand are:
> Neither. There is no requirement on the number of documents, and the
The independence assumptions are an essential part of the Naive Bayes
I don't have any personal experience with such methods, however. My
> > So, as the formula seem to be correct in my example, I guess myThat wouldn't be Naive. I don't see what to do without Bayes either,
> > question would be, is there any way of binging the number back
> > between 0 and 1? or can I simply assume that anything > 1 is in
> > fact 1, (or almost 1).
> As Jussi explained, you have to use the data correctly, and you got
is that just me? And what if there are no instances of "naughty" AND
"money" when assigning the probabilities?
> > Following on to that, I also see many examples where theThe denominator does not matter when one is comparing alternatives
> > denominator can be ignored as it can be regarded as constant. But
> > then how can I calculate how close the probability of a number is
> > to 1? (because without a denominator I have no idea how close the
> > probability of a document is to be 'spam').
> You`ll have to explain better what you mean by this. As is, it make
that have the same denominator. When one says that the posterior is
proportional to the prior and the likelihood, one thinks of that
denominator as an uninteresting proportionality constant.
Perhaps it's so that an actual probability P("spam" | data) alone
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.