NLTK naive bayes raw probabilities instead of labels

J

unread,

Nov 28, 2013, 5:50:06 PM11/28/13

to nltk-...@googlegroups.com

Hi,

For the NLTK naive bayes I was wondering if there was a way to return the raw probabilities (and not just the label)?

Alex Rudnick

unread,

Nov 28, 2013, 7:23:26 PM11/28/13

to nltk-...@googlegroups.com

Yes!

The function you want there is prob_classify instead of just classify.
It returns a distribution over the different possible labels.

--
-- alexr

J

unread,

Nov 28, 2013, 7:49:05 PM11/28/13

to nltk-...@googlegroups.com

Hm, it's just giving me:

Alex Rudnick

unread,

Nov 28, 2013, 8:01:07 PM11/28/13

to nltk-...@googlegroups.com

Right! That's a ProbDist object. It's basically a mapping from labels
to probabilities.

You can ask it what the possible labels are with .samples() and get
the probability for a given label with .prob(). For example, consider
this two-way classification task:

>>> import nltk
>>> examples = [({"blue":True, "red":False},"BlueOne"), ({"blue":False,"red":True},"RedOne")]
>>> classifier = nltk.classify.naivebayes.NaiveBayesClassifier.train(examples)
>>> classifier.prob_classify({"blue":True,"red":False})
<ProbDist with 2 samples>
>>> dist = classifier.prob_classify({"blue":True,"red":False})
>>> list(dist.samples())
['BlueOne', 'RedOne']
>>> dist.prob("BlueOne")
0.9

On Thu, Nov 28, 2013 at 7:49 PM, J <jrubi...@gmail.com> wrote:
> Hm, it's just giving me:
> <ProbDist with 2 samples>

--
-- alexr

J

unread,

Nov 28, 2013, 8:17:21 PM11/28/13

to nltk-...@googlegroups.com

okay. I'm confused as my probabilities aren't adding to 1:

dist = classifier.prob_classify(dialogue_act_features(new_row[28]))

print(list(dist.samples()))

prob_one = dist.prob("1")

prob_zero = dist.prob("0")

print prob_one

print prob_zero

['1', '0']

2.10487845223e-140

4.40636834365e-194

Reply all

Reply to author

Forward