Issue 702 in nltk: Patch for /trunk/nltk/nltk/probability.py;

63 views
Skip to first unread message

nl...@googlecode.com

unread,
Jul 18, 2011, 11:23:10 AM7/18/11
to nltk-...@googlegroups.com
Status: New
Owner: ----
Labels: Type-Patch

New issue 702 by rico.sen...@gmx.ch: Patch for
/trunk/nltk/nltk/probability.py;
http://code.google.com/p/nltk/issues/detail?id=702

NaiveBayesClassifier.train initializes the selected probability estimator
with a 'bins' argument. This leads to an exception when trying to set
MLEProbDist as estimator.

minimal example that throws an error:

>>> from nltk import NaiveBayesClassifier
>>> from nltk.probability import MLEProbDist
>>> train = [(dict(a=1), 'positive'),(dict(a=0), 'negative'),]
>>> classifier = NaiveBayesClassifier.train(train,estimator=MLEProbDist)
>>> print(classifier.classify(dict(a=1)))

expected output:
positive


This patch makes MLEProbDist accept a bins argument, without actually using
it. Ideally, the init arguments of the different ProbDists should be even
more harmonized to make them more interchangeable, but this is a start.

Attachments:
probability.py.patch 429 bytes

nl...@googlecode.com

unread,
Jul 28, 2011, 3:15:40 AM7/28/11
to nltk-...@googlegroups.com
Updates:
Status: Fixed
Owner: StevenBird1
Labels: Component-probability

Comment #1 on issue 702 by StevenBird1: Patch for
/trunk/nltk/nltk/probability.py;
http://code.google.com/p/nltk/issues/detail?id=702

Thanks for the fix; adopted in r8804.


nl...@googlecode.com

unread,
Mar 8, 2012, 11:43:35 PM3/8/12
to nltk-...@googlegroups.com

Comment #2 on issue 702 by anirudh....@gmail.com: Patch for
/trunk/nltk/nltk/probability.py;
http://code.google.com/p/nltk/issues/detail?id=702

>>> import spambot as s
>>> c=s.spamclassifier()
>>> c.classifier()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "spambot.py", line 58, in classifier
cl= NaiveBayesClassifier.train(training_set)

File "/usr/local/lib/python2.7/dist-packages/nltk/classify/naivebayes.py",
line 215, in train
label_probdist = estimator(label_freqdist)
File "/usr/local/lib/python2.7/dist-packages/nltk/probability.py", line
916, in __init__
LidstoneProbDist.__init__(self, freqdist, 0.5, bins)
File "/usr/local/lib/python2.7/dist-packages/nltk/probability.py", line
802, in __init__
'must have at least one bin.')
ValueError: A ELE probability distribution must have at least one bin.


My code was working perfectly fine till yesterday. When I ran it last
night, the code threw this error. And I am unable to run
naivebayesclassifier.train(args)
However, when I tried to run a simple code with Naivebayes classifier in
the interpreter , it ran flawlessly.
Please get back. My project is dependent on this

Reply all
Reply to author
Forward
0 new messages