New issue 702 by rico.sen...@gmx.ch: Patch for
/trunk/nltk/nltk/probability.py;
http://code.google.com/p/nltk/issues/detail?id=702
NaiveBayesClassifier.train initializes the selected probability estimator
with a 'bins' argument. This leads to an exception when trying to set
MLEProbDist as estimator.
minimal example that throws an error:
>>> from nltk import NaiveBayesClassifier
>>> from nltk.probability import MLEProbDist
>>> train = [(dict(a=1), 'positive'),(dict(a=0), 'negative'),]
>>> classifier = NaiveBayesClassifier.train(train,estimator=MLEProbDist)
>>> print(classifier.classify(dict(a=1)))
expected output:
positive
This patch makes MLEProbDist accept a bins argument, without actually using
it. Ideally, the init arguments of the different ProbDists should be even
more harmonized to make them more interchangeable, but this is a start.
Attachments:
probability.py.patch 429 bytes
Comment #1 on issue 702 by StevenBird1: Patch for
/trunk/nltk/nltk/probability.py;
http://code.google.com/p/nltk/issues/detail?id=702
Thanks for the fix; adopted in r8804.
>>> import spambot as s
>>> c=s.spamclassifier()
>>> c.classifier()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "spambot.py", line 58, in classifier
cl= NaiveBayesClassifier.train(training_set)
File "/usr/local/lib/python2.7/dist-packages/nltk/classify/naivebayes.py",
line 215, in train
label_probdist = estimator(label_freqdist)
File "/usr/local/lib/python2.7/dist-packages/nltk/probability.py", line
916, in __init__
LidstoneProbDist.__init__(self, freqdist, 0.5, bins)
File "/usr/local/lib/python2.7/dist-packages/nltk/probability.py", line
802, in __init__
'must have at least one bin.')
ValueError: A ELE probability distribution must have at least one bin.
My code was working perfectly fine till yesterday. When I ran it last
night, the code threw this error. And I am unable to run
naivebayesclassifier.train(args)
However, when I tried to run a simple code with Naivebayes classifier in
the interpreter , it ran flawlessly.
Please get back. My project is dependent on this