Error getting classifier accuracy

16 views

Skip to first unread message

Aaron O'Hare

unread,

Jan 27, 2018, 5:45:15 PM1/27/18

to nltk-users

Fairly new to Python/NLTK so forgive me if this is a basic question.

The classifier appears to be running/working fine but when trying to retrieve the accuracy via nltk.classify.accuracy I am encountering a ValueError.

Is this related to the training set being contained within [({xxx})] while the test set is contained within [xxx]?

The error states:

results = classifier.classify_many([fs for (fs, l) in gold])



ValueError: too many values to unpack (expected 2)

The code:

 train = [('train', 'train'),



('next train in', 'train'),

('When is the next train', 'train'),

('How long until the next train', 'train'),

("Where is the next train", 'train'),

('dart', 'train'),

('next dart in', 'train'),

('When is the next dart', 'train'),

('How long until the next dart', 'train'),

("Where is the next dart", 'train'),

("Show me where", 'map'),

("Directions to", 'map'),

('map', 'map')]







all_words = set(word.lower() for passage in train for word in word_tokenize(passage[0]))

t = [({word: (word in word_tokenize(x[0])) for word in all_words}, x[1]) for x in train]

classifier = nltk.NaiveBayesClassifier.train(t)

classifier.show_most_informative_features()







test_sentence = 'Whatever my message is, hopefully something about trains'




test_sent_features = {word.lower(): (word in word_tokenize(test_sentence.lower())) for word in all_words}




print(classifier.classify(test_sent_features))



print(nltk.classify.accuracy(classifier, test_sent_features))

I'm sure there's something simple I'm overlooking but I cant seem to spot it. Would appreciate any input on this, thanks.

Reply all

Reply to author

Forward

0 new messages