I have downloaded the recent NLTK from the website to test out some of the exercises in the original NLTK book on p 17. I am using the 2.7.10 version of Python IDLE.
Can someone please tell me why I am getting the "u" in front of each instance of the frequency distribution list below? I tried both text1 and text2 that i downloaded as part of NLTK. Is there a way I can eliminate them? Is it my Python version? Everything else seems to work fine.
Thanks very much.
>>> fdist1= FreqDist(text2)
>>> fdist1
FreqDist({u',': 9397, u'to': 4063, u'.': 3975, u'the': 3861, u'of': 3565, u'and': 3350, u'her': 2436, u'a': 2043, u'I': 2004, u'in': 1904, ...})
>>> vocab1=fdist1.keys()
>>> vocab1[:50]
[u'succour', u'four', u'woods', u'hanging', u'woody', u'conjure', u'looking', u'eligible', u'scold', u'unsuitableness', u'meadows', u'stipulate', u'leisurely', u'bringing', u'disturb', u'internally', u'hostess', u'mohrs', u'persisted', u'Does', u'succession', u'tired', u'cordially', u'pulse', u'elegant', u'second', u'sooth', u'shrugging', u'abundantly', u'errors', u'forgetting', u'contributed', u'fingers', u'increasing', u'exclamations', u'hero', u'leaning', u'Truth', u'here', u'china', u'hers', u'natured', u'substance', u'unwillingness', u'pretensions', u'reports', u'NOT', u'NOW', u'projection', u'sweetest']
>>>