Hi, I am using NLTK for an analysis in Portuguese.
The problem is that I am using a corpus that is not from NLTK.
I have already converted it into nltk.text but, it can´t 'read' special characteres like é, í, ç,...
So, I really need help here, because if I my decoded text, that is a string type, I can´t do collocations, for example.
How do I decode NLTK text into utf8?