iam facing problem in loading Sindhi corpus. This corpus is unicode based corpus. I got following errors. Please rectify
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-3-62b88b16c0e0> in <module>()
1 import unicodedata
2 import nltk
----> 3 contents = open("D:\SindhiCorpus.txt").read()
4 len(contents)
C:\Users\mazhar\Anaconda3\lib\encodings\cp1252.py in decode(self, input, final)
21 class IncrementalDecoder(codecs.IncrementalDecoder):
22 def decode(self, input, final=False):
---> 23 return codecs.charmap_decode(input,self.errors,decoding_table)[0]
24
25 class StreamWriter(Codec,codecs.StreamWriter):
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 30: character maps to <undefined>