I have trained my model on the corpus which is written in Hindi.
sentences = LineSentence('/home/gaurish/Desktop/AllTokens.txt')
model = models.Word2Vec(sentences)
model.save('mymodel')
But when i try to access the cosine similarity of the same i get this error
gaurish@gaurish-Studio-1457:~/Desktop$ python loadPy.py
2015-12-25 11:06:59,909 : INFO : loading Word2Vec object from mymodel
2015-12-25 11:07:00,082 : INFO : setting ignored attribute syn0norm to None
2015-12-25 11:07:00,082 : INFO : setting ignored attribute cum_table to None
hi
Traceback (most recent call last):
File "loadPy.py", line 17, in <module>
print new_model.vocab["अशूभ"]
KeyError: '\xe0\xa4\x85\xe0\xa4\xb6\xe0\xa5\x82\xe0\xa4\xad'
gaurish@gaurish-Studio-1457:~/Desktop$ ^C
gaurish@gaurish-Studio-1457:~/Desktop$
Am i doing something wrong ???
the vocab object if printed on terminal looks like this
word2vec.Vocab object at 0x7fdaa2637ed0>, u'\u0924\u093e\u0917': <gensim.models.word2vec.Vocab object at 0x7fdaa2ac7150>, u'\u0926\u094b\u0928\u0936\u0947': <gensim.models.word2vec.Vocab object at 0x7fdaa2411bd0>, u'\u0930\u0938\u094d\u0924\u094d\u092f\u093e\u0902\u0924': <gensim.models.word2vec.Vocab object at 0x7fdaa2ab9a90>, u'\u0915\u093e\u0933\u0916\u093e\u0902\u0924': <gensim.models.word2vec.Vocab object at 0x7fdaa3020050>, u'\u0928\u093e\u0936\u093e\u0921\u0940': <gensim.models.word2vec.Vocab object at 0x7fdaa2411c50>, u'\u0909\u092a\u0928\u093f\u0937\u0926': <gensim.models.word2vec.Vocab object at 0x7fdaa2411c90>, u'\u0917\u0942': <gensim.models.word2vec.Vocab object at 0x7fdaa2411cd0>, u'\u091c\u092e\u092a': <gensim.models.word2vec.Vocab object at 0x7fdaa2dbd0d0>, u'\u092c\u0938\u0924\u0932\u094b': <gensim.models.word2vec.Vocab object at 0x7fdaa2411d50>, u'\u092a\u093f\u0924\u094d\u0924': <gensim.models.word2vec.Vocab object at 0x7fdaa2411d90>, u'\u0908-\u092e\u0947\u0932': <gensim.models.word2vec.Vocab object at 0x7fdaa2411dd0>, u'\u0935\u093e\u092f\u0942': <gensim.models.word2vec.Vocab object at 0x7fdaa24b4b10>, u'\u0939\u093f\u0902\u0926\u0942\u0902': <gensim.models.word2vec.Vocab object at 0x7fdaa2411e50>, u'\u0935\u094d\u0939\u0921\u092a\u0923': <gensim.models.word2vec.Vocab object at 0x7fdaa2411e90>, u'\u0915\u0930\u092a\u093e\u091a\u0947': <gensim.models.word2vec.Vocab object at 0x7fdaa2411ed0>, u'\u0938\u0902\u0917\u0923\u0915\u0940': <gensim.models.word2vec.Vocab object at 0x7fdaa2411f10>, u'\u091c\u092e\u093e': <gensim.models.word2vec.Vocab object at 0x7fdaa2411f50>, u'\u091a\u0930\u093f\u0924\u094d\u0930\u0935\u093e\u0928': <gensim.models.word2vec.Vocab object at 0x7fdaa2a95050>, u'2551': <gensim.models.word2vec.Vocab object at 0x7fdaa2917e50>, u'\u0932\u0915\u094d\u0937\u094d\u092e\u0940\u091a\u0940': <gensim.models.word2vec.Vocab object at 0x7fdaa241f050>, u'\u0927\u094b\u0902\u092a\u0930\u093e': <gensim.models.word2vec.Vocab object at 0x7fdaa241f090>, u'10641': <gensim.models.word2vec.Vocab object at 0x7fdaa2c49f50>, u'\u0935\u0916\u0926\u093e\u0915': <gensim.models.word2vec.Vocab object at 0x7fdaa2d83cd0>, u'13826': <gensim.models.word2vec.Vocab object at 0x7fdaa309d2d0>, u'\u0938\u0902\u0935\u0938\u093e\u0930\u093e\u091a\u0947\u0930': <gensim.models.word2vec.Vocab object at 0x7fdaa28fc7d0>, u'\u0915\u093e\u0930\u093e\u0925\u094d\u092f\u093e\u091a\u0940': <gensim.models.word2vec.Vocab object at 0x7fdaa327ee90>, u'\u091b\u0924\u094d\u0930\u092a\u0924\u0940': <gensim.models.word2vec.Vocab object at 0x7fdaa241f210>, u'\u092e\u0941\u0933\u093e\u0935\u094d\u092f\u093e': <gensim.models.word2vec.Vocab object at 0x7fdaa241f250>, u'\u0935\u0947\u0936\u094d\u092f\u093e': <gensim.models.word2vec.Vocab object at 0x7fdaa241f290>, u'\u092a\u0941\u0928\u0930\u093e\u0935\u0943\u0924\u094d\u09