Gensim WordRank wrapper -- ValueError: stat: path too long for Windows

130 views
Skip to first unread message

Xi Lu

unread,
Jan 23, 2018, 2:41:03 PM1/23/18
to gensim
Hello all,

I am now trying to use the models.wrappers.wordrank to trian word embedding using my own corpus data. I just follow the codes in the official website (https://radimrehurek.com/gensim/models/wrappers/wordrank.html):

 model = gensim.models.wrappers.Wordrank.train(‘C:/Users/dummy/wordrank’, corpus_file=MY_CORPUS_STR, out_name=’wr_model’)

Then I keep to have this error:

  File "C:\Users\...\Anaconda3\lib\site-packages\gensim\models\wrappers\wordrank.py", line 95, in train
    copyfile(corpus_file, os.path.join(meta_dir, corpus_file.split('/')[-1]))

  File "C:\Users\...\Anaconda3\lib\shutil.py", line 103, in copyfile
    if _samefile(src, dst):

  File "C:\Users\...\Anaconda3\lib\shutil.py", line 88, in _samefile
    return os.path.samefile(src, dst)

  File "C:\Users\...l\Anaconda3\lib\genericpath.py", line 90, in samefile
    s1 = os.stat(f1)

  ValueError: stat: path too long for Windows

I think path like ‘C:/Users/dummy/wordrank’ should not be a long one.... But I keep to have this error and cannot find any solution online. Could you guys help me with this?

Is there a way to train WordRank embeddings with similar codes like Word2Vec, Doc2Vec in gensim?

I am using win 10, spyder 3.2.6, Python 3.5, latest gensim library.

Thank you all for help!

Ivan Menshikh

unread,
Jan 24, 2018, 1:07:27 AM1/24/18
to gensim
Hello Xi,

Workaround  - try to move wordrank & corpus_file to the upper level of file-system, i.e. `C:/wordrank` and `C:/my_copus`.
`f1` doesn't look like a path to wordrank, probably, this is path to corpus_file.

Xi Lu

unread,
Jan 24, 2018, 11:19:40 AM1/24/18
to gen...@googlegroups.com
Thank you for the clarification!

I also have a question and want to make sure: my computer is Windows 10 and to use gensim for using WordRank, is it true that I need to go to WordRank's author's repo and download his code first?

I used to thought that gensim already have all the codes for wordrank. But there is nothing in the folder of C:\Users\...\Anaconda3\Lib\site-packages\gensim\models\wrappers\wordrank

Should the path to wordrank to have a collection of all C++ codes for WordRank? Or an empty one is OK and only for saving all generated files? 

Thanks!

--
You received this message because you are subscribed to a topic in the Google Groups "gensim" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gensim/3OaC4oMAHyg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gensim+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ivan Menshikh

unread,
Jan 25, 2018, 12:15:50 AM1/25/18
to gensim
We propose only "wrapper" for wordrank, i.e. python-runner. As a backend, we use wordrank from https://bitbucket.org/shihaoji/wordrank.
The path to wordrank should point to wordrank directory (repository) with already compiled binaries.

To unsubscribe from this group and all its topics, send an email to gensim+un...@googlegroups.com.

Xi Lu

unread,
Jan 25, 2018, 10:50:51 AM1/25/18
to gen...@googlegroups.com
Oh, finally I see.

Thank you very much for the clarification and the pointer!

To unsubscribe from this group and all its topics, send an email to gensim+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages