Training word2vec model on CentOS: "UserWarning: C extension not loaded for Word2Vec"

7,152 views
Skip to first unread message

Bo Wang

unread,
Jul 28, 2015, 3:54:03 PM7/28/15
to gensim

I have been trying to train a word2vec model on CentOS (6.5) (has python 2.6.6) using gensim. This is the warning message I have been receiving:

Training embedding model... /usr/lib64/python2.6/site-packages/gensim/models/word2vec.py:651: UserWarning: C extension not loaded for Word2Vec, training will be slow. Install a C compiler and reinstall gensim for fast training.
warnings.warn("C extension not loaded for Word2Vec, training will be slow. "

I do have gcc installed:

$ gcc -v Using built-in specs. Target: x86_64-redhat-linux Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux Thread model: posix gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC)

$ which gcc /usr/bin/gcc

 

$ which g++ /usr/bin/g++

I also tried uninstall then reinstall gensim, but it still shows the same warning message that C extension was not loaded and training will be slow...

Does anyone have any idea what causes this and what should I do?? Thanks!!

Matti Lyra

unread,
Jul 28, 2015, 4:22:19 PM7/28/15
to gensim, bwa...@googlemail.com
I don't have a solution, just wanted to chime in that I am experiencing the exact same problem on OSX (10.9.5) running inside a conda environment. I've tried installing using `pip`, `conda` (gensim: 0.10.3-np19py27_0) and directly from sources, none of them work, meaning the `gensim.word2vec.FAST_VERSION` is always set to -1.

$ gcc -v
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 6.0 (clang-600.0.54) (based on LLVM 3.5svn)
Target: x86_64-apple-darwin13.4.0
Thread model: posix

$ which gcc
/usr/bin/gcc

Matti Lyra

unread,
Jul 28, 2015, 4:58:55 PM7/28/15
to gen...@googlegroups.com, bwa...@googlemail.com
Ok, I think I solved the mystery.

The scipy version (0.16.0) I am using does not have the `scipy.linalg.blas.fblas` module as it has been depracated. The Cython/C wrapper for the fast bit of word2vec and doc2vec however still expect that module to be there. Everything compiles fine but importing `gensim.models.word2vec_inner` fails because it can't find the `fblas` module in scipy.

scipy.linalg.blas
http://docs.scipy.org/doc/scipy/reference/linalg.blas.html

There are two possible solutions, one involves pulling the sources and installing from there the other doesn't:

1) roll back to a version of scipy that has that module - I don't know what version would be though

or 2 the slightly more complicated solution:
2.1) pip install cython
2.2) modify the `gensim/models/word2vec_inner.pyx` and `gensim/models/doc2vec_inner.pyx` files so that the import for fblas on line 18
`from scipy.linalg.blas import fblas`

becomes
`from scipy.linalg import blas as fblas`

line 18 in both files. Delete the associated `.so` and `.c` files and call

$ cython word2vec_inner.pyx
$ cython doc2vec_inner.pyx

this will regenerate the '.c' files.

Then install normally using `python setup.py install` from the gensim root that you pulled from github. To verify that it all worked according to plan try

>>> from gensim.models import word2vec
>>> word2vec.FAST_VERSION0

The output should be non-negative.

It also appears that there is already a bug filed for this on github https://github.com/piskvorky/gensim/issues/382


/m

--
You received this message because you are subscribed to a topic in the Google Groups "gensim" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gensim/isBqIhrw9mk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gensim+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Radim Řehůřek

unread,
Jul 29, 2015, 9:13:02 AM7/29/15
to gensim, matti...@gmail.com, bwa...@googlemail.com, matti...@gmail.com
Hello guys,

yes, the issue has been filed at

No one has submitted the (simple) patch yet, so this is a good chance to contribute to gensim :)

Thank you for reporting,
Radim

Bo Wang

unread,
Jul 29, 2015, 9:37:13 AM7/29/15
to gensim, matti...@gmail.com
Thanks Matti! I have modified both .pyx files and removed the corresponding .so and .c files. But when I call


$ cython word2vec_inner.pyx
$ cython doc2vec_inner.pyx

I receive the following error from Cython:

Compiler crash traceback from this point on:
  File "Cython/Compiler/Visitor.py", line 173, in Cython.Compiler.Visitor.TreeVisitor._visit (/tmp/pip_build_root/cython/Cython/Compiler/Visitor.c:4469)
    return handler_method(obj)
  File "Cython/Compiler/Visitor.py", line 508, in Cython.Compiler.Visitor.MethodDispatcherTransform.visit_BinopNode (/tmp/pip_build_root/cython/Cython/Compiler/Visitor.c:10624)
    return self._visit_binop_node(node)
  File "Cython/Compiler/Visitor.py", line 519, in Cython.Compiler.Visitor.MethodDispatcherTransform._visit_binop_node (/tmp/pip_build_root/cython/Cython/Compiler/Visitor.c:10802)
    if obj_type.is_builtin_type:
AttributeError: 'NoneType' object has no attribute 'is_builtin_type'

Any idea why?

Thanks.

Pham Lam

unread,
Jul 30, 2015, 2:29:12 AM7/30/15
to gensim, matti...@gmail.com, bwa...@googlemail.com
I faced the same on Ubuntu 15.04/Python 2.7.9. I solved it out by following:

1.  pip uninstall gensim
2.  pip uninstall scipy 

3. pip install --no-cache-dir scipy==0.15.1
4. pip install --no-cache-dir gensim==0.12.1

test on ipython:
5. from gensim.models import word2vec
>> word2vec.FAST_VERSION

It outputs 1 is OK.

Laam

Dirk Brand

unread,
Aug 14, 2015, 2:22:19 PM8/14/15
to gensim, matti...@gmail.com, bwa...@googlemail.com
I can confirm that this works really well.  Now I just need to figure Doc2Vec out...

Zhong Liang Ong

unread,
Aug 31, 2015, 6:58:40 AM8/31/15
to gensim, matti...@gmail.com, bwa...@googlemail.com
Thanks! works for me too on CentOS 6.6 running Python 2.7.10 in virtualenv.

mtudor

unread,
Sep 14, 2015, 2:40:30 PM9/14/15
to gensim, matti...@gmail.com
I'm getting the same errors as Bo when trying to apply Matti's fix #2.  I'm seeing this on both Anaconda on Redhat and OSX native python install.  Scipy 1.6, python 2.7.

Gordon Mohr

unread,
Sep 14, 2015, 7:04:44 PM9/14/15
to gensim
The easier workaround for the scipy-0.16.0 incompatibility is to force the use of an older scipy (0.15.1), as described at:


- Gordon
Message has been deleted

Lior Magen

unread,
Apr 19, 2016, 10:38:07 AM4/19/16
to gensim, matti...@gmail.com, bwa...@googlemail.com
Ok so I am using Word2Vec for about 6 months and Iv'e always encountered the 'C compiler...' issue and finally Iv'e ran into this post and saw your answer and now FINALLY! it works soo fast, from training in a rate of 4000 words/s it became 1318718 words/s. This is INSANE! thank you so much!

Deep

unread,
Nov 16, 2016, 2:44:13 PM11/16/16
to gensim, matti...@gmail.com, bwa...@googlemail.com
Hi I am using gensim = 13.3, I did the following but I still get the error and word2vec is slow

1.  pip uninstall gensim
2.  pip uninstall scipy 

3. pip install --no-cache-dir scipy==0.15.1
4. pip install --no-cache-dir gensim

Lev Konstantinovskiy

unread,
Nov 24, 2016, 3:10:56 PM11/24/16
to gensim
Hi Deep,

Did you run the cython install?
pip install cython

Regards
Lev

Gordon Mohr

unread,
Nov 24, 2016, 10:47:49 PM11/24/16
to gensim
Note package `cython` isn't actually required to install the optimized gensim libraries. (Cython has already played its role on the developers' machines, compiling the cython code to C++ files that are part of the distribution.) 

But other system build-support packages are required – for example, 'python-dev' and 'build-essentials' on Ubuntu systems.  (On CentOS, the prerequisite package names may vary.) So mainly watch for any errors when gensim is pip-installed – that will give the hint as to why the native libraries aren't compiled, and thus perhaps to what extra steps you should add to your install. 

- Gordon

Deep

unread,
Nov 30, 2016, 1:20:42 AM11/30/16
to gensim
I installed both the python-dev and build_essentials and re-installed gensim but still no luck. I am trying this on a Google Cloud Platform VM instance. Strangely when gensim is installed, I dont get any errors or warnings either.

Deep

unread,
Nov 30, 2016, 1:56:18 AM11/30/16
to gensim
The warnings seem to go away with gensim==0.12.1 but not with the latest version of gensim==0.13.3
Reply all
Reply to author
Forward
0 new messages