error when importing gensim.test.utils

75 views
Skip to first unread message

andreas heiner

unread,
Aug 25, 2024, 10:17:41 AM8/25/24
to Gensim
Hi,

I need to use the glove model in gensim, but I can't install the required libraries. I made a fresh install of gensim, so I'm somewhat confused

Anything I did wrong?

platform: Ubuntu juellyfish
python 3.12

Thanks,

andreas


--------------------------------------------------------------------------- ImportError Traceback (most recent call last) Cell In[66], line 4 2 glove_path = ~/WordVector_Models" 3 glove_file = "glove.6B.100d.txt" ----> 4 from gensim.test.utils import datapath, get_tmpfile 5 from gensim.models import KeyedVectors 6 from gensim.scripts.glove2word2vec import glove2word2vec File ~/.local/lib/python3.10/site-packages/gensim/__init__.py:11 7 __version__ = '4.3.3' 9 import logging ---> 11 from gensim import parsing, corpora, matutils, interfaces, models, similarities, utils # noqa:F401 14 logger = logging.getLogger('gensim') 15 if not logger.handlers: # To ensure reload() doesn't add another one File ~/.local/lib/python3.10/site-packages/gensim/corpora/__init__.py:14 12 from .dictionary import Dictionary # noqa:F401 13 from .hashdictionary import HashDictionary # noqa:F401 ---> 14 from .wikicorpus import WikiCorpus # noqa:F401 15 from .textcorpus import TextCorpus, TextDirectoryCorpus # noqa:F401 16 from .ucicorpus import UciCorpus # noqa:F401 File ~/.local/lib/python3.10/site-packages/gensim/corpora/wikicorpus.py:32 30 # cannot import whole gensim.corpora, because that imports wikicorpus... 31 from gensim.corpora.dictionary import Dictionary ---> 32 from gensim.corpora.textcorpus import TextCorpus 35 logger = logging.getLogger(__name__) 37 ARTICLE_MIN_WORDS = 50 File ~/.local/lib/python3.10/site-packages/gensim/corpora/textcorpus.py:46 44 from gensim import interfaces, utils 45 from gensim.corpora.dictionary import Dictionary ---> 46 from gensim.parsing.preprocessing import ( 47 remove_stopword_tokens, remove_short_tokens, 48 lower_to_unicode, strip_multiple_whitespaces, 49 ) 50 from gensim.utils import deaccent, simple_tokenize 52 from smart_open import open ImportError: cannot import name 'remove_stopword_tokens' from 'gensim.parsing.preprocessing' (/home/ahe/.local/lib/python3.10/site-packages/gensim/parsing/preprocessing.py)

Gordon Mohr

unread,
Aug 26, 2024, 3:03:18 PM8/26/24
to Gensim
It's hard to read your error stack as the rich-text format it's been pasted as has lost all linebreaks. And while you report using Python 3.12, the filepaths in the error imply you may be using Python 3.10. And, one years-ago report of problems with the same `remove_stopword_tokens` function seemed due to inadvertently using an older Gensim. 

I would suggest starting from a clean Python virtual environment of known version without Gensim & related libraries installed. Then, install what you need & try again. 

If the same problem recurs, please describe the exact minimal steps to reaching the error message, in terms of what's installed and what code is run, and then show the exact readable error message you received – which may better highlight what's going wrong in your setup.

- Gordon
Reply all
Reply to author
Forward
0 new messages