nltk.download('wordnet')

74 views

Skip to first unread message

fangqiang shan

unread,

Mar 2, 2023, 5:18:19 AM3/2/23

to nltk-users

hi, could you have solutions for this question? I already did this:

import nltk
print(nltk.data.path)

and get results like this:

['/root/nltk_data', '/usr/share/nltk_data', '/usr/local/share/nltk_data', '/usr/lib/nltk_data', '/usr/local/lib/nltk_data', '/kaggle/working/nltk_data', '/kaggle/working/nltk_data', '/kaggle/working/nltk_data', '/kaggle/working/nltk_data', '/kaggle/working/nltk_data', '/kaggle/working/nltk_data', '/kaggle/working/nltk_data', '/kaggle/working/nltk_data']

--------------------------------------------------------------------------- LookupError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/nltk/corpus/util.py in __load(self) 79 else: ---> 80 try: 81 root = nltk.data.find(f"{self.subdir}/{self.__name}") /opt/conda/lib/python3.7/site-packages/nltk/data.py in find(resource_name, paths) 652 "pcfg": "pcfg", --> 653 "fcfg": "fcfg", 654 "fol": "fol", LookupError: ********************************************************************** Resource 'corpora/wordnet.zip/wordnet/.zip/' not found. Please use the NLTK Downloader to obtain the resource: >>> nltk.download() Searched in: - '/root/nltk_data' - '/usr/share/nltk_data' - '/usr/local/share/nltk_data' - '/usr/lib/nltk_data' - '/usr/local/lib/nltk_data' - '/kaggle/working/nltk_data' - '/kaggle/working/nltk_data' - '/kaggle/working/nltk_data' - '/kaggle/working/nltk_data' - '/kaggle/working/nltk_data' - '/kaggle/working/nltk_data' - '/kaggle/working/nltk_data' - '/kaggle/working/nltk_data' - '/kaggle/working/nltk_data' ********************************************************************** During handling of the above exception, another exception occurred: LookupError Traceback (most recent call last) /tmp/ipykernel_27/2395884815.py in <module> 46 47 # Apply the preprocessing function to all columns of the dataframe ---> 48 preprocessed_df = kb_df.apply(preprocess_column) 49 50 # Save the preprocessed dataframe as a CSV file /opt/conda/lib/python3.7/site-packages/pandas/core/frame.py in apply(self, func, axis, raw, result_type, args, **kwargs) 8738 kwargs=kwargs, 8739 ) -> 8740 return op.apply() 8741 8742 def applymap( /opt/conda/lib/python3.7/site-packages/pandas/core/apply.py in apply(self) 686 return self.apply_raw() 687 --> 688 return self.apply_standard() 689 690 def agg(self): /opt/conda/lib/python3.7/site-packages/pandas/core/apply.py in apply_standard(self) 810 811 def apply_standard(self): --> 812 results, res_index = self.apply_series_generator() 813 814 # wrap results /opt/conda/lib/python3.7/site-packages/pandas/core/apply.py in apply_series_generator(self) 826 for i, v in enumerate(series_gen): 827 # ignore SettingWithCopy here in case the user mutates --> 828 results[i] = self.f(v) 829 if isinstance(results[i], ABCSeries): 830 # If we have a view on v, we need to make a copy because /tmp/ipykernel_27/2395884815.py in preprocess_column(col) 41 42 # Apply the preprocessing function to each cell in the column ---> 43 preprocessed_col = col.apply(preprocess_text) 44 45 return preprocessed_col /opt/conda/lib/python3.7/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwargs) 4355 dtype: float64 4356 """ -> 4357 return SeriesApply(self, func, convert_dtype, args, kwargs).apply() 4358 4359 def _reduce( /opt/conda/lib/python3.7/site-packages/pandas/core/apply.py in apply(self) 1041 return self.apply_str() 1042 -> 1043 return self.apply_standard() 1044 1045 def agg(self): /opt/conda/lib/python3.7/site-packages/pandas/core/apply.py in apply_standard(self) 1099 values, 1100 f, # type: ignore[arg-type] -> 1101 convert=self.convert_dtype, 1102 ) 1103 /opt/conda/lib/python3.7/site-packages/pandas/_libs/lib.pyx in pandas._libs.lib.map_infer() /tmp/ipykernel_27/2395884815.py in preprocess_text(text) 31 32 # Lemmatize the words ---> 33 words = [lemmatizer.lemmatize(word) for word in words] 34 35 # Join the words back into a single string /tmp/ipykernel_27/2395884815.py in <listcomp>(.0) 31 32 # Lemmatize the words ---> 33 words = [lemmatizer.lemmatize(word) for word in words] 34 35 # Join the words back into a single string /opt/conda/lib/python3.7/site-packages/nltk/stem/wordnet.py in lemmatize(self, word, pos) 38 :type word: str 39 :param pos: The Part Of Speech tag. Valid options are `"n"` for nouns, ---> 40 `"v"` for verbs, `"a"` for adjectives, `"r"` for adverbs and `"s"` 41 for satellite adjectives. 42 :param pos: str /opt/conda/lib/python3.7/site-packages/nltk/corpus/util.py in __getattr__(self, attr) 114 # Fix for inspect.isclass under Python 2.6 115 # (see https://bugs.python.org/issue1225107). --> 116 # Without this fix tests may take extra 1.5GB RAM 117 # because all corpora gets loaded during test collection. 118 if attr == "__bases__": /opt/conda/lib/python3.7/site-packages/nltk/corpus/util.py in __load(self) 79 else: 80 try: ---> 81 root = nltk.data.find(f"{self.subdir}/{self.__name}") 82 except LookupError as e: 83 try: /opt/conda/lib/python3.7/site-packages/nltk/corpus/util.py in __load(self) 76 root = nltk.data.find(f"{self.subdir}/{self.__name}") 77 except LookupError: ---> 78 raise e 79 else: 80 try: /opt/conda/lib/python3.7/site-packages/nltk/data.py in find(resource_name, paths) 651 "cfg": "cfg", 652 "pcfg": "pcfg", --> 653 "fcfg": "fcfg", 654 "fol": "fol", 655 "logic": "logic", LookupError: ********************************************************************** Resource 'corpora/wordnet' not found. Please use the NLTK Downloader to obtain the resource: >>> nltk.download() Searched in: - '/root/nltk_data' - '/usr/share/nltk_data' - '/usr/local/share/nltk_data' - '/usr/lib/nltk_data' - '/usr/local/lib/nltk_data' - '/kaggle/working/nltk_data' - '/kaggle/working/nltk_data' - '/kaggle/working/nltk_data' - '/kaggle/working/nltk_data' - '/kaggle/working/nltk_data' - '/kaggle/working/nltk_data' - '/kaggle/working/nltk_data' - '/kaggle/working/nltk_data' - '/kaggle/working/nltk_data'

J Dub

unread,

Mar 26, 2023, 5:16:17 AM3/26/23

to nltk-users

Did you try..

import nltk

nltk.download()

to launch the resource management GUI?

Even if you used it a while back it may show that your versions are 'out-of-date'

Reply all

Reply to author

Forward

0 new messages