nltk.download('punkt', download_dir=' /var/www/capps/Data/nltk_data')
Dear all,I've started working with NLTK and I need some help.Short question: How can I change the path where Python3 looks for dowloaded data? By default it's a folder in the user home.Long question:I have a script that starts with:#!/usr/bin/python3import nltk
nltk.download('punkt')The third line to download 'punkt' package failes, it seems it is because the user doesn't have appropriate permissions. If I run thistry:
nltk.download('punkt')
except BaseException as e:
print(e)
I get the following exception:
[nltk_data] Downloading package punkt to /var/www/nltk_data...
An exception occurred
[Errno 13] Permission denied: '/var/www/nltk_data'I am trying to change the path of where the 'nltk_data' is writen. I do:$ sudo -u www-data python3import nltknltk.download()Then I select c) Config, then d) Set Data Dir, enter the new path (i.e. /var/www/capps/Data/nltk_data), then go back to main menu and quit, with True message.When I then quit the Python3 REPL I get the same error again:Error in atexit._run_exitfuncs:
PermissionError: [Errno 13] Permission deniedAnd that means that the new path hasn't been saved. It seems there's a bug ticket about that (https://bugs.python.org/issue19891) and it seems the reason for the problem is that the user (www-data) doesn't have a home directory.I have then tried something different:$ sudo -u www-data python3import nltknltk.download()there I do the same (change the path) and then I download the 'punkt' data. That seems to work and the files are written in the path of my choice. When I quit, the path will change to the default again but the files stay.Then I comment out the nltk.download('punkt') line in my script, but that doesn't seem to work either:['/var/www/capps/Lab/ling', '/usr/lib/python35.zip', '/usr/lib/python3.5', '/usr/lib/python3.5/plat-x86_64-linux-gnu', '/usr/lib/python3.5/lib-dynload', '/var/www/.local/lib/python3.5/site-packages', '/usr/local/lib/python3.5/dist-packages', '/usr/lib/python3/dist-packages']
NLKT imported fine
Traceback (most recent call last):
File "gender.py", line 105, in <module>
parse_gender(f.read())
File "gender.py", line 89, in parse_gender
for sentence in nltk.sent_tokenize(text)
File "/var/www/.local/lib/python3.5/site-packages/nltk/tokenize/__init__.py", line 105, in sent_tokenize
tokenizer = load('tokenizers/punkt/{0}.pickle'.format(language))
File "/var/www/.local/lib/python3.5/site-packages/nltk/data.py", line 868, in load
opened_resource = _open(resource_url)
File "/var/www/.local/lib/python3.5/site-packages/nltk/data.py", line 993, in _open
return find(path_, path + ['']).open()
File "/var/www/.local/lib/python3.5/site-packages/nltk/data.py", line 701, in find
raise LookupError(resource_not_found)
LookupError:
**********************************************************************
Resource punkt not found.
Please use the NLTK Downloader to obtain the resource:
>>> import nltk
>>> nltk.download('punkt')
For more information see: https://www.nltk.org/data.html
Attempted to load tokenizers/punkt/PY3/english.pickle
Searched in:
- '/var/www/nltk_data'
- '/usr/nltk_data'
- '/usr/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
- ''
**********************************************************************I would be very grateful for any tips.Cheers, Manuel
--
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nltk-users+...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/nltk-users/CALRRYPVUREc0xJBTUkVijCHgjOTHN6RKM6jiXu-GEmQanY2v0g%40mail.gmail.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/nltk-users/D9ADBE22-4C67-4F0B-A000-8E05CADA8AFF%40uu.nl.
--
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nltk-users+...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/nltk-users/CALRRYPVUREc0xJBTUkVijCHgjOTHN6RKM6jiXu-GEmQanY2v0g%40mail.gmail.com.