Re: Errno 13 Permission denied - www-data unable to download data to the default folder

1,628 views
Skip to first unread message

Manuel Souto Pico

unread,
Sep 21, 2019, 5:47:03 PM9/21/19
to nltk-...@googlegroups.com
Hello again,

I was having problems to post on the googlegroups page and I got messages from the admin that my message was denied. That's the reason for the duplicate message, my apologies.

I have found *one* solution:

nltk.download('punkt', download_dir=' /var/www/capps/Data/nltk_data')

Now it seems the data download fine, but I still get all errors from the "Traceback" downwards.

The file that I'm trying to execute is "gender.py", available in https://github.com/foxbook/atap/tree/master/snippets/ch01

Hopefully somebody can help.

Cheers, Manuel


On Fri, 20 Sep 2019 at 21:04, Manuel Souto Pico <m.sou...@gmail.com> wrote:
Dear all,

I've started working with NLTK and I need some help.

Short question: How can I change the path where Python3 looks for dowloaded data? By default it's a folder in the user home.

Long question:

I have a script that starts with:

#!/usr/bin/python3
import nltk
nltk.download('punkt')

The third line to download 'punkt' package failes, it seems it is because the user doesn't have appropriate permissions. If I run this

try:
  nltk.download('punkt')
except BaseException as e:
  print(e)

I get the following exception:

[nltk_data] Downloading package punkt to /var/www/nltk_data...
An exception occurred
[Errno 13] Permission denied: '/var/www/nltk_data'

I am trying to change the path of where the 'nltk_data' is writen. I do:

$ sudo -u www-data python3
import nltk
nltk.download()

Then I select c) Config, then d) Set Data Dir, enter the new path (i.e. /var/www/capps/Data/nltk_data), then go back to main menu and quit, with True message.

When I then quit the Python3 REPL I get the same error again:

Error in atexit._run_exitfuncs:
PermissionError: [Errno 13] Permission denied

And that means that the new path hasn't been saved. It seems there's a bug ticket about that (https://bugs.python.org/issue19891) and it seems the reason for the problem is that the user (www-data) doesn't have a home directory.

I have then tried something different:

$ sudo -u www-data python3
import nltk
nltk.download()

there I do the same (change the path) and then I download the 'punkt' data. That seems to work and the files are written in the path of my choice. When I quit, the path will change to the default again but the files stay.

Then I comment out the nltk.download('punkt') line in my script, but that doesn't seem to work either:

['/var/www/capps/Lab/ling', '/usr/lib/python35.zip', '/usr/lib/python3.5', '/usr/lib/python3.5/plat-x86_64-linux-gnu', '/usr/lib/python3.5/lib-dynload', '/var/www/.local/lib/python3.5/site-packages', '/usr/local/lib/python3.5/dist-packages', '/usr/lib/python3/dist-packages']
NLKT imported fine
Traceback (most recent call last):
  File "gender.py", line 105, in <module>
    parse_gender(f.read())
  File "gender.py", line 89, in parse_gender
    for sentence in nltk.sent_tokenize(text)
  File "/var/www/.local/lib/python3.5/site-packages/nltk/tokenize/__init__.py", line 105, in sent_tokenize
    tokenizer = load('tokenizers/punkt/{0}.pickle'.format(language))
  File "/var/www/.local/lib/python3.5/site-packages/nltk/data.py", line 868, in load
    opened_resource = _open(resource_url)
  File "/var/www/.local/lib/python3.5/site-packages/nltk/data.py", line 993, in _open
    return find(path_, path + ['']).open()
  File "/var/www/.local/lib/python3.5/site-packages/nltk/data.py", line 701, in find
    raise LookupError(resource_not_found)
LookupError:
**********************************************************************
  Resource punkt not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk
  >>> nltk.download('punkt')

  For more information see: https://www.nltk.org/data.html

  Attempted to load tokenizers/punkt/PY3/english.pickle

  Searched in:
    - '/var/www/nltk_data'
    - '/usr/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
    - ''
**********************************************************************

I would be very grateful for any tips.

Cheers, Manuel


Dimitriadis, A. (Alexis)

unread,
Sep 21, 2019, 6:04:35 PM9/21/19
to <nltk-users@googlegroups.com>
Hi Manuel,

I'm not sure I follow all that, but the nltk doesn't have persistent configuration for your settings. Once you have downloaded some resources to an nltk_data location that is not in the standard list, you can tell your programs where it is by adding this line:

    import nltk
    nltk.data.path.append("/full/path/to/your/nltk_data")

Alternately you could provide the path through the environment variable $NLTK_DATA, but if you don't know what that means, just stay with the above snippet.

Alexis


--
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nltk-users+...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/nltk-users/CALRRYPVUREc0xJBTUkVijCHgjOTHN6RKM6jiXu-GEmQanY2v0g%40mail.gmail.com.

Manuel Souto Pico

unread,
Sep 22, 2019, 2:47:11 PM9/22/19
to nltk-...@googlegroups.com
Bravo, Alexis.

That fixed the problem. Thank you so much!

Cheers, Manuel

Priyatam Nayak

unread,
Sep 22, 2019, 4:43:16 PM9/22/19
to nltk-...@googlegroups.com
Download nltk data manually and put it the config folder and also unzip the punky folder inside the nltk folder

--
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nltk-users+...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/nltk-users/CALRRYPVUREc0xJBTUkVijCHgjOTHN6RKM6jiXu-GEmQanY2v0g%40mail.gmail.com.
--

Regards
Priyatam Nayak
Reply all
Reply to author
Forward
0 new messages