nltk.data.load() error help

160 views
Skip to first unread message

Jarvin Li

unread,
Jun 27, 2016, 8:02:27 AM6/27/16
to nltk-users
Dear All,

I'm trying to experiment with NLTK, and I am having trouble using the load() function. I need to try to load a file from my own machine. From what I understand on the internet, I tried this:

import nltk

trailfile = nltk.data.load(
'C:/quotefilePos.json', 'json', True, True, 'None', 'None', 'None') After that I got some error messages that look like this:
C:\Python34\python.exe "C:/Users/JarvinLi/PycharmProjects/ThesisTrial1/Trial Loading.py" <<Loading C://quotefilePos.json>> Traceback (most recent call last): File "C:/Users/JarvinLi/PycharmProjects/ThesisTrial1/Trial Loading.py", line 3, in <module> trailfile = nltk.data.load(r'C:/quotefilePos.json', 'json', True, True, 'None', 'None', 'None') File "C:\Python34\lib\site-packages\nltk\data.py", line 801, in load opened_resource = _open(resource_url) File "C:\Python34\lib\site-packages\nltk\data.py", line 924, in _open return urlopen(resource_url) File "C:\Python34\lib\urllib\request.py", line 161, in urlopen return opener.open(url, data, timeout) File "C:\Python34\lib\urllib\request.py", line 464, in open response = self._open(req, data) File "C:\Python34\lib\urllib\request.py", line 487, in _open 'unknown_open', req) File "C:\Python34\lib\urllib\request.py", line 442, in _call_chain result = func(*args) File "C:\Python34\lib\urllib\request.py", line 1253, in unknown_open raise URLError('unknown url type: %s' % type) urllib.error.URLError: <urlopen error unknown url type: c>

Problem is, I don't understand that it. Any tips, help on how I can get this thing running?

Any reply will be welcomed. 

Thanks in advance.



DISCLAIMER AND CONFIDENTIALITY NOTICE 
The information contained in this e-mail, including those in its attachments, is confidential and intended only for the person(s) or entity(ies) to which it is addressed. If you are not an intended recipient, you must not read, copy, store, disclose, distribute this message, or act in reliance upon the information contained in it. If you received this e-mail in error, please contact the sender and delete the material from any computer or system. Any views expressed in this message are those of the individual sender and may not necessarily reflect the views of De La Salle University. 

George Orton

unread,
Jun 27, 2016, 8:53:38 AM6/27/16
to nltk-...@googlegroups.com
Hi, The error message "urllib.error.URLError: <urlopen error unknown url type: c>” is indicating that "C:/quotefilePos.jsonis an invalid URL. You will need to specify a valid URL for urllib to work correctly.
Sincerely, George

Dimitriadis, A. (Alexis)

unread,
Jun 27, 2016, 6:26:50 PM6/27/16
to nltk-...@googlegroups.com
trailfile = nltk.data.load('C:/quotefilePos.json', 'json', True, True, 'None', 'None', 'None')
...

urllib.error.URLError: <urlopen error unknown url type: c>

There was a bug, since fixed, in an older version of the NLTK that can give this error on Windows. Basically the C: at the front of your path is mistaken for a protocol, like "http:”. I couldn’t say if downloading the latest version will make your problem go away, but it’s worth a try. If the problem persists (or if you don’t feel like refreshing your nltk), explicitly add the “file:” protocol at the start of your path:

trailfile = nltk.data.load(r'file:C:\quotefilePos.json', 'json', True, True, 'None', 'None', 'None')

Alexis
Reply all
Reply to author
Forward
0 new messages