problem with "from nltk.book import *"

2,369 views
Skip to first unread message

Katya

unread,
Apr 4, 2013, 4:37:04 AM4/4/13
to nltk-...@googlegroups.com
Hi all,

When I try to run "from nltk.book import *" get the following error message:
>>> from nltk.book import *
*** Introductory Examples for the NLTK Book ***
Loading text1, ..., text9 and sent1, ..., sent9
Type the name of the text or sentence to view it.
Type: 'texts()' or 'sents()' to list the materials.

Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
from nltk.book import *
File "D:\Python\2.7.3\lib\site-packages\nltk\book.py", line 21, in <module>
text1 = Text(gutenberg.words('melville-moby_dick.txt'))
File "D:\Python\2.7.3\lib\site-packages\nltk\corpus\util.py", line 68, in __getattr__
self.__load()
File "D:\Python\2.7.3\lib\site-packages\nltk\corpus\util.py", line 56, in __load
except LookupError: raise e
LookupError:
**********************************************************************
Resource 'corpora/gutenberg' not found. Please use the NLTK
Downloader to obtain the resource: >>> nltk.download()
Searched in:
- 'C:\\Users\\XiaoJia/nltk_data'
- 'C:\\nltk_data'
- 'D:\\nltk_data'
- 'E:\\nltk_data'
- 'D:\\Python\\2.7.3\\nltk_data'
- 'D:\\Python\\2.7.3\\lib\\nltk_data'
- 'C:\\Users\\XiaoJia\\AppData\\Roaming\\nltk_data'
**********************************************************************
>>>

While the path to gutenberg corpus is: D:\Python\2.7.3\nltk_data\packages\corpora

Do you know how to deal with it?

Thank you in advance!

Katya

unread,
Apr 4, 2013, 5:22:27 AM4/4/13
to nltk-...@googlegroups.com
And here is a new error message:
>>> from nltk.book import *
*** Introductory Examples for the NLTK Book ***
Loading text1, ..., text9 and sent1, ..., sent9
Type the name of the text or sentence to view it.
Type: 'texts()' or 'sents()' to list the materials.

Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>

from nltk.book import *
File "D:\Python\2.7.3\lib\site-packages\nltk\book.py", line 21, in <module>
text1 = Text(gutenberg.words('melville-moby_dick.txt'))
File "D:\Python\2.7.3\lib\site-packages\nltk\corpus\util.py", line 68, in __getattr__
self.__load()
File "D:\Python\2.7.3\lib\site-packages\nltk\corpus\util.py", line 53, in __load
root = nltk.data.find('corpora/%s' % self.__name)
File "D:\Python\2.7.3\lib\site-packages\nltk\data.py", line 455, in find
try: return find(modified_name)
File "D:\Python\2.7.3\lib\site-packages\nltk\data.py", line 445, in find
try: return ZipFilePathPointer(p, zipentry)
File "D:\Python\2.7.3\lib\site-packages\nltk\data.py", line 311, in __init__
zipfile = OpenOnDemandZipFile(os.path.abspath(zipfile))
File "D:\Python\2.7.3\lib\site-packages\nltk\data.py", line 738, in __init__
zipfile.ZipFile.__init__(self, filename)
File "D:\Python\2.7.3\lib\zipfile.py", line 714, in __init__
self._GetContents()
File "D:\Python\2.7.3\lib\zipfile.py", line 748, in _GetContents
self._RealGetContents()
File "D:\Python\2.7.3\lib\zipfile.py", line 763, in _RealGetContents
raise BadZipfile, "File is not a zip file"
BadZipfile: File is not a zip file
>>>

Alexis Dimitriadis

unread,
Apr 4, 2013, 6:14:20 AM4/4/13
to nltk-...@googlegroups.com
Resource 'corpora/gutenberg' not found. Please use the NLTK
Downloader to obtain the resource: >>> nltk.download()
Searched in:
...

- 'D:\\Python\\2.7.3\\nltk_data'

While the path to gutenberg corpus is: D:\Python\2.7.3\nltk_data\packages\corpora

Katya,

Have you used nltk.download() to download and install the "book" bundle?

The gutenberg corpus should be in nltk_data\corpora\gutenberg. I'm not sure where the extra "packages" subdirectory came from, but it's confusing the discovery algorithm.

Alexis
--
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nltk-users+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Katya

unread,
Apr 6, 2013, 10:52:47 AM4/6/13
to nltk-...@googlegroups.com
Thank you, Alexis!

I couldn't download the 'book" with nltk.download () so I just downloaded the whole package from a different site. After I removed the subdirectory 'packages' I got a new error message, probably you know what it might be:

>>> from nltk.book import *
*** Introductory Examples for the NLTK Book ***
Loading text1, ..., text9 and sent1, ..., sent9
Type the name of the text or sentence to view it.
Type: 'texts()' or 'sents()' to list the materials.

Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
from nltk.book import *
File "D:\Python\2.7.3\lib\site-packages\nltk\book.py", line 21, in <module>
text1 = Text(gutenberg.words('melville-moby_dick.txt'))
File "D:\Python\2.7.3\lib\site-packages\nltk\corpus\util.py", line 68, in __getattr__
self.__load()
File "D:\Python\2.7.3\lib\site-packages\nltk\corpus\util.py", line 53, in __load
root = nltk.data.find('corpora/%s' % self.__name)
File "D:\Python\2.7.3\lib\site-packages\nltk\data.py", line 455, in find
try: return find(modified_name)
File "D:\Python\2.7.3\lib\site-packages\nltk\data.py", line 445, in find
try: return ZipFilePathPointer(p, zipentry)
File "D:\Python\2.7.3\lib\site-packages\nltk\data.py", line 311, in __init__
zipfile = OpenOnDemandZipFile(os.path.abspath(zipfile))
File "D:\Python\2.7.3\lib\site-packages\nltk\data.py", line 738, in __init__
zipfile.ZipFile.__init__(self, filename)
File "D:\Python\2.7.3\lib\zipfile.py", line 714, in __init__
self._GetContents()
File "D:\Python\2.7.3\lib\zipfile.py", line 748, in _GetContents
self._RealGetContents()
File "D:\Python\2.7.3\lib\zipfile.py", line 763, in _RealGetContents
raise BadZipfile, "File is not a zip file"
BadZipfile: File is not a zip file
>>>

Thanks a lot! 

Alexis Dimitriadis

unread,
Apr 6, 2013, 4:29:40 PM4/6/13
to nltk-...@googlegroups.com
It seems to be saying that you still don't have the files you think you have. There's no way to guess what could be wrong with the object you downloaded or the way you installed it, so I'd suggest you try nltk.download() again, and if necessary figure out why it's not working for you.

Alexis

Katya Stolpovskaya

unread,
Apr 6, 2013, 10:49:00 PM4/6/13
to nltk-...@googlegroups.com
Thank you again! What I could find out, nltk.download () doesn't work because of proxy/firewall/etc. settings, so I was recommended either to try using cygwin (aparantly many people have this problem on Windows), or to download it from other source. So I guess, I'll have to make friends with cygwin:) Thank you very much! 


--
You received this message because you are subscribed to a topic in the Google Groups "nltk-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/nltk-users/Dr74kRxahzc/unsubscribe?hl=en-GB.
To unsubscribe from this group and all of its topics, send an email to nltk-users+...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
AKA XIAOJIA

Sadaf Naz

unread,
Feb 28, 2017, 6:21:12 AM2/28/17
to nltk-users
Hi,
I have the same problem. i have downloaded book from nltk.download().
Reply all
Reply to author
Forward
0 new messages