Failed installing NLTK with pip

2,570 views
Skip to first unread message

bobtodd

unread,
Jul 19, 2011, 10:40:37 AM7/19/11
to nltk-users
Howdy all,

I'm trying to install NLTK with pip on Mac OS 10.6.8 and I'm running
into trouble. Specifically I'm trying to install in a virtual
environment associated with Python 2.7.2. Whenever I try to install
NLTK using "pip install nltk" (I do have pip set to the right
environment), I receive an error that it cannot find the file
"setup.py".

I know this is an error that's been encountered by others, e.g. here:

http://groups.google.com/group/nltk-users/browse_thread/thread/3162be5b2d13e5a7

which seems to be a mirror of this:

http://www.richard-careaga.com/archives/3474

But in that thread the author suggests that the user do the
following:

cd /usr/local/var/pip/build/nltk*/nltk*
pip install .


What I in fact find in my install is this:


09:23 AM mbpbt:ml> ls
09:23 AM mbpbt:ml> mkvirtualenv ml
New python executable in ml/bin/python
Installing setuptools............done.
Installing pip...............done.
virtualenvwrapper.user_scripts creating /Users/bobtodd/.virtualenvs/ml/
bin/predeactivate
virtualenvwrapper.user_scripts creating /Users/bobtodd/.virtualenvs/ml/
bin/postdeactivate
virtualenvwrapper.user_scripts creating /Users/bobtodd/.virtualenvs/ml/
bin/preactivate
virtualenvwrapper.user_scripts creating /Users/bobtodd/.virtualenvs/ml/
bin/postactivate
virtualenvwrapper.user_scripts creating /Users/bobtodd/.virtualenvs/ml/
bin/get_env_details
(ml)09:23 AM mbpbt:ml> python --version
Python 2.7.2
(ml)09:23 AM mbpbt:ml> pip install nltk
Downloading/unpacking nltk
Downloading nltk-2.0.1rc1.macosx-10.6-x86_64.tar.gz (1.9Mb): 1.9Mb
downloaded
Running setup.py egg_info for package nltk
Traceback (most recent call last):
File "<string>", line 14, in <module>
IOError: [Errno 2] No such file or directory: '/Users/
bobtodd/.virtualenvs/ml/build/nltk/setup.py'
Complete output from command python setup.py egg_info:
Traceback (most recent call last):

File "<string>", line 14, in <module>

IOError: [Errno 2] No such file or directory: '/Users/
bobtodd/.virtualenvs/ml/build/nltk/setup.py'

----------------------------------------
Command python setup.py egg_info failed with error code 1
Storing complete log in /Users/bobtodd/.pip/pip.log
(ml)09:23 AM mbpbt:ml> cd ~/.virtualenvs/ml/
bin/ build/ include/ lib/
(ml)09:23 AM mbpbt:ml> cd ~/.virtualenvs/ml/build/nltk*/nltk*
-bash: cd: /Users/bobtodd/.virtualenvs/ml/build/nltk*/nltk*: No such
file or directory
(ml)09:25 AM mbpbt:ml> ls ~/.virtualenvs/ml/build/nltk/
opt pip-egg-info
(ml)09:25 AM mbpbt:ml> ls ~/.virtualenvs/ml/build/nltk/opt/
local
(ml)09:26 AM mbpbt:ml> ls ~/.virtualenvs/ml/build/nltk/opt/local/
Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-
packages/nltk
__init__.py containers.pyc help.py
sourcedstring.py
__init__.pyc corpus help.pyc
sourcedstring.pyc
align.py data.py inference stem
align.pyc data.pyc internals.py tag
app decorators.py internals.pyc test
book.py decorators.pyc lazyimport.py text.py
book.pyc downloader.py lazyimport.pyc text.pyc
ccg downloader.pyc metrics tokenize
chat draw misc toolbox
chunk etree model tree.py
classify evaluate.py nltk.jar tree.pyc
cluster evaluate.pyc olac.py
treetransforms.py
collocations.py examples olac.pyc
treetransforms.pyc
collocations.pyc featstruct.py parse util.py
compat.py featstruct.pyc probability.py util.pyc
compat.pyc grammar.py probability.pyc yamltags.py
containers.py grammar.pyc sem yamltags.pyc
(ml)09:27 AM mbpbt:ml> ls ~/.virtualenvs/ml/build/nltk/pip-egg-info/
(ml)09:27 AM mbpbt:ml>

So I in fact don't even have the directory ...../build/nltk*/nltk*
that the author suggests I cd to.

From other materials I've found online, many users have solved the
problem (dating back a few years now) by various changes along the
lines of linking a file ..../build/nltk/setup.py to the actual file
deeper down, e.g. .../build/nltk/nltk-<version_number>/setup.py. But
as you can see from the above, I don't even have that... rather I seem
to have .../build/nltk/opt, which honestly seems to me a rather
mysterious beast, what with the Frameworks 'n' all. So I don't even
know how to apply the solutions I've come across.

I understand that I could try easy_install, but as pip is getting set
to replace this, I figure a solution to the issue with pip would
benefit others as well. And I understand that I could use MacPorts,
but I've ditched that for Homebrew for other reasons.

Any help would be greatly appreciated.

Kind regards,
Todd

Richard Careaga

unread,
Jul 19, 2011, 11:58:16 AM7/19/11
to nltk-...@googlegroups.com
Just to eliminate the obvious, are you "in" the virtual environment to start?



bobtodd
July 19, 2011 10:40 AM

Mikhail Korobov

unread,
Jul 19, 2011, 12:06:29 PM7/19/11
to nltk-...@googlegroups.com
Hi Todd, this looks like http://code.google.com/p/nltk/issues/detail?id=534

Basically, nltk-2.0.1rc1.macosx-10.6-x86_64.tar.gz is broken because it contains extra directory structure.

The solution that works for me is to install from generic source tarball:

Richard Careaga

unread,
Jul 19, 2011, 12:10:56 PM7/19/11
to nltk-...@googlegroups.com
That works for me in virtualenv.

Mikhail Korobov wrote:
> pip installhttp://pypi.python.org/packages/source/n/nltk/nltk-2.0.1rc1.tar.gz

bobtodd

unread,
Jul 20, 2011, 10:35:56 AM7/20/11
to nltk-users, Todd Krause
Howdy Richard and Mikhail,

Thanks very much. I hadn't found that previous thread; thanks for
pointing it out. It seems that

pip install http://pypi.python.org/packages/source/n/nltk/nltk-2.0.1rc1.tar.gz

does in fact solve the problem. That might be one to put on the NLTK
"Download" page....

As it was installing, however, it did say at the end that libyaml was
not found, which struck me as odd. I don't know how to correct that,
or what impact it would have if left uncorrected. I'll include the
output below for reference.

Thanks very much for your help.
Best,
Todd

NLTK install output:

(ml)09:24 AM mbpbt:ml> pip install http://pypi.python.org/packages/source/n/nltk/nltk-2.0.1rc1.tar.gz
Downloading/unpacking http://pypi.python.org/packages/source/n/nltk/nltk-2.0.1rc1.tar.gz
Downloading nltk-2.0.1rc1.tar.gz (2.0Mb): 2.0Mb downloaded
Running setup.py egg_info for package from
http://pypi.python.org/packages/source/n/nltk/nltk-2.0.1rc1.tar.gz

Requirement already satisfied (use --upgrade to upgrade): setuptools
in /Users/bobtodd/.virtualenvs/ml/lib/python2.7/site-packages/
setuptools-0.6c11-py2.7.egg (from nltk==2.0.1rc1)
Downloading/unpacking PyYAML==3.09 (from nltk==2.0.1rc1)
Downloading PyYAML-3.09.tar.gz (238Kb): 238Kb downloaded
Running setup.py egg_info for package PyYAML

Installing collected packages: PyYAML, nltk
Found existing installation: PyYAML 3.10
Uninstalling PyYAML:
Successfully uninstalled PyYAML
Running setup.py install for PyYAML
checking if libyaml is compilable
/usr/bin/cc -fno-strict-aliasing -arch i386 -arch x86_64 -O3 -
march=core2 -msse4.1 -w -pipe -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-
prototypes -I/usr/local/Cellar/python/2.7.2/include/python2.7 -c build/
temp.macosx-10.5-intel-2.7/check_libyaml.c -o build/temp.macosx-10.5-
intel-2.7/check_libyaml.o
build/temp.macosx-10.5-intel-2.7/check_libyaml.c:2:18: error:
yaml.h: No such file or directory
build/temp.macosx-10.5-intel-2.7/check_libyaml.c: In function
‘main’:
build/temp.macosx-10.5-intel-2.7/check_libyaml.c:5: error:
‘yaml_parser_t’ undeclared (first use in this function)
build/temp.macosx-10.5-intel-2.7/check_libyaml.c:5: error: (Each
undeclared identifier is reported only once
build/temp.macosx-10.5-intel-2.7/check_libyaml.c:5: error: for
each function it appears in.)
build/temp.macosx-10.5-intel-2.7/check_libyaml.c:5: error:
expected ‘;’ before ‘parser’
build/temp.macosx-10.5-intel-2.7/check_libyaml.c:6: error:
‘yaml_emitter_t’ undeclared (first use in this function)
build/temp.macosx-10.5-intel-2.7/check_libyaml.c:6: error:
expected ‘;’ before ‘emitter’
build/temp.macosx-10.5-intel-2.7/check_libyaml.c:8: error:
‘parser’ undeclared (first use in this function)
build/temp.macosx-10.5-intel-2.7/check_libyaml.c:11: error:
‘emitter’ undeclared (first use in this function)
build/temp.macosx-10.5-intel-2.7/check_libyaml.c:2:18: error:
yaml.h: No such file or directory
build/temp.macosx-10.5-intel-2.7/check_libyaml.c: In function
‘main’:
build/temp.macosx-10.5-intel-2.7/check_libyaml.c:5: error:
‘yaml_parser_t’ undeclared (first use in this function)
build/temp.macosx-10.5-intel-2.7/check_libyaml.c:5: error: (Each
undeclared identifier is reported only once
build/temp.macosx-10.5-intel-2.7/check_libyaml.c:5: error: for
each function it appears in.)
build/temp.macosx-10.5-intel-2.7/check_libyaml.c:5: error:
expected ‘;’ before ‘parser’
build/temp.macosx-10.5-intel-2.7/check_libyaml.c:6: error:
‘yaml_emitter_t’ undeclared (first use in this function)
build/temp.macosx-10.5-intel-2.7/check_libyaml.c:6: error:
expected ‘;’ before ‘emitter’
build/temp.macosx-10.5-intel-2.7/check_libyaml.c:8: error:
‘parser’ undeclared (first use in this function)
build/temp.macosx-10.5-intel-2.7/check_libyaml.c:11: error:
‘emitter’ undeclared (first use in this function)
lipo: can't open input file: /var/folders/Ay/AyvniV1bGgGjRXMBvvzWnk
+++TI/-Tmp-//ccAhi9O8.out (No such file or directory)

libyaml is not found or a compiler error: forcing --without-
libyaml
(if libyaml is installed correctly, you may need to
specify the option --include-dirs or uncomment and
modify the parameter include_dirs in setup.cfg)

Running setup.py install for nltk

Successfully installed PyYAML nltk
Cleaning up...
(ml)09:24 AM mbpbt:ml> python
Python 2.7.2 (default, Jul 10 2011, 04:08:06)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> exit()
(ml)09:26 AM mbpbt:ml> pip freeze
PyYAML==3.09
distribute==0.6.19
matplotlib==1.1.0
nltk==2.0.1rc1
numpy==1.6.0
virtualenv==1.6.1
virtualenvwrapper==2.7.1
wsgiref==0.1.2
(ml)09:26 AM mbpbt:ml>


Mikhail Korobov

unread,
Jul 20, 2011, 10:48:32 AM7/20/11
to nltk-...@googlegroups.com, Todd Krause
Hi Todd,

Great to hear the problem is solved!

Don't worry about errors with libyaml - it is a library used to speed up PyYAML but PyYAML works fine without it. I don't think PyYAML speed is critical for nltk so there is nothing wrong in using pure-python PyYAML.

Anyway, in order to get PyYAML compile with libyaml support correct libyaml binaries and headers should be installed. I think I managed to do this using brew (with no success with macports because it doesn't provide necessary headers), but this was long time ago and maybe things changed now.

bobtodd

unread,
Jul 20, 2011, 11:04:58 AM7/20/11
to nltk-users, Todd Krause
Howdy Mikhail,

Great. Thanks for the explanation. The note about Homebrew helps;
I've found that package manager very nice for managing library
issues. But perhaps I'll leave the libyaml issue until my NLTK or
YAML use gets heavy enough to warrant a fix.

Thanks again!
Todd
Reply all
Reply to author
Forward
0 new messages