stanford.py

2,518 views
Skip to first unread message

lisa

unread,
Nov 30, 2012, 9:12:15 AM11/30/12
to nltk-...@googlegroups.com
I just updated to the last version nltk and run the following code but got errors: NLTK was unable to find the java file!

It's odd since  the first line output says it found java executables but later it complains again about the java file! I want to make it work asap. I will wait online!!! Thanks

import nltk
from nltk.tag.stanford import StanfordTagger
nltk.internals.config_java("C:/Program Files/Java/jdk1.7.0_04/bin/java.exe")
path_to_model = "C:/Dropbox/_Ubuntu/project/NLTK/src/lib/wsj-0-18-bidirectional-distsim.tagger"
path_to_jar = "C:/Dropbox/_Ubuntu/project/NLTK/src/lib/stanford-postagger.jar"

tagger = StanfordTagger(path_to_model, path_to_jar)
tokens = nltk.tokenize.word_tokenize("I hope this works!")
print tagger.tag(tokens)



[Found C:/Program Files/Java/jdk1.7.0_04/bin/java.exe: C:/Program Files/Java/jdk1.7.0_04/bin/java.exe]

Traceback (most recent call last):
  File "C:\Users\Administrator\Desktop\test.py", line 9, in <module>
    print tagger.tag(tokens)
  File "C:\Python27\lib\site-packages\nltk\tag\stanford.py", line 54, in tag
    return self.batch_tag([tokens])[0]
  File "C:\Python27\lib\site-packages\nltk\tag\stanford.py", line 59, in batch_tag
    config_java(options=self.java_options, verbose=False)
  File "C:\Python27\lib\site-packages\nltk\internals.py", line 90, in config_java
    _java_bin = find_binary('java', bin, env_vars=['JAVAHOME', 'JAVA_HOME'], verbose=verbose)
  File "C:\Python27\lib\site-packages\nltk\internals.py", line 528, in find_binary
    url, verbose)
  File "C:\Python27\lib\site-packages\nltk\internals.py", line 512, in find_file
    raise LookupError('\n\n%s\n%s\n%s' % (div, msg, div))
LookupError: 

===========================================================================
NLTK was unable to find the java file!
Use software specific configuration paramaters or set the JAVAHOME environment variable.
===========================================================================

Morten Minde Neergaard

unread,
Nov 30, 2012, 11:20:50 AM11/30/12
to nltk-...@googlegroups.com
At 06:12, Fri 2012-11-30, lisa wrote:
> I just updated to the last version nltk and run the following code but got
> errors: NLTK was unable to find the java file!

Recently fixed the code so it gives a warning when you instanciate the
StanfordTagger directly – Try stanford.POSTagger or .NERTagger.


Cheers,
--
Morten Minde Neergaard

lisa

unread,
Dec 3, 2012, 8:02:26 PM12/3/12
to nltk-...@googlegroups.com
Update to the latest version of nltk and it works like a charm.
Also, I tested the same code on 64bit windows and 64 bit Ubuntu, the later one seems to work ok but the previous one fails. It kept complaining about java path on windows.

A note for everyone.

lisa

unread,
Feb 21, 2013, 7:22:52 PM2/21/13
to nltk-...@googlegroups.com
I found this issue too.   64 bit windows just not working?

I have done the following
set JAVA_HOME, JAVAHOME
do this stuff: nltk.internals.config_java("C:/Program Files/Java/jdk1.7.0_04/bin/java.exe") in code

Kenneth

unread,
Feb 25, 2013, 4:04:05 PM2/25/13
to nltk-...@googlegroups.com
I have had the same problem on Window 7 64-bit where I would get the - NLTK was unable to find the java file!

I set the JAVA_HOME and manually set nltk.internals.config_java to no avail.

I have been able to get it working by commenting out two lines in the batch_tag function in \nltk\tag\stanford.py

The lines are line 59 and 85.

config_java(options=self.java_options, verbose=False)
and 
config_java(options=default_options, verbose=False)
respectively.

If you still need to pass the java_options it should also work if you modify these calls to include the path to the jvm as in - config_java(bin=None, options=None, verbose=True)

I'm still unsure why this is happening on 64-bit windows but it is something I will be looking into more in the next few days, hope this helps.

Kenneth

lisa

unread,
Feb 25, 2013, 4:16:41 PM2/25/13
to nltk-...@googlegroups.com
Hi Kenneth, 
    I marked your solution down as a backup. I got it working recently by hard code the java path of my system into the internals.py. I think the internals.py is designed to work on Linux as the java object doesn't have an extension (exe.)
    Hope this thread can help others save some live brains cells.

dgraham

unread,
May 20, 2013, 10:33:12 PM5/20/13
to nltk-...@googlegroups.com
Hi All,

I just had this problem today and found several bugs in internals.py, who can I submit a potential patch to such that it will work on both Windows and linux/mac?

Thanks and regards,

Dougal Graham
Message has been deleted

arjun govani

unread,
May 7, 2014, 6:22:22 AM5/7/14
to nltk-...@googlegroups.com, ryal...@gmail.com
Hi Kenneth,

I have same problem for 62 bit windows. And I don't have rights to update Stanford.py file so I cant able to comment those two line in Stanford.py file so is there any other solution??

Thank You,
Arjun  

K R Anushree

unread,
Apr 22, 2015, 3:55:26 AM4/22/15
to nltk-...@googlegroups.com
simple solution, worked for me :- (no need to make any changes in internals.py, update nltk or call to nltk.internals.config_java()   -   which did not work for me :( :( )

Add the following to your python code :-
import os
os.environ['JAVAHOME'] = "C:/Program Files/Java/jdk1.8.0_31/bin"

and now use NERTagger as :-
from nltk.tag.stanford import NERTagger
st = NERTagger('stanford-ner-2014-06-16/classifiers/english.all.3class.distsim.crf.ser.gz','stanford-ner-2014-06-16/stanford-ner.jar')
st.tag('John has refused the offer from Facebook'.split())

Output :- [(u'John', u'PERSON'), (u'has', u'O'), (u'refused', u'O'), (u'the', u'O'), (u'offer', u'O'), (u'from', u'O'), (u'Facebook', u'ORGANIZATION'), (u'.', u'O')]

Worked like charm for me ...
Windows 7 64 bit machine


Reply all
Reply to author
Forward
0 new messages