MaxentClassifier megam algorithm installation question

1,135 views
Skip to first unread message

Bio

unread,
Jan 5, 2013, 8:49:14 AM1/5/13
to nltk-...@googlegroups.com
Hello, I am attempting to use the megam algorithm with the Maxent Classifier. When I try to use the command self.classifier = nltk.MaxentClassifier.train( train_set, algorithm='megam', trace = 0) I get the following error:

>>> 

Traceback (most recent call last):
  File "/Users/George Orton/text_proc_7.py", line 124, in <module>
    chunker = ConsecutiveNPChunker(train_sents)
  File "/Users/George Orton/text_proc_7.py", line 112, in __init__
    self.tagger = ConsecutiveNPChunkTagger(tagged_sents)
  File "/Users/George Orton/text_proc_7.py", line 97, in __init__
    train_set, algorithm='megam', trace=0)
  File "/Library/Frameworks/Python.framework/Versions/7.3/lib/python2.7/site-packages/nltk/classify/maxent.py", line 315, in train
    gaussian_prior_sigma, **cutoffs)
  File "/Library/Frameworks/Python.framework/Versions/7.3/lib/python2.7/site-packages/nltk/classify/maxent.py", line 1518, in train_maxent_classifier_with_megam
    stdout = call_megam(options)
  File "/Library/Frameworks/Python.framework/Versions/7.3/lib/python2.7/site-packages/nltk/classify/megam.py", line 163, in call_megam
    config_megam()
  File "/Library/Frameworks/Python.framework/Versions/7.3/lib/python2.7/site-packages/nltk/classify/megam.py", line 59, in config_megam
  File "/Library/Frameworks/Python.framework/Versions/7.3/lib/python2.7/site-packages/nltk/internals.py", line 528, in find_binary
    url, verbose)
  File "/Library/Frameworks/Python.framework/Versions/7.3/lib/python2.7/site-packages/nltk/internals.py", line 512, in find_file
    raise LookupError('\n\n%s\n%s\n%s' % (div, msg, div))
LookupError: 

===========================================================================
NLTK was unable to find the megam file!
Use software specific configuration paramaters or set the MEGAM environment variable.

  For more information, on megam, see:
===========================================================================
>>> 

To try and resolve this I did a web search on installing megam and found: 

MegaM: http://hal3.name/megam/megam_src.tgz (requires ocaml; edit Makefile to setWITHCLIBS to point to location of ocaml .h files; locate these using macports withport contents ocaml, install megam binary, e.g. to /usr/local/bin and setMEGAMHOME to point to this directory).

I then downloaded megam and tried again but got the same error message. Because these direction say that ocaml is required and that I must edit Makefile to set WITHCLIBS to point to the location of ocaml .h files I typed port contents ocaml at the terminal prompt and got:

george-ortons-macbook-pro:~ George Orton$ port contents ocaml
Warning: port definitions are more than two weeks old, consider using selfupdate
Port ocaml is not installed.

What I'm hoping is that someone can help me with this megam installation. Am I correct in assuming that ocaml is a programming language that I must first install in order to use megam? Any help with this megam installation process would be appreciated. Thanks, George

Steven Bird

unread,
Jan 5, 2013, 4:07:26 PM1/5/13
to nltk-...@googlegroups.com
Hi George,

First you need to install OCaml, then build MegaM from source. See also:

-Steven Bird
--
 
 

Bio

unread,
Jan 6, 2013, 8:14:07 AM1/6/13
to nltk-...@googlegroups.com
Hi Steven, Thank you for your response. I will try to document the process I go through as I install megam. In hopes of providing a reference for future questions about megam installation. Sincerely, George

Bio

unread,
Jan 7, 2013, 9:06:44 AM1/7/13
to nltk-...@googlegroups.com
Hello, In my ongoing quest to install MegaM I have installed OCaml. I downloaded OCaml from "http://caml.inria.fr/download.en.html" and then installed it. According to the information I obtained from a web search I must next "edit Makefile to set WITHCLIBS to point to location of ocaml .h files; locate these using macports with "port contents ocaml"". So first I typed in port contents ocaml at the Terminal prompt. I got an error message saying:

george-ortons-macbook-pro:~ George Orton$ port contents ocaml
Port ocaml is not installed.

So next I used finder ( I am on a Mac) to try and locate any ocaml.h files but was unable to locate any, there are lots of ocaml files on my system but none that are .h (at least none that I could find). I installed OCaml after downloading it so it's not clear to me why I am getting the not installed message when using Macports. If anybody has any ideas on how to proceed from here I would appreciate the help. I know this is not specifically an nltk question, but since I only wish to use MegaM in conjunction with nltk I am posting in this forum. Sincerely, George

Steven Bird

unread,
Jan 7, 2013, 3:50:25 PM1/7/13
to nltk-...@googlegroups.com
Hi George,

Macports only knows about software that was installed using macports. Since you didn't use macports to install ocaml, you can't use macports to tell you where it is installed. You could look at the ocaml makefile to see where it installs ocaml. Or you could reinstall ocaml using macports (sudo port install ocaml). Or you could re-try searching for the ocaml install location using your finder (since it may have re-indexed your hard drive by now).

-Steven Bird
--
 
 

Leon Derczynski

unread,
Jan 7, 2013, 4:08:31 PM1/7/13
to nltk-...@googlegroups.com
Hi George,

If megam is troublesome to install on your platform, you may prefer to use the MaxEnt classifier that is bundled with NLTK. While a little slower in many situations, it is also ample, runs natively in Python and suitable for a wide range of experimental purposes. It may be worth using this to conduct your experiments before investing (a potentially long) time coaxing megam into running on your platform. Good luck either way!

All the best,


Leon


--
 
 



--
Leon R A Derczynski
Research Associate, NLP Group

Department of Computer Science
University of Sheffield
Regent Court, 211 Portobello
Sheffield S1 4DP, UK

+45 5157 4948
http://www.dcs.shef.ac.uk/~leon/

Fred Mailhot

unread,
Jan 7, 2013, 5:35:02 PM1/7/13
to nltk-...@googlegroups.com
Caveat: I have had a hard time using the NLTK MaxEnt classifier for any but the most trivial problems.

Training for a classifier-based POS tagger took ~30minutes with MegaM[1], but never convered with the NLTK classifier. By "never" I mean I gave up after >10hrs with all 8 cores of my Macbook slammed.

[1] I.e. I have successfully installed MegaM on my MBP, OSX 10.7...I *may* have documented the process, if so, I'll report here ASAP.


--
 
 

Bio

unread,
Jan 8, 2013, 9:35:59 AM1/8/13
to nltk-...@googlegroups.com
Hi Steven, Leon & Fred, Thank you for your responses. Fred, if by chance you documented the process of successfully loading MegaM onto you MBP I would greatly appreciate your posting the process here. Steven I decided to try your suggestion and reinstall ocaml using Macports. Unfortunately I received an error message:

george-ortons-macbook-pro:~ George Orton$ sudo port install ocaml
Password:
Error: Unable to open port: can't read "build.cmd": Failed to locate 'make' in path: '/opt/local/bin:/opt/local/sbin:/bin:/sbin:/usr/bin:/usr/sbin' or at its MacPorts configuration time location, did you move it?

I'm not sure what this error means, my best guess is that the macports install of ocaml is faulty. Perhaps macports is unable to install ocaml because I have already installed ocaml without using macports and I would need to uninstall ocaml before attempting a macports ocaml install. Because I was unsuccessful with the macports install I decided to try and edit megam's Makefile to set WITHCLIBS to the location of the ocaml .h files. This is per the instructions I originally found doing an internet search. The problem with this is that I'm not sure what exactly an ocaml .h file is. When I look up ocaml in Finder I find 470 different files and folders none of which end in .h. What I decided to try was taking the path to the folder labeled ocaml and using that as the location to set the WITHCLIBS location. Here is what my original WITHCLIBS location read:

WITHCLIBS =-I /usr/lib/ocaml/caml

and here is what I changed it to:

WITHCLIBS =-I /opt/local/var/macports/sources/rsync.macports.org/release/ports/lang

I found it odd that the path listed macports since my macports install failed, but I wasn't sure what else to try. Unfortunately this Makefile edit did not solve the problem. When I ran my original self.classifier = nltk.MaxentClassifier.train( train_set, algorithm='megam', trace = 0) code ( I am running the code from chapter 7, page 275 of the Natural Language Processing with Python text) I got the same error message I originally encountered:

NLTK was unable to find the megam file!
Use software specific configuration paramaters or set the MEGAM environment variable.

Clearly, the Makefile edit I tried is not correct. In reading the MegaM documentation I read about the nltk.config_megam("path to megam")  command. So I also tried putting that in my code. Unfortunately that also gave an error message:

NLTK was unable to find the /Users/George Orton/Downloads file!
Use software specific configuration paramaters or set the MEGAM environment variable.

To proceed from this point I think I need to know which file is being referenced in the directions which say edit the Makefile file with the location of the ocaml .h file. Which of the 470 ocaml files/folders is the ocaml .h file. I am also not clear on whether I need to use the nltk.config_megam("path to megam")  command when attempting to use the megam algorithm with the Maxent Classifier. If anybody has any thoughts on how to proceed from here I would greatly appreciate the input. Sincerely, George

Mikhail Korobov

unread,
Jan 8, 2013, 10:35:06 AM1/8/13
to nltk-...@googlegroups.com
I also gave up installing MegaM (using ocaml from homebrew); editing the makefile didn't help me either :) 

Classifiers in scikit-learn are really good and efficient; there is LogisticRegression classifier - see http://scikit-learn.org/dev/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression and  http://scikit-learn.org/dev/modules/linear_model.html#logistic-regression . You can use it directly or through a NLTK compat layer (see https://github.com/nltk/nltk/blob/master/nltk/classify/scikitlearn.py

вторник, 8 января 2013 г., 20:35:59 UTC+6 пользователь Bio написал:

Bio

unread,
Jan 9, 2013, 9:44:01 AM1/9/13
to nltk-...@googlegroups.com
Hello, I am still working on trying to get MegaM to work with the Maxent Classifier. I decided to try reinstalling ocaml to see if the install directories would be listed during the install process. It turns out the directory path was listed. The command line executables are all installed in /usr/local/bin. Armed with this information I went into megam's Makefile file and edited the WITHCLIBS line to show this path. I then ran my code and unfortunately got the same error message as before. 

NLTK was unable to find the megam file!
Use software specific configuration paramaters or set the MEGAM environment variable.

Based on this error message I am beginning to think that it is not the WITHCLIBS setting inside megam's Makefile file that is causing the problem but the nltk specific command:

nltk.config_megam('.../path/to/megam')

which is causing the problem. I have tried adding this command to my code and I get a similar but different error message. Here is my version of the config_megam:

nltk.config_megam("/Users/George Orton/Downloads/megam_0.92")

and here is the error message I get when I run my code with this command included:

NLTK was unable to find the /Users/George Orton/Downloads/megam_0.92 file!
Use software specific configuration paramaters or set the MEGAM environment variable.

From this error message I believe that perhaps the path I am supposed to use in the config_megam command is supposed to be one of the files inside the megam folder rather than the path to the megam folder itself. So, I am wondering if anybody has any experience with this particular config_megam command and knows what '.../path/to/megam' should be? In the interim I changed the classifier from the Maxent to the NaiveBayes classifier in order to continue working through the concept presented in the Natural Language Processing with Python text where this question arose. Sincerely, George

Matthew Versaggi

unread,
Jun 21, 2015, 2:15:30 PM6/21/15
to nltk-...@googlegroups.com

FYI ....

My Posting to StackOverflow on the issue of getting MegaM running on a Windows7 Box:
Reply all
Reply to author
Forward
0 new messages