how to use maximum entropy classifier in NLTK?

2,696 views
Skip to first unread message

Raymond

unread,
Mar 17, 2010, 3:05:16 AM3/17/10
to nltk-users
Dear all,
I am doing a text classification project and I've used
naive byes classifier and it works ok. When I try to use the maxent
classifier, I don't know how to use it. Although the API doc tells me
what to fill inside the method, but I actually don't understand. Can
someone teach me?
now I got a featureset of a list of tuple formed by a dictionary and a
class label, ie: [ ( dict, label ) ], which was used
by the naive byes classifier.
Thank you everyone.


Regards,
Raymond

Raymond

unread,
Mar 17, 2010, 3:07:47 AM3/17/10
to nltk-users
here is the 2 text from the api doc, I don't understand what does it
saying...........

-------------------------------------------------------------------------------------------------------------------------
Converting Input-Features to Joint-Features

In maximum entropy models, joint-features are required to have numeric
values. Typically, each input-feature input_feat is mapped to a set of
joint-features of the form:

joint_feat(token, label) = { 1 if input_feat(token) == feat_val
{ and label == some_label
{
{ 0 otherwise
----------------------------------------------------------------------------------------------------------------------------
A feature encoding that generates vectors containing a binary joint-
features of the form:

joint_feat(fs, l) = { 1 if (fs[fname] == fval) and (l == label)
{
{ 0 otherwise
Where fname is the name of an input-feature, fval is a value for that
input-feature, and label is a label.

Typically, these features are constructed based on a training corpus,
using the train() method. This method will create one feature for each
combination of fname, fval, and label that occurs at least once in the
training corpus.

The unseen_features parameter can be used to add unseen-value
features, which are used whenever an input feature has a value that
was not encountered in the training corpus. These features have the
form:

joint_feat(fs, l) = { 1 if is_unseen(fname, fs[fname])
{ and l == label
{
{ 0 otherwise
Where is_unseen(fname, fval) is true if the encoding does not contain
any joint features that are true when fs[fname]==fval.

The alwayson_features parameter can be used to add always-on features,
which have the form:

joint_feat(fs, l) = { 1 if (l == label)
{
{ 0 otherwise

----------------------------------------------------------------------------------------------------------------------------

Jacob Perkins

unread,
Mar 17, 2010, 8:26:22 PM3/17/10
to nltk-users
AFAIK you need to call the train classmethod with a list of
featuresets. That'll give you a trained instance you can use for
classification.

Jacob
---
http://streamhacker.com
http://twitter.com/japerk

terrypink

unread,
Apr 7, 2010, 4:58:29 PM4/7/10
to nltk-users

Could someone provide some more guidance on how to implement the
maximum entropy model? I had the same issue with the API. Does the
book offer any code that is similar?

dmtr

unread,
Apr 7, 2010, 10:20:31 PM4/7/10
to nltk-users
Are you looking for a usage example? You can find a working example
here:
http://docs.huihoo.com/nltk/0.9.5/guides/classify.html

-- D

Steven Bird

unread,
Apr 9, 2010, 4:12:15 AM4/9/10
to nltk-...@googlegroups.com

Would anyone be interested in maintaining a wiki page on the NLTK site that points to 3rd party documentation about NLTK?  It could involve more than one person.

There's some great resources out there but they're hard to find reliably.

-Steven Bird

On 8 Apr 2010 12:20, "dmtr" <dchi...@gmail.com> wrote:

Are you looking for a usage example? You can find a working example
here:
http://docs.huihoo.com/nltk/0.9.5/guides/classify.html

-- D



On Apr 7, 1:58 pm, terrypink <hsinte...@hotmail.com> wrote:

> Could someone provide some more guid...

JAGANADH G

unread,
Apr 9, 2010, 4:46:09 AM4/9/10
to nltk-...@googlegroups.com
On Fri, Apr 9, 2010 at 1:42 PM, Steven Bird <steve...@gmail.com> wrote:

Would anyone be interested in maintaining a wiki page on the NLTK site that points to 3rd party documentation about NLTK?  It could involve more than one person.


+1 I am interested in this task
 

There's some great resources out there but they're hard to find reliably.



--
**********************************
JAGANADH G
http://jaganadhg.freeflux.net/blog

Maciej Pastuszka

unread,
Apr 9, 2010, 6:33:26 AM4/9/10
to nltk-users
On Fri, Apr 9, 2010 at 1:42 PM, Steven Bird <stevenbi...@gmail.com>

wrote:
> Would anyone be interested in maintaining a wiki page on the NLTK site that
> points to 3rd party documentation about NLTK? It could involve more than
> one person.

You can count me in as well. I would be happy to offer my help.

Reply all
Reply to author
Forward
0 new messages