Regards,
Raymond
-------------------------------------------------------------------------------------------------------------------------
Converting Input-Features to Joint-Features
In maximum entropy models, joint-features are required to have numeric
values. Typically, each input-feature input_feat is mapped to a set of
joint-features of the form:
joint_feat(token, label) = { 1 if input_feat(token) == feat_val
{ and label == some_label
{
{ 0 otherwise
----------------------------------------------------------------------------------------------------------------------------
A feature encoding that generates vectors containing a binary joint-
features of the form:
joint_feat(fs, l) = { 1 if (fs[fname] == fval) and (l == label)
{
{ 0 otherwise
Where fname is the name of an input-feature, fval is a value for that
input-feature, and label is a label.
Typically, these features are constructed based on a training corpus,
using the train() method. This method will create one feature for each
combination of fname, fval, and label that occurs at least once in the
training corpus.
The unseen_features parameter can be used to add unseen-value
features, which are used whenever an input feature has a value that
was not encountered in the training corpus. These features have the
form:
joint_feat(fs, l) = { 1 if is_unseen(fname, fs[fname])
{ and l == label
{
{ 0 otherwise
Where is_unseen(fname, fval) is true if the encoding does not contain
any joint features that are true when fs[fname]==fval.
The alwayson_features parameter can be used to add always-on features,
which have the form:
joint_feat(fs, l) = { 1 if (l == label)
{
{ 0 otherwise
----------------------------------------------------------------------------------------------------------------------------
Jacob
---
http://streamhacker.com
http://twitter.com/japerk
-- D
Would anyone be interested in maintaining a wiki page on the NLTK site that points to 3rd party documentation about NLTK? It could involve more than one person.
There's some great resources out there but they're hard to find reliably.
-Steven Bird
On 8 Apr 2010 12:20, "dmtr" <dchi...@gmail.com> wrote:
Are you looking for a usage example? You can find a working example
here:
http://docs.huihoo.com/nltk/0.9.5/guides/classify.html
-- D
On Apr 7, 1:58 pm, terrypink <hsinte...@hotmail.com> wrote:
> Could someone provide some more guid...
Would anyone be interested in maintaining a wiki page on the NLTK site that points to 3rd party documentation about NLTK? It could involve more than one person.
There's some great resources out there but they're hard to find reliably.
You can count me in as well. I would be happy to offer my help.