How to look up a training data associated with a certain rule which NLTK BrillTaggerTrainer has produced

17 views
Skip to first unread message

Min Jun Park

unread,
Apr 2, 2015, 3:13:56 AM4/2/15
to nltk-...@googlegroups.com

I wonder if there's any way to look up training data which is modified by the specific rule that Brill tagger has produced.

Taking the document of BrillTaggerTrainer as an example,

>>> training_data = treebank.tagged_sents()[:100]

>>> tagger1 = tt.train(training_data, max_rules=10)
    TBL train (fast) (seqs: 100; tokens: 2417; tpls: 2; min score: 2; min acc: None)
    Finding initial useful rules...
        Found 845 useful rules.
    <BLANKLINE>
               B      |
       S   F   r   O  |        Score = Fixed - Broken
       c   i   o   t  |  R     Fixed = num tags changed incorrect -> correct
       o   x   k   h  |  u     Broken = num tags changed correct -> incorrect
       r   e   e   e  |  l     Other = num tags changed incorrect -> incorrect
       e   d   n   r  |  e
    ------------------+-------------------------------------------------------
     132 132   0   0  | AT->DT if Pos:NN@[-1]
      85  85   0   0  | NN->, if Pos:NN@[-1] & Word:,@[0]
      69  69   0   0  | NN->. if Pos:NN@[-1] & Word:.@[0]
      51  51   0   0  | NN->IN if Pos:NN@[-1] & Word:of@[0]
      47  63  16 161  | NN->IN if Pos:NNS@[-1]
      33  33   0   0  | NN->TO if Pos:NN@[-1] & Word:to@[0]
      26  26   0   0  | IN->. if Pos:NNS@[-1] & Word:.@[0]
      24  24   0   0  | IN->, if Pos:NNS@[-1] & Word:,@[0]
      22  27   5  24  | NN->-NONE- if Pos:VBD@[-1]
      17  17   0   0  | NN->CC if Pos:NN@[-1] & Word:and@[0]

Suppose I'm interested in a specific tbl.rule, say the first line of above example AT->DT if Pos:NN@[-1],
is it possible to look up related training data(i.e., some sentences among training_data) that the rule of AT->DT if Pos:NN@[-1] used?

here's the question thread in stackoverflow.com
Reply all
Reply to author
Forward
0 new messages