Hi,
Does the
nltk.tbl.rule.TagRule
work correctly with
nltk.tag.brill.BrillTagger
Or are both an obsolete feature of the code?
I tried to create my own TagRule as follows:
from nltk.tag.brill import BrillTagger
from nltk.tbl.rule import TagRule
from nltk.tag.perceptron import PerceptronTagger
tagger0=PerceptronTagger()
rulelist=[MyRule("NNP","PASKA")]
btag=BrillTagger(tagger0,rulelist)
The "MyRule" class is defined as follows:
class MyRule(TagRule):
# these come from the init ..
# original_tag = None # The tag which this TagRule may cause to be replaced.
# replacement_tag = None # The tag with which this TagRule may replace another tag.
def applies(self, tokens, index):
paska
print("MyRule: applies: tokens=",tokens)
# return False
return True
def apply(self, tokens, positions=None):
# print("MyRule: apply: tokens=",tokens,"\n")
# print("MyRule: apply: positions=",positions,"\n\n")
# Returns: the indices of tokens whose tags were changed by this rule.
# .. so, if we have original_tag="NNP" and there is tag "NNP" in position, say, 2, then we can return a list that includes 2
modded_indices=[]
for cc, pair in enumerate(tokens):
if (pair[0].lower()=="corporation" and pair[1]=="NNP"):
print("MyRule: apply: modding pair",pair)
modded_indices.append(cc)
return modded_indices
Some observations:
1) When running, the tagger on words, "applies" method is never called
2) "apply" method is called correctly, but it never modifies (i.e. re-tags) anything
Help appreciated! Couldn't find any examples of using the TagRule class anywhere (so maybe I'm wasting my time here..)
Regards,
Sampsa