Training sentence: [('kurian', 'O'), ...('the', 'O'), ('proceedings', 'B-LEGAL'), ('by', 'O'), ('notification', 'B-LEGAL'), ('dated', 'O'), ... ('the', 'O'), ('land', 'B-LEGAL'), ('acquisition', 'I-LEGAL'), ('act', 'B-LEGAL'), ... ('of', 'O'), ('acquisition', 'B-LEGAL'), ('is', 'O'), ('residential', 'O'), ('and', 'O'), ('commercial', 'B-LEGAL'), ('for', 'O'),...('period', 'B-LEGAL'), ('of', 'O'), ('three', 'O'), ('months', 'O'), ('from', 'O'), ('today', 'O')]
from nltk.tag import hmm
print("Training sentence: {}".format(train_data[0]))
trainer = hmm.HiddenMarkovModelTrainer()
tagger = trainer.train_supervised(train_data)
print(tagger)
for tst in test_x[:20]:
test_sentence = " ".join(tst)
print("Test sentence: {}".format(test_sentence))
result = tagger.tag(test_sentence.split())
print("Tagged sentence: {}".format(result))
catchphrases = [ w for w,t in result if "LEGAL" in t]
print("Catchphrases: {}".format(catchphrases))
Test sentence: 1 after hea...ame are dismissed
Tagged sentence: [('1', 'O'), ..., ('customs', 'B-LEGAL'), ('excise', 'I-LEGAL'), ('service', 'I-LEGAL'), ('tax', 'I-LEGAL'), ('appellate', 'O'),...('dismissed', 'O')]
Catchphrases: ['customs', 'excise', 'service', 'tax']
Test sentence: 1 this app.....accordingly
Tagged sentence: [('1', 'O'), ...('ordered', 'O'), ('accordingly', 'O')]
Catchphrases: []
Test sentence: 1 an issue ... costs
Tagged sentence: [('1', 'O'), ('an', 'O'), ('issue', 'O'), ... ('to', 'O'), ('costs', 'O')]
Catchphrases: []