Entity Recognition Using CLEARTK

33 views

Skip to first unread message

Utkarsh Srivastava

unread,

Sep 3, 2015, 8:02:54 AM9/3/15

to cleartk-users

Hi All,

I am fairly new to ClearTK. I am trying to implement a solution for Entity Recognition ( lets say the Entity is University).

I have hand-labelled data where all tokens/phrases which are universities are labelled as University for ex. ( I graduated from <UNIVERSITY>SKU<UNIVERSITY> Singapore). I have also written a UIMA annotator which runs a pipepline and identifies all UNIVERSITY labels, creates UniversityAnnotation and adds to index. All is well till now.

Further, I want to use ClearTK to use the annotation and known features like context, shape etc to learn and identify the annotation for test data. I looked at the examples on clearTK ex. NamedEntityChunker which identifies well known classes like LOCATION, PERSON etc. I could not find example for identifying custom annotations from hand labelled data.

What I am not sure is that should my annotation (UNIVERSTIY) be another type in NamedEntity mentiontype or can the UIMA annotation be used independently where a token is identified as UNIVERSITY or NOT A UNIVERSITY . I am looking for directions as what are the best practices/approach to solve this kind of problem.