Support for Opennlp NameFinder

8 views
Skip to first unread message

Majid Laali

unread,
Nov 2, 2015, 2:17:03 PM11/2/15
to cleartk-developers
Hi, 

I am going to extend the cleartk-opennlp-tools project and add a wrapper for Opennlp NameFinder to this project. I would be grateful if someone guide me how an UIMA type should be added to the cleartk projects.

More specifically, NameFinder annotates texts by spans. Each span contains start, end and a type (e.g. "person", "date", "location", etc.). I believe we cannot use the existing cleartk types (e.g. Chunk or NameEntity) for this purpose. Therefore, my suggestions is to create a new type, lets call it Span. My question is:
should the Span type be added to the cleartk-type-system project or the cleartk-opennlp-tools project?

Thanks, 
Majid




Majid Laali   PhD StudentConcordia University
 1515 St. Catherine St. West, EV9-401, Montreal QC, Canada
 

Lee Becker

unread,
Nov 2, 2015, 3:15:22 PM11/2/15
to cleartk-d...@googlegroups.com

On Mon, Nov 2, 2015 at 12:17 PM, Majid Laali <mjl...@gmail.com> wrote:
Hi, 

I am going to extend the cleartk-opennlp-tools project and add a wrapper for Opennlp NameFinder to this project. I would be grateful if someone guide me how an UIMA type should be added to the cleartk projects.

More specifically, NameFinder annotates texts by spans. Each span contains start, end and a type (e.g. "person", "date", "location", etc.). I believe we cannot use the existing cleartk types (e.g. Chunk or NameEntity) for this purpose. Therefore, my suggestions is to create a new type, lets call it Span. My question is:
should the Span type be added to the cleartk-type-system project or the cleartk-opennlp-tools project?

Thanks, 
Majid

You should be able to use org.cleartk.ne.type.NamedEntity?  While it has additional fields relevant for the ACE data, your new NameFinder annotator need not populate all of them.  You can simply take its output and set the NamedEntity's start, end, and entityType fields before stuffing it into the CAS.

Majid Laali

unread,
Nov 2, 2015, 4:23:25 PM11/2/15
to cleartk-developers
Thanks Lee, I was reluctant to use org.cleartk.ne.type.NamedEntity because as you said most of its fields will be null. However, I will take you advice and re-use it for NameFinder. 

Thanks, 
Majid






Majid Laali   PhD CandidateConcordia University
 1515 St. Catherine St. West, EV9-401, Montreal QC, Canada
 

Majid Laali

unread,
Nov 2, 2015, 5:36:51 PM11/2/15
to cleartk-developers
I realized I may not understand your point well. org.cleartk.ne.type.NamedEntity extends from org.cleartk.score.type.ScoredTOP and therefore to annotate a name entity, I have to use both  org.cleartk.ne.type.NamedEntityMention and org.cleartk.ne.type.NamedEntity at same time. Is this correct?

Thanks, 
Majid






Majid Laali   PhD CandidateConcordia University
 1515 St. Catherine St. West, EV9-401, Montreal QC, Canada
 


Steven Bethard

unread,
Nov 6, 2015, 4:55:37 PM11/6/15
to cleartk-developers
You want NamedEntityMention and not NamedEntity. NamedEntity is for linking a bunch of NamedEntityMentions together.

Steve

--
You received this message because you are subscribed to the Google Groups "cleartk-developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cleartk-develop...@googlegroups.com.
To post to this group, send email to cleartk-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/cleartk-developers.
For more options, visit https://groups.google.com/d/optout.

Majid Laali

unread,
Nov 7, 2015, 11:24:57 AM11/7/15
to cleartk-developers
Thank you Steve, 

I added a support for Apache Open NLP Named Entity Recognition. Please consider my pull request.

Thanks, 
Majid




Majid Laali   PhD CandidateConcordia University
 1515 St. Catherine St. West, EV9-401, Montreal QC, Canada
 


Steven Bethard

unread,
Nov 9, 2015, 2:40:07 PM11/9/15
to cleartk-developers
Thanks so much for putting together a pull request. I'll try to take a look at it as soon as I can, but it may be a few weeks.

Reply all
Reply to author
Forward
0 new messages