Testing Epic for named entity extraction

43 views

Skip to first unread message

Scott Smith

unread,

Mar 24, 2015, 2:01:54 PM3/24/15

to scalanlp...@googlegroups.com

In my work, we're currenlty using OpenNLP for named entity extraction. I want to do some testing with Epic just to see how they compare (since we code in Scala, it would be nice to use a Scala library). Here is the code I'm using:

val pipeline = {

MLSentenceSegmenter.bundled("en").get andThen TreebankTokenizer

}

val ner = Segmenter.nerSystem(NerSelector.loadNer("en").get.asInstanceOf[SemiCRF[String, String]])

try {

val preprocessedSlab = pipeline(Slab(txt))

val nered = ner(preprocessedSlab)

for ((span, sentence) <- nered.iterator[Sentence] if span.nonEmpty) {

for ((espan, entity) <- nered.covered[EntityMention](span)) {

println(entity.entityType + " " + preprocessedSlab.spanned(espan))

}

} catch {

case ex: Exception => println(s"Error while processing $txt", ex)

}

I tested this against this text:

Singer-songwriter David Crosby hit a jogger with his car Sunday evening, a spokesman said. The accident happened in Santa Ynez, California, near where Crosby lives, on January 31, 2015. Crosby was driving at approximately 50 mph when he struck the jogger, according to California Highway Patrol Spokesman Don Clotworthy. The posted speed limit was 55.

and Epic gave me these entities:

PER David Crosby

LOC Santa Ynez

LOC California

PER Crosby

PER Don Clotworthy

In comparison, OpenNLP also recognized these entities:

Sunday

evening

January 31

2015

California Highway Patrol

The date information, for example, is missed by Epic. Is this most likely a difference in the modelling used between the two libraries? Or is there something else I should be doing with Epic to match these entities?

thanks

scott s

Scott Smith

unread,

Mar 25, 2015, 5:28:11 PM3/25/15

to scalanlp...@googlegroups.com

I can answer my own question: yes, the difference is the model. That being said, is there a way to use the OpenNLP NERs in Epic?

Reply all

Reply to author

Forward

0 new messages