Re: Clinical TempEval - Questions about the evaluation

Bethard, Steven John - (bethard)

unread,

Jan 25, 2017, 11:43:38 PM1/25/17

to Clinical TempEval

> * Several possible TIMEX3 are sometimes annotated with SECTIONTIME or DOCTIME entities. If our system annotate these with TIMEX3 entities, does it impact the evaluation score ? In other words, does the evaluation script take care of this situation ?

The scoring script ignores SECTIONTIME and DOCTIME, and scores only the TIMEX3s. The overlapping SECTIONTIME and TIMEX3 is a result of changing annotation guidelines over time, but I’ve checked the test data, and there don’t appear to be any SECTIONTIMEs overlapping with TIMEX3s in the test data (which I believe is correct, given the final annotation guidelines). So you should tune your systems accordingly.

> * As mentioned in the Annotation Guidelines, several parts related to medication are not annotated. Does the evaluation script ignore annotations that could be provided by the system for these parts ? Should we make sure that these parts stay empty ?

Great question! The evaluation script does not know about the different section types, so yes, your systems should not produce annotations on those parts. If you want to see an example of doing that, there’s one in Apache cTAKES:

https://github.com/apache/ctakes/blob/trunk/ctakes-temporal/src/main/java/org/apache/ctakes/temporal/ae/TemporalEntityAnnotator_ImplBase.java

If you follow that code through, the specific segments to skip are "20104", "20105", "20116", and "20138".

> * Some parts of the annotation are annotated with DUPLICATE entities. How the evaluation script deals with these parts if they exist in the test corpus ?

The only entities that are evaluated by the evaluation script are TIMEX3, EVENT, and TLINK:

https://github.com/bethard/clinical-tempeval/blob/master/program/evaluate.py

Everything else (including DUPLICATE entities) is ignored.

Steve

julien....@gmail.com

unread,

Jan 26, 2017, 3:45:53 AM1/26/17

to clinical-tempeval

Hi Steven,

I have one follow-up question.

If our system detects a TIMEX3 entity but in the Gold Standard the same span is annotated with a SECTIONTIME entity, does it count as a False Positive and impacts the precision score for TIMEX3.

Similarly, if our system detects EVENT and TIMEX3 entities in parts that should have been encoded with DUPLICATE entities, does it impact the precision scores for EVENT and TIMEX3 ?

Julien

Bethard, Steven John - (bethard)

unread,

Jan 26, 2017, 12:46:36 PM1/26/17

to julien....@gmail.com, clinical-tempeval

SECTIONTIME and DUPLICATE will be ignored entirely by the scoring script, so you will not be marked wrong for missing a SECTIONTIME or a DUPLICATE in the gold data. However, if you produce a TIMEX3 and there is no TIMEX3 there (or an EVENT and there is no EVENT there), you will be marked wrong (decreasing your precision), even if there is a SECTIONTIME, DUPLICATE, or any other annotation in the same place.

Steve

--
You received this message because you are subscribed to the Google Groups "clinical-tempeval" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clinical-tempe...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward