Dear chairs,
We are still playing with test set (version 1.6), and we have found several patterns we would like you to kindly clarify.
1) It is not clear the rationale for annotating dates. We have collected a number of instances that show different behaviors:
92866765834567680 "In 2010, Coca-Cola was voted the most discriminatory employer ..." no annotation for spot '2010'
99852692247166976 "Auberge Resorts Announces It Is Again Honored On 2011 Travel + Leisure World ..." spot '2011' has been annotated
101107456050073601 "On September 19, 2011 the US citizens are gonna ..." 'September 19' and '2011' got annotated
91732691678007296 ""Did you know that World Ocean Day is on the 8th of June? Let's ..." spot '8th of June' is not annotated.
There are several instances of these cases, so I think it would be better if you could please clarify what is the guidelines for annotating dates.
2) 101022789389131776 "Met today w/ reps of 8 million Egyptians w/ disabilities. " 'Egyptians' is not annotated while in 96116890295992320 and 103170022179999745 'Lybian' is annotated with
http://dbpedia.org/resource/Libyan_people
What's the correct behavior?
3) Finally, based on past threads, my understanding was that all occurrences of a spot should be annotated. Could you please confirm this?
Actually, we have found that there are several tweets where this doesn't hold:
101389292311556096 "Slaying of 3 Muslims lays bare divisions: With police nowhere to be seen, the Muslims of ..." only one occurrency of spot 'muslim' has been annotated
92365972840775680 "Pedigree and genetics conference scheduled for Sept. 7-8 - Paulick Report: Pedigree and genetics conference sche...
http://bit.ly/oOBsVN" only one occurrence of spot 'pedigree' has been annotated
92836172925120512 "I want to hear Justin's point of view on this non stop drama. Not Scooter's no offense but all this about Justin not him." only one occurrence of spot 'justin' has been annotated
I haven't systematically checked the dataset, but I think there could be more than these three instances in the dataset.
Regards,
-- Ugo Scaiella