Hi,
Thanks for organizing this competition. I have a couple of questions and suggestions.
1) Segmentation
I found that in many cases the segmentation is ambiguous. For example
Why the segmentation is "Swish bank UBS" rather than "UBS"? In the same tweet, the annotator also annotate the "UBS" to the same entity. I feel that "UBS" is also the right answer.
There are a lot of other examples. Some of them are
92693524981616640 "the Queen" or "Queen"
93734032189304832 "The Gang of Six" or "Gang of Six"
Given that the segmentation is not really clear, maybe be we should not evaluate on the mentions? We can still take into account the ordering of the entities, and that will give us a very good outlook of the results already.
2) Mistakes/redirect on the DBPedia pages
I found that there exists some mistakes/ redirect on the annotation, which could create problems in the evaluation
For example,
There are other cases that I did not list.
These annotation will create problems in the evaluation and creating systems.
Thanks a lot!
Ming-Wei