Dear organizers,
We would like to question a number of annotations provided in the ground truth that we believe could be revisited.
In the Round 1, HardTable:
- for the table SUJBVAXP, the CEA annotation of (row=3,col=1) is currently Q414375 which is a disambiguation page while we believe that Q1968338 should be the correct annotation.
In the Round 2, ToughTable-WD:
- for the table 1C9LFOKN, the CTA annotation of (col=1) is Q11028 (information) while we believe that Q35657 (U.S. state) should be the correct type;
- for the table 8QA9EYPI, the CTA annotation of (col=0) is Q11028 again while we believe that Q5 (or Q82955) would be correct types;
- for the table LC4VF1A9, the CTA annotation of (col=0) is Q7048977 (non-physical entity) while we believe that Q5 (human) would be more appropriate;
- for the table PRDTMM8A, the CEA annotation of (row=51,col=6) is Q142 (France) while we believe that Q159 (Russia) should be the proper entity;
- for the table W1858N3I, the CEA annotation of (row=336,col=2) is Q8023663 (glossary of video game terms) while we believe that Q25397095 (sandbox game) should be the correct entity.
- for the table LC4VF1A9, cell (84,2) = "singing", Vincenzo has argued why Q27939 is a proper CEA annotation and we would also propose that Q17172850 (voice) is also a proper one no, since this is also compatible with the CTA "musical instrument".
- for the table Q7CDPWKD, cell (45,6) = "Unknown (elective))", this is indeed very hard to get Q186431 as an annotation and we have been ourselves fooled by Q76068807.
We suggest that all participant teams also share their doubts and suggestions to improve the gold standards if mistakes are found and acknowledged by the organizers. This so called adjudication phase is common in many benchmarking competition from the NLP community and would benefit to all.
Best regards.
Raphaël Troncy ... on behalf of the DAGOBAH Team