Systematic Annotation Error in the 1.5 training set

28 views
Skip to first unread message

Chang Ming-Wei

unread,
Feb 9, 2014, 11:14:17 PM2/9/14
to micropo...@googlegroups.com
 Hi,
 
      Here are some systematic error in the 1.5 data. We mainly address three issues here 1) 's should not be in mentions 2) quote should not be in mentions 3) entity name should be finalized. If an entity name is in redirect, we should follow the page and use the final DBPedia page. 4) some annotation mistakes
 
       Here are the errors we found in the training set. Please see if they make sense. Please also address similar issues in the test data as well.
 
Entity: 100998532747640832 http://dbpedia.org/resource/CCTV should be http://dbpedia.org/resource/Closed-circuit_television
Entity: 92955019615272961 http://dbpedia.org/resource/Ephesians should be http://dbpedia.org/resource/Epistle_to_the_Ephesians
Entity: 94811433027633152 http://dbpedia.org/resource/ENews_Channel should be http://dbpedia.org/resource/E!News
Entity: 99144320430518274 http://dbpedia.org/resource/Gremio should be http://dbpedia.org/resource/Gr%C3%AAmio_Foot-Ball_Porto_Alegrense
Entity: 92999023346196480 http://dbpedia.org/resource/Visa_policy_of_the_United_States should be http://dbpedia.org/resource/United_States_visas
Entity: 92999023346196480 http://dbpedia.org/resource/Visa_policy_of_the_United_States should be http://dbpedia.org/resource/United_States_visas
Entity: 91797766837252097 http://dbpedia.org/resource/News_International should be http://dbpedia.org/resource/NI_Group
Entity: 94484528395071488 http://dbpedia.org/resource/Reporter should be http://dbpedia.org/resource/journalist
Entity: 99598183302299648 http://dbpedia.org/resource/Kansas_University should be http://dbpedia.org/resource/University_of_Kansas
Entity: 97439276295397376 http://dbpedia.org/resource/Zach_Walters_(baseball) should be http://dbpedia.org/resource/Washington_Nationals_minor_league_players#Zach_Walters
Entity: 92275090007404544 http://dbpedia.org/resource/Mail_on_Sunday should be http://dbpedia.org/resource/The_Mail_on_Sunday
Entity: 92934213975805952 http://dbpedia.org/resource/Goo_Hara should be http://dbpedia.org/resource/Goo_Ha-ra
Entity: 93274772011626496 http://dbpedia.org/resource/CNN_Money should be http://dbpedia.org/resource/CNNMoney.com
Entity: 99113051927752704 http://dbpedia.org/resource/Dietrick_Hall should be http://dbpedia.org/resource/Campus_of_Virginia_Tech
Entity: 93068195799371776 http://dbpedia.org/resource/Tulsa should be http://dbpedia.org/resource/Tulsa,_Oklahoma
Entity: 92647293219643392 http://dbpedia.org/resource/6_August should be http://dbpedia.org/resource/August_6
Entity: 91649388589494272 http://dbpedia.org/resource/School_of_Audio_Engineering should be http://dbpedia.org/resource/SAE_Institute
Entity: 92433257878134784 http://dbpedia.org/resource/Argus_Filch should be http://dbpedia.org/resource/Hogwarts_staff#Argus_Filch
Entity: 93808293348257792 http://dbpedia.org/resource/10_September should be http://dbpedia.org/resource/September_10
Entity: 103087658267455488 http://dbpedia.org/resource/Southend should be http://dbpedia.org/resource/Southend-on-Sea
Entity: 103087658267455488 http://dbpedia.org/resource/Southend_Victoria should be http://dbpedia.org/resource/Southend_Victoria_railway_station
Entity: 92201997121499136 http://dbpedia.org/resource/Pooh should be http://dbpedia.org/resource/Winnie-the-Pooh
Entity: 101392563843502082 http://dbpedia.org/resource/Austin360 should be http://dbpedia.org/resource/Austin_American-Statesman
Entity: 92714300258529280 http://dbpedia.org/resource/Catherine_II_of_Russia should be http://dbpedia.org/resource/Catherine_the_Great
Entity: 100373381538529280 http://dbpedia.org/resource/Reno_Tahoe_Open should be http://dbpedia.org/resource/Reno–Tahoe_Open
Mention: 102923598414622720 News1130's should be News1130
Entity: 94142638093115392 http://dbpedia.org/resource/Greek_debt_crisis should be http://dbpedia.org/resource/Greek_government-debt_crisis
Entity: 92741631727509504 http://dbpedia.org/resource/Krabby_Patty should be http://dbpedia.org/resource/SpongeBob_SquarePants
Entity: 101668632941178880 http://dbpedia.org/resource/Falcon_HTV-2 should be http://dbpedia.org/resource/Hypersonic_Technology_Vehicle_2
Entity: 100010138022330368 http://dbpedia.org/resource/Japanese_economy should be http://dbpedia.org/resource/Economy_of_Japan
Entity: 95069213244399617 http://dbpedia.org/resource/Norwegian_Police should be http://dbpedia.org/resource/Norwegian_Police_Service
Entity: 98116519757742080 http://dbpedia.org/resource/UEFA_European_Championship should be http://dbpedia.org/resource/UEFA_European_Football_Championship
Entity: 95553711010619393 http://dbpedia.org/resource/Sport_news should be http://dbpedia.org/resource/Sports_journalism
Entity: 91957731992420352 http://dbpedia.org/resource/New_Jersey_Nets should be http://dbpedia.org/resource/Brooklyn_Nets
Entity: 93672807057203201 http://dbpedia.org/resource/NFLPA should be http://dbpedia.org/resource/National_Football_League_Players_Association
Entity: 98017799158497280 http://dbpedia.org/resource/Demographics_of_US should be http://dbpedia.org/resource/Demographics_of_the_United_States
Entity: 93086151035994113 http://dbpedia.org/resource/News_International should be http://dbpedia.org/resource/NI_Group
Entity: 94115609981358080 http://dbpedia.org/resource/News_International should be http://dbpedia.org/resource/NI_Group
Entity: 94818647805140993 http://dbpedia.org/resource/SCAF should be http://dbpedia.org/resource/Supreme_Council_of_the_Armed_Forces
Entity: 92398639233777664 http://dbpedia.org/resource/Janelle_Monae should be http://dbpedia.org/resource/Janelle_Mon%C3%A1e
Entity: 93033242663469056 http://dbpedia.org/resource/Red_state should be http://dbpedia.org/resource/Red_states_and_blue_states
Entity: 91970435507421184 http://dbpedia.org/resource/American_people should be http://dbpedia.org/resource/Americans
Entity: 99288973943390209 http://dbpedia.org/resource/Ice_Age_(2002_film) should be http://dbpedia.org/resource/Ice_Age_(film)
Entity: 93071762639699968 http://dbpedia.org/resource/News_International should be http://dbpedia.org/resource/NI_Group
Entity: 91944850680844288 http://dbpedia.org/resource/007 should be http://dbpedia.org/resource/James_Bond_(literary_character)
Mention: 92229656115286016 Michael Jackson's should be Michael Jackson
Entity: 92630700901138432 http://dbpedia.org/resource/18_November should be http://dbpedia.org/resource/November_18
Entity: 96228351936692224 http://dbpedia.org/resource/Noynoy_Aquino should be http://dbpedia.org/resource/Benigno_Aquino_III
Entity: 93059420296183808 http://dbpedia.org/resource/Black_race should be http://dbpedia.org/resource/Black_people
Entity: 93059420296183808 http://dbpedia.org/resource/White_race should be http://dbpedia.org/resource/White_people
Entity: 99252118250201089 http://dbpedia.org/resource/I-tunes should be http://dbpedia.org/resource/ITunes
Entity: 91804404180725760 http://dbpedia.org/resource/Pee_Wee_Herman should be http://dbpedia.org/resource/Pee-wee_Herman
Entity: 92574480072839168 http://dbpedia.org/resource/Hynix should be http://dbpedia.org/resource/SK_Hynix
Mention: 100960553220046848 Sainsbury's should be Sainsbury
Entity: 93144205001629696 http://dbpedia.org/resource/United_States_court should be http://dbpedia.org/resource/United_States_federal_courts
Entity: 92680562950668288 http://dbpedia.org/resource/Fred_Weasley should be http://dbpedia.org/resource/Dumbledore's_Army#Fred_and_George_Weasley
Entity: 100930986484842496 http://dbpedia.org/resource/Londoners should be http://dbpedia.org/resource/Londoner
Entity: 102715817497604096 http://dbpedia.org/resource/A3_road_(Great_Britain) should be http://dbpedia.org/resource/A3_road
Entity: 102715817497604096 http://dbpedia.org/resource/A3_road_(Great_Britain) should be http://dbpedia.org/resource/A3_road
Entity: 92034073391923201 http://dbpedia.org/resource/RSA_Security should be http://dbpedia.org/resource/RSA_(security_firm)
Entity: 91957934195609600 http://dbpedia.org/resource/Grim_Reaper should be http://dbpedia.org/resource/Death_(personification)
Entity: 101032158956761088 http://dbpedia.org/resource/SkyNews should be http://dbpedia.org/resource/Sky_News
Mention: 102124541182091264 ICameron's should be ICameron
Entity: 92238889653256192 http://dbpedia.org/resource/Christopher_Wallace,_Jr. should be http://dbpedia.org/resource/The_Notorious_B.I.G.
Entity: 94063258499153920 http://dbpedia.org/resource/Editor should be http://dbpedia.org/resource/Editing
Entity: 92381667926351872 http://dbpedia.org/resource/Vessels should be http://dbpedia.org/resource/Vessel
Mention: 94156730929381376 Fodor's should be Fodor
Entity: 93301545906606081 http://dbpedia.org/resource/CBS_21 should be http://dbpedia.org/resource/WHP-TV
Entity: 93772300826050561 http://dbpedia.org/resource/Florida_People should be http://dbpedia.org/resource/List_of_people_from_Florida
Entity: 92205398123216896 http://dbpedia.org/resource/Ohi_Nuclear_Power_Plant should be http://dbpedia.org/resource/%C5%8Ci_Nuclear_Power_Plant
Entity: 92205398123216896 http://dbpedia.org/resource/Ohi_Nuclear_Power_Plant should be http://dbpedia.org/resource/%C5%8Ci_Nuclear_Power_Plant
Entity: 96243745539891200 http://dbpedia.org/resource/Norwegian_Government should be http://dbpedia.org/resource/Politics_of_Norway
Entity: 94764774340046849 http://dbpedia.org/resource/1._FC_Cologne should be http://dbpedia.org/resource/1._FC_K%C3%B6ln
Entity: 96273919115403264 http://dbpedia.org/resource/Vallares should be http://dbpedia.org/resource/Genel_Energy
Entity: 96369935135158272 http://dbpedia.org/resource/July_27th should be http://dbpedia.org/resource/July_27
Entity: 96369935135158272 http://dbpedia.org/resource/Physicists should be http://dbpedia.org/resource/Physicist
Entity: 91937116287799296 http://dbpedia.org/resource/Catholic should be http://dbpedia.org/resource/Catholicism
Entity: 91937116287799296 http://dbpedia.org/resource/Protestant should be http://dbpedia.org/resource/Protestantism
Entity: 96924850878287872 http://dbpedia.org/resource/Alvaro_Pereira should be http://dbpedia.org/resource/%C3%81lvaro_Pereira
Entity: 92287584750936064 http://dbpedia.org/resource/Lily_Evans should be http://dbpedia.org/resource/Order_of_the_Phoenix_(fiction)#Lily_Potter
Entity: 94247453636829184 http://dbpedia.org/resource/Finnish_people should be http://dbpedia.org/resource/Finns
Entity: 102620482892869633 http://dbpedia.org/resource/Chinese_Navy should be http://dbpedia.org/resource/People's_Liberation_Army_Navy
Entity: 91628739892490240 http://dbpedia.org/resource/Half-Blood_Prince should be http://dbpedia.org/resource/Harry_Potter_and_the_Half-Blood_Prince
Entity: 97638606843297793 http://dbpedia.org/resource/Syrian_government should be http://dbpedia.org/resource/Council_of_Ministers_(Syria)
Entity: 92643760508387328 http://dbpedia.org/resource/26_January should be http://dbpedia.org/resource/January_26
Entity: 91885515309187072 http://dbpedia.org/resource/Harry-Potter should be http://dbpedia.org/resource/Harry_Potter
Entity: 98060650676363264 http://dbpedia.org/resource/Retail_Distribution_Review should be http://dbpedia.org/resource/Financial_Services_Authority#Retail_consumers
Entity: 99594637676789760 http://dbpedia.org/resource/Ramadan_Massacre should be http://dbpedia.org/resource/Siege_of_Hama_(2011)
Entity: 103207560827514880 http://dbpedia.org/resource/SEAL_Team_Six should be http://dbpedia.org/resource/United_States_Naval_Special_Warfare_Development_Group
Entity: 95555792912134144 http://dbpedia.org/resource/Canadian_politics should be http://dbpedia.org/resource/Politics_of_Canada
Entity: 93525433684983808 http://dbpedia.org/resource/Israeli_intelligence should be http://dbpedia.org/resource/Mossad
Entity: 92006871375945728 http://dbpedia.org/resource/Good_Weekend_(Sydney_Morning_Herald) should be http://dbpedia.org/resource/The_Sydney_Morning_Herald
Mention: 92557111468363776 Afghanistan's should be Afghanistan
Entity: 91651716184948736 http://dbpedia.org/resource/Social_security_system should be http://dbpedia.org/resource/Social_security
Entity: 93960668197289984 http://dbpedia.org/resource/Scottish_and_Southern_Energy should be http://dbpedia.org/resource/SSE_plc
Mention: 93047259066925056 'Fred Weasley' should be Fred Weasley
Entity: 93923166950395904 http://dbpedia.org/resource/Cnbc_tv18 should be http://dbpedia.org/resource/CNBC-TV18
Entity: 94421647397888000 http://dbpedia.org/resource/PETA should be http://dbpedia.org/resource/People_for_the_Ethical_Treatment_of_Animals
Entity: 97969923694931968 http://dbpedia.org/resource/US_TV should be http://dbpedia.org/resource/Television_in_the_United_States
Entity: 98355416865587200 http://dbpedia.org/resource/News_International should be http://dbpedia.org/resource/NI_Group
Entity: 97408964903448576 http://dbpedia.org/resource/WKYT should be http://dbpedia.org/resource/WKYT-TV
Entity: 100947346967367680 http://dbpedia.org/resource/Hertfordshire_Police should be http://dbpedia.org/resource/Hertfordshire_Constabulary
Entity: 102156455674789888 http://dbpedia.org/resource/Italian_government should be http://dbpedia.org/resource/politics_of_Italy
Mention: 94209038987952128 Demi's should be Demi
Mention: 93661388681125888 'Bodyguard' should be Bodyguard
Entity: 99885981905321985 http://dbpedia.org/resource/Navy_seals should be http://dbpedia.org/resource/United_States_Navy_SEALs
Entity: 99124519112949762 http://dbpedia.org/resource/Dietrick_Hall should be http://dbpedia.org/resource/Campus_of_Virginia_Tech
Entity: 91970961699647488 http://dbpedia.org/resource/Angela_Simmons should be http://dbpedia.org/resource/Run's_House#The_Simmons_Family
Entity: 99121683209785344 http://dbpedia.org/resource/Dietrick_Hall should be http://dbpedia.org/resource/Campus_of_Virginia_Tech
Entity: 92856534454906880 http://dbpedia.org/resource/Cambodian-Thai_border_dispute should be http://dbpedia.org/resource/Cambodian–Thai_border_dispute
Entity: 93099256562450433 http://dbpedia.org/resource/Harry_Potter_books should be http://dbpedia.org/resource/Harry_Potter
Entity: 93099256562450433 http://dbpedia.org/resource/Scoobydoo should be http://dbpedia.org/resource/Scooby-Doo
 
Ming-Wei

MSM

unread,
Feb 18, 2014, 4:48:14 PM2/18/14
to micropo...@googlegroups.com, Chang Ming-Wei
Dear Ming-Wei,

Thanks very much for your comments. We have already addressed them in v1.6.
We kept the apostrophes for the entities which contain an apostrophe in their names: e.g. Sainsbury's or Fodor's, we removed the quotes inside the entity mentions, and we resolved the redirects you suggested.

Many thanks.
Best regards,
#Microposts2014 NEEL Challenge crew
--
You received this message because you are subscribed to the Google Groups "microposts2014" group.
To unsubscribe from this group and stop receiving emails from it, send an email to microposts201...@googlegroups.com.
Visit this group at http://groups.google.com/group/microposts2014.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all
Reply to author
Forward
0 new messages