Entities not matching Dbpedia taxonomy

Daniel Dahlmeier

unread,

Jan 27, 2014, 5:05:48 AM1/27/14

to micropo...@googlegroups.com

Hi

some of the entities that are annotated in the training data do not seem to match any category given in the documentation.

For example the zodiac sign "Aquarius":

91645803923382272 "#Aquarius your greatest obstacle is your fear of rejectionâyou'd rather write down your feelings than express them face to face." Aquarius http://dbpedia.org/resource/Aquarius_(astrology)

The taxonomy for "Aquarius" in dbpedia is

http://dbpedia.org/page/Aquarius_(astrology) -> http://dbpedia.org/class/yago/AstrologicalSigns -> Thing

Atrological Signs is not included in the taxonomy listed in the annotation guidelines (#Microposts2014 Challenge on Named Entity Extraction &

Linking (NEEL) Annotation Guidelines)

The taxonomy is included below for reference.

Do I understand something wrong? Are entities restricted to the categories from taxonomy given in the annotation guidelines or not ?

regards,

Daniel

Taxonomy

-----------------

Amount

Animal

Bird

Insect

Event

MilitaryConflict

PoliticalEvent

SportEvent

WeatherEvent

MeetingEvent

Function

Job

Location

AdministrativeRegion

Airport

Bridge

Canal

City

Continent

Country

Hospital

Island

Museum

Lake

Lighthouse

Mountain

Park

Restaurant

River

Road

ShoppingMall

Stadium

Station

Valley

Organization

Airline

Band

Broadcast

Company

EducationalInstitution

Legislature

NonProfitOrganisation

RadioStation

SoccerClub

SportsLeague

SportsTeam

TVStation

University

PoliticalOrganisation

Person

Ambassador

Architect

Artist

Astronaut

Athlete

Celebrity

ComicsCharacter

Criminal

FictionalCharacter

Mayor

MusicalArtist

Politician

SoccerPlayer

TennisPlayer

Product

Aircraft

Album

Automobile

Book

Drug

EmailAddress

Magazine

Movie

Newspaper

OperatingSystem

PhoneNumber

ProgrammingLanguage

RadioProgram

SchoolNewspaper

Software

Song

Spacecraft

URL

VideoGame

Weapon

Website

Time

Holiday

Cardinal Direction

Language

Nationality

Numeric Expression

Day of a month

Religion

Season

AstronomicalObject

Planet

Natural Satellite

EthnicGroup

Weather

Sport Name

Fréderic Godin

unread,

Jan 27, 2014, 5:16:10 AM1/27/14

to micropo...@googlegroups.com

Hi,

For me, this was not clear either.

I find many amounts of money but they are never annotated.

I would assume that these are amounts.

Also, I currently assume that we only need to detect the subcategories given such as 'Airport' or 'Bridge' but not locations in general.

'Location' is even not part of the DBPedia ontology. In DBPedia 'Place' is used.

Thanks in advance for clarifying!

Best,

Fréderic

2014-01-27 Daniel Dahlmeier <ddahl...@googlemail.com>

--
You received this message because you are subscribed to the Google Groups "microposts2014" group.
To unsubscribe from this group and stop receiving emails from it, send an email to microposts201...@googlegroups.com.
Visit this group at http://groups.google.com/group/microposts2014.
For more options, visit https://groups.google.com/groups/opt_out.

MSM

unread,

Jan 28, 2014, 11:00:04 AM1/28/14

to micropo...@googlegroups.com, Fréderic Godin

Hi Fréderic,

On 27/01/2014 10:16, Fréderic Godin wrote:

Hi,

For me, this was not clear either.
I find many amounts of money but they are never annotated.

I would assume that these are amounts.

We have only considered entities which can be mapped to DBpedia. Numeric expressions such as "4 million pounds" do not have a DBpedia URI.
Could you please provide some example numeric expressions which you think we should have been mapped to DBpedia?

Also, I currently assume that we only need to detect the subcategories given such as 'Airport' or 'Bridge' but not locations in general.

'Location' is even not part of the DBPedia ontology. In DBPedia 'Place' is used.

Similarly, could you please provide some example entities for these types?

Thanks in advance for clarifying!

Best,

Fréderic

Thanks very much,
#Microposts2014 Challenge crew

MSM

unread,

Jan 28, 2014, 11:02:22 AM1/28/14

to micropo...@googlegroups.com, Daniel Dahlmeier

Hi Daniel,

Thanks very much for your comment. We have added AstrologicalSign to our taxonomy.

Many thanks,
#Microposts2014 Challenge crew

Stefano Parmesan

unread,

Jan 29, 2014, 2:37:04 AM1/29/14

to micropo...@googlegroups.com

Hi everyone,

Let me try to make the issue clearer:

- First question that arises is: the microposts taxonomy has been built on top of what?

I personally would expect to match against http://mappings.dbpedia.org/server/ontology/classes/ (used for property rdf:type) but looking at the entries something seems out of place; just going through the taxonomy in order:

Amount -> can't be found in the dbpedia ontology

Animal -> http://mappings.dbpedia.org/server/ontology/classes/Animal

Bird -> http://mappings.dbpedia.org/server/ontology/classes/Bird

Insect -> http://mappings.dbpedia.org/server/ontology/classes/Insect

Event -> http://mappings.dbpedia.org/server/ontology/classes/Event

MilitaryConflict -> http://mappings.dbpedia.org/server/ontology/classes/MilitaryConflict

PoliticalEvent -> can't be found in the dbpedia ontology

SportEvent -> can't be found in the dbpedia ontology

WeatherEvent -> can't be found in the dbpedia ontology

MeetingEvent -> can't be found in the dbpedia ontology

Function -> can't be found in the dbpedia ontology

Job -> can't be found in the dbpedia ontology

Location -> can't be found in the dbpedia ontology

AdministrativeRegion ->http://mappings.dbpedia.org/server/ontology/classes/AdministrativeRegion

Airport -> http://mappings.dbpedia.org/server/ontology/classes/Airport

and so on. It seems to me this is the wrong ontology, so the question. I also checked the dbpedia categories (used with property dcterms:subject), but even there something is out of place (there is for example no http://dbpedia.org/resource/Category:Amount even though http://dbpedia.org/resource/Category:Animal is there).

- Second thing is: should we check for exact membership, or we should also consider the parent-categories?

I ask this question because (for example) in tweet 91921712177889280 we find the entity http://dbpedia.org/resource/God which is not directly in neither of the entities in the taxonomy (both with rdf:type and dcterms:subject) but if we check for the parents, we will eventually find http://dbpedia.org/resource/Category:Religion which is in the taxonomy; this means that we should check for all the subcategories as well, but if this is the case, what's the point of having both "Event" and all its children ("MilitaryConflict", "PoliticalEvent", ...) in the taxonomy?

Thanks,

2014-01-28 MSM <msm.o...@gmail.com>

--

Dott. Stefano Parmesan

Backend Web Developer and Data Lover ~ SpazioDati s.r.l.

Via del Brennero, 52 – 38122 Trento – Italy

Fréderic Godin

unread,

Jan 29, 2014, 2:43:24 AM1/29/14

to micropo...@googlegroups.com

Very nice explanation Stefano!

This is the problem I've been suffering from the whole week.

Best,

Fréderic

2014-01-29 Stefano Parmesan <parm...@spaziodati.eu>

Fréderic Godin

unread,

Feb 3, 2014, 9:53:53 AM2/3/14

to micropo...@googlegroups.com

Dear chairs,

Since it has been a week since Daniel asked the first question about the taxonomy, I was wondering if you were able to take a look at it?

I think many teams are still struggling. Or did someone already find an explanation and did I miss it?

Thanks in advance!

Best,

Fréderic

Op maandag 27 januari 2014 11:05:48 UTC+1 schreef Daniel Dahlmeier:

Stefano Parmesan

unread,

Feb 4, 2014, 3:34:48 AM2/4/14

to micropo...@googlegroups.com

We didn't, still waiting for an answer...

(it's like hearing the first verse of Comfortably Numb in your head over and over again)

Thanks and regards,

2014-02-03 Fréderic Godin <frederi...@ugent.be>:

--

You received this message because you are subscribed to the Google Groups "microposts2014" group.
To unsubscribe from this group and stop receiving emails from it, send an email to microposts201...@googlegroups.com.
Visit this group at http://groups.google.com/group/microposts2014.
For more options, visit https://groups.google.com/groups/opt_out.

#Microposts2014 Chairs

unread,

Feb 4, 2014, 4:27:47 AM2/4/14

to micropo...@googlegroups.com

Dear Stefano, Frederic,

> - First question that arises is: the microposts taxonomy has been built
> on top of what?

the taxonomy is derived from the NERD ontology
http://nerd.eurecom.fr/ontology/nerd-v0.5.n3 . A set of additional
classes from YAGO have been added.

> - Second thing is: should we check for exact membership, or we should
> also consider the parent-categories?

This is up to you and how your system is built.

For the evaluation process we *do not consider* any typing information,
but we will judge the goodness of your submission based on the exact
match of the pair (entity,uri).
Hence, you are free to use any ontology (for instance your own) that
better fits the model of the corpus.

Cheers,
#Microposts2014 Challenge crew

Mena Badieh Habib Morgan

unread,

Feb 7, 2014, 7:39:57 AM2/7/14

to micropo...@googlegroups.com

Hello,
The provided NERD taxonomy is not consistent with DBpedia ontology.
For example there is no equivalent for EmailAddress in the DBpedia ontology.
How could I develop my own methods of disambiguation while you are directing me to a specific ontology of a specific tool?
The task is to link entities to DBpedia, so DBpedia ontology is the only ontology that should be used. Right?

Thanks

Mena

#Microposts2014 Chairs

unread,

Feb 7, 2014, 8:10:56 AM2/7/14

to micropo...@googlegroups.com

Dear Mena,

thanks to share your concern.

As we have already stated in a previous email as a reply to a similar
question, we have used the NERD ontology plus few additions from YAGO2
to prepare both training and test sets.

> The provided NERD taxonomy is not consistent with DBpedia ontology.
> For example there is no equivalent for EmailAddress in the DBpedia ontology.
> How could I develop my own methods of disambiguation while you are
> directing me to a specific ontology of a specific tool?
> The task is to link entities to DBpedia, so DBpedia ontology is the only
> ontology that should be used. Right?

We have decoupled the typing task from the disambiguation one. Hence,
DBpedia is not the only ontology used to prepare the corpora. Having
said that you are free to use only DBpedia.

Hope it helps.

Best regards,
#Microposts2014 Challenge crew

Ugo Scaiella

unread,

Feb 13, 2014, 10:03:57 AM2/13/14

to micropo...@googlegroups.com

Dear Chairs,

The main concern regarding the taxonomy is that it is not clear whether the results that participants have to submit must be filtered using that taxonomy before submission or you will manage this tasks on yourself before running evaluation scripts.

Actually, (almost) all annotators participants are using do not rely on that taxonomy and most likely they will annotate tweets with DBpedia concepts that are not contained in that taxonomy. This doesn't mean that the annotator is not working fine, but this challenge is just focused on a reduced set of DBpedia concepts.

An this is perfectly fine, but the issue is that you will evaluate both precision and recall, so it is important to remove all DBpedia URIs that are not contained in that taxonomy otherwise, even if the annotator has correctly found a relevant DBpedia URI but that URI is not part of that taxonomy, this case will be considered as a false positive, hence affecting the annotator precision and in turn the overall F1 score.

Could you please clarify what URIs should be included in the TSV files to be submitted?

In case you don't manage such a filter, ie participants have to filter out URIs that are not part of that taxonomy, I think you should clarify how to match a DBpedia URI with that taxonomy, because it's still not clear.