Fwd: [Dbpedia-discussion] DBpedia Data Quality Evaluation Campaign

17 views
Skip to first unread message

Amrapali Zaveri

unread,
Nov 27, 2012, 10:40:23 AM11/27/12
to dbpedia-da...@googlegroups.com, Dr.Jens Lehmann, Dimitris Kontokostas, Mohamed Morsey, Mohamed Sherif, Sören Auer
FYI, email below from Magnus Knuth.

Begin forwarded message:

From: "Knuth, Magnus" <Magnus...@hpi.uni-potsdam.de>
Subject: Re: [Dbpedia-discussion] DBpedia Data Quality Evaluation Campaign
Date: 27 November 2012 16:25:31 CET
To: Amrapali Zaveri <zav...@informatik.uni-leipzig.de>

Hello Amrapali,

I just used your TripleCheckMate tool and have some questions and remarks.

The tool is not really easy and joyful to use, I have problems to crawl that much triples by my eyes, that I get for several entities.

Why did you include the dbp-prop triples? Are they somehow relevant for your evaluation? I always considered them as some kind of raw-data or fallback, to see what was not properly extracted by the framework.

Did you use the dataset I provided to you, to preselect entities that have been marked potentially inconsistent or are they selected randomly?

The error types are defined pretty narrow and it was hard to me to realize that the error I considered was not contained in your list. It would be helpful to have an possibility to submit new own errors. It is also problematic that I have to select the error for each triple I consider wrong, since often the error holds for a number of triples.

I would really like to join the discussion about the evaluation and cleanup efforts for DBpedia and Linked Data in general. Maybe, we can phone to speak about further ideas.

Kind regards
Magnus


Regards,
Ms. Amrapali Zaveri Gokhale

University of Leipzig - Department of Computer Science
Paulinum 618, Augustusplatz 10, 04109 Leipzig, Germany
http://aksw.org/AmrapaliZaveri

Magnus Knuth

unread,
Nov 27, 2012, 11:12:42 AM11/27/12
to dbpedia-da...@googlegroups.com, Dr.Jens Lehmann, Dimitris Kontokostas, Mohamed Morsey, Mohamed Sherif, Sören Auer, zav...@informatik.uni-leipzig.de
Cool, you forwarded my mail already to this group. :)

Amrapali Zaveri

unread,
Nov 28, 2012, 6:07:03 AM11/28/12
to Knuth, Magnus, dbpedia-da...@googlegroups.com
Dear Magnus,

Thanks for your email.

On 27 Nov 2012, at 16:25, Knuth, Magnus wrote:

> Hello Amrapali,
>
> I just used your TripleCheckMate tool and have some questions and remarks.
>
> The tool is not really easy and joyful to use, I have problems to crawl that much triples by my eyes, that I get for several entities.
It does depend on the resource, some have a lot of triples while others have very few.
>
> Why did you include the dbp-prop triples? Are they somehow relevant for your evaluation? I always considered them as some kind of raw-data or fallback, to see what was not properly extracted by the framework.
We included everything so as to get an idea of the overall quality of DBpedia. We are aware that the dbpprop triples are not always of best quality but some times the important information not extracted with any of the other properties are extracted with the dbpprop properties.
>
> Did you use the dataset I provided to you, to preselect entities that have been marked potentially inconsistent or are they selected randomly?
No, not for this evaluation since we were interested in all kinds of errors.
>
> The error types are defined pretty narrow and it was hard to me to realize that the error I considered was not contained in your list. It would be helpful to have an possibility to submit new own errors.
We have provided a comment box when choosing the error type where you may add another error type, if it does not already exist.
>
> It is also problematic that I have to select the error for each triple I consider wrong, since often the error holds for a number of triples.
Not sure what you mean. We need to record the error type for each triple for our evaluation.
>
> I would really like to join the discussion about the evaluation and cleanup efforts for DBpedia and Linked Data in general. Maybe, we can phone to speak about further ideas.
Sure!
>
> Kind regards
> Magnus
>
>
> Am 22.11.2012 um 10:24 schrieb Amrapali Zaveri:
>
>> Hi,
>>
>> In case you missed out on the following email, here's a chance to win either a Samsung Galaxy Tab 2 or an Amazon voucher worth 300 Euro !!!
>>
>> Please help us in evaluating the quality of DBpedia by using the tool: http://nl.dbpedia.org:8080/TripleCheckMate/. Those who evaluate 10 or more resources have a higher chance of winning. So, go ahead, start evaluating now!
>>
>> Thank you very much for your time.
>> Regards,
>> DBpedia Data Quality Evaluation Team.
>>
>> On 15 Nov 2012, at 17:58, zav...@informatik.uni-leipzig.de wrote:
>>
>>> Dear all,
>>>
>>> As we all know, DBpedia is an important dataset in Linked Data as it
>>> is not only connected to and from numerous other datasets, but it also
>>> is relied upon for useful information. However, quality problems are
>>> inherent in DBpedia be it in terms of incorrectly extracted values or
>>> datatype problems since it contains information extracted from
>>> crowd-sourced content.
>>>
>>> However, not all the data quality problems are automatically
>>> detectable. Thus, we aim at crowd-sourcing the quality assessment of
>>> the dataset. In order to perform this assessment, we have developed a
>>> tool whereby a user can evaluate a random resource by analyzing each
>>> triple individually and store the results. Therefore, we would like to
>>> request you to help us by using the tool and evaluating a minimum of 3
>>> resources. Here is the link to the tool:
>>> http://nl.dbpedia.org:8080/TripleCheckMate/, which also includes
>>> details on how to use it.
>>>
>>> In order to thank you for your contributions, a lucky winner will win
>>> either a Samsung Galaxy Tab 2 or an Amazon voucher worth 300 Euro. So,
>>> go ahead, start evaluating now !! Deadline for submitting your
>>> evaluations is 9th December, 2012.
>>>
>>> If you have any questions or comments, please do not hesitate to
>>> contact us at dbpedia-da...@googlegroups.com.
>>>
>>> Thank you very much for your time.
>>>
>>> Regards,
>>> DBpedia Data Quality Evaluation Team.
>>> https://groups.google.com/d/forum/dbpedia-data-quality
>>>
>>> _______________________________________________
>>> Dbpedia-discussion mailing list
>>> Dbpedia-d...@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> Monitor your physical, virtual and cloud infrastructure from a single
>> web console. Get in-depth insight into apps, servers, databases, vmware,
>> SAP, cloud infrastructure, etc. Download 30-day Free Trial.
>> Pricing starts from $795 for 25 servers or applications!
>> http://p.sf.net/sfu/zoho_dev2dev_nov
>> _______________________________________________
>> Dbpedia-discussion mailing list
>> Dbpedia-d...@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
Reply all
Reply to author
Forward
0 new messages