Some thoughts re benchmarking dataset dynamics solutions ...

1 view
Skip to first unread message

mhausenblas

unread,
Dec 5, 2009, 11:15:58 AM12/5/09
to Dataset Dynamics

juum

unread,
Dec 5, 2009, 1:41:14 PM12/5/09
to Dataset Dynamics
Hi Michael,
not bad, suga dady is alive ;-)

Not really sure what the benchmark is for and if it is a benchmark and
not yet just a test bed !

Benchmarks, in general, are for measuring performance or
characteristics of hardware or software.
So my main question are "What characteristics we want to measure?".
- delta functions for granularity
- how long it takes to compute the deltas?
- how big are the deltas
- how long does it take to transform a dataset from one version into
another (based on the delta notifications).

And/Or do we want to provide a solution to simulate dataset dynamics?
We could offer a framework (e.g. java servlet) which takes as input
the setup and returns the respective dataset URIs. Once the framework
is activated it changes the datasets accordingly to their change
frequency and some change functions (e.g. randomly change property
data values or change triples ....)
Example:
Input:
10 datasets (size 100 triple each) of type 1,
20 -"- of type 4.
5 -"- of type 30.

I would suggest to add more relevant dimensions:
- ratio between ADD, DEL and UPDATA changes
- simulate changes for links and/or object values.

a nice weekend
Juergen

Michael Hausenblas

unread,
Dec 5, 2009, 1:56:18 PM12/5/09
to dady

Juergen,

You've got some good questions in there, indeed!

> So my main question are "What characteristics we want to measure?".

Hm, looking at slide 6 ("Solutions") of [1], I'd say the main question is a
rather high-level one: given a dataset with certain dynamics (in terms of
change volume, change frequency, etc.) I as the consumer of this dataset
want to assess which solution to pick (is crawling the best for me, is a
notification-based approach good, etc.). Of course this also depends on the
use case (for an indexer the answer might look different compared to an
app), however, if we don't have a defined set of datasets with certain,
upfront specified dynamics, how shall we test our implementation, and once
someone else comes along and wants to compare it, how is this done, etc. ...
so, you get the idea, right? ;)

Cheers,
Michael

[1] http://webofdata.files.wordpress.com/2009/12/ldc09-demo-screencast.pdf

--
Dr. Michael Hausenblas
LiDRC - Linked Data Research Centre
DERI - Digital Enterprise Research Institute
NUIG - National University of Ireland, Galway
Ireland, Europe
Tel. +353 91 495730
http://linkeddata.deri.ie/
http://sw-app.org/about.html

behas

unread,
Dec 6, 2009, 3:53:55 PM12/6/09
to Dataset Dynamics
concerning evaluation and benchmarking; check the "in-official"
document Niko described here: http://groups.google.com/group/dataset-dynamics/web/technologies-to-build-on

Michael Hausenblas

unread,
Dec 7, 2009, 9:10:19 AM12/7/09
to dady
Bernhard, Niko,

> concerning evaluation and benchmarking; check the "in-official"
> document Niko described here:
> http://groups.google.com/group/dataset-dynamics/web/technologies-to-build-on

This is in fact a perfect starting point, yes.

BTW, did we fix the date/time for our teleco already?


Cheers,
Michael

--
Dr. Michael Hausenblas
LiDRC - Linked Data Research Centre
DERI - Digital Enterprise Research Institute
NUIG - National University of Ireland, Galway
Ireland, Europe
Tel. +353 91 495730
http://linkeddata.deri.ie/
http://sw-app.org/about.html

> From: behas <bernhard....@univie.ac.at>
> Reply-To: dady <dataset-...@googlegroups.com>
> Date: Sun, 6 Dec 2009 12:53:55 -0800 (PST)
> To: dady <dataset-...@googlegroups.com>
> Subject: [dady] Discussion on benchmarking
>
>

Bernhard Haslhofer

unread,
Dec 7, 2009, 9:48:57 AM12/7/09
to dataset-...@googlegroups.com
yep, http://groups.google.com/group/dataset-dynamics/web/teleconferences

Wed. 6pm GMT+1 (Vienna) = 5pm Irish time?

bernhard

______________________________________________________
Research Group Multimedia Information Systems
Department of Distributed and Multimedia Systems
Faculty of Computer Science
University of Vienna

Postal Address: Liebiggasse 4/3-4, 1010 Vienna, Austria
Phone: +43 1 42 77 39635 Fax: +43 1 4277 39649
E-Mail: bernhard....@univie.ac.at
WWW: http://www.cs.univie.ac.at/bernhard.haslhofer

Daniel Koller

unread,
Dec 7, 2009, 10:05:21 AM12/7/09
to dataset-...@googlegroups.com
btw. my skype id: dk19061979,

Daniel

On Mon, Dec 7, 2009 at 3:48 PM, Bernhard Haslhofer <bernhard....@univie.ac.at> wrote:

yep, http://groups.google.com/group/dataset-dynamics/web/teleconferences

--
---
Daniel Koller
Jahnstrasse 20
80469 München * dako...@googlemail.com

Michael Hausenblas

unread,
Dec 7, 2009, 10:11:02 AM12/7/09
to dady

> yep, http://groups.google.com/group/dataset-dynamics/web/teleconferences
>
> Wed. 6pm GMT+1 (Vienna) = 5pm Irish time?

You're right. Must have slipped my attention, sorry (aka Michael's such a
slacker ;)

Cheers,
Michael

--
Dr. Michael Hausenblas
LiDRC - Linked Data Research Centre
DERI - Digital Enterprise Research Institute
NUIG - National University of Ireland, Galway
Ireland, Europe
Tel. +353 91 495730
http://linkeddata.deri.ie/
http://sw-app.org/about.html

Reply all
Reply to author
Forward
0 new messages