Re: Issue 44 in void-impl: Mirroring of datasets should be describable with voiD

1 view
Skip to first unread message

void...@googlecode.com

unread,
Dec 8, 2010, 3:22:31 AM12/8/10
to void-di...@googlegroups.com

Comment #7 on issue 44 by K.J.W.Alexander: Mirroring of datasets should be
describable with voiD
http://code.google.com/p/void-impl/issues/detail?id=44

A year on, there are now a lot more real world situations that need to be
described with something like "void:mirror"; OpenLink, Talis, Swirrl, TSO,
Sindice etc all hosting copies of datasets published primarily somewhere
else.

Some scenarios:

X is exact copy of Y
X is copy of older version of Y
X is copy of subset of Y
X is superset of copy of Y (eg: John Goodwin took BIS dataset and augmented
with selected Ordnance Survey data)
X is modified copy of Y (eg: John changed some of the triples in BIS to
point to OS PostcodeUnits instead of BIS locations.)
X provides similar service to Y (eg: overlap between various Music
datasets)
Uberblic synthesises data from dbpedia, geonames etc, mirroring changes in
real time, but using different URIs
... will have to do for now, gotta go...

void...@googlecode.com

unread,
Dec 8, 2010, 5:21:09 AM12/8/10
to void-di...@googlegroups.com

Comment #8 on issue 44 by rich...@cyganiak.de: Mirroring of datasets should
@Keith: Good observation, the relationships between datasets out there
definitely have become a lot more complex recently.

Some of this can be expressed with void:subset, and for most you'd need
more than just void:mirror. The use case of “dataset X is a modified
version of dataset Y” falls mostly under Issue 3, IMO.

void...@googlecode.com

unread,
Dec 8, 2010, 5:40:21 AM12/8/10
to void-di...@googlegroups.com

Comment #9 on issue 44 by Michael.Hausenblas: Mirroring of datasets should

Keith, is this a proposal to do it in MS2? If so -1

Don't get me wrong, I think it is important, but it looks to me like if we
want to address it now, then we risk to screw up the release plan.

Thoughts?

Keith Alexander

unread,
Dec 8, 2010, 12:55:39 PM12/8/10
to void-di...@googlegroups.com
Michael,
I agree, I didn't change the release number, just marking the place
because it's been cropping up for me again recently.

Richard,
Yes. Two broad use cases strike me:
1. Provenance (issue 3). Possibly a little complex, lots of variations
to consider.
2. X is down/slow/out of date, try Y instead. Maybe easier to have
some kind of simple relationship that facilitates this. something like
void:alternativeDataset

void...@googlecode.com

unread,
Mar 23, 2011, 6:39:03 PM3/23/11
to void-di...@googlegroups.com

Comment #10 on issue 44 by rich...@cyganiak.de: Mirroring of datasets

FWIW, I just had a use case where this would be handy.

All the bio2rdf datasets have two mirrors. Each dataset has a SPARQL
endpoint at http://foo.bio2rdf.org/sparql. The DNS server for
foo.bio2rdf.org is configured to round-robin resolve the domain to either
server1.foo.bio2rdf.org or server2.foo.bio2rdf.org. So effectively there
are these two servers, which are mirrors of each other, and a main URL that
randomly resolves to either of them.

If we had modelled this in VoID somehow, then we could think about allowing
the expression of this kind of stuff in CKAN.

Reply all
Reply to author
Forward
0 new messages