A year on, there are now a lot more real world situations that need to be
described with something like "void:mirror"; OpenLink, Talis, Swirrl, TSO,
Sindice etc all hosting copies of datasets published primarily somewhere
else.
Some scenarios:
X is exact copy of Y
X is copy of older version of Y
X is copy of subset of Y
X is superset of copy of Y (eg: John Goodwin took BIS dataset and augmented
with selected Ordnance Survey data)
X is modified copy of Y (eg: John changed some of the triples in BIS to
point to OS PostcodeUnits instead of BIS locations.)
X provides similar service to Y (eg: overlap between various Music
datasets)
Uberblic synthesises data from dbpedia, geonames etc, mirroring changes in
real time, but using different URIs
... will have to do for now, gotta go...
Keith, is this a proposal to do it in MS2? If so -1
Don't get me wrong, I think it is important, but it looks to me like if we
want to address it now, then we risk to screw up the release plan.
Thoughts?
Richard,
Yes. Two broad use cases strike me:
1. Provenance (issue 3). Possibly a little complex, lots of variations
to consider.
2. X is down/slow/out of date, try Y instead. Maybe easier to have
some kind of simple relationship that facilitates this. something like
void:alternativeDataset
FWIW, I just had a use case where this would be handy.
All the bio2rdf datasets have two mirrors. Each dataset has a SPARQL
endpoint at http://foo.bio2rdf.org/sparql. The DNS server for
foo.bio2rdf.org is configured to round-robin resolve the domain to either
server1.foo.bio2rdf.org or server2.foo.bio2rdf.org. So effectively there
are these two servers, which are mirrors of each other, and a main URL that
randomly resolves to either of them.
If we had modelled this in VoID somehow, then we could think about allowing
the expression of this kind of stuff in CKAN.