Modelling non-RDF datasets

23 views
Skip to first unread message

Alasdair Gray

unread,
Mar 12, 2013, 6:11:26 AM3/12/13
to void-di...@googlegroups.com
Hi All,

We are making extensive use of VoID in the Open PHACTS project [1]. Thanks for providing a great vocabulary. We are currently deciding how to model a recurring use case of needing to describe non-RDF datasets and manage linksets to them.

In the VoID vocabulary, a dataset is defined to be [2]

A set of RDF triples that are published, maintained or aggregated by a single provider.

Since all predicates are defined with a domain/range of void:Dataset, this would mean that it would be incorrect to use them for any dataset that is not a set of RDF triples.

Should we go ahead and use the predicates despite this inaccurate interpretation of the non-RDF dataset? 

Is there another vocabulary that allows for the modelling of linksets that does not restrict the dataset to a set of RDF triples? I am aware of DCAT [3] but do not see suitable linking predicates.

Should we develop a set of super-properties that do not have the domain/range restrictions?

Thanks,

Alasdair

Keith Alexander

unread,
Mar 12, 2013, 9:24:36 AM3/12/13
to void-di...@googlegroups.com
Hi,

On Tuesday, March 12, 2013, Alasdair Gray wrote:
Since all predicates are defined with a domain/range of void:Dataset, this would mean that it would be incorrect to use them for any dataset that is not a set of RDF triples.

Should we go ahead and use the predicates despite this inaccurate interpretation of the non-RDF dataset? 

Which void predicates in particular would be useful to you for non-RDF datasets? 
 
Is there another vocabulary that allows for the modelling of linksets that does not restrict the dataset to a set of RDF triples? I am aware of DCAT [3] but do not see suitable linking predicates.


Out of interest, how do your linksets linking non-RDF datasets work? what do you use for the subject and object URIs in a linking triple?

Thanks,

Keith
 
Should we develop a set of super-properties that do not have the domain/range restrictions?

Thanks,

Alasdair

--
You received this message because you are subscribed to the Google Groups "void-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to void-discussi...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Alasdair Gray

unread,
Mar 12, 2013, 10:05:14 AM3/12/13
to void-di...@googlegroups.com, keithal...@keithalexander.co.uk

Hi Keith,


The use case we are primarily focusing on at the moment is the one of linking an RDF dataset to a non-RDF dataset which has, for example, a web page per data record. We would want to be able to link-out to these web pages to support our users in viewing more information about the data that they are seeing. Specifically, a user has searched for a chemical compound and got some basic information back from the Open PHACTS platform which includes the fact that it interacts with a certain ligand in PDB. We would want to support the ability to click on the ligand to jump to the record in PDB.


So far we have been modelling these are VoID linksets, where the link predicate would be something that states that this is user facing page about the concept. Therefore we are looking at the void:subjectsTarget and void:objectsTarget predicates in this case.


Alasdair

Hi,
To unsubscribe from this group and stop receiving emails from it, send an email to void-discussion+unsubscribe@googlegroups.com.

Rob Atkinson

unread,
Jul 14, 2013, 8:47:12 PM7/14/13
to void-di...@googlegroups.com

i have related questions:

1) If the definition of void:Dataset is a "set" - can that set then be empty although the set of concepts or resources is not? (the concepts exist and can re referenced in triples, but there are no RDF statements available. Would this require an empty RDF container response? 

If so, one can describe a set of resources that do not _yet_ have RDF resources available, but where URIs exist.  (This seems to match the web page case - where it is unknown whether the web page may or may not have RDFa embedded, or an equivalent RDF resource is made available in future (such as Wikipedia and DBpedia).

2) Is there actually a requirement that RDF is available for each entity, and if so, would the void:Dataset description be a valid response? 

"If the entities described in a dataset are identified by HTTP URIs, then it is a reasonable assumption that resolving such a URI will return an RDF description of the entity." [1]

however, the void:lookupEndpoint [2] provides an optional explicit mechanism to access RDF, so the "reasonable assumption" above would not appear to be a strong requirement.

Perhaps it is "reasonable" therefore for any request for RDF from a URI in the void:uriSpace of the dataset to just return the dataset description? If so, this is a non-information resource ->information resource redirection issue, and by thus any void:Dataset meets the definition by virtue of description, provided that redirect is supported. 


In summary I think it is reasonable that the Open World assumption should allow us to have empty sets of RDF, hence we could use void:Dataset for any dataset that could be described in RDF at some stage, and where there is utility to describe it using the semanics of VoiD rather than re-inventing all that (nice) stuff and buying into a problem of migrating it all as RDF resources become available as other issues around what RDF is required get resolved.

Regards 
Rob Atkinson

Richard Cyganiak

unread,
Jul 15, 2013, 4:42:09 AM7/15/13
to void-di...@googlegroups.com, void-di...@googlegroups.com
Hi Rob,

On 15 Jul 2013, at 01:47, Rob Atkinson <robatki...@gmail.com> wrote:
i have related questions:

1) If the definition of void:Dataset is a "set" - can that set then be empty although the set of concepts or resources is not? (the concepts exist and can re referenced in triples, but there are no RDF statements available.

It's a corner case, but I can't see anything wrong with it.

Would this require an empty RDF container response? 

Well, if the URL of a data dump is indicated then it should be empty, but indicating a data dump is optional.

If so, one can describe a set of resources that do not _yet_ have RDF resources available, but where URIs exist.  (This seems to match the web page case - where it is unknown whether the web page may or may not have RDFa embedded, or an equivalent RDF resource is made available in future (such as Wikipedia and DBpedia).

That's correct, although VoID doesn't provide a lot of interesting properties for saying useful things about such sets of resources. There might be other more useful vocabularies for this use case. You might want to look into POWDER for example.

2) Is there actually a requirement that RDF is available for each entity, and if so, would the void:Dataset description be a valid response? 

"If the entities described in a dataset are identified by HTTP URIs, then it is a reasonable assumption that resolving such a URI will return an RDF description of the entity." [1]

however, the void:lookupEndpoint [2] provides an optional explicit mechanism to access RDF, so the "reasonable assumption" above would not appear to be a strong requirement.

The spec says that consumers of your VoID description can reasonably expect an RDF description. If you know that no such response exists, then you're potentially leading such consumers astray. I would recommend against doing that. The void:lookupEndpoint property is for a completely different use case and rarely used in practice. It's existence can't be interpreted as nullifying statements made elsewhere in the spec.

Perhaps it is "reasonable" therefore for any request for RDF from a URI in the void:uriSpace of the dataset to just return the dataset description?

That hinges on the question whether you consider the description if the dataset to be a reasonably accurate description of the individual resources. That's a subjective question. I'd say it's better than nothing, and may or may not be good enough, depending on what you want to achieve.

If so, this is a non-information resource ->information resource redirection issue, and by thus any void:Dataset meets the definition by virtue of description, provided that redirect is supported. 


In summary I think it is reasonable that the Open World assumption should allow us to have empty sets of RDF, hence we could use void:Dataset for any dataset that could be described in RDF at some stage, and where there is utility to describe it using the semanics of VoiD rather than re-inventing all that (nice) stuff and buying into a problem of migrating it all as RDF resources become available as other issues around what RDF is required get resolved.

My take on this is that a very sparse machine-readable description of a resource is not good, but is better than no machine-readable description at all. Beyond that general observation, it's all a question of the practical goal you want to achieve.

Best,
Richard



Regards 
Rob Atkinson



On Tuesday, 12 March 2013 21:11:26 UTC+11, Alasdair Gray wrote:
Hi All,

We are making extensive use of VoID in the Open PHACTS project [1]. Thanks for providing a great vocabulary. We are currently deciding how to model a recurring use case of needing to describe non-RDF datasets and manage linksets to them.

In the VoID vocabulary, a dataset is defined to be [2]

A set of RDF triples that are published, maintained or aggregated by a single provider.

Since all predicates are defined with a domain/range of void:Dataset, this would mean that it would be incorrect to use them for any dataset that is not a set of RDF triples.

Should we go ahead and use the predicates despite this inaccurate interpretation of the non-RDF dataset? 

Is there another vocabulary that allows for the modelling of linksets that does not restrict the dataset to a set of RDF triples? I am aware of DCAT [3] but do not see suitable linking predicates.

Should we develop a set of super-properties that do not have the domain/range restrictions?

Thanks,

Alasdair

--
You received this message because you are subscribed to the Google Groups "void-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to void-discussi...@googlegroups.com.

Rob Atkinson

unread,
Jul 15, 2013, 10:10:00 PM7/15/13
to void-di...@googlegroups.com
Many thanks for your response Richard.

FYI, my application does indeed extend the VoiD vocabulary to support descriptions of URLtemplate access to non-RDF resources. We are currently exploring the potential of RDF-datacube here, particularly as many of the resources are natively available using SDMX. The RDF-generating adaptor pattern [1]  is great where the publishers of the data are willing and able to provide a redirect, but we want to be able to describe cases where this is not yet available, but easily adapt if we succeed in improving access at a later date.

I did hear a rumour that VoiD spec updates were under consideration - but since you have not mentioned this I'm guessing this concern is out of scope?  Is there an link to a description for scope of any VoiD refinements?

Regards
Rob Atkinson



--
You received this message because you are subscribed to a topic in the Google Groups "void-discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/void-discussion/70nF65xNkqY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to void-discussi...@googlegroups.com.

Richard Cyganiak

unread,
Jul 16, 2013, 5:16:02 AM7/16/13
to void-di...@googlegroups.com
On 16 Jul 2013, at 03:10, Rob Atkinson <robatki...@gmail.com> wrote:
> FYI, my application does indeed extend the VoiD vocabulary to support descriptions of URLtemplate access to non-RDF resources. We are currently exploring the potential of RDF-datacube here, particularly as many of the resources are natively available using SDMX. The RDF-generating adaptor pattern [1] is great where the publishers of the data are willing and able to provide a redirect, but we want to be able to describe cases where this is not yet available, but easily adapt if we succeed in improving access at a later date.

Okay, sounds good.

> I did hear a rumour that VoiD spec updates were under consideration - but since you have not mentioned this I'm guessing this concern is out of scope?

I was planning to get a group of people together for doing an update earlier this year, but I ended up not having the bandwidth due to changes in my role here at DERI. I'm still interested in doing an update, and I think there's enough experience with VoID now to do one, but such an effort needs somebody to drive it who can invest the necessary time…

> Is there an link to a description for scope of any VoiD refinements?

The scope for such an update is not clearly defined, but the idea would be to look at how people are using VoID in practice, and make it work better for those cases. Possible work items might include alignment with DCAT, Data Cube, and other metadata efforts; support for provenance; better alignment with SPARQL Service Descriptions; seeing if anything can be done in relation to the Linked Data Platform WG's work; etc.

There's an issue tracker for VoID, and if you have a VoID extension or feature request, then this would be a good place to document it:
http://code.google.com/p/void-impl/issues/list?q=product=vocab

Kjetil Kjernsmo

unread,
Jul 16, 2013, 6:32:52 AM7/16/13
to void-di...@googlegroups.com
On Tuesday 16. July 2013 11.16.02 Richard Cyganiak wrote:
> I was planning to get a group of people together for doing an update
> earlier this year, but I ended up not having the bandwidth due to changes
> in my role here at DERI. I'm still interested in doing an update, and I
> think there's enough experience with VoID now to do one, but such an
> effort needs somebody to drive it who can invest the necessary time…

Just so you know about it, I'm interested, but not this year... :-)

Cheers,

Kjetil

Rob Atkinson

unread,
Jul 16, 2013, 8:13:55 PM7/16/13
to void-di...@googlegroups.com
OK thanks, very helpful

I'm potentially going to be keen to harvest some VoiD and see how we can cross reference to authoritative geographical object identifiers (as opposed to the default but limited model of geonames). I guess we'd be interested in building a graph of things connected with GeoNames using VoiD, and then seeing how we can describe virtual subsets of Geonames that actually represent the links - i.e. what is linked are actually all counties in N. Ireland, not all geonames entities.

I;m not sure what the state of the art in VoiD crawling is, or how consistent the VoiD usage is, (once again, any links appreciated, or I'd be interested in collaborating in the analysis phase if someone is planning to tackle this.)

We have a project this calendar year to see what can be done with existing content in the Environmental Assessment space, and next year would be keen to bring our experiences to the table with our Use Cases.

Regards

Richard Cyganiak

unread,
Jul 17, 2013, 2:35:08 AM7/17/13
to void-di...@googlegroups.com, void-di...@googlegroups.com
On 17 Jul 2013, at 01:13, Rob Atkinson <robatki...@gmail.com> wrote:
I;m not sure what the state of the art in VoiD crawling is, or how consistent the VoiD usage is, (once again, any links appreciated, or I'd be interested in collaborating in the analysis phase if someone is planning to tackle this.)

My subjective impression: Usage is quite inconsistent, perhaps due to lack of a VoID validation tool and strong consuming apps.

Best,
Richard

Rob Atkinson

unread,
Jul 17, 2013, 7:22:50 PM7/17/13
to void-di...@googlegroups.com
Hmm,  

is there interest or activity  in a VoiD validation tool? This might be a future activity I'd be interested if we can include extensions in the validation process - i.e. it should be able to take VoiD type specialisations needed by a specific community. So it could be depoyed in "vanilla" or "cookies and cream" modes.

I have found VoiD very well designed in general - have found i needed to use nearly all of it, and modulo this issue of whether i can describe non-RDF resources legally, have been able to use its semantics faithfully, though I've only done basic reasoning around object behaviour based on its dataset container. Not expert enough to know whether it works with other logic, or what people might try to do, but as a potential VoiD specialisation profile publisher would be good to know what the basic Use Cases I could test against are.

Where VoiD needs extension, generally I feel these are not it's concern - and its the lack of a suitable standard. I'd like to see standard profiles of VoiD for the Linked Data Platform technical feature set, and see LDP extended to handle binding identifiers to REST service endpoints, including QB, but also simpler URL templates.

Regards
Rob

Alasdair J G Gray

unread,
Jul 18, 2013, 3:58:38 AM7/18/13
to void-di...@googlegroups.com, EU openPHACTS project members based at the University
Hi Rob,

There is certainly interest in validating VoID documents against community profiles. I am involved with two such initiatives in over-lapping communities.

The first is the Open PHACTS drug discovery platform project [1]. Within this project we have specified a profile for VoID [2] with the properties that we require for providing provenance trails of the data and enabling linking between datasets. To validate that VoID descriptions do conform with our profile we have a validation service [3]. This service can easily be reconfigured to other profiles by changing a single configuration file.

The second activity I'm involved with is the W3C Health Care and Life Sciences Interest Group [4]. Within this group we are in the process of specifying a profile for hcls datasets. We plan to use the same validation service as the Open PHACTS project but with its own configuration file.

The code for our validator is available from [5]. We'd be happy to get involved with work to enable this to be used by a wider community.

Best regards,

Alasdair


Please consider the environment before printing this email.

Rob Atkinson

unread,
Jul 19, 2013, 1:50:48 AM7/19/13
to void-di...@googlegroups.com, EU openPHACTS project members based at the University
Thats very interesting indeed - I'll see when I can squeeze in a play with this.

Is there any chance it can be configured with a "vanilla" profile to meet Richard's desire for an online validation tool? Obviously it would need expert review that its actually a valid conformance check, and a process to fix errors and extend test coverage.

Rob

Richard Cyganiak

unread,
Jul 19, 2013, 3:31:43 AM7/19/13
to void-di...@googlegroups.com, void-di...@googlegroups.com
One thing worth mentioning here is W3C's upcoming workshop on RDF validation:

It may be the kick-off for a working group in that area. So we may get a standard way of describing validation rules and constraints on RDF, with corresponding tools.

Best,
Richard

On 18 Jul 2013, at 00:22, Rob Atkinson <robatki...@gmail.com> wrote:

Alasdair J G Gray

unread,
Jul 19, 2013, 9:52:46 AM7/19/13
to void-di...@googlegroups.com, EU openPHACTS project members based at the University
Rob,

On 19 Jul 2013, at 06:50, Rob Atkinson <robatki...@gmail.com> wrote:

Thats very interesting indeed - I'll see when I can squeeze in a play with this.

Is there any chance it can be configured with a "vanilla" profile to meet Richard's desire for an online validation tool?

What would you like to see in such a "vanilla" profile? The VoID specification does not mandate that any particular predicates are used, this is why we created the Open PHACTS profile. If we had a suitable profile defined (MUST/SHOULD/MAY properties) we could generate the configuration file for the validator.

Alasdair
Reply all
Reply to author
Forward
0 new messages