Can reference dataset be used in SHACL constraint in EDG?

25 views
Skip to first unread message

Fan Li

unread,
Jul 11, 2019, 2:56:44 PM7/11/19
to TopBraid Suite Users
For example, if there is an attribute "country code", can I constraint its value to one of them from the "Country Code" reference dataset? Thanks!

Rob Atkinson

unread,
Jul 11, 2019, 7:49:14 PM7/11/19
to TopBraid Suite Users
Hi - I have been thinking about this too - it seems a challenge in an "open world" or with large reference data sets.

sh:Class allows you to define an acceptable Class, which all your country codes belong to - but to use this it means importing all the content, (with explicit rdf:type declarations) into the graph with the SHACL rules - this wont scale to a bigger dataset - such as biological species, or even car models.

Maybe if the rules engine is expected to dereference a URI and find the Class of the instance referenced - but that only works in a Linked Data world where URIs will return an RDFS profile. This may need a control on the rule - eg:sh_x:performLookup rdfs:range xsd:Boolean

alternatively maybe a sh_xxx:lookup with options - e.g. a SPARQL query template that names the graph  - or maybe just the graph identifier ?

Holger Knublauch

unread,
Jul 11, 2019, 8:13:17 PM7/11/19
to topbrai...@googlegroups.com


On 12/07/2019 09:49, Rob Atkinson wrote:
Hi - I have been thinking about this too - it seems a challenge in an "open world" or with large reference data sets.

sh:Class allows you to define an acceptable Class, which all your country codes belong to - but to use this it means importing all the content, (with explicit rdf:type declarations) into the graph with the SHACL rules - this wont scale to a bigger dataset - such as biological species, or even car models.

I wouldn't see that as a problem. The validation of sh:class happens in the *data graph*, i.e. instances. The shapes graph only need the class reference, e.g.

    sh:class ex:Country

would suffice in the shapes graph, and only the data graph needs to owl:import the actual instances of ex:Country.

Regardless of where these instances live, sh:class would be insufficient to express that *only* those values are permitted for any data graph. So for example, another data graph may owl:import additional ex:Country instances, or even just define them itself. The only built-in mechanism to cover finite enumerations in SHACL core would be sh:in, but that will not scale for large data sets, and is not very modular.

I believe this indicates the need for another SHACL constraint component, which could have a syntax such as

ex:Address
    sh:property [
        sh:path ex:country ;
        sh:class ex:Country ;
        ext:graph <http://my-dataset.org> ;
        ext:classInGraph ex:Country ;
    ] .

This constraint component would take a graph and a class as arguments, and then only permit the instances from the given graph. Such a constraint component is reasonably easy to define in SHACL-SPARQL, and is thus supported by the official standard, see

https://www.w3.org/TR/shacl/#constraint-components-syntax

The SPARQL query would be something like

ASK {
    GRAPH $graph {
        $value rdf:type/rdfs:subClassOf* $classInGraph
    }
}

(Replacing $graph with the given ext:graph and $classInGraph with the given ext:classInGraph).

BTW in the example above I left the sh:class statement to help tools such as input forms to still make sense of the situation - these would not know much about the semantics of ext:classInGraph.


Maybe if the rules engine is expected to dereference a URI and find the Class of the instance referenced - but that only works in a Linked Data world where URIs will return an RDFS profile. This may need a control on the rule - eg:sh_x:performLookup rdfs:range xsd:Boolean

alternatively maybe a sh_xxx:lookup with options - e.g. a SPARQL query template that names the graph  - or maybe just the graph identifier ?

Yes to the latter, see one such approach above.

Holger




On Friday, 12 July 2019 04:56:44 UTC+10, Fan Li wrote:
For example, if there is an attribute "country code", can I constraint its value to one of them from the "Country Code" reference dataset? Thanks!
--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/1c2bfbbb-4a1d-4630-8b92-7f7d6e54da9f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Fan Li

unread,
Jul 12, 2019, 8:09:16 AM7/12/19
to TopBraid Suite Users
Hi Holger: Thanks for your detailed explanation! I am new to SHACL and EDG... How can I define a new Constraint Component in EDG or Composer?
To unsubscribe from this group and stop receiving emails from it, send an email to topbrai...@googlegroups.com.

Holger Knublauch

unread,
Jul 12, 2019, 7:02:58 PM7/12/19
to topbrai...@googlegroups.com

They are "just" instances of sh:ConstraintComponent that you can edit with TBC. Look at the instances such as sh:ClassConstraintComponent in dash.ttl for examples. In a nutshell, a constraint component defines the names of parameters (sh:parameter) and one or more so-called validators. The validators are typically SPARQL queries and in those the values of the parameters can be used as pre-bound variables.

If you want to go down this route and run into roadblocks I'd be happy to help you work out this use case.

Holger

To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/3b3b02ec-78cd-4114-8b00-16f6c4c2ff18%40googlegroups.com.

Holger Knublauch

unread,
Jul 14, 2019, 2:10:18 AM7/14/19
to topbrai...@googlegroups.com

Hi Fan Li,

I was just reminded that we do have support for one such pattern built into EDG already. Take a look at the class edg:PropertyValueSet, which describes the values of a (reference) dataset by means of class, graph and property. This is meant to be used in cases where you want to use something like a Country *Code*, not the Country URI resource, as values. You can define an instance of edg:PropertyValueSet to point at these values and then use edg:propertyValueSet to link a property shape with your edg:PropertyValueSet. This will then apply do constraint checking to verify that the values are in fact valid country codes.

Does this resemble your use case?

Holger


On 12/07/2019 04:44, Fan Li wrote:
For example, if there is an attribute "country code", can I constraint its value to one of them from the "Country Code" reference dataset? Thanks!
--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.

Fan Li

unread,
Jul 17, 2019, 12:42:15 AM7/17/19
to TopBraid Suite Users
Hi Hogler,

Thanks much for your follow-up! I have been exploring the 1st option you offered and I sort of understand how to add sh:ConstraintComponent instance in dash.ttl by now. My question is whether I should just modify the dash.ttl file or somehow extend it by creating my own constraint declaration file importing dash.ttl?

I will look into the 2nd option you just mentioned now. I will report back later.

Thanks again,
Fan


On Sunday, July 14, 2019 at 2:10:18 AM UTC-4, Holger Knublauch wrote:

Hi Fan Li,

I was just reminded that we do have support for one such pattern built into EDG already. Take a look at the class edg:PropertyValueSet, which describes the values of a (reference) dataset by means of class, graph and property. This is meant to be used in cases where you want to use something like a Country *Code*, not the Country URI resource, as values. You can define an instance of edg:PropertyValueSet to point at these values and then use edg:propertyValueSet to link a property shape with your edg:PropertyValueSet. This will then apply do constraint checking to verify that the values are in fact valid country codes.

Does this resemble your use case?

Holger


On 12/07/2019 04:44, Fan Li wrote:
For example, if there is an attribute "country code", can I constraint its value to one of them from the "Country Code" reference dataset? Thanks!
--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbrai...@googlegroups.com.

Holger Knublauch

unread,
Jul 17, 2019, 12:44:42 AM7/17/19
to topbrai...@googlegroups.com


On 17/07/2019 14:42, Fan Li wrote:
Hi Hogler,

Thanks much for your follow-up! I have been exploring the 1st option you offered and I sort of understand how to add sh:ConstraintComponent instance in dash.ttl by now. My question is whether I should just modify the dash.ttl file or somehow extend it by creating my own constraint declaration file importing dash.ttl?

No, dash.ttl is a system file that cannot be modified and will be overwritten with the next TopBraid update.

Instead, create your own .ttl file that owl:imports dash.ttl and then declares the additional constraint component(s). Then owl:import that .ttl file into your Ontology asset collection, using the Edit Includes dialog.

Holger



I will look into the 2nd option you just mentioned now. I will report back later.

Thanks again,
Fan

On Sunday, July 14, 2019 at 2:10:18 AM UTC-4, Holger Knublauch wrote:

Hi Fan Li,

I was just reminded that we do have support for one such pattern built into EDG already. Take a look at the class edg:PropertyValueSet, which describes the values of a (reference) dataset by means of class, graph and property. This is meant to be used in cases where you want to use something like a Country *Code*, not the Country URI resource, as values. You can define an instance of edg:PropertyValueSet to point at these values and then use edg:propertyValueSet to link a property shape with your edg:PropertyValueSet. This will then apply do constraint checking to verify that the values are in fact valid country codes.

Does this resemble your use case?

Holger


On 12/07/2019 04:44, Fan Li wrote:
For example, if there is an attribute "country code", can I constraint its value to one of them from the "Country Code" reference dataset? Thanks!
--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbrai...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/328b5040-6670-445a-93a4-7de7cfde3382%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/8626f31e-d30d-4b7a-b0e0-ee2de0cf6fa5%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages