Understanding the limits of SHACL

48 views
Skip to first unread message

Steve Ray

unread,
Dec 9, 2020, 1:25:34 PM12/9/20
to TopBraid Suite Users
I'd like to run the following scenario by you to confirm whether my understanding of what I can and cannot do with TBC and SHACL is correct.

Suppose I have the following pattern:

image.png

I want to use a SHACL PropertyShape with sh:values to give me an inferred "shortcut" relation from Class A to Class D. However, in doing so, I want to filter on values of Prop B and Prop C, so it is not just a simple matter of a 3-step sh:path statement.
I read your documentation at https://www.topquadrant.com/graphql/values.html with interest on this topic, and have concluded that while a single PropertyShape cannot achieve this, I can nest two PropertyShapes:
the first gives me a filtered "shortcut" relation from Class A to Class C (filtering on Prop B), which I can then use in defining a second filtered "shortcut" that filters on Prop C and gets me to Class D. This conclusion is based on the fact that one cannot nest sh:values statements inside a single PropertyShape definition - is this true? (If it IS possible to nest sh:values statements, does the inner sh:values statement play the role of sh:nodes for the outer filterShape?)

Corollary questions:
Must Class A be the targetClass of both of the above PropertyShapes?
Must I run the SHACL reasoner beforehand for the inferred relations to be usable?

I also tried to achieve the same behavior with an embedded SPARQL query instead of native SHACL statements, which allowed me to create the inferred shortcut relation with a single SPARQL query. However, as you document at the same link above, under the section "Use of Inferred Values using SPARQL", I can only use the shortcut property if:
a) I use the       (?a ?shortcut-property) tosh:values ?d          syntax, or
b) I run the SHACL reasoner before making a query.

Is this correct?

Finally, you also solicit feedback on whether the integration with SPARQL should work more directly, without the use of the tosh:values magic property, to which I will add a strong "YES PLEASE!". I say this because my colleagues are looking for use-cases where the inferred shortcut property is indistinguishable from a regular explicit property when querying.



Steve


Holger Knublauch

unread,
Dec 9, 2020, 7:11:37 PM12/9/20
to topbrai...@googlegroups.com


On 2020-12-10 4:25 am, Steve Ray wrote:
I'd like to run the following scenario by you to confirm whether my understanding of what I can and cannot do with TBC and SHACL is correct.

Suppose I have the following pattern:

image.png

I want to use a SHACL PropertyShape with sh:values to give me an inferred "shortcut" relation from Class A to Class D. However, in doing so, I want to filter on values of Prop B and Prop C, so it is not just a simple matter of a 3-step sh:path statement.
I read your documentation at https://www.topquadrant.com/graphql/values.html with interest on this topic, and have concluded that while a single PropertyShape cannot achieve this, I can nest two PropertyShapes:
the first gives me a filtered "shortcut" relation from Class A to Class C (filtering on Prop B), which I can then use in defining a second filtered "shortcut" that filters on Prop C and gets me to Class D. This conclusion is based on the fact that one cannot nest sh:values statements inside a single PropertyShape definition - is this true? (If it IS possible to nest sh:values statements, does the inner sh:values statement play the role of sh:nodes for the outer filterShape?)

Maybe this discussion should be grounded in specific examples to make sure we are talking about the same things.

But sh:values rules can *depend* on each other, if that's what you mean with nesting. So if you have

    sh:values [ sh:path ex:other ]

and ex:other is also backed by a sh:values rule then it will compute those values on the fly. And the same works if sh:path expressions are used deeper in the expression tree.


Corollary questions:
Must Class A be the targetClass of both of the above PropertyShapes?
I don't think so, but a specific example may help.

Must I run the SHACL reasoner beforehand for the inferred relations to be usable?
I don't think so.


I also tried to achieve the same behavior with an embedded SPARQL query instead of native SHACL statements, which allowed me to create the inferred shortcut relation with a single SPARQL query. However, as you document at the same link above, under the section "Use of Inferred Values using SPARQL", I can only use the shortcut property if:
a) I use the       (?a ?shortcut-property) tosh:values ?d          syntax, or
b) I run the SHACL reasoner before making a query.

Is this correct?
Yes, SPARQL only ever sees the asserted triples in normal basic graph patterns (?s ?p ?o). So if you want special handling of inferred values, you need to go through our magic properties or materialize the inferences beforehand.


Finally, you also solicit feedback on whether the integration with SPARQL should work more directly, without the use of the tosh:values magic property, to which I will add a strong "YES PLEASE!". I say this because my colleagues are looking for use-cases where the inferred shortcut property is indistinguishable from a regular explicit property when querying.

Yes I know. This is not trivial, otherwise we would have done this already. It wouldn't be a big deal if the data is static and doesn't change. Then, we could simply offer a button where the inferences graph is (temporarily) added to the query graph. However, many real-world use cases would have too many triples for this to work, and (after edits) cache invalidation is a major problem.

So: how big is your data (asserted vs inferred triples) and how often does the data change?

As I probably said before, note that other technologies that we are bundling with TopBraid make working with inferred values more "natural": Both GraphQL and Active Data Shapes (JavaScript) will compute inferences on the fly. This is because these implementations are more under our own control so I was able to inject additional logic when the fields are queried. The SPARQL engine is based on simple SPO graph queries, where adding this logic would be too expensive. For example, if you have a given subject, our code would first need to compute the context (rdf:types, applicable shapes etc) in order to determine applicable rules. In GraphQL and ADS the context is provided by the surrounding object, e.g.

    person.address.zipCode

will know that the zipCode is computed through a sh:values rule at ex:Address-zipCode, and address would be computed from ex:Person-address. This context makes the dynamic computation much easier/performant. Furthermore, GraphQL and ADS do not need to support pesky inverse cases where you need to go from an object to the subject (?s $p $o)...

Holger





Steve


--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/CAGUep8515RB8ZCxqx2YBou_dp%2BxyFKE5ZkRqB0Ho5jC-prixCQ%40mail.gmail.com.

Steve Ray

unread,
Dec 10, 2020, 1:33:06 PM12/10/20
to TopBraid Suite Users
Holger,
OK, here's the actual situation I'm in. If you look at the instances in this diagram, I'm trying to get from b20:Fan_20 to b20:Fan20OutletPressure:
Pasted image 20201207150105.png
...I'm trying to see if I can use native SHACL calls to duplicate the following embedded SPARQL version:

b20:OutletPressureValueShape
  a sh:PropertyShape ;
  sh:path c223:hasOutletPressure ;
  sh:name "Outlet Pressure Value Shape" ;
  sh:values [
      sh:select """SELECT ?prop
WHERE {
?anyOutlet <http://www.w3.org/2000/01/rdf-schema#subClassOf>* <http://data.ashrae.org/standard223/1.0/model/core#OutletPoint> .
$this <http://data.ashrae.org/standard223/1.0/model/core#hasConnectionPoint> ?cp .
?cp a ?anyOutlet .
?cp <http://data.ashrae.org/standard223/1.0/model/core#hasProperty> ?prop .
?prop <http://qudt.org/schema/qudt/hasQuantityKind> <http://qudt.org/vocab/quantitykind/Pressure> .
} """ ;
    ] ;
.

I recognize that I would not be able to support the use of the variable ?anyOutlet, but I'd be interested if the rest is still possible. That is, to filter on the type of the first jump along the path, and then to filter on the hasQuantityKind value on the second jump along the path. I'm confused when to use sh:nodes (is this always needed when using sh:filterShape?)

I started trying to nest sh:values with the following malformed code, but just got confused...

b20:OutletPressureValueShape
  a sh:PropertyShape ;
  sh:path c223:hasOutletPressure ;
  sh:name "Outlet Pressure Value Shape" ;
  sh:values [
      sh:filterShape [
          sh:property [
              sh:path c223:hasProperty ;
              sh:hasValue b20:Fan20OutletPressure ;
            ] ;
        ] ;
      sh:nodes [
          sh:values [
              sh:filterShape [
                  sh:property [
                      sh:path rdf:type ;
                      sh:hasValue c223:AirOutletPoint ;
                    ] ;
                ] ;
              sh:nodes [
                  sh:path c223:hasConnectionPoint ;
                ] ;
            ] ;
        ] ;
    ] ;
.


P.S., I have tried, and failed, to get the prefixes working inside the embedded SPARQL query. But that's another story.

Steve




Holger Knublauch

unread,
Dec 10, 2020, 9:56:56 PM12/10/20
to topbrai...@googlegroups.com

Hi Steve,

here is a screenshot (of EDG 7.0 file editing capability) that shows how I developed this example

The ontology may not exactly align with what you have in mind, but it's hopefully close enough. File is attached.

For copy and paste, here is the sh:values expression, with annotations. To read this from left to right in your diagram, you need to start inside-out, i.e. at the bottom. This is what gets evaluated first, then the resulting nodes get moved up etc.

  sh:values [ # here we get all hasValues values of the nodes delivered by the sh:nodes expression
      sh:path ex:hasValue ;
      sh:nodes [ # here we filter all sh:nodes from the nested expression to only allow those that have Pressure
          sh:filterShape [
              sh:property [
                  sh:path ex:hasQuantityKind ;
                  sh:hasValue ex:Pressure ;
                ] ;
            ] ;
          sh:nodes [  # this node takes the hasProperty values of the nodes returned by the sh:nodes expression
              sh:path ex:hasProperty ;
              sh:nodes [   # this blank node here takes all values of hasConnectionPoint (at the start node ?this)
                  sh:filterShape [ # and filters them to only allow instances of AirOutletPoint. This is a SHACL constraint
                      sh:class ex:AirOutletPoint ;
                    ] ;
                  sh:nodes [
                      sh:path ex:hasConnectionPoint ;
                    ] ;
                ] ;
            ] ;
        ] ;
    ] ;

Whether this is the best way to express such things is of course questionable if you already have SPARQL in routine use. One reason in favor of SHACL node expressions would be that the nested sh:values expressions would also apply inferences (recursively). Another advantage of SHACL is that it supports inline SHACL constraints, e.g. the sh:class operator is arguably more elegant than trying the same in SPARQL. Having said this, SHACL filters may be much less efficient from a performance point of view while hasValue matches are directly optimized in SPARQL.

To make prefixes work in sh:sparql, use the SHACL button on the namespaces widget in TBC. This will produce something like

<http://example.org/pressure>
  rdf:type owl:Ontology ;
  swa:defaultNamespace "http://example.org/pressure#" ;
  rdfs:label "New File (pressure.ttl)" ;
  owl:imports <http://datashapes.org/dash> ;
  sh:declare ex:PrefixDeclaration ;
.

ex:PrefixDeclaration
  rdf:type sh:PrefixDeclaration ;
  sh:namespace "http://example.org/pressure#"^^xsd:anyURI ;
  sh:prefix "ex" ;
.

and you can then say

sh:sparql [
    sh:prefixes <http://example.org/pressure> ;
    sh:select "..."
]

HTH
Holger

pressure.ttl

Steve Ray

unread,
Dec 11, 2020, 12:59:47 PM12/11/20
to TopBraid Suite Users
Holger,
Your example is tremendously helpful! Thanks so much. I can see how the sh:filterShape and sh:nodes statements always come in pairs, and that the outermost sh:path (inside the sh:values) needs an sh:nodes to operate on. These are all things that I wasn't sure about.

I will use this approach in the future. In our work, we are going to be assuming OWL RL reasoning, so the embedded sh:class ex:AirOutletPoint should even work with a superClass, as in my original example. I understand that if I don't run the OWL RL reasoner first, then I would need to fall back to the SPARQL query approach if I want to query against the superclass of AirOutletPoint.

Thanks again. 

Steve




Holger Knublauch

unread,
Dec 11, 2020, 7:29:20 PM12/11/20
to topbrai...@googlegroups.com
On 2020-12-12 3:59 am, Steve Ray wrote:

> Holger,
> Your example is tremendously helpful! Thanks so much. I can see how
> the sh:filterShape and sh:nodes statements always come in pairs, and
> that the outermost sh:path (inside the sh:values) needs an sh:nodes to
> operate on. These are all things that I wasn't sure about.
>
> I will use this approach in the future. In our work, we are going to
> be assuming OWL RL reasoning, so the embedded sh:class
> ex:AirOutletPoint should even work with a superClass, as in my
> original example. I understand that if I don't run the OWL RL reasoner
> first, then I would need to fall back to the SPARQL query approach if
> I want to query against the superclass of AirOutletPoint.

Just to be extra clear: sh:values expressions will not automatically use
other types of reasoning such as OWL RL. Those triples would need to be
asserted. However, operators such as sh:class will automatically walk
up/down the rdfs:subClassOf hierarchy and thus have some kind of simple
reasoning built in.

Holger


Steve Ray

unread,
Dec 11, 2020, 8:08:44 PM12/11/20
to TopBraid Suite Users
I see your point. Using sh:class still worked with the superclass (c223:InletPoint).

Here is my working shape (this time for an InletPoint, which has a subClassOf AirInletPoint):

b20:InletPressureShape

  rdf:type sh:PropertyShape ;

  sh:path c223:hasInletPressure ;

  sh:name "Inlet Pressure Shape" ;

  sh:values [

      sh:path c223:hasValue ;

      sh:nodes [

          sh:filterShape [

              sh:property [

                  sh:path qudt:hasQuantityKind ;

                  sh:hasValue qudtqk:Pressure ;

                ] ;

            ] ;

          sh:nodes [

              sh:path c223:hasProperty ;

              sh:nodes [

                  sh:filterShape [

                      sh:class c223:InletPoint ;

                    ] ;

                  sh:nodes [

                      sh:path c223:hasConnectionPoint ;

                    ] ;

                ] ;

            ] ;

        ] ;

    ] ;

.


I do find that I still need to run the SHACL reasoner before the c223:hasInletPressure relation returns a result in the query below, but I don't need to run the OWL RL rules. 


SELECT *

WHERE {

?s c223:hasInletPressure ?press .

}





Steve




--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages