Best practices guidance

60 views
Skip to first unread message

Steve Ray

unread,
Oct 12, 2020, 2:39:03 PM10/12/20
to TopBraid Suite Users
This question is directed at anyone in this group who is fluent in
SHACL. I'm looking for guidance on best practices for modeling the
following kind of pattern:

A PowerMeasurementDevice takes measurements of Power.

Is it better practice to put a SHACL constraint on the instances of a
class called PowerMeasurement (that they should be measurements of
Power), where:

PowerMeasurementDevice producesMeasurements PowerMeasurement .


or, should I just define a property shape for PowerMeasurementDevice
such that the property producesMeasurements has the appropriate
constraints? In this case, I don't need to create a special class
called PowerMeasurement.

Any thoughts one way or the other? I'm leaning toward the explicit
definition of a PowerMeasurement class, but it's more intuition than
principle.


Steve

Irene Polikoff

unread,
Oct 12, 2020, 3:04:07 PM10/12/20
to topbrai...@googlegroups.com
I do not understand the difference between two options you are outlining. In the second case, how would the constraint look like?

The bigger question that would ultimately determine the answer is “what you would like to accomplish by producing this model and what do you expect to see in typical data that you will use with it?”. In other words, what do you want it to do e.g., what constraint violations would you like it to produce when validating what kind of data, what inferences would you like it to create if any?

There are few general best practices that can be stated in abstract, such as:

  • Do not use inverse properties; instead, use shapes with inverse paths
  • When deciding what direction to create a property in, go from “many” to “one”

But for most guidance, the data and goals need to be understood. There are many different options and alternatives with SHACL.  Even with the ones above it is not “100% must do", ultimately, it all depends.

--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/CAGUep87ZvBUe0urm5V6rxFPJOVpYCzG3Aa0qz7DxFBH-9TR9Wg%40mail.gmail.com.

Michael DeBellis

unread,
Oct 12, 2020, 3:16:32 PM10/12/20
to topbrai...@googlegroups.com

 I'm looking for guidance on best practices for modeling the
following kind of pattern:
A PowerMeasurementDevice takes measurements of Power.  

In my opinion, you haven't said enough about your problem to really give a good answer to the question. You can model this either with OWL or SHACL. Actually, you will probably use OWL either way just to model that a   PowerMeasurementDevice is an instance of Device which is probably a subclass of something else and that producesMeasurements is an object property. So the real question is: is it better to use OWL alone or OWL + SHACL? 

The standard answer to these kinds of questions is usually: "use OWL for reasoning and SHACL for constraints" but of course things like the domain and range of a property (like most things that can be modeled in SHACL) can be modeled either way, as an axiom for reasoning or as a constraint on data. 

I think what it really comes down to is what kind of data do you have? If you are creating say an expert system ontology from scratch and are also going to populate it with rules in SWRL or some other language then probably OWL alone is better. If you have a lot of existing real world data which may contain values of producesMeasurements that don't have the appropriate values for their domain and/or range then you probably want to use SHACL. Also,  how volatile is your data? If there is constantly new data coming in (which may not match the constraint) then SHACL is probably better. If it is relatively static and your data will tend to be well formed then OWL axioms are probably better. 

As you probably know, the issue with using OWL axioms is the first time you encounter an Individual that doesn't have the proper domain or range then your entire ontology is inconsistent which means it can't be used for any reasoning until the bad data is fixed or removed. With SHACL you just get a report (or some other user defined action) about the bad data but it doesn't make your entire model inconsistent. 

Michael 


Irene Polikoff

unread,
Oct 12, 2020, 6:02:34 PM10/12/20
to topbrai...@googlegroups.com
I am pretty sure this was a question about the most appropriate SHACL modeling pattern as opposed to OWL vs SHACL.

Regarding reasoning, TopBraid users typically use SHACL rules for reasoning. SWRL is pretty obsolete. It will never be a standard and is mainly supported by Protege. 

I was also assuming that PowerMeasurementDevice (if such class is created) could be a subclass of a MeasurementDevice and instances would be the actual devices - may be from different manufacturers, etc. Of course, the decision on where to stop with classes and subclasses and start using instances is also situational - it depends.

Steve Ray

unread,
Oct 12, 2020, 6:15:00 PM10/12/20
to TopBraid Suite Users
Yes, sorry, I was indeed asking about SHACL modeling patterns. I will
think some more so that I can distill my confusion into something more
clear, and perhaps gain insight along the way.

Steve
> To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/1D78D8DF-B1ED-4122-B936-81D3046176C5%40topquadrant.com.

Gary Murphy

unread,
Oct 13, 2020, 8:59:16 AM10/13/20
to TopBraid Suite Users
I don't know if this helps, but in my case, using SHACL to measure how far a data item is from it's definition (I count errors and warnings, then report a critique); I also needed to report the shacl shapes involved and allow overrides.  I chose to put my rules into their own NodeShapes rather than extend the data item classes; I don't have any great compelling reason to split them like this, even as extensions to the data classes, the PropertyShapes could still have their own id's, but this division has worked for me.



--

Schema App


Gary Murphy

Developer, Schema App

ga...@schemaapp.com

www.schemaapp.com

Steve Ray

unread,
Oct 13, 2020, 3:52:53 PM10/13/20
to TopBraid Suite Users
Irene,
Thanks for the feedback. I was working on a "double-hop" property
shape as part of this exercise, and stumbled on what I would
characterize as a UI bug in TBC. Note in the screenshot that when I
chose "Add Empty Row" when setting a value for sh:hasValue, the
datatype shows up as string.
Autocompletion works as expected (see second screenshot)
But here's the source code view (see third screenshot). I lost a
couple of hours wondering why the validation was failing until Ralph
helpfully pointed out the quotation marks in the source code, which of
course should not be there.



Steve


On Mon, Oct 12, 2020 at 3:02 PM Irene Polikoff <ir...@topquadrant.com> wrote:
>
> To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/1D78D8DF-B1ED-4122-B936-81D3046176C5%40topquadrant.com.
Screen Shot 2020-10-13 at 12.44.07 PM.png
Screen Shot 2020-10-13 at 12.48.12 PM.png
Screen Shot 2020-10-13 at 12.49.14 PM.png

Steve Ray

unread,
Oct 13, 2020, 4:52:57 PM10/13/20
to TopBraid Suite Users
Irene,
To answer your question about what my second case would look like,
here's a working example:

test1:PressureMeasuringDevice
a owl:Class ;
a sh:NodeShape ;
rdfs:subClassOf owl:Thing ;
sh:property [
a sh:PropertyShape ;
sh:path test1:producesMeasurements ;
sh:class qudt:Quantity ;
sh:name "produces measurements" ;
] ;
sh:property [
a sh:PropertyShape ;
sh:path (
test1:producesMeasurements
qudt:unit
qudt:hasQuantityKind
) ;
sh:hasValue <http://qudt.org/vocab/quantitykind/ForcePerArea> ;
sh:name "produces measurements" ;
] ;
.

Note that this does not require the definition of a constrained class
PowerMeasurement. Instead, it just says "I expect an instance of the
generalized qudt:Quantity class, but it had better have a unit with a
quantitykind of ForcePerArea".

So, the advantage of this method is there are fewer subClasses of my
Measurement class that I would have to define. The cost is more
complex property shapes.

I'm quite interested in your opinion of this.


Steve
> To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/E7D12854-E0D7-4342-AB1C-989E1CD1D12A%40topquadrant.com.

Holger Knublauch

unread,
Oct 13, 2020, 6:16:20 PM10/13/20
to topbrai...@googlegroups.com
Hi Steve,

the UI needs to decide what kind of input widget it presents based on
what it can find out from the ontology. The case of sh:hasValue is
difficult, because it would need to make some assumptions about what
type of value is most likely needed. In your case this is extra
difficult because the property shape doesn't specify sh:datatype or
sh:class, and even worse, uses a property path. Due to lack of other
info it thus "guesses" you may want to use an xsd:string and presents
you with a string input field.

Some things are better edited in source code, and I recommend taking a
look at the source code from time to time. The EDG web UI allows
arranging the source code and form panels side by side, making it easier
to spot such mistakes.

Holger

Steve Ray

unread,
Oct 13, 2020, 6:33:04 PM10/13/20
to TopBraid Suite Users
Understood. I'll try to remain vigilant about those little widgets.
This was a subtle one. It really had me questioning whether I
understood SHACL at all!

Steve
> To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/e1cb2bd5-0763-5aa2-a166-8d52164e22da%40topquadrant.com.

Irene Polikoff

unread,
Oct 13, 2020, 7:19:05 PM10/13/20
to topbrai...@googlegroups.com
Hi Steve,

In principle, there could be several kind of things that one may expect to be instances of the test1:PressureMeasuringDevice.

Since I don’t have a good example with pressure at the moment, let's talk, instead, about WeightMeasuringDevice such as a scale. The idea is the same:

1. There are different models of scales e.g.,https://www.amazon.com/eufy-Bluetooth-Measurements-Composition-Analysis/dp/B07GZBXCH6, so a model could be an instance. Models don’t really have quantities since a quantity is a specific measurement with a value. But one could say that a model is used to measure human weight as opposed to the weight of food.
2. There are instances of scales - a specific scale of the model https://www.amazon.com/eufy-Bluetooth-Measurements-Composition-Analysis/dp/B07GZBXCH6, with its specific barcode
3. There are applied instances of scales. In other words, it is not only a specific scale, but it is a scale that is being used to measure my weight 

I think you are describing the third situation because you are wanting to specify quantities e.g., Irene’s weight would be an instance of qudt:Quantity. Presumably, in your data you will have instances of qudt:Quantity and you want to ensure that they have the right unit of measure.

Your model would work fine for data validation of existing data e.g., if you have an instance of a weight measuring device and it is said to produce measurements (quantities) that are not measured in units of the right quantity kind, e.g., it is said that Irene’s weight is measured in meters you will get a violation. The device will be flagged as invalid.

I don’t like that both property shapes in your example have the same sh:name - this is confusing.

If on the other hand, you want to be able to create data based on this model e.g., when you create a device and then create a quantity "Irene’s weight", you want to be offered for selection only the appropriate units if this is a quantity measured by your device, then this model would not support it in EDG and, I doubt, would support it for any tool out of the box. 

If quantities such as Irene’s weight are already existing when you create a device, you may want to be offered just the quantities that use appropriate units of measure when describing the device. This would not work out of the box either.

You could create a custom application targeted to your use of your specific modeling pattern.

Also, with data validation, is it the device that is invalid or is it the  quantity that is invalid? Can quantities exist independent of the devices that measure them? May be the issue is with the quantity - it got described with the wrong units and not with the device. In other words, it is correct to say that a specific device is used to measure my weight, but it is not correct to say that my weight could be measured in meters. Further, data about my weight could exist without knowing a specific scale that measured it.

In other words, the model design depends on whether it is only for validation of already existing data or for guiding you in correctly creating new data. It also depends on what is the focus of your validation - devices or quantities.

Reply all
Reply to author
Forward
0 new messages