On 18 Feb 2022, at 20:51, Tim Smith <smith...@gmail.com> wrote:
--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/CAF0WbnLbWWJoc%2BE3vOAiCvLRVtNhvUvBPiEJdH32dZ1McZu0nw%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/7A145419-9C78-4C75-8144-9436E99CBDCF%40topquadrant.com.
On 22 Feb 2022, at 13:41, Tim Smith <smith...@gmail.com> wrote:Hi David,Thank you for the clarification. My impression of TeamWorks was different. This feels like "open loop control" of a closed loop process given that one workflow can overwrite another unless strict human processes are followed. My use case requires branching and versioning with "github-like" functionality where multiple people/roles can edit the same graph with conflicting edits reconciled at commit time.
We have a need to have versions of both ontologies and instance data with compatible versions linked together.
Since all edits can be controlled through EDG (vs Git where the individual changes have to be detected and reconciled), maybe an extension of the TBC Diff engine might help here?
Anyway, I was hoping to use versioning/diff reconciliation to help justify the cost of EDG. I do not have a TSM at the moment. What is the best way to submit a feature request?
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/CAF0WbnK%2BTQUYvp1%2B%2B3V8MXSZujq8z8_FXifFycNmgwXBL1vDWQ%40mail.gmail.com.
Hi Tim,A few further opinions below.On 22 Feb 2022, at 13:41, Tim Smith <smith...@gmail.com> wrote:Hi David,Thank you for the clarification. My impression of TeamWorks was different. This feels like "open loop control" of a closed loop process given that one workflow can overwrite another unless strict human processes are followed. My use case requires branching and versioning with "github-like" functionality where multiple people/roles can edit the same graph with conflicting edits reconciled at commit time.IMO thinking of ontologies as software artefacts is a good approach. However, assuming that ontologies (RDF graphs), can be versioned and diff'ed using the same approach as say Java source code (sequences of lines of text) does not seem to me like a good way to start. In RDF-land, even ontologies are actually data. The defiition of “conflicting edits” is also a big one as things that conflict in RDF-land can be completely unrelated as far as a normal diff tool is concerned.
We have a need to have versions of both ontologies and instance data with compatible versions linked together.Of course, everyone has that requirement who’s ever used an ontology at all. However, a tool cannot do all the heavy lifting. For example, if your ontology changes to make existing data invalid then some sort of data migration/transformation is needed.The simple addition of "<x> sh:minCount 1” in a SHACL property shape can make millions of data instances invalid and no diff tool will find that.Only good business processes outside anything in EDG, or any other tool, can properly support this in the general case.
Since all edits can be controlled through EDG (vs Git where the individual changes have to be detected and reconciled), maybe an extension of the TBC Diff engine might help here?As I mentioned, in a working copy you are not actually changing a graph. A working copy is a layer of changes (literally additions and deletions) to be applied over the production copy that the UI make look like normal triples when viewing thru the lense that is the working copy. That said, we are looking at potential improvements.
TopBraid's Diff engine is entirely declarative and can be modified or extended to get customized output. The various diff classes have SPARQL queries attached to them via diff:rule
. These queries are called by the engine to create the instances of this class. The GRAPH
keyword of SPARQL is used to query the old graph versus the new graph. If you want to add your own kinds of diff outputs, then simply add such diff:rules
.
The second step of the Diff engine is to call all spin:rules
for the constructed instances of the diff classes. These rules can be used to post-process the raw output from the first step, e.g. to create human-readable labels. The same mechanism can be used to create higher-level diff objects from the lower-level triple change objects.
You can modify the diff.ttl
file in your workspace to adjust the behavior of the diff engine for your needs.
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/EEB5D695-A7B4-43A7-89EC-563247F17BDF%40topquadrant.com.
On 22 Feb 2022, at 17:32, Tim Smith <smith...@gmail.com> wrote:Hi David,A few thoughts below.On Tue, Feb 22, 2022 at 9:57 AM David Price <dpr...@topquadrant.com> wrote:Hi Tim,A few further opinions below.On 22 Feb 2022, at 13:41, Tim Smith <smith...@gmail.com> wrote:Hi David,Thank you for the clarification. My impression of TeamWorks was different. This feels like "open loop control" of a closed loop process given that one workflow can overwrite another unless strict human processes are followed. My use case requires branching and versioning with "github-like" functionality where multiple people/roles can edit the same graph with conflicting edits reconciled at commit time.IMO thinking of ontologies as software artefacts is a good approach. However, assuming that ontologies (RDF graphs), can be versioned and diff'ed using the same approach as say Java source code (sequences of lines of text) does not seem to me like a good way to start. In RDF-land, even ontologies are actually data. The defiition of “conflicting edits” is also a big one as things that conflict in RDF-land can be completely unrelated as far as a normal diff tool is concerned.I was definitely not thinking of diff as in a textual difference operation. My reference to Github was intended to be an analogy not a literal interpretation. I believe the concepts of branching, diff'ing and merging apply both to code and graphs with the implementation being wildly different. Comparing graphs at the triple level, including handling b-nodes properly was what I was thinking, thus my reference to the TBC Diff engine.We have a need to have versions of both ontologies and instance data with compatible versions linked together.Of course, everyone has that requirement who’s ever used an ontology at all. However, a tool cannot do all the heavy lifting. For example, if your ontology changes to make existing data invalid then some sort of data migration/transformation is needed.The simple addition of "<x> sh:minCount 1” in a SHACL property shape can make millions of data instances invalid and no diff tool will find that.Only good business processes outside anything in EDG, or any other tool, can properly support this in the general case.I wasn't thinking of a tool to manage the transformation of instance data from one ontology to another. I agree, that would be a real challenge. I was thinking of simply ensuring an instance graph can know what version of an ontology it was populating. This requires defining versions of ontologies beyond putting a "v<X.x>" in the name of the ontology graph.. This could be inferred from the import statements or a specific predicate could be used to point to the defining ontology. While I'm not looking for this capability here, I am a fan of using a transformation ontology to define the transformations and a SPARQL/SPIN/SHACL engine to execute them. I have built such an ontology and have used it successfully a number of times. I believe this was also the strategy behind the Spinmap Mapping technology in TBC.Since all edits can be controlled through EDG (vs Git where the individual changes have to be detected and reconciled), maybe an extension of the TBC Diff engine might help here?As I mentioned, in a working copy you are not actually changing a graph. A working copy is a layer of changes (literally additions and deletions) to be applied over the production copy that the UI make look like normal triples when viewing thru the lense that is the working copy. That said, we are looking at potential improvements.I understand. Fundamentally, I think of a "working copy" as originating from the production graph at a specific point in time (i.e. a "branch" in the version tree), even if the graph isn't duplicated.
I think the risk in the current design is that the layer of changes is currently defined as anything the user has entered as a change since the working copy was created. As you said, working copies do not know what changes other working copies have committed to the production graph.
So if you make changes, and I make changes, if I commit first, your changes can overwrite mine and vice versa.
What's also confusing, currently, if this happens, your changes will show up in my working copy (because it's just loading the latest production graph). BUT, when I run the Comparison Report in my working copy, which explicitly says "Compares this working copy with the production copy", it only shows the changes I have entered even though the production copy has changed since my workflow was initiated.
Maybe I worked with version control systems for too long, LOL???
While this can be managed at the work process level, it would be better to prevent the occurrence within EDG. For example, when a Working Copy is displaying the changes, it could show all proposed changes (from my working copy) as well as all committed changes that have occured since the working copy was initiated (EDG should have the change history but maybe there are edge cases?). Changes that conflict could be highlighted. Conflict definitions could be defined in an ontology, much like the TBC Diff engine.
From TBC help:The Diff Engine
TopBraid's Diff engine is entirely declarative and can be modified or extended to get customized output. The various diff classes have SPARQL queries attached to them via
diff:rule
. These queries are called by the engine to create the instances of this class. TheGRAPH
keyword of SPARQL is used to query the old graph versus the new graph. If you want to add your own kinds of diff outputs, then simply add suchdiff:rules
.The second step of the Diff engine is to call all
spin:rules
for the constructed instances of the diff classes. These rules can be used to post-process the raw output from the first step, e.g. to create human-readable labels. The same mechanism can be used to create higher-level diff objects from the lower-level triple change objects.
You can modify the
diff.ttl
file in your workspace to adjust the behavior of the diff engine for your needs.
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/CAF0WbnLjPLzrPmQ%3DeeT56_GmbaxSrNE0D3UC8A6e%3DmMmVZG0Ag%40mail.gmail.com.
On Feb 22, 2022, at 2:41 PM, David Price <dpr...@topquadrant.com> wrote:
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/5E9076C7-7594-4DB8-A30C-AE128D05DF70%40topquadrant.com.