Dear,
2 simple/short technical questions:
Thx Michel
|
|||||||||||||||||
|
Dear,
2 simple/short technical questions:
- Can the generated comments at the start of a turtle file be avoided somehow?
- Can the order of the items in the file be preserved? (related to # comments)?
Thx Michel
Dr. ir. H.M. (Michel) Böhms
Senior Data Scientist
This message may contain information that is not intended for you. If you are not the addressee or if this message was sent to you by mistake, you are requested to inform the sender and delete the message. TNO accepts no liability for the content of this e-mail, for the manner in which you use it and for damage of any kind resulting from the risks inherent to the electronic transmission of messages.
--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Sorry bit late reply…
When I save, all comments (edited/added manualy) are gone and order is changed.
(can’t remember I changed a setting for that behaviour, so guess default)
After save (in ttl), order seems:
Is there a way to overrule this and stick to manually determined order and keeping own comments?
Gr Michel
Sorry bit late reply…
When I save, all comments (edited/added manualy) are gone and order is changed.
(can’t remember I changed a setting for that behaviour, so guess default)
After save (in ttl), order seems:
- Comments baseURI/prefix/baseuri
- Prefixes
- Changed external entities
- Ontology declaration
- Classes (alphabet.)
- Properties (datatype/object mixed) (alphabet.)
Is there a way to overrule this and stick to manually determined order and keeping own comments?
Hi Holger, see after >:
On 12/07/2017 21:11, Bohms, H.M. (Michel) wrote:
Sorry bit late reply…
When I save, all comments (edited/added manualy) are gone and order is changed.
(can’t remember I changed a setting for that behaviour, so guess default)
After save (in ttl), order seems:
- Comments baseURI/prefix/baseuri
- Prefixes
- Changed external entities
- Ontology declaration
- Classes (alphabet.)
- Properties (datatype/object mixed) (alphabet.)
Is there a way to overrule this and stick to manually determined order and keeping own comments?
Yes, the only way to keep these is to not use TBC at all.
What you are asking for (and in the parallel thread) is almost impossible to implement for us. You are asking for a system that not only parses Turtle but also preserves the details of the formatting (e.g. commas vs semicolon) and # comments (which the parsers
usually throw away).
Seriously, if these low-level details of the TTL syntax are relevant to you, just use text editors.
Greetings Michel
Hi Michel,
First, a bit of clarity seems to be required wrt understanding some basics of RDF/OWL. RDF/XML, Turtle, etc are encoding graphs, which by-definition have no order or hierarchy. There is no top, bottom or middle of a graph. Therefore, ordering is not important in any Turtle and RDF/XML encoding. Contrast that with XML which is encodes a strictly ordered set of XML elements and hierarchy/containment relationships plus attributes on the XML elements. Graphs and XML are nothing alike and mixing requirements between them is not a sensible thing to do.
>The situation is IMHO subtly different:
It does not make sense to decide whether to use IFC EXPRESS and STEP file format vs IFC OWL and Turtle for data exchange based on a manually controlled Turtle file format. If that is important in your argument, IMO you’ve got entirely the wrong argument :-)
If you have further concerns, I suggest you take them up with me directly rather than on the user forum since I understand the IFC STEP issues and most Composer developers and users don’t. I’m happy to help make the argument to IFC STEP users. How about : For real-world uses of RDF/OWL, it is very likely that the data will end up in an RDF database anyway, and so the fact that it was ever encoded as Turtle disappears entirely.
>Finally, Here we totally agree!, if Turtle (or any concrete RDF syntax) becomes completely irrelevant in future because all instantiation and changing is done via direct interface like SPARQL, all issues that arose by combining RDF Documents and RDF Datasets also become fully irrelevant! So let’s work hard to make Turtle a formal W3C Condemnation (is that the opposite of a Recommendation?).
>Gr Michel
--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-users+unsubscribe@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
Dear Irene
Under the assumption:
“rdf data is always in an RDF database (accessible by SPARQL) and RDF Documents are ONLY for semantic exchange between such systems”
I fully agree 100 % with all your statements (and those of your colleagues).
So the issue is about the “assumption”.
I observe many situations where the primary/reference data (typically an ontology) is in an RDF Document. You might say “this not good” or “this won’t be in future” etc. but that is just a situation I observe a lot (and the actual RDF1.1 specifications are certainly not clear that this shouldn’t be the case either).
An example (other than my own cmo.ttl): https://www.w3.org/ns/prov.ttl
This is a reference specification published on the web as RDF document (in various dereferenceable serialisations). It can be imported in any RDF DB you like but the reference spec IS an RDF Document (so this is another status then “just for exchange”).
In this ontology you find many comments like:
“## Definitions from other ontologies”
Or
“# The following was imported from http://www.w3.org/ns/prov-dc#”
Now image you are the owner of such a reference ontology as RDF Document.
You should be aware that when editing your ontology, most tools (other than text editors) will delete both order and comments (in general: all non-semantic aspects of that document).
When I say “delete” I of course mean the sequence (not-parse/record+write) which effectively deletes the comments and specific order after editing.
The current formal specifications do not tell us much about the rightness of assumptions above. Turtle specifies a comment mechanism but does not say ie “be careful, comments are only relevant when writing files, after parsing all will be lost” or something similar.
I also agree (with Holger) that if that second interpretation (RDF doc as primary/reference) is assumed it is quite a job to record the non-semantic info and reuse that when writing out again. (in the ISO STEP world the same issue is relevant and some tools actually retain the non-semantic data in STEP Physical Files to support more deterministic documents. In some situations this simplifies model comparisons; I am not saying this is the right approach, only saying it happens).
I hope I made the issue at least more clear now.
Greetings, Michel
|
|||||||||||||||||
|
From: topbrai...@googlegroups.com [mailto:topbrai...@googlegroups.com]
On Behalf Of Irene Polikoff
Sent: donderdag 13 juli 2017 20:32
To: topbrai...@googlegroups.com
Subject: Re: [topbraid-users] tbc ttl file questions
Michel,
--
This message may contain information that is not intended for you. If you are not the addressee or if this message was sent to you by mistake, you are requested to inform the sender and delete the message. TNO accepts no liability for the content of this e-mail, for the manner in which you use it and for damage of any kind resulting from the risks inherent to the electronic transmission of messages.
From: topbrai...@googlegroups.com [mailto:topbrai...@googlegroups.com] On Behalf Of Irene Polikoff
Sent: donderdag 13 juli 2017 20:32
To: topbrai...@googlegroups.com
Subject: Re: [topbraid-users] tbc ttl file questionsMichel,Serializations and deserialization provide a way for data to be translated into a format that could be used for transmission, interchange, storage in a file system, etc. with the ability for it to be later reconstructed to create semantically identical clone of the data.The goal of RDF serializations and tool interoperability is to ensure that if tool A produces a serialization of a graph X, tool B can read it in and understand it as graph X. Tool B can then, in its turn, produce serialization of graph X, tool A can import it and it is still the same graph. The serialization output of A may not look exactly the same as the serialization output of B, but their semantic interpretation is always the same.Serialization/deserialization process is not intended to ensure that the sequence of bytes in a file will be exactly the same. In case of both RDF/XML and Turtle format, there are several syntactic variations for representing the same information. The simplest RDF serialization is N-Triple. There is little room in it for syntactic variations as it just contains triple statements. However, even with that simplicity, there are variants since the order of statements may vary. The bottom line is that if you are using serializations in the interchange and parse them to deserialize for use in some target system, you need a parser that will understand what the serialization means semantically and will not rely purely on the byte sequence.If TBC parser was ignoring something that captured semantics of data, this would be a bug. I do not think it is the case. Comma is not ignored, it is correctly understood by deserialization when data is imported into TBC. “Deleting it” is not even a concept because once data is deserialized, comma no longer exists. We now have a graph. When you save it, it is serialized anew - without any memory or consideration of how its serialization looked when it came in. As long as the serialization still represents semantically identical object, it is correct.Regards,Irene Polikoff
On Jul 13, 2017, at 4:13 AM, Bohms, H.M. (Michel) <michel...@tno.nl> wrote:Seriously, if these low-level details of the TTL syntax are relevant to you, just use text editors.
- Yes, low-level syntax issues ARE very relevant. They are the fundament under all we do in the end. When convincing our client to move from SPFF or XML to RDF and its serializations they expect implementations that 100% support these specs. If a comment is a feature of that spec, if a comma is a feature of that spec they do not expect that a parser and or writer ignores or even deletes them. Anyway as said before, lets agree to disagree (although your views in these matters highly surprise me I must say).
--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Right, and I am actually in favour of that.
Fact is not all serialisations have that idea. Turtle and RDF/XML do have comments and people ARE using them in practice (cmo, prov ,…).
So let’s conclude:
“Despite being offered the possibility better not use comments in your RDF Documents (or assume any order)”.
In a sense these things are one order worse than annotations: where annotations are processed but not interpreted, comments and not even processed.
So we agree on the “what should be”, but differ on the use in practice. That’s fine for me. Maybe you can consider in your system some warnings on save: “when saving not all features of your original input will be retained”.
Thx for the discussion,
Michel
|
|||||||||||||||||
|
From: topbrai...@googlegroups.com [mailto:topbrai...@googlegroups.com]
On Behalf Of Holger Knublauch
Sent: vrijdag 14 juli 2017 00:05
To: topbrai...@googlegroups.com
Subject: Re: [topbraid-users] tbc ttl file questions
FWIW some formats such as JSON (and thus JSON-LD) don't even support comments. The philosophy behind that is that every piece of information should become data and not be hidden in a specific serialization.
Holger
Irene Polikoff
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
See my previous reply.
The ordering-thing is really the same issue of the comment-thing (both being non-semantic RDF document aspects).
The comma-thing is something very different and (maybe you noticed 😊) I have a very strong opinion here.
I will try to reformulate one more time and then shut up.
Note 1: time-complexity typically works the other way round (spacetime-complexity is constant), but that wasn’t the issue here. If time-complexity IS the issue you better forget Turtle and stick with triples (this is true in general, in theory it depends on the time-penalty balance between reading an item and processing an item: better space-complexity means less to process but the actual processing is normally more than you gain)
Note 2: a b c all replace strings by predefined characters (; , []) so the end-result is always shorter than the original
In my eralier example:
Triples:
a1 b1 c1.
a1 b2 c2.
a1 b2 c3.
a1 b3 c4.
c4 b4 c5.
Applying a b and c:
a1 b1 c1; b2 c2, c3; b3 [c4 b4 c5].
Applying only b and c:
a1 b1 c1; b2 c2 ; b2 c3; b3 [c4 b4 c5].
As can be seen the second code is longer than first one.
David said earlier: “efficient” is in the eye of the beholder.
This of course related. “Efficient” only makes sense if you specify space or time. Here we talked space-complexity/efficiency.
It could be of course that for some reason you optimized space and time complexity by just leaving out “b”.
Gr Michel
|
|
On 14 Jul 2017, at 09:13, Bohms, H.M. (Michel) <michel...@tno.nl> wrote:
- A directed graph when represented as triples has bad space-complexity.
- Turtle improves the space-complexity by systematically shortcutting the three components of a triple:
- Reuse of a subject > ;
- Reuse of a predicate > ,
- Reuse of a subject > […]