Separate namespaces (and possibly graphs) for instance data and model on XML import

16 views
Skip to first unread message

Rob Atkinson

unread,
Aug 14, 2019, 8:07:05 PM8/14/19
to TopBraid Suite Users
Checking I'm not missing something here.. can i import XML and have the model (classes and properties) in a different namespace than the instance data?

when using the <sml:ConvertXMLtoRDF> tag (NB not able to use the XML import wizard as i need a repeatable process under program control)

you have the option of specifying an XSD..

however AFAICT it forces both the model (elements => Class and property mappings) and instances into the same namespace, in the same output graph..

this is somewhat inconsistent with improvements elsewhere to separate model and instances.

I can deal with this by post-processing to strip model triples into a separate graph, and update all the namespaces .. but if there was already a way to control different namespaces for model and instances it would be much better...

Rob Atkinson

Holger Knublauch

unread,
Aug 14, 2019, 8:41:59 PM8/14/19
to topbrai...@googlegroups.com

Hi Rob,

if you have already existing classes and properties with sxml: annotations then the algorithm should reuse those instead of creating new classes. See comment at sml:ConvertXMLToRDF:

Converts an arbitrary XML input document into an RDF graph using the Semantic XML mapping approach. The input graph of this module may contain class definitions that have sxml: declarations attached to them and these will be used for the instances. For more, see Help > Import and Export > Creating, Importing, Querying, Saving XML documents with Semantic XML.

One approach to produce a suitably annotated ontology is to import an XSD, another is to import an XML instance file, then delete the instances and adjust the namespaces. Use the resulting file as input to the sml:ConvertXMLToRDF step.

Holger

--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/c5e86483-2869-4696-96a4-fb2991ec80db%40googlegroups.com.

Rob Atkinson

unread,
Aug 15, 2019, 1:36:53 AM8/15/19
to TopBraid Suite Users

I have tried reading this a few times and am still a little lost... 

1)  I dont see how the context  "If a process ontology is used to control the Semantic XML mapping"  relates the to arguments for 

sml:ConvertXMLToRDF (sm:Module)

Converts an arbitrary XML input document into an RDF graph using the Semantic XML mapping approach. The input graph of this module may contain class definitions that have sxml: declarations attached to them and these will be used for the instances. For more, see Help > Import and Export > Creating, Importing, Querying, Saving XML documents with Semantic XML.

 
  Template Module

See Also

Arguments

sml:baseURI: The base URI of the new RDF (for the creation of the new class and property names).
sml:replace (xsd:boolean): [Optional] If true then the resulting output graph will not include the input graph, i.e. only the new triples will be returned.
sml:xml: The XML document that shall be converted to RDF. To avoid character encoding issues, we strongly recommend this value to be a reference to an already parsed XML document, and not a literal. In other words, use "Add SPARQL expression" from the drop down menu and enter ?varName and do not use a string value such as {?varName}. The actual document parsing should be handled by predecessing modules such as sml:ImportXMLFromURL.
sml:xmlType (xsd:string): [Optional] An (optional) type indicator for the Semantic XML conversion. Current supported values are "XHTML" (treats the input as HTML source, and may run a tidy algorithm in case the HTML is not well-formed XHTML).

2) i think I can follow how to manually construct a rdfs class model with sxml:annotations - but an example would be really helpful here! 

3) I dont really see why you couldnt take an XSD and map it to a namespace rather than having to do this manually - am trying to minimise the number of 'unnatural acts' for someone who knows XML to see an equivalent RDF model.

So with the current state - if there really is a way to use the SXML annotations in ConvertXMLtoRDF, then I think I'd need to build a pre-processing step to read the XSD and generate the equivalent process model in a target namespace and inject these annotations...  not TQ is inspecting the XSD building those classes anyway - so i guess this is where you suggest throwing away the instances to get these classes.. 

I get the feeling all the pieces i need are there, but cant quite get a handle on how to access them individually in the right order :-)


To unsubscribe from this group and stop receiving emails from it, send an email to topbrai...@googlegroups.com.

Irene Polikoff

unread,
Aug 15, 2019, 8:38:46 AM8/15/19
to topbrai...@googlegroups.com
Rob,

As I understand it, Holger’s recommendation is to first import XSD. This will create annotated ontology.

If you do not have XSD, then Import XML file and delete all instances from the resulting RDF, leaving only classes and properties.

This will automatically create the annotated ontology which can be used to import XML, achieving separation of classes/properties and data.

These steps can be parts of a single process, following each other. When sml:ConvertXMLToRDF is invoked the second time (to convert instance data) its input graph should contain the ontology produced by the previous step.

After data is converted, you may want to consider running transformation of the ontology to create shapes. We have not yet upgraded all importers to work directly with ontologies defined only using SHACL. Or you could do this right after ontology creation - if you only generate and add shapes, without removing any other definitions that are there.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/f9e46257-fc4c-4f85-92ea-e93f648839d5%40googlegroups.com.

Rob Atkinson

unread,
Aug 15, 2019, 7:07:50 PM8/15/19
to TopBraid Suite Users
unfortunately in a programmatic context "first import XSD" is not really specific enough to interpret - it could mean

1) use LoadFromXML and ConvertXMLtoRDF sequence to get into current query graph
2) do it as a manual step in advance (not relevant in this context)
3) inject is as an owl:import  into some graph (but which? - the current query graph or some temp graph? )   then access via graphWithImports


in general these "Use Cases" really need to be specified in terms of explicit Pre-conditions and post-conditions - and if necessary specific steps (which tag or function to actually invoke)

as it happens I have got around it by post-processing the ConvertXMLto RDF to force all model elements into a separate namespace and then filter them out in a separate steps which avoids loading all the instances twice. 


Holger Knublauch

unread,
Aug 15, 2019, 7:08:59 PM8/15/19
to topbrai...@googlegroups.com

In addition to what Irene said, if the pre-built tooling doesn't fit your needs, there is always the option of going through 3rd party technologies such as XSL transformations (there is a SM module to execute those if you have a script to produce RDF/XML).

Holger

Rob Atkinson

unread,
Aug 15, 2019, 8:24:57 PM8/15/19
to TopBraid Suite Users

i can deal with it - (have done) but was just a shame not to be able to access the internal components that seem to do this job anyway. 
--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbrai...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages