File conversion from .ttl to .rdf format fails

347 views
Skip to first unread message

Rich Keller

unread,
Sep 22, 2017, 8:41:25 PM9/22/17
to TopBraid Suite Users

Hi. I am using the TBC Export/Merge/Convert wizard to convert some of my turtle files to RDF/XML. I don't have a problem with most of the files, but three of them cause TBCME to freeze. I tried to get information from the log file, but with most attempts, there was nothing written. I did manage to capture an exception in one case. In this instance, I was converting a file named SectorLocationInst.ttl. Although the turtle file was presumably well-formed as it was readable by TBCME and other tools, Jena is complaining about improper formatting (but for the generated rdf file, not the turtle file). Here is the trace. The file could be sent off list, if needed.  Rich

---------------------

!ENTRY org.topbraid.core 4 0 2017-09-22 14:55:58.793
!MESSAGE While reading: http://atmweb.arc.nasa.gov/ontology/SectorLocationInst from L/DW/SectorLocationInst.rdf
!STACK 0
org.apache.jena.riot.RiotException: [line: 1569, col: 55] XML document structures must start and end within the same entity.
    at org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:136)
    at org.apache.jena.riot.lang.LangRDFXML$ErrorHandlerBridge.fatalError(LangRDFXML.java:238)
    at org.apache.jena.rdfxml.xmlinput.impl.ARPSaxErrorHandler.fatalError(ARPSaxErrorHandler.java:47)
    at org.apache.jena.rdfxml.xmlinput.impl.XMLHandler.warning(XMLHandler.java:199)
    at org.apache.jena.rdfxml.xmlinput.impl.XMLHandler.fatalError(XMLHandler.java:229)
    at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
    at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.endEntity(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentScannerImpl.endEntity(Unknown Source)
    at org.apache.xerces.impl.XMLEntityManager.endEntity(Unknown Source)
    at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
    at org.apache.xerces.impl.XMLEntityScanner.scanLiteral(Unknown Source)
    at org.apache.xerces.impl.XMLScanner.scanAttributeValue(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanAttribute(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
    at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
    at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
    at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
    at org.apache.jena.rdfxml.xmlinput.impl.RDFXMLParser.parse(RDFXMLParser.java:150)
    at org.apache.jena.rdfxml.xmlinput.ARP.load(ARP.java:118)
    at org.apache.jena.riot.lang.LangRDFXML.parse(LangRDFXML.java:134)
    at org.apache.jena.riot.RDFParserRegistry$ReaderRIOTLang.read(RDFParserRegistry.java:178)
    at org.apache.jena.riot.RDFDataMgr.process(RDFDataMgr.java:859)
    at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:259)
    at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:245)
    at org.apache.jena.riot.adapters.RDFReaderRIOT.read(RDFReaderRIOT.java:69)
    at org.apache.jena.rdf.model.impl.ModelCom.read(ModelCom.java:305)
    at org.topbraid.core.registry.GraphSource.loadModel(GraphSource.java:148)
    at org.topbraid.core.registry.IFileGraphSource.loadModel(IFileGraphSource.java:109)
    at org.topbraid.core.registry.GraphSource.loadPhysicalSource(GraphSource.java:118)
    at org.topbraid.core.registry.GraphSource.loadGraph(GraphSource.java:70)
    at org.topbraid.core.io.IO.loadAndRegister(IO.java:327)
    at org.topbraid.core.io.IO.load(IO.java:292)
    at org.topbraidcomposer.core.io.TBCIO.loadModel(TBCIO.java:340)
    at org.topbraidcomposer.core.io.TBCIO.loadModel(TBCIO.java:375)
    at org.topbraidcomposer.core.io.TBCIO$2$1.run(TBCIO.java:428)
    at java.lang.Thread.run(Thread.java:745)

Andy Seaborne

unread,
Sep 23, 2017, 8:30:33 AM9/23/17
to TopBraid Suite Users
Hi Rich,

It's appears to be being read because it is in the TBC workspace. 

Is there a log message about the base URI already being in use?

Some possibilities ("guesses"):
  • the RDF/XML file is truncated
  • the RDF/XML file is being read before it has been fully written
  • there are illegal characters getting into the XML tags
Knowing what's in SectorLocationInst.rdf as well as the TTL file, would help but if it's due to timing, the .rdf file may end up valid, because it was only partially there when read.

You could try to run the Jena tools directly as a check:

riot --pretty RDF/XML YourData.ttl > YourData.rdf
riot --validate YourData.rdf

    Andy

On Saturday, 23 September 2017 01:41:25 UTC+1, Rich Keller wrote:

Hi. I am using the TBC Export/Merge/Convert wizard to convert some of my turtle files to RDF/XML. I don't have a problem with most of the files, but three of them cause TBCME to freeze. I tried to get information from the log file, but with most attempts, there was nothing written. I did manage to capture an exception in one case. In this instance, I was converting a file named SectorLocationInst.ttl. Although the turtle file was presumably well-formed as it was readable by TBCME and other tools, Jena is complaining about improper formatting (but for the generated rdf file, not the turtle file). Here is the trace. The file could be sent off list, if needed.  Rich

---------------------

!ENTRY org.topbraid.core 4 0 2017-09-22 14:55:58.793
!MESSAGE While reading: http://atmweb.arc.nasa.gov/ontology/SectorLocationInst from L/DW/SectorLocationInst.rdf
!STACK 0
org.apache.jena.riot.RiotException: [line: 1569, col: 55] XML document structures must start and end within the same entity.
    at org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:136)
... 
    at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
... 
    at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:245)
Message has been deleted
Message has been deleted

Andy Seaborne

unread,
Sep 27, 2017, 9:00:42 AM9/27/17
to TopBraid Suite Users
Here is an update: Rich is having problems sending messages to the group.

The TTL data files take a very long time (minutes) to be written in RDF/XML in the pretty format. This seems to be because there are a large number of possible ways to write the data and every possibility is being investigated. Rich (on Mac) then sees either no output or occasionally truncated output. I don't get that effect (I'm on Linux) but the TBC-Eclipse process is locked up while writing (Dialog boxes for Eclipse and Eclipse-based applications work completely different between these two systems). The Eclipse platform does scan for files in the background and that triggers refresh actions in TBC - this would explain what Rich is seeing.

In addition, in several versions of TBC, whichever export file format for RDF/XML is chosen, it ends up using the RDF/XML pretty writer. That's not meant to happen - there are two options in the dropdown, one is supposed to be pretty (AKA "abbreviated") and one the basic block oriented form which does not have the computational formatting algorithm.

    Andy
Reply all
Reply to author
Forward
0 new messages