Loading XML file into semantic xml taking hours

16 views
Skip to first unread message

Tim Smith

unread,
Jun 18, 2020, 10:48:06 PM6/18/20
to topbrai...@googlegroups.com
Hi,

I am attempting to open an xml file in TBC 6.4 beta on Windows 10.  The file is 840,000 rows and ~50MB on disk.  -Xmx is set to 28GB.  

TBC has been chewing CPU for over three hours.  Memory consumption fluctuates from ~5GB to ~8GB (up and down) so I don't think it is memory bound.

Is there anyway to tell if it's making progress?  Is this file too large to load?

Thanks,

Tim

Rob Atkinson

unread,
Jun 18, 2020, 11:05:22 PM6/18/20
to TopBraid Suite Users

I'm not sure about whether it will work or not, but a warning that AFAIK if you leave an xml file in an open project in a workspace it is parsed on TBC startup - and this can take a long time too. It may only apply to XSD files, life is too short to check  this

Tim Smith

unread,
Jun 18, 2020, 11:15:53 PM6/18/20
to topbrai...@googlegroups.com
Thanks for the heads up Rob.  I'll remove it from the workspace before I restart TBC.  For now, it gets to run overnight.

--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/be68ffef-b8aa-4f74-ac9d-d86addd5150eo%40googlegroups.com.

Holger Knublauch

unread,
Jun 18, 2020, 11:21:02 PM6/18/20
to topbrai...@googlegroups.com

Yeah in general I wouldn't be surprised if a file of that size doesn't work well with semantic xml. Lots of layers of indirection and local memory. It would be best handled with some streaming parser, which we probably don't have integrated well enough. Are there any other tools you could use to preprocess the XML? For example define an XSLT to convert to RDF/XML?

Holger

Tim Smith

unread,
Jun 19, 2020, 12:03:16 AM6/19/20
to topbrai...@googlegroups.com
Hi Holger,

Well, it did finish after about 4+ hours.  I will certainly look at other tools.  Haskell was recommended and it certainly looks promising.  In the meantime, I saved the triples into an RDF file so I can analyze them tomorrow.  

If this works, I may need to do it on a regular basis.  The file is an xml dump of a ladder logic program.  I'm hoping to extract the data and alarm flows into and out of the PLC that runs the program and populate EDG.  This would be a big step towards automating the modeling of our OT layer.  I may also be able to capture and model the control loops.

Tim

Reply all
Reply to author
Forward
0 new messages