Parsing trig file

39 views
Skip to first unread message

devm...@gmail.com

unread,
Jan 5, 2018, 8:41:18 AM1/5/18
to rdflib-dev
Hey everyone, I am pretty new to using rdflib. I am attempting to parse a trig file I have and am having some issues. My goal is to build a tool to help clean my triples. I have a JENA 2 triple store that I exported into TRiG using JENA Tools. It has some issues I need to fix before I can upload it back into my database. I am hoping rdflib will be what I need to do this.


The first file I have is the complete triple store dump of my database, and is about 4.1 GB. When I attempt to parse it

e = Graph()
e.parse("content.trig", format="trig")

I get the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/trip_cleaner/venv/lib/python3.6/site-packages/rdflib/graph.py", line 1043, in parse
parser.parse(source, self, **args)
File "/trip_cleaner/venv/lib/python3.6/site-packages/rdflib/plugins/parsers/trig.py", line 161, in parse
p.loadStream(source.getByteStream())
File "/trip_cleaner/venv/lib/python3.6/site-packages/rdflib/plugins/parsers/notation3.py", line 434, in loadStream
return self.loadBuf(stream.read()) # Not ideal
OSError: [Errno 22] Invalid argument


The second file I have is a drastically reduced version of content.trig, with just the first 1000 lines (~250 triples). This is "successfully" parsed, however has nothing in the graph afterwords - len(e) = 0.


Any idea on where I am going wrong?


gals...@gmail.com

unread,
Jun 22, 2018, 8:51:45 AM6/22/18
to rdflib-dev
בתאריך יום שישי, 5 בינואר 2018 בשעה 15:41:18 UTC+2, מאת devm...@gmail.com:
I have the same problem with a dbpedia file, did you happen to solve it?
Reply all
Reply to author
Forward
0 new messages