import job trouble

34 views
Skip to first unread message

Bonnie MacKellar

unread,
Jul 29, 2014, 5:23:47 PM7/29/14
to ld...@googlegroups.com
Hi,
I am extremely new to LDIF and am trying to use it on some n-quad files from bio2rdf. I have not gotten very far because I cannot even get one of the files to load at all. This file is around 5.5Gb and has at least 17 million quads in it. The other files I have tried, much smaller, have no problem to be loaded. The import job starts as can be seen by this console output

Import Job Sider_meddra_freq_parsed.0 started (quad / hourly)
[INFO] Loading from /home/bonnie/ldif-0.5.2/examples/mytest/dumps/sider-meddra_freq_parsed.nq

and the status monitor shows it loading quads fairly quickly...until it gets to around 15.5 million or 16 million (it has varied). At this point, the import job slows to a crawl, and eventually stops adding quads, though the process is still running and holding lots of memory. As far as I can tell, it must be hanging on something. Since I am new to this tool, I am really at a loss as to how to debug this. I am not getting any informative output on the console - the message above is the last thing to appear.

I have tried running this with both the in-memory and the triple store backed versions. There is no difference. It appears to me that the import process is independent of the triple store backend. I am running on a Ubuntu machine with 16Gb memory - as long as I run with the max heap space set to 10Gb, it does not run out of memory so that is not the problem.

Any advice?

thanks,
Bonnie MacKellar

Reply all
Reply to author
Forward
0 new messages