Stardog data add failure

1 view
Skip to first unread message

Tze-John Tang

unread,
Jun 23, 2014, 10:50:07 PM6/23/14
to sta...@clarkparsia.com
I was running stardog data add from the command line and I received a Java Heap space error. Is this coming from the JVM Java Heap running out of space, or is it the disk based heap running out of space? And if it is disk, is there anyway to point it to some location other than /tmp? /tmp is sitting at 90%, so I don't think it is getting full. I set the stardog client max heap to 4GB and it errored out, and I am going to try 8GB an see if that helps.

When I shutdown and restart the server, I see the warning:

Jun 23, 2014 9:47:47 PM com.complexible.stardog.index.disk.statistics.DiskCharacteristicSetsStatisticsLoader doLoad
WARNING: Statistics for database infrastructureDb cannot be loaded due to missing files (probably need to be rebuilt)

Do I need to do something to update stats, or do I need to load the data again, or what? When I startup the server after the error it takes a while to start. If I shutdown and startup again, then it is instantaneous... but I still get the warning. I don't see anything show up in the server stardog.log file.

Thanks.

-tj

Tze-John Tang

unread,
Jun 24, 2014, 12:03:31 PM6/24/14
to sta...@clarkparsia.com
I updated the max heap size on the server side to 4GB, and now I do not get the Java heap error anymore on load. But for some reason, after spending time importing the data, it comes back with no triples added, and no errors. It did add an earlier file that was large as well.

websvc@ua00017p:/opt/apps/stardog_data/r2rml-parser-0.5-alpha> stardog data add --named-graph urn:x-abbvie:libra infrastructureDb root.n3
Adding data from file: root.n3
Added 5,603,362 RDF triples.
websvc@ua00017p:/opt/apps/stardog_data/r2rml-parser-0.5-alpha> stardog data add --named-graph urn:x-abbvie:libra infrastructureDb anumber.n3
Adding data from file: anumber.n3
Added 0 RDF triples.

This file probably has about 7 million triples.

-tj

Evren Sirin

unread,
Jun 24, 2014, 12:20:21 PM6/24/14
to Stardog
The message "Added 0 RDF triples" is probably because the triples were
already in the database. The add command will report the actual
triples added to the database not the number of triples read from the
file. Are you sure this file contains new triples?

If you are getting errors due to statistics you can try the
"stardog-admin db optimize" command which rebuilds the statistics from
scratch.

Best,
Evren
> --
> -- --
> You received this message because you are subscribed to the C&P "Stardog"
> group.
> To post to this group, send email to sta...@clarkparsia.com
> To unsubscribe from this group, send email to
> stardog+u...@clarkparsia.com
> For more options, visit this group at
> http://groups.google.com/a/clarkparsia.com/group/stardog?hl=en

Tze-John Tang

unread,
Jun 24, 2014, 2:32:13 PM6/24/14
to sta...@clarkparsia.com
Evren,


The message "Added 0 RDF triples" is probably because the triples were
already in the database. The add command will report the actual
triples added to the database not the number of triples read from the
file. Are you sure this file contains new triples?


I am sure that these triples do not exist because  I cleared the graph before I loaded. Though I did load two other files before it (none of which should have the same triples.)  I will clear it out and try again, and let you know.

Thanks.

-tj

Tze-John Tang

unread,
Jun 24, 2014, 2:35:26 PM6/24/14
to sta...@clarkparsia.com
Maybe the triples did load, and it is just saying 0... I will check. I started out with about 1 million triples in the store. The I loaded 5,000, and then as referenced in my first post, I loaded 5 million triples. If I look at the db stats page, I see that there are now 18 milliion triples. Which might be the case, if I include the on I just loaded.

-tj

Tze-John Tang

unread,
Jun 24, 2014, 4:36:36 PM6/24/14
to sta...@clarkparsia.com
Ok. I cleared the graph and loaded again, and now it gives me the number of triples loaded. What I am thinking is that when I had the Java Heap error, it actually did load them, but had problems completing the load. So when I upped the memory, and it stopped crashing, the actual triples to load ended up being 0.

-tj

Tze-John Tang

unread,
Jun 24, 2014, 10:37:39 PM6/24/14
to sta...@clarkparsia.com


After the load, I now have about 18 million triples. But the CPU is stuck in some kind of high iowait load, and the stardog server is responding very slowly. I stopped and restarted the server, and now the cpu process is taking up 100% of the cpu (not iowait). The stardog.log file is not showing me anything. Is there anyway to up the logging level so that I can see what is going on?

-tj

Mike Grove

unread,
Jun 25, 2014, 7:27:24 AM6/25/14
to stardog
It might be easier to grab the PID of the process and use jstack to see what the threads are doing.

Cheers,

Mike


On Tue, Jun 24, 2014 at 10:37 PM, Tze-John Tang <tzejoh...@gmail.com> wrote:


After the load, I now have about 18 million triples. But the CPU is stuck in some kind of high iowait load, and the stardog server is responding very slowly. I stopped and restarted the server, and now the cpu process is taking up 100% of the cpu (not iowait). The stardog.log file is not showing me anything. Is there anyway to up the logging level so that I can see what is going on?

-tj

--

Tze-John Tang

unread,
Jun 25, 2014, 11:48:46 AM6/25/14
to sta...@clarkparsia.com
I'll try that next time. Essentially I could not start up the store again. So I dropped the db that had the issue, and created a new one (this time without searching enabled), and reloaded it. This time around, it is ok.

It is possible that there were issues related to the heap space errors I received doing the data loads originally, which might have corrupted the db.

-tj

Mike Grove

unread,
Jun 25, 2014, 11:51:33 AM6/25/14
to stardog
On Wed, Jun 25, 2014 at 11:48 AM, Tze-John Tang <tzejoh...@gmail.com> wrote:
I'll try that next time. Essentially I could not start up the store again. So I dropped the db that had the issue, and created a new one (this time without searching enabled), and reloaded it. This time around, it is ok.

Was search set up to index asynchronously?  That could explain the CPU usage if it was indexing the database in the background.  Similarly, it could have been the process that updates the database statistics in the background. 

Cheers,

Mike
 

It is possible that there were issues related to the heap space errors I received doing the data loads originally, which might have corrupted the db.

-tj

Kendall Clark

unread,
Jun 25, 2014, 11:51:40 AM6/25/14
to stardog
TJ,

Would it be possible to share that data source with us so we could try to reproduce?

Cheers,
Kendall


On Wed, Jun 25, 2014 at 11:48 AM, Tze-John Tang <tzejoh...@gmail.com> wrote:

Tze-John Tang

unread,
Jun 25, 2014, 12:11:14 PM6/25/14
to sta...@clarkparsia.com
Sure, if you have some FTP site somewhere, I can dump it there for you. Given that only a select few have access and that it gets deleted soon after, as it is all internal data.

Thanks,
-tj
Reply all
Reply to author
Forward
0 new messages