db create performance issue with icv.enabled and recursive property chains

1 view
Skip to first unread message

Conrad Leonard

unread,
Oct 17, 2014, 12:47:36 AM10/17/14
to sta...@clarkparsia.com
Hi;
I'm creating a db with a relatively small ontology - hundreds of triples in the named schema graph, nothing initially in default graph; running with Xmx=8g
The symptom is, if I create the database with "icv.enabled=true, icv.reasoning.type=RL", the create command returns within 10seconds or so with a success message but the Stardog Java process remains at 100%cpu usage for a long time - over an hour - and during this time the database is unresponsive, i.e. cannot query it or talk to it in any fashion with stardog-admin commands. If I create the database with icv.enabled=false, this symptom does not occur initially but after offlining the database, setting icv.enabled to true and then bringing it back online, same deal. So it's something to do with setting the ICV guard mode and this particular ontology. I should say that at this point there are no ICV rules in place, just that ICV guard mode has been set to true.

I tentatively attribute the cause to be the presence of objectProperty recursive propertyChainAxiom statements in the ontology, of which there are a dozen or so. Example:
    <owl:ObjectProperty rdf:about="grafli:isCaptureKitOf">
        <owl:propertyChainAxiom rdf:parseType="Collection">
            <rdf:Description rdf:about="grafli:isCaptureKitOf"/>
            <rdf:Description rdf:about="grafli:hasOutput"/>
        </owl:propertyChainAxiom>
    </owl:ObjectProperty>

Removing these one-by-one dramatically reduced the time during which the database was unresponsive

N_rpc trial_1 trial_2 trial_3
0 0m8.754s 0m8.345s 0m8.830s
1 0m8.737s 0m8.731s 0m8.669s
2 0m11.892s 0m9.855s 0m9.539s
3 0m12.098s 0m10.460s 0m12.663s
4 0m14.190s 0m15.397s 0m15.699s
5 0m14.998s 0m12.612s 0m13.846s
6 0m43.607s 0m32.458s 0m57.735s
7 0m49.735s 0m56.960s 0m42.042s
8 2m31.610s 3m32.952s 7m58.442s
9 2m18.650s 0m51.967s 4m52.401s
10 >60m.000s 10m55.271s >20m.000s

The first column N_rpc is the number of recursive propertyChainAxiom statements in the ontology. The times vary considerably between trials but the trend is always the same; adding a single extra recursive property chain substantially increases the time after creation during which the db is unresponsive, presumably doing its thing calculating.

Is this expected behaviour, i.e. is it known that such (recursive) propertyChainAxiom statements are so costly in terms of db creation time?

regards,
Conrad.

Evren Sirin

unread,
Oct 17, 2014, 9:25:03 AM10/17/14
to Stardog
By default ICV validation will check database consistency based on
your ontology using the regular OWL semantics in addition to
validating your constraints. Property chains might be expensive for
reasoning but they would not cause any work by themselves for
consistency checking unless they are relevant for a disjointness axiom
which must be the case here. You can set
icv.consistency.automatic=false in db create command to prevent
consistency checking. Can you also share the ontology and preferably a
minimal dataset so we can take a look at what is taking so much time?
BTW, we had some improvements to property chains in the previous
release 2.2.1 so you might want to try upgrading if you are on an
older version.

Best,
Evren
> --
> -- --
> You received this message because you are subscribed to the C&P "Stardog"
> group.
> To post to this group, send email to sta...@clarkparsia.com
> To unsubscribe from this group, send email to
> stardog+u...@clarkparsia.com
> For more options, visit this group at
> http://groups.google.com/a/clarkparsia.com/group/stardog?hl=en

Conrad Leonard

unread,
Oct 17, 2014, 8:12:52 PM10/17/14
to sta...@clarkparsia.com
Hi Evren;
Please find attached a small example ontology that displays this behaviour for me. It does not contain any disjoint axioms. I am using Stardog 2.2.1. Also attached is the config file I could use to create this via CLI although actually I am usually doing this by POSTing JSON over HTTP.
As stated, the database creation command returns quickly, and at this point stardog process has low CPU & memory but then any further attempt to interact with that db (e.g. query or data export) immediately pegs stardog Java process at 100% CPU for at least an hour (actually I don't know if it ever finishes because I haven't waited longer than that) and over the course of this time memory usage increases slowly from the initial low value to maximum allowed by STARDOG_JAVA_ARGS.

cheers,
Conrad.
database.properties
test-tbox-plus11.owl

Mike Grove

unread,
Oct 21, 2014, 7:37:42 AM10/21/14
to stardog
Thanks for sending along the test case.  It looks like it's stuck performing classification.  It eventually does complete and you can query the database, but it takes too long.  We'll look at how we can improve the performance in an upcoming version.

Cheers,

Mike
Reply all
Reply to author
Forward
0 new messages