Hi,
I have been tracking down a very strange issue today that results in a query stopping to work, and having to optimize the database for the query to work again. Tested on 4.2.1 and 4.2.2.
Essentially I have two components. One that uploads data, and another that queries stardog to create json files to index in elasticsearch. I've created a whole system that can deal with data being loaded in an arbitrary order. An incoming file is initially split into several pieces, then uploaded one at a time to stardog. Every time a piece is uploaded, a message gets sent to the service that indexes to elasticsearch to query stardog again. Eventually the last piece is uploaded to stardog, a message is sent to the indexing service. The query in the indexing service now returns the correct answer and updates elasticsearch.
I've created a test system that does the following.
- Two threads. UPLOAD_THREAD and QUERY_THREAD.
- UPLOAD_THREAD will
- Clear stardog
- Upload the ontology
- notify QUERY_THREAD
- start uploading data
- wait for notification from QUERY_THREAD
- QUERY_THREAD will
- Wait for notification from UPLOAD_THREAD
- Query stardog
- If the query returns the correct result, it will notify UPLOAD_THREAD
- if not it will keep querying
The data is a linked list:
A -parent-> B -parent-> C ..... -parent->E
and E has a property arkiv:arkivskaper with a resource.
The ontology has two main axioms.
1. A parentTransitive property that is super of parent
2. An arkiv:arkivskaper property that is the chain of "parentTransitive o arkivskaper".
The query asks for "A arkiv:arkivskaper ?anything" which should result in the traversal down to E and then over the declared arkiv:arkivskaper property to the resource.
Eventually though the query stops returning the correct answer. Stardog will also not return the correct answer. The data is in stardog, I've checked. And here is the strange part, optimising or taking they db offline/online will make the query start to work again.
We are using stardog as our master database and we are counting on it to be a strict serialisation layer for our system. We are assuming that when a transaction commits, then a query started after that transaction will return based on that transactions modifications.
I will upload the code as soon as I can, but possibly not until tomorrow.
Cheers,
Håvard M. Ottestad