Client-side replica graph gets slower if no sync is done

9 views
Skip to first unread message

Andreas S. Rath (teddius)

unread,
Jul 21, 2009, 2:13:47 AM7/21/09
to OpenAnzo
Dear OpenAnzo Group,

I observed that the sparql query performance decreases over time if a
client-side replica graph is used and not synchronized for a longer
time period with a server. The amount of triples added to this graph
was over 50000 triples.
If you continue to add triples in separate transactions then the query
performance further decreases. At some point in time you also get an
out of memory exception. Is there a possibility to optimize this for
an off-line client which synchronizes not that frequently?

Thanks
Andreas

Ben Szekely

unread,
Jul 21, 2009, 11:06:16 AM7/21/09
to open...@googlegroups.com
Hi Andreas,
The behavior you are seeing is inherent in the design of the
client. RDF graphs and statements created/modified by non-synchronized
transactions are held in a special data structure called the transaction
queue which filters reads and writes against the local replica. This
data structure was designed to really only maintain small increments of
data between updates. However, depending on how many transactions your
50k statements are spread over, this could be fine or problematic. If
you only have a few transactions, then the transaction queue structure
carries only a small amount of overhead compared to the number of
statements, but if you have lots of transactions you will have a serious
problem. This is because each transaction maintains a mini quadstore of
additions and deletions used to filter reads and writes. Keep in mind
that unless you group operations inside anzoClient.begin() and
anzoClient.commit(), each operation will get it's own transaction.

Hope this helps,
Ben

Andreas S. Rath

unread,
Jul 21, 2009, 11:08:49 AM7/21/09
to open...@googlegroups.com
Hi Ben,

Thanks for your quick answer. What would be the best solution for a
offline client that synchronizes only once a week?

Best
Andreas

Ben Szekely

unread,
Jul 21, 2009, 11:15:55 AM7/21/09
to open...@googlegroups.com
The best approach for now would be reduce the total number of
transactions as much as possible. Without knowing more about your use
case, I can't really suggest anything more specific. However, I do
wonder why you can't synchronize more frequently.

- Ben
Reply all
Reply to author
Forward
0 new messages