4store locking strategies

Skip to first unread message

Zlatko.

unread,
Jan 10, 2012, 1:11:09 PM1/10/12
to 4store-support
Hello,

I wonder if someone can summarise the locking strategies implemented
in 4store. I want to understand this better as I ran into a problem
when using it for the system I am working on. At times the system has
to store a series of large number of triples, e.g. tens blocks of up
to couple of million triples, and at the same time allow for querying
the data already stored in the database. The triples are stored by
sending turtle formatted data to the /data/ endpoint. I observed that
when committing e.g. 1.5 million triples, which takes about 60sec ,
the database is blocked for querying. In modern SQL servers, with row
locking, it is possible to set the transaction isolation level so that
during insert selecting is possible, just that the uncommitted inserts
are not seen by the select. Is this behaviour possible in 4store?
Please explain how do selects, inserts and deletes lock the database.

Many Thanks,
- Zlatko.

Steve Harris

unread,
Jan 11, 2012, 5:13:16 AM1/11/12
to 4store-...@googlegroups.com
The DB isn't locked for queries when writing, BUT you can only query via HTTP when the HTTP Server is running (which is must have been). This includes but PUTs to /data/ and SPARQL Update requests.

You can only have one writing client at once however.

- Steve

> --
> You received this message because you are subscribed to the Google Groups "4store-support" group.
> To post to this group, send email to 4store-...@googlegroups.com.
> To unsubscribe from this group, send email to 4store-suppor...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/4store-support?hl=en.
>

--
Steve Harris, CTO, Garlik Limited
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203 http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD

Steve Harris

unread,
Jan 11, 2012, 5:17:10 AM1/11/12
to 4store-...@googlegroups.com
Another possibility is that you might have just run out of disk/vm. Linux's kernel elevator is not very smart, tends to favour writes, and 4store does a lot of IO when writing.

- Steve

Steve Harris

unread,
Jan 11, 2012, 5:24:10 AM1/11/12
to 4store-...@googlegroups.com
Ugh, should have written "disk/vm bandwidth".

- Steve

Philip John

unread,
Jan 11, 2012, 10:27:05 AM1/11/12
to 4store-...@googlegroups.com
On a similar note, I'm trying out the SPARQL 1.1 LOAD functionality, and it seems to block the HTTP server from accepting any requests until it's finished. The file I'm fetching is ~600MB and is being downloaded from a pre-signed Amazon S3 object - is that expected (i.e., is the http server single threaded, or does it block when fetching remote files to load?)

- Phil.

Steve Harris

unread,
Jan 11, 2012, 12:08:00 PM1/11/12
to 4store-...@googlegroups.com
The HTTP server is multithreaded, but there's only one write thread and multiple read threads.

LOAD would block the write thread (it's a SPARQL Update request), but I think queries should still go ahead, I'm not very confident about that though. LOAD support is quite recent, and I doubt it's been tested much. As with other large writes, it will use a lot of disk/vm bandwidth.

- Steve

Phil John

unread,
Jan 11, 2012, 5:37:00 PM1/11/12
to 4store-support
Hmmm, curious.

I'm definitely getting no access to the HTTP server, be it a simple
call to /status or a SPARQL query.

I've tested this with some RPMs just built from git head on a small
4+1 cluster running Centos 6.

Phil.

On Jan 11, 5:08 pm, Steve Harris <steve.har...@garlik.com> wrote:
> The HTTP server is multithreaded, but there's only one write thread and multiple read threads.
>
> LOAD would block the write thread (it's a SPARQL Update request), but I think queries should still go ahead, I'm not very confident about that though. LOAD support is quite recent, and I doubt it's been tested much. As with other large writes, it will use a lot of disk/vm bandwidth.
>
> - Steve
>
> On 11 Jan 2012, at 15:27, Philip John wrote:
>
>
>
>
>
>
>
>
>
> > On a similar note, I'm trying out the SPARQL 1.1 LOAD functionality, and it seems to block the HTTP server from accepting any requests until it's finished. The file I'm fetching is ~600MB and is being downloaded from a pre-signed Amazon S3 object - is that expected (i.e., is the http server single threaded, or does it block when fetching remote files to load?)
>
> > - Phil.
>
> > >>> For more options, visit this group athttp://groups.google.com/group/4store-support?hl=en.
>
> > >> --
> > >> Steve Harris, CTO, Garlik Limited
> > >> 1-3 Halford Road, Richmond, TW10 6AW, UK
> > >> +44 20 8439 8203  http://www.garlik.com/
> > >> Registered in England and Wales 535 7233 VAT # 849 0517 11
> > >> Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
>
> > >> --
> > >> You received this message because you are subscribed to the Google Groups "4store-support" group.
> > >> To post to this group, send email to 4store-...@googlegroups.com.
> > >> To unsubscribe from this group, send email to 4store-suppor...@googlegroups.com.
> > >> For more options, visit this group athttp://groups.google.com/group/4store-support?hl=en.
>
> > > --
> > > Steve Harris, CTO, Garlik Limited
> > > 1-3 Halford Road, Richmond, TW10 6AW, UK
> > > +44 20 8439 8203  http://www.garlik.com/
> > > Registered in England and Wales 535 7233 VAT # 849 0517 11
> > > Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
>
> > > --
> > > You received this message because you are subscribed to the Google Groups "4store-support" group.
> > > To post to this group, send email to 4store-...@googlegroups.com.
> > > To unsubscribe from this group, send email to 4store-suppor...@googlegroups.com.
> > > For more options, visit this group athttp://groups.google.com/group/4store-support?hl=en.
>
> > --
> > Steve Harris, CTO, Garlik Limited
> > 1-3 Halford Road, Richmond, TW10 6AW, UK
> > +44 20 8439 8203  http://www.garlik.com/
> > Registered in England and Wales 535 7233 VAT # 849 0517 11
> > Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
>
> > --
> > You received this message because you are subscribed to the Google Groups "4store-support" group.
> > To post to this group, send email to 4store-...@googlegroups.com.
> > To unsubscribe from this group, send email to 4store-suppor...@googlegroups.com.
> > For more options, visit this group athttp://groups.google.com/group/4store-support?hl=en.
>
> > --
> > You received this message because you are subscribed to the Google Groups "4store-support" group.
> > To post to this group, send email to 4store-...@googlegroups.com.
> > To unsubscribe from this group, send email to 4store-suppor...@googlegroups.com.
> > For more options, visit this group athttp://groups.google.com/group/4store-support?hl=en.

Phil John

unread,
Jan 11, 2012, 5:37:51 PM1/11/12
to 4store-support
And by no access, I mean the connection to do a read operation just
hangs until the LOAD is completed.

Phil.

Z Zlatev

unread,
Jan 12, 2012, 10:11:28 AM1/12/12
to 4store-...@googlegroups.com
I am experiencing the the same 4store behaviour - once the our
database feeder module sends the triples to the SPARQL HTTP server for
inserting (we use POST with mime type at the end of the payload
specifying that the format is turtle), no other inserts or selects are
possible. I wonder if the writing thread simply opens the 4store
data-files for writing and locks them from other concurrent usage,
i.e. reading or writing. Also, is it feasible and how easy is it to
extend the writing threads pool. For the system I am working on, this
is quite critical as the writes to the database more frequent than the
reads, and 4store is running on a quite powerful
multi-processor/multi-core platform (it is not a cluster installation
though).

- Zlatko.

Philip John

unread,
Jan 12, 2012, 10:21:33 AM1/12/12
to 4store-...@googlegroups.com
What size is the file you are sending?

I've tried splitting my ntriples file into roughly 2MB chunks and then sending those and, whilst there are some spikes in query performance during a commit, it's probably bearable. This is running on a small cluster of 4 machines with 8 segments.

Phil.

Steve Harris

unread,
Jan 12, 2012, 1:37:38 PM1/12/12
to 4store-...@googlegroups.com
Oh, interesting, could be a bug.

The guy who wrote the HTTP server is in the office tomorrow, so I'll ask him to look at it.

- Steve

> For more options, visit this group at http://groups.google.com/group/4store-support?hl=en.

Steve Harris

unread,
Jan 12, 2012, 5:37:08 PM1/12/12
to 4store-...@googlegroups.com
There is a version somewhere that William Waites worked on which supports parallel connections with multi-write locking, but I think there's some performance penalty if you use it. I think it might be this version: https://github.com/wwaites/4store

Parallel reads are definitely supported by mainline, and should work.

- Steve

Phil John

unread,
May 20, 2012, 2:19:38 PM5/20/12
to 4store-...@googlegroups.com
Hi Steve,

Did you ever get an answer on whether a writer in the HTTP server could block readers?

We're looking at the best way to import some large-ish datasets (typically in the low to mid 10's of millions of triples, but sometimes up to 100 million at a time) and need to factor in a write lock blocking readers if so.

Kind regards,

Phil.

>>>>>> To post to this group, send email to 4store-support@googlegroups.com.
>>>>>> To unsubscribe from this group, send email to 4store-support+unsubscribe@googlegroups.com.


>>>>>> For more options, visit this group athttp://groups.google.com/group/4store-support?hl=en.
>>
>>>>> --
>>>>> Steve Harris, CTO, Garlik Limited
>>>>> 1-3 Halford Road, Richmond, TW10 6AW, UK
>>>>> +44 20 8439 8203  http://www.garlik.com/
>>>>> Registered in England and Wales 535 7233 VAT # 849 0517 11
>>>>> Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google Groups "4store-support" group.

>>>>> To post to this group, send email to 4store-support@googlegroups.com.
>>>>> To unsubscribe from this group, send email to 4store-support+unsubscribe@googlegroups.com.


>>>>> For more options, visit this group athttp://groups.google.com/group/4store-support?hl=en.
>>
>>>> --
>>>> Steve Harris, CTO, Garlik Limited
>>>> 1-3 Halford Road, Richmond, TW10 6AW, UK
>>>> +44 20 8439 8203  http://www.garlik.com/
>>>> Registered in England and Wales 535 7233 VAT # 849 0517 11
>>>> Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
>>
>>>> --
>>>> You received this message because you are subscribed to the Google Groups "4store-support" group.

>>>> To post to this group, send email to 4store-support@googlegroups.com.
>>>> To unsubscribe from this group, send email to 4store-support+unsubscribe@googlegroups.com.


>>>> For more options, visit this group athttp://groups.google.com/group/4store-support?hl=en.
>>
>>> --
>>> Steve Harris, CTO, Garlik Limited
>>> 1-3 Halford Road, Richmond, TW10 6AW, UK
>>> +44 20 8439 8203  http://www.garlik.com/
>>> Registered in England and Wales 535 7233 VAT # 849 0517 11
>>> Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
>>
>>> --
>>> You received this message because you are subscribed to the Google Groups "4store-support" group.

>>> To post to this group, send email to 4store-support@googlegroups.com.
>>> To unsubscribe from this group, send email to 4store-support+unsubscribe@googlegroups.com.


>>> For more options, visit this group athttp://groups.google.com/group/4store-support?hl=en.
>>
>>> --
>>> You received this message because you are subscribed to the Google Groups "4store-support" group.

>>> To post to this group, send email to 4store-support@googlegroups.com.
>>> To unsubscribe from this group, send email to 4store-support+unsubscribe@googlegroups.com.


>>> For more options, visit this group athttp://groups.google.com/group/4store-support?hl=en.
>>
>> --
>> Steve Harris, CTO, Garlik Limited
>> 1-3 Halford Road, Richmond, TW10 6AW, UK
>> +44 20 8439 8203  http://www.garlik.com/
>> Registered in England and Wales 535 7233 VAT # 849 0517 11
>> Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
>
> --
> You received this message because you are subscribed to the Google Groups "4store-support" group.

> To post to this group, send email to 4store-support@googlegroups.com.
> To unsubscribe from this group, send email to 4store-support+unsubscribe@googlegroups.com.


> For more options, visit this group at http://groups.google.com/group/4store-support?hl=en.
>

Steve Harris

unread,
May 21, 2012, 6:04:02 AM5/21/12
to 4store-...@googlegroups.com
Sorry, forgot to come back on this.

It can, yes. Not normally for the entire duration of the INSERT, but certainly for portion of it.

- Steve

To view this discussion on the web visit https://groups.google.com/d/msg/4store-support/-/Y-MOH9J7_0oJ.
To post to this group, send email to 4store-...@googlegroups.com.
To unsubscribe from this group, send email to 4store-suppor...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/4store-support?hl=en.

-- 
Steve Harris, CTO
Garlik, a part of Experian
1-3 Halford Road, Richmond, TW10 6AW, UK
Registered in England and Wales 653331 VAT # 887 1335 93
Registered office: Landmark House, Experian Way, Nottingham, Notts, NG80 1ZZ

Philip John

unread,
May 21, 2012, 6:14:29 AM5/21/12
to 4store-...@googlegroups.com
Hi Steve,

Thanks for the update.

We're looking at semi-regular large loads (10's of millions of triples) so will look to schedule those overnight. During the day we have a peak of around 20-30 updates a second, with about 200-300 reads a second, will a fairly beefy cluster with lots of memory and SSDs be able to cope with that do you think?

Regards,

Phil.

Steve Harris

unread,
May 21, 2012, 8:00:06 AM5/21/12
to 4store-...@googlegroups.com
I would think so, but it all depends on the data of course.

William Waites has a version of 4store that's fully concurrent, and can do multiple updates at once etc. but you pay a bit of a performance penalty for that, I think it's a branch of quite an old version. William may be able to give you more information though, and say whether it's been forward-ported.

Cheers,
   Steve

Phil John

unread,
May 25, 2012, 9:41:11 AM5/25/12
to 4store-...@googlegroups.com
Understood.

I gave the concurrent branch a quick go, but after finally getting it to compile it kept not finding the shared libs for raptor when running anything, so I've somewhat given up :)

I've had a quick look at the httpd.c file and noticed the number of concurrent queries (reads) is hardcoded as 16 - I've tried upping it to higher values (along with upping the watchdog bytes per second) without apparent side-effects, and indeed it improved QPS when running on a cluster of 4 extra large quadruple memory EC2 instances - is this dangerous to do? Does 16 have some special meaning?

Regards,

Phil.
- Steve

-- 
Steve Harris, CTO
Garlik, a part of Experian
1-3 Halford Road, Richmond, TW10 6AW, UK
Registered in England and Wales 653331 VAT # 887 1335 93
Registered office: Landmark House, Experian Way, Nottingham, Notts, NG80 1ZZ

--
You received this message because you are subscribed to the Google Groups "4store-support" group.
To post to this group, send email to 4store-support@googlegroups.com.
To unsubscribe from this group, send email to 4store-support+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/4store-support?hl=en.


--
You received this message because you are subscribed to the Google Groups "4store-support" group.
To post to this group, send email to 4store-support@googlegroups.com.
To unsubscribe from this group, send email to 4store-support+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/4store-support?hl=en.

-- 

Steve Harris

unread,
May 25, 2012, 9:53:35 AM5/25/12
to 4store-...@googlegroups.com
No, there's nothing special about 16. Ideally it should be dynamic, to avoid taking too many resources when idle, but it won't hurt to turn it up.

Would also be an option to define it in /etc/4store.conf.

- Steve

To view this discussion on the web visit https://groups.google.com/d/msg/4store-support/-/iVcIxCKpwosJ.
To post to this group, send email to 4store-...@googlegroups.com.
To unsubscribe from this group, send email to 4store-suppor...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/4store-support?hl=en.
Reply all
Reply to author
Forward
0 new messages