To recap the reads are done via SPARQL CONSTRUCT queries and the
writes via HTTP PUT and I get frequent HTTP time-outs
I've put together a little demo application which demonstrates this,
it takes a very small graph (58 triples) and attempts to repeatedly
write the graph to 4store and then read it back again. I wonder
whether the fact that I'm overwriting an existing Graph in the store
is part of the problem since the 1st write and read are almost
instantaneous but after the 2nd write it hangs? But I've tried it
when I use a different Graph URI for each write/read attempt and I get
exactly the same behaviour.
The demo application can be checked out of SVN here:
https://dotnetrdf.svn.sourceforge.net/svnroot/dotnetrdf/samples/4s-stress-test
Run the 4s-stress-test.exe file from the bin\Debug directory
It's a Windows command line application that prompts for a URI for a
4store server and the option of using a different Graph URI for each
iteration before attempting to do 100 write/read attempts in
sequence. I was hoping to have this run under Mono but I've found
what may be a bug in the Mono runtime which prevents this currently.
Can you guys take a look and tell me what you think?
Rob
The problem was that I wasn't explicitly closing the connection to
4store after a POST operation whereas with the read operations the
connection always gets closed. This causes an issue since .Net's HTTP
model will permit only two concurrent connections to a given domain at
a time so once two writes were done I had two open connections to my
4store instance which blocked any more connections from being made
hence the timeouts.
So this was primarily a bug in my code but I'm wondering whether
4store is not closing connections at it's end, I have connectors for
other stores that use a similar pattern which never had this problem
(though I've now applied the fix universally for good measure) so I
don't know if 4store isn't closing requests properly after responding.
I had looking into this down for today and I wasn't enthusiastic about
trying to reproduce the problem via running a Windows executable, so
you saved me some time there.
I believe that 4store does close connections once it has finished with
them. An earlier generation of the code had a double-close bug and we
put some effort into making sure it closed each socket exactly once.
Is it possible that the difference with other stores is that they
speak HTTP/1.1 and so you can re-use the connections whereas 4store
only does HTTP/1.0 with exactly one connection per operation ?
I would be interested in adding HTTP/1.1 support to 4store, but I am
quite busy for the forseeable future. In theory offering HTTP/1.1
could reduce the setup/ teardown overhead and would show a performance
improvement for some use cases, particularly with longer network
latencies. So, for long imports or slow queries on the local machine
it would make no difference, but if doing many sequential fast queries
over the Internet (where latencies often exceed 100ms) it could be a
very large improvement.
It's possible that quality of implementation issues that forbid the
full exploitation of HTTP/1.1 by web browsers (basically, too many web
servers crash, corrupt data or have other unacceptable behaviour
unless you limit yourself to a "known safe" subset of the
specification) need not apply to SPARQL, and we could thus get some
performance wins from offering HTTP/1.1 features like pipelining.