I'm using sqlite in a speed critical application. Data integrity after
crashes is not very important as the database can be recreated from
scratch in that case.
In my tests, the writing of data with insert commands seemed very slow,
despite being wrapped in a transaction.
I have tried PRAGMA synchronous=off and also compiled the asynchronous
io module in, but that didn't help much. A little profiling revealed
that the unixSync function is called somewhere down the callstack of the
insert command. And there, full_fsync seems not to take the pragma
settings or the async io module into account.
I recompiled sqlite with the SQLITE_NO_SYNC definement and had a huge
increase in speed.
Is this intended behaviour? How does the pragma/async io module fit in,
then?
Cheers,
Benjamin
_______________________________________________
sqlite-users mailing list
sqlite...@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
I believe you've got something wrong in your tests. Because with
asynchronous module (if you indeed used its VFS when created your
connections) no actual disk I/O happens in the same thread as insert
command. With PRAGMA synchronous = off the only syncs I can think of
being called is sync on directory where journal is created and
deleted. If you don't care about database corruption and if you don't
need to do rollbacks then maybe you need to try PRAGMA journal_mode =
off too? Or at least PRAGMA journal_mode = persist if you still need
to do rollbacks.
Pavel
thanks for helping me out!
>> Is this intended behaviour? How does the pragma/async io module fit in,
>> then?
>>
> I believe you've got something wrong in your tests. Because with
> asynchronous module (if you indeed used its VFS when created your
> connections) no actual disk I/O happens in the same thread as insert
> command.
Yes, I thought so as well.
But when I profiled with google perftools, unixSync showed up in the
call stack below the place where the insert commands were executed. I
have started async io by calling sqlite3async_initialize before opening
the database. It returned SQLITE_OK. Just to make sure that I'm not
looking at the wrong code, I have inserted a hardcoded breakpoint
(asm("int3);) into the flush-function and the debugger stops frequently
at that point.
Compiling with -DSQLITE_NO_SYNC is running an order of magnitude faster
than all combinations of PRAGMA synchronous, PRAGMA journal_mode and
async io that I could think of.
Unfortunately all of this is embedded in a large scale application where
I can't easily extract a small example demonstrating the effect. The
code uses the Sql module from Qt with a sqlite backend compiled as a
plugin. Maybe something is lost along these lines.
I'll see if I can whip up anything smaller without Qt inbetween.
> With PRAGMA synchronous = off the only syncs I can think of
> being called is sync on directory where journal is created and
> deleted. If you don't care about database corruption and if you don't
> need to do rollbacks then maybe you need to try PRAGMA journal_mode =
> off too? Or at least PRAGMA journal_mode = persist if you still need
> to do rollbacks.
>
I don't need to do rollbacks and have tried journal_mode=off vs
journal_mode=memory. The former didn't make much of a difference, so I
opted for the latter, just in case.
See, the unixSync-function says:
static int unixSync(sqlite3_file *id, int flags){
/* ... */
int isDataOnly = (flags&SQLITE_SYNC_DATAONLY);
int isFullsync = (flags&0x0F)==SQLITE_SYNC_FULL;
/* ... */
rc = full_fsync(pFile->h, isFullsync, isDataOnly);
/* ... */
}
And full_fsync does call fsync unless compiled with -DSQLITE_NO_SYNC
I still need to check why unixSync is triggered by INSERT. I had
problems to get debugging info going and therefore had to resort to
looking at machine code (hence the hardcoded breakpoint).
I have set up a test program without QtSql and could not reproduce any
of the performance degradations. So the problem has nothing to do with
sqlite. Sorry for the noise.
I guess Qt either ignores some of your pragmas, or resets their values
to what it thinks is better. Also I'm afraid it ignores your async
module VFS, so I suggest you to check twice that it's really used as
you intend it to be.
Pavel