When to block with sqlite

289 views
Skip to first unread message

John Wright

unread,
Feb 21, 2010, 1:49:14 PM2/21/10
to nodejs
I want to build a no-blocking compatible sqlite bulk insert feature
for my node.js little project I am building. I am sure node would
prefer my code not to block and give it a callback when its done
loading 30,000 records. That part is easy to understand. I am
wondering best how todo this however.

Using a child process to exec the sqlite command tool import command.
Advantage, I can use the built in async capablities of node.js for
executing child processes and node can be freed to do other things
during the bulk import. Disadvantage, it seems kind of heavy weight to
exec a process when sqlite has a nice built in c interface.

I could use a C binding to sqlite (like node-sqlite), use
sqlite_prepare to prepare one insert statement and kick of the inserts
in a sqlite transaction. Advantage no child process, disadvantage, how
to make this async and get a callback. Is the node.js libeio (or
whatever it is called) interfaces hard to use and plugin in this case.
Are there many advantages to this over the child process idea.

Just not sure how to weight out these decisions in general with node.
When does it become important not to block, 10 records, 1?

Guan Yang

unread,
Feb 21, 2010, 3:01:47 PM2/21/10
to nod...@googlegroups.com
On Sun, Feb 21, 2010 at 13:49, John Wright <mrjjw...@gmail.com> wrote:
> Using a child process to exec the sqlite command tool import command.
> Advantage, I can use the built in async capablities of node.js for
> executing child processes and node can be freed to do other things
> during the bulk import. Disadvantage, it seems kind of heavy weight to
> exec a process when sqlite has a nice built in c interface.

Could call the SQLite library in a thread instead of forking a
completely separate process? I think libeio does this for some
operations, and the SQLite FAQ says that it is thread safe.

Blaine Cook

unread,
Feb 21, 2010, 3:02:42 PM2/21/10
to nod...@googlegroups.com
On 21 February 2010 19:49, John Wright <mrjjw...@gmail.com> wrote:
> Just not sure how to weight out these decisions in general with node.
> When does it become important not to block, 10 records, 1?

Obviously, this depends per-project and per-deployment, but in general
you should aim to block for the absolute minimum amount of time
possible, and you should never block whilst waiting for data over a
socket.

Every time a node app blocks anywhere adds delay to all other requests
currently in-flight within node. So, for example, if you fetch 100,000
rows and it takes 1s (totally made-up numbers, for sake of argument),
and you block the whole time in your library, then *nothing else will
happen* on that server for *1 second* – your 0.4 ms response time just
jumped to 1000.4 ms.

Use continuations and setTimeout (and other methods that I encourage
others to contribute here! :-) ) if you're doing non-network related
operations that will likely take a long time. If you're doing network
ops, make it streaming and totally non-blocking except for the
necessary buffering.

Hope that helps!

b.

Ryan Dahl

unread,
Feb 21, 2010, 3:51:59 PM2/21/10
to nod...@googlegroups.com

I'm not sure if the node-sqlite binding is actually doing things
asynchronously. Apparently there is an interface to make SQLite calls
in other threads: http://www.sqlite.org/asyncvfs.html
but I don't see any of those calls present in node-sqlite.

Guan Yang

unread,
Feb 21, 2010, 3:59:21 PM2/21/10
to nod...@googlegroups.com
On Sun, Feb 21, 2010 at 15:51, Ryan Dahl <coldre...@gmail.com> wrote:
> I'm not sure if the node-sqlite binding is actually doing things
> asynchronously. Apparently there is an interface to make SQLite calls
> in other threads: http://www.sqlite.org/asyncvfs.html
> but I don't see any of those calls present in node-sqlite.

A quick scan of that page suggests that it's something slightly
different, analogous to a filesystem where calls return immediately
without any guarantee that they were actually executed. They do this
by through VFS, SQLite's filesystem abstraction layer. Also, it's not
clear that reads (when data is on disk and not in the write queue) are
actually asynchronous because you're interfacing with the regular
SQLite API.

What you'd really want is a set of asynchronous higher-level SQLite
operations so you get callbacks when your changes are actually
committed to disk. It's been years since I last tried to embed SQLite,
but it still seems to be that running SQLite calls in a separate
thread that you manage yourself is the way to go.

Ryan Dahl

unread,
Feb 21, 2010, 4:11:38 PM2/21/10
to nod...@googlegroups.com

In that case a binding to SQLite will be a difficult undertaking
filled with ev_async calls - but perhaps truly worthwhile.

Tim Caswell

unread,
Feb 21, 2010, 4:15:31 PM2/21/10
to nod...@googlegroups.com
It's actually quite fast, see http://github.com/creationix/node-persistence/blob/master/lib/persistence/sqlite.js for an example of this.
The only problem is that we're scraping stdout and can't tell the difference between "5" and 5.  Speed isn't an issue, but rich types is.

For most uses, this is good enough though, especially if you have a higher-level abstraction on top that knows what type to expect.

Connor Dunn

unread,
Feb 21, 2010, 4:36:17 PM2/21/10
to nod...@googlegroups.com
Has anyone looked at http://github.com/rbranson/node-ffi I don't know
what I'm talking about here, but it appears to have async sqlite
bindings as an example

Ricardo Tomasi

unread,
Feb 21, 2010, 5:25:56 PM2/21/10
to nodejs
What is the cost/delay when spawning a child process? As node is
pretty lightweight I was thinking of keeping a separate process always
on, working like an HTTP interface to SQLite. It looks like a lot of
overhead but simplifies things for me, and the performance hit
wouldn't be significant for small-scale apps. Any drawbacks?

On 21 fev, 17:59, Guan Yang <g...@yang.dk> wrote:

John Wright

unread,
Feb 21, 2010, 9:17:23 PM2/21/10
to nodejs
Tim, I really got a lot of code/inspiration from node-persistence and
it prompted my question. Using a child process to interface with the
sqlite shell process might be better for node because of the built-in
async capabilities. Can you explain what is "actually quite fast"
below? You mean using a child process is actually quite fast?

On Feb 21, 2:15 pm, Tim Caswell <t...@creationix.com> wrote:
> It's actually quite fast, seehttp://github.com/creationix/node-persistence/blob/master/lib/persist...for an example of this.

Dean Landolt

unread,
Feb 21, 2010, 9:26:14 PM2/21/10
to nod...@googlegroups.com
On Sun, Feb 21, 2010 at 9:17 PM, John Wright <mrjjw...@gmail.com> wrote:
Tim,  I really got a lot of code/inspiration from node-persistence and
it prompted my question.  Using a child process to interface with the
sqlite shell process might be better for node because of the built-in
async capabilities.  Can you explain what is "actually quite fast"
below?  You mean using a child process is actually quite fast?

At a few milliseconds the cost kicking off a child process isn't much of a problem. If the problem is in getting back typed data one solution could be a simple sqlite wrapper that spits back json. It could just use the sync sqlite apis because it'll be a child process anyway.

Tim Caswell

unread,
Feb 21, 2010, 11:51:27 PM2/21/10
to nod...@googlegroups.com
I keep one persistent child process open and I was getting several thousand requests per second. Compared to my pure-js postgres driver, this is phenomenal speed. If someone could write a generic sqlite3 program that spits out typed JSON (Like Dean said), that would even easier to wrap and probably be useful a a few scripting languages.

> --
> You received this message because you are subscribed to the Google Groups "nodejs" group.
> To post to this group, send email to nod...@googlegroups.com.
> To unsubscribe from this group, send email to nodejs+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/nodejs?hl=en.
>

Reply all
Reply to author
Forward
0 new messages