Using a child process to exec the sqlite command tool import command.
Advantage, I can use the built in async capablities of node.js for
executing child processes and node can be freed to do other things
during the bulk import. Disadvantage, it seems kind of heavy weight to
exec a process when sqlite has a nice built in c interface.
I could use a C binding to sqlite (like node-sqlite), use
sqlite_prepare to prepare one insert statement and kick of the inserts
in a sqlite transaction. Advantage no child process, disadvantage, how
to make this async and get a callback. Is the node.js libeio (or
whatever it is called) interfaces hard to use and plugin in this case.
Are there many advantages to this over the child process idea.
Just not sure how to weight out these decisions in general with node.
When does it become important not to block, 10 records, 1?
Could call the SQLite library in a thread instead of forking a
completely separate process? I think libeio does this for some
operations, and the SQLite FAQ says that it is thread safe.
Obviously, this depends per-project and per-deployment, but in general
you should aim to block for the absolute minimum amount of time
possible, and you should never block whilst waiting for data over a
socket.
Every time a node app blocks anywhere adds delay to all other requests
currently in-flight within node. So, for example, if you fetch 100,000
rows and it takes 1s (totally made-up numbers, for sake of argument),
and you block the whole time in your library, then *nothing else will
happen* on that server for *1 second* – your 0.4 ms response time just
jumped to 1000.4 ms.
Use continuations and setTimeout (and other methods that I encourage
others to contribute here! :-) ) if you're doing non-network related
operations that will likely take a long time. If you're doing network
ops, make it streaming and totally non-blocking except for the
necessary buffering.
Hope that helps!
b.
I'm not sure if the node-sqlite binding is actually doing things
asynchronously. Apparently there is an interface to make SQLite calls
in other threads: http://www.sqlite.org/asyncvfs.html
but I don't see any of those calls present in node-sqlite.
A quick scan of that page suggests that it's something slightly
different, analogous to a filesystem where calls return immediately
without any guarantee that they were actually executed. They do this
by through VFS, SQLite's filesystem abstraction layer. Also, it's not
clear that reads (when data is on disk and not in the write queue) are
actually asynchronous because you're interfacing with the regular
SQLite API.
What you'd really want is a set of asynchronous higher-level SQLite
operations so you get callbacks when your changes are actually
committed to disk. It's been years since I last tried to embed SQLite,
but it still seems to be that running SQLite calls in a separate
thread that you manage yourself is the way to go.
In that case a binding to SQLite will be a difficult undertaking
filled with ev_async calls - but perhaps truly worthwhile.
On 21 fev, 17:59, Guan Yang <g...@yang.dk> wrote:
On Feb 21, 2:15 pm, Tim Caswell <t...@creationix.com> wrote:
> It's actually quite fast, seehttp://github.com/creationix/node-persistence/blob/master/lib/persist...for an example of this.
Tim, I really got a lot of code/inspiration from node-persistence and
it prompted my question. Using a child process to interface with the
sqlite shell process might be better for node because of the built-in
async capabilities. Can you explain what is "actually quite fast"
below? You mean using a child process is actually quite fast?
> --
> You received this message because you are subscribed to the Google Groups "nodejs" group.
> To post to this group, send email to nod...@googlegroups.com.
> To unsubscribe from this group, send email to nodejs+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/nodejs?hl=en.
>