pysqlite vs. APSW

367 views
Skip to first unread message

Gour-Gadadhara Dasa

unread,
Apr 30, 2011, 4:05:24 AM4/30/11
to python...@googlegroups.com
Hello!

I'm starting with Python and plan to write desktop GUI application
using QT lib and external C library wrapping it with Cython extension.

Sqlite3 will be used as application file format and considering that
we plan to store pictures in the database, I wonder whether APSW might
bet more appropriate bindings than pysqlite?

My reasoning is based on:

"APSW gives all functionality of SQLite including virtual tables,
Virtual File System (VFS), BLOB I/O, backups and file control."

(http://apidoc.apsw.googlecode.com/hg/pysqlite.html)

and http://apidoc.apsw.googlecode.com/hg/blob.html#blobio where it
says: "...An alternate approach to using blobs is to store the data in
files and store the filename in the database. Doing so loses the ACID
properties of SQLite."


Several times I was advised to use BLOBs by storing only filenames
into database, but storing complete BLOBs looks as interesting option
and 'pro' for APSW.

Any pro/cons?

We plan to make multi-platform application being able to run on
Windows/Mac/Linux/Unix and it will be developed on (Free)PC-BSD
x86_64.

Btw, any reason why there is no APSW available in FreeBSD ports?

Sincerely,
Gour

--
“In the material world, conceptions of good and bad are
all mental speculations…” (Sri Caitanya Mahaprabhu)

http://atmarama.net | Hlapicina (Croatia) | GPG: 52B5C810


signature.asc

Roger Binns

unread,
Apr 30, 2011, 11:56:22 AM4/30/11
to python...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

[ Disclosure: I am the author of APSW so may be a little biased :) ]

On 04/30/2011 01:05 AM, Gour-Gadadhara Dasa wrote:
> "APSW gives all functionality of SQLite including virtual tables,
> Virtual File System (VFS), BLOB I/O, backups and file control."

Of those pysqlite does now support the backup API.

> Several times I was advised to use BLOBs by storing only filenames
> into database, but storing complete BLOBs looks as interesting option
> and 'pro' for APSW.

You can also store complete blobs with pysqlite. If you only need to read
or write the blobs in one go then either solution will work. APSW has the
incremental blob I/O support meaning you can treat a blob like a file handle
and do random read/writes of portions of it.

The advice to store files outside of SQLite exists primarily because of the
lack of incremental blob I/O support until not too long ago. ie when
reading or writing blobs you had to do the entire blob in one go. Not a big
deal if the size was a few kilobytes, but increasingly problematic if
multiple megabytes. The SQLite author also measured performance differences
where small blobs (less than 100kb IIRC) performed better with SQLite but
larger ones performed better using the OS file apis.

If you do store non-trivial blobs make sure they are in a separate table.
Also ensure that they are the last column of a table. SQLite uses variable
length columns and rows and skipping from one row to the next needs to skip
over the data which can take quite a bit of work if there are blobs in there.

Storing the blobs as files also makes incremental backups easier. Backup
solutions work at the file level so changing a single byte in a SQLite
database requires making a copy of the entire file. You can also use
existing platform tools ("explorer") to browse a directory full of files.

Note that either way will function correctly.

> Any pro/cons?

If you use APSW then you can use virtual tables. Your virtual table can
hide if the file is stored as a blob, in another table or as a file (or over
the network).

> Btw, any reason why there is no APSW available in FreeBSD ports?

It used to be. The biggest problem with those kind of solutions is that
they lag releases ending up way behind SQLite and APSW releases (can be
several years!)

In any event APSW is trivial to install and will even fetch the latest
SQLite for you, enable extensions (eg FTS) and make life very easy. The
final install is a single file - apsw.so in this case - so the packaging
system doesn't provide that much value.

http://apidoc.apsw.googlecode.com/hg/build.html#recommended

Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iEYEARECAAYFAk28MSIACgkQmOOfHg372QTd8QCgrHmsfDgodbiBsHGSlJ2sm69G
yZwAoI4pfOyZ2YaQlkcIjE2nxumAOWHy
=Yr42
-----END PGP SIGNATURE-----

Gour-Gadadhara Dasa

unread,
Apr 30, 2011, 12:49:52 PM4/30/11
to python...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sat, 30 Apr 2011 08:56:22 -0700
Roger Binns <rogerb-JFdGOZ7...@public.gmane.org> wrote:

> [ Disclosure: I am the author of APSW so may be a little biased :) ]

Don't worry...I noticed your name as the APSW author. :)

> APSW has the incremental blob I/O support meaning you can treat a
> blob like a file handle and do random read/writes of portions of it.

I'm not sure if we need that considering we'll probably store person's
photo(s) and different diagrams (svg, png, jpg...)

> The advice to store files outside of SQLite exists primarily because
> of the lack of incremental blob I/O support until not too long ago.

Ahh...now it's clear.

> ie when reading or writing blobs you had to do the entire blob in one
> go. Not a big deal if the size was a few kilobytes, but increasingly
> problematic if multiple megabytes. The SQLite author also measured
> performance differences where small blobs (less than 100kb IIRC)
> performed better with SQLite but larger ones performed better using
> the OS file apis.

I believe our blobs will be more then fee Ks.

> If you do store non-trivial blobs make sure they are in a separate
> table. Also ensure that they are the last column of a table. SQLite
> uses variable length columns and rows and skipping from one row to
> the next needs to skip over the data which can take quite a bit of
> work if there are blobs in there.

Thank you. Very useful tip.

> Storing the blobs as files also makes incremental backups easier.
> Backup solutions work at the file level so changing a single byte in
> a SQLite database requires making a copy of the entire file. You can
> also use existing platform tools ("explorer") to browse a directory
> full of files.

That's true, but ability to have everything in one file and not caring
about keeping many files in filesystem is advantageous for me.

> If you use APSW then you can use virtual tables. Your virtual table
> can hide if the file is stored as a blob, in another table or as a
> file (or over the network).

This is interesting.

> It used to be. The biggest problem with those kind of solutions is
> that they lag releases ending up way behind SQLite and APSW releases
> (can be several years!)

Hmm...in the case of sqlite, ports has sqlite3-3.7.6.1, which is not
so old.

I'm still very fresh with FreeBSD and already have few apps on my TODO
lists for ports, so I may take a plunge.

> In any event APSW is trivial to install and will even fetch the latest
> SQLite for you, enable extensions (eg FTS) and make life very easy.
> The final install is a single file - apsw.so in this case - so the
> packaging system doesn't provide that much value.

Another question: I've 1st edition of Owens Sqlite book, but today
was reading Using Sqlite ebook where it is stated: "..."...the
recommended way of using SQLite is to integrate the whole database
engine directly into your application." What is the recommended way
when using Sqlite with Python (e.g APSW) in order to minimize
dependency problems? Include APSW within one's application and build
similar to Cython extension (we plan to include 3rd party library and
build it as Cython extension)?


Sincerely,
Gour

- --
“In the material world, conceptions of good and bad are
all mental speculations…” (Sri Caitanya Mahaprabhu)

http://atmarama.net | Hlapicina (Croatia) | GPG: 52B5C810


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (FreeBSD)

iQIcBAEBAgAGBQJNvD2wAAoJELhxXPVStcgQlA8P/2mTyRAW3v5f8FEoRVAeAcC5
4ZbtjM8MIp4LEztjdSh94821YSCG6MQsI8QFoutWYBjuNpK6IRu5XNTjo/o+wYEb
T9QrqpDAQ0xIlJ5svaBxA09ntjpkiMiJgU+QRRmIOhjmIoxnvPUp/SmxYurERSjF
ylXOYYxAuZjJZ1K+K3S5FXSpgKmp6toxfaFaEdDAgzm7lIFOybDzVs75+9K+UpWR
Wsw/4PWbbXNKRcS8v2gfh2L0WU6gVee1K8DmyUOneveVIvObh1WusVMjfhcw0crR
rXzLzd7qkpWcSO2mkbMoYBOvUfkVPkUFlojezCTUJWsa1gES/12Rol1TXV3701co
fvucV4opw2ZZqcQFSNZcMVjXzxeptyTLlm50dYBtCjtleIkyQaqHZg9OjWnVKjos
eA0iQZc478OzZEo+XmjVNO5Z7jaeckqW3tXWcf8kQxTIrbCEc4luw8kQ4FVgv/ey
+sTD0+BvHF5LLWRhP9mMsK+CqlTI2pWYnwqBnyXLevMxGUsqltjl2CNy7nBNRm4j
+3jxLXLGc/3rtBRGgjGc8dc8GSDM1Zt1wDJYPtPXEKwi02lxbtxf+WzLfR4iTAn6
0IqTlxCwPgYbFLzI8biBcKv7fTA7ybbhU8D3FUHMGioQZU9fIW/Jnv9+vK6RIC3l
wXJUv9LxX/q9rsmlPul0
=qWlk
-----END PGP SIGNATURE-----

Roger Binns

unread,
Apr 30, 2011, 1:28:41 PM4/30/11
to python...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 04/30/2011 09:49 AM, Gour-Gadadhara Dasa wrote:
> On Sat, 30 Apr 2011 08:56:22 -0700
> Roger Binns <rogerb-JFdGOZ7...@public.gmane.org> wrote:
>> APSW has the incremental blob I/O support meaning you can treat a
>> blob like a file handle and do random read/writes of portions of it.
>
> I'm not sure if we need that considering we'll probably store person's
> photo(s) and different diagrams (svg, png, jpg...)

As an example lets say the image is 20MB and you only want the exif
information from the beginning (a kb or two). If you use the normal SQLite
APIs then that entire 20MB has to be read into memory before you can access
the 1kb at the start. If you use the incremental blob I/O api then you can
read just that first kb.

> Hmm...in the case of sqlite, ports has sqlite3-3.7.6.1, which is not
> so old.

3.7.6.2 is the current version and the difference between that and .1 was
due a fix NetBSD needed and hence probably applies to FreeBSD too. (It was
to do with calling a pointer to the open(2) function that takes varargs.)

> Another question: I've 1st edition of Owens Sqlite book, but today
> was reading Using Sqlite ebook where it is stated: "..."...the
> recommended way of using SQLite is to integrate the whole database
> engine directly into your application." What is the recommended way
> when using Sqlite with Python (e.g APSW) in order to minimize
> dependency problems? Include APSW within one's application and build
> similar to Cython extension (we plan to include 3rd party library and
> build it as Cython extension)?

The recommended APSW build instructions include the SQLite code directly
into APSW. Any system SQLite is completely ignored. Heck it will even work
correctly if another part of the app loads a different version of SQLite
into the same process. (This happens on Mac where CoreData uses SQLite.)

I made sure APSW binary is a single file so just include that with your
application distribution. It has no external dependencies.

For a source distribution you can include the .zip from the APSW download.
You can also produce a zip that includes the SQLite source with the APSW code.

Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iEYEARECAAYFAk28RsUACgkQmOOfHg372QT/rwCgvj19CJIrHLYhr/cKIvOIJhDi
ABEAn3YDGVKbMNN4H9JFFVHVyK9wur/d
=rTWt
-----END PGP SIGNATURE-----

Gour-Gadadhara Dasa

unread,
Apr 30, 2011, 3:11:07 PM4/30/11
to python...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sat, 30 Apr 2011 10:28:41 -0700
Roger Binns <rogerb-JFdGOZ7...@public.gmane.org> wrote:

> As an example lets say the image is 20MB and you only want the exif
> information from the beginning (a kb or two). If you use the normal
> SQLite APIs then that entire 20MB has to be read into memory before
> you can access the 1kb at the start. If you use the incremental blob
> I/O api then you can read just that first kb.

I agree...it might become useful for us as well.

> I made sure APSW binary is a single file so just include that with
> your application distribution. It has no external dependencies.

Very nice.

> For a source distribution you can include the .zip from the APSW
> download. You can also produce a zip that includes the SQLite source
> with the APSW code.

Looks good...let me try it.

Thank you for your input, anf, of course, for providing APSW. ;)


Sincerely,
Gour

- --
“In the material world, conceptions of good and bad are
all mental speculations…” (Sri Caitanya Mahaprabhu)

http://atmarama.net | Hlapicina (Croatia) | GPG: 52B5C810


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (FreeBSD)

iQIcBAEBAgAGBQJNvF7LAAoJELhxXPVStcgQNS4QAM0qaYikf3LABrwrd8oDVFRK
7kcdCAKMTjIULWWzEMVWVMGILRUGHUDet3f/o+Ohq6fEnhbr+jVVtbzuPjBZ7fPF
31bbk44e8gJZAQ6Jla9qfbZ+RzK4kkMa01PgL1n4CgIQOL/wOnZVYndzGxIMfBT1
BOccC2iOKX+Z4SKcguGsVuazdI2uetPtuPPp7zB+5EWR2uZjEzkBxq8HNQyYp5tW
2TKSgEmnsCA/B8aCMX76NKrtdOYhQ9GY13LhXCxiVAvdyy1uJSdhjaBMclzlgrxg
zb5J+QXnf3mUzGkrcFDSRcfjQngBONqQF3lbr0274+gO+rsvhsLih/V6Kp5vR45o
zfMn+qlIKaO6KpPggdjMU9yTxHzMLjjRL/Kd38ijhMDFWJ0a/Of49+CoXDD+kY6l
PisF6Gxy8g+tueTmAq5chvYsFyD58bW22q/RGKYI6TGsoXXJswKO5N237/nFtDgm
rYiYXlA3f3OeLPFGFt/0hM3jRRzfmG1nKNChj7wa+2c08b0ERvL6UFhfeUFjcGox
VjvgJmbjc7FdOiJ/Lg7fAT0rwUOox9BY28dWeyqz6Jw7r25Vs9YCkMNudaSyIZro
XwH/eLE/XvZN2+pw5e8kwnX0GPDT4UblDllhGaSTusEQyGlWp+5jEqz2LaBeg1of
dArUogGJdC4L2ZJpnF4W
=JHBC
-----END PGP SIGNATURE-----
Reply all
Reply to author
Forward
0 new messages