> Take it like that. Transactions are a way to provide consistent data to
> every reader/writer connected to a database.
Right, I've got that part down, it is the "what is the magic that
Web2py is doing behind my back" part that I was (and still a little
bit) unclear on.
The trick comes in when SQLite is being used.
Just for completeness, if I do a commit on every loop, what is the
window where Web2py is causing the SQLite lock to be taken?
Is it just between the update_record and the commit?
I ask because I'm not yet clear on how it is that the commit on every
loop will allow the Web2py app to be more responsive than the commit
after all 500 cases. It must be that in the commit on every loop there
is a substantial chance for the Web2py app to get to the database. But
if a commit causes a new transaction to start, how does Web2py get in?
Ok, just to double check, that is for all DB back-ends or just for SQLite?
Ah, OK. So it would be easier to write code that uses any DB except
for SQLite... Hmmm
Ok, so I think we've established that is a special limitation for SQLite.
For all the other DBs, I can have my Web2py app inserting new records
willy nilly and the background process updating them (mail sent, mail
failed, whatever) without the Web2py app having performance/response
time issues due to database contention.
The trick comes in when SQLite is being used.
Just for completeness, if I do a commit on every loop, what is the
window where Web2py is causing the SQLite lock to be taken?
Is it just between the update_record and the commit?
I ask because I'm not yet clear on how it is that the commit on every
loop will allow the Web2py app to be more responsive than the commit
after all 500 cases. It must be that in the commit on every loop there
is a substantial chance for the Web2py app to get to the database. But
if a commit causes a new transaction to start, how does Web2py get in?
In fact, the example in the book shows an outer loop that looks for
new work, and when done, sleeps for 60 seconds. If the commit call
starts a new transaction, the on the very last of the 500 records,
we'll fall out of the inner loop and then sleep for 60 seconds, which
should lock Web2py out for a full minute? how does the server process
drop the transaction in that case so that the Web2py process has a
chance to get in to the DB?
(I'm beginning to think that The Book should just say: This example is
for any DB except SQLite, don't use SQLite for this kind of processing
???)
> I don't think web2py is doing any magic -- I think it just relies on the
> standard database and database driver behavior.
"There in lies the rub" - the problem is that The Book isn't really
very easy to use to find out just what the DAL does for each DB and
what "shows through" as the behaviour of the underlying DB.
That is, The Book isn't really a reference manual, which is fine, but
it also leaves me wondering if there is something the DAL does but
that I just haven't found/remembered the right part of the The Book to
find it.
I guess the issue here is that I don't really want to tie my code (or
the example in The Book) to any one DB, but there are DB "quirks"
which manage to show through, and as a new Web2py user, its hard to
build up a model of what the DAL does and what leaks through. :-)
For example, the content at the link you provided ends with "7.0
Transaction Control At The SQL Level" and talks about autocommit mode
vs. "out of autocommit mode". I searched The Book for "autocommit" and
found nothing. I also searched for SQLite and didn't find anything
mentioning that (that I was able to recognize).
It matters because with autocommit off, SELECTs will acquire SHARED lock.
If I am reading things correctly, that might cause this server script
to block any writes/updates from the Web2py app.
Specifically, if the server does not find any records to update (i.e.
no work to do), the query/SELECT it does to find work items would
provoke a SHARED lock, but with no records to update the server won't
do a commit (or anything else that i can see) that would drop that
lock?
I was being half kidding, but only half. :-)
> Maybe I should make a slice on how to fix this behaviour also for
> SQLite..... it's just a bit complicated because it requires to rebuild the
> sqlite module with an updated library.
Oh, that's unfortunate. I have to say, I having the SQLite as part of
Python (i.e. not rebuilding) is a huge win for getting new user's into
the early success that makes them (or at least me) happy to keep using
and expanding their Web2py usage. :-)
>>> import sqlite3
>>> sqlite3.version
'2.4.1'
>>> sqlite3.sqlite_version
'3.6.22'
>>> import sqlite3
>>> sqlite3.version
'2.6.0'
>>> sqlite3.sqlite_version
'3.7.9'
sudo pip install http://pysqlite.googlecode.com/files/pysqlite-2.6.3.tar.gz --global-option build_staticI'm looking for The Book to tell me what to expect.
If it doesn't tell me, then I don't know if the current behaviour
(once I discover it) will be carried forward to future versions of
Web2py as part of the backwards compatibility promise, or, if the
current behaviour is just a quirk of the current implementation that
might change and which could bite me if I rely on it.
Can I count on the DAL setting or not setting autocommit mode? How do
I have any way of knowing if changing that "behind the DAL's back"
would break anything or not. Does the DAL even care?
Since the DAL is trying to abstract away details about particular DB
engines so that Web2py users can write code that "doesn't care", it
also needs to say (for each DB Engine) what assumptions it is
making... that is, what matters to it and what doesn't. It's perfectly
fine to say, for each DB Engine, what configurations/settings/etc.
that Web2py 'cares about' and which it doesn't.
Sooo.... Have I misunderstood the SQLite locking description?
> I believe so. The shared lock does not last forever -- only as long as it
> takes to execute the select. Once the data have been read, the shared lock
> (for that particular select) is released.
Yes, if autocommit is on. The part you didn't quote was that
autocommit can be disabled in which case a SHARED lock is held.
Since the documentation about what the DAL uses/needs/assumes about
the underlying Python layer (pysqlite in this case) isn't there, its
just easier to avoid using it outside of a Web2py application.
> You're forgetting another "layer". DAL leverages on DBAPI implementation of
> sql drivers.
Well, that's not exactly good news for transparency and ease of documentation.
But what I would prefer to do is actually reason about what should be
happening
Sure, there is documentation about the quirks of SQLite, but nothing
(unless I've missed it after repeated reading) about the
assumptions/defaults, etc of how the DAL uses the intermediate glue
layers (pysqlite, etc.).
Well, earlier it was claimed that the DAL for SQLite left autocommit
on,
web2py is just using the Python standard library sqlite3 module. According to the docs, it looks like it does not default to autocommit mode.
So now I have an extra place to go look and try to figure out, does
the DAL actually implement DBAPI with autocommit or does the DAL
implement DBAPI but with an exception re: autocommit? Having more
places to look is not 'ease of documentation' from a user's
perspective. :-/
It matters if the query at the top of the loop holds a lock on the DB,
which means that when there are no rows found to process, the database
remains locked while the server is sleeping...
Again, with autocommit off:
The server process does a query/select to see if there is any work to
do. This provokes SQLite to acquire a SHARED lock. Then the server
process decides "oops, no work to do, let's go to sleep for a minute".
Notice that it does nothing to release that lock.
> Yes, it does release the lock, as soon as the select is complete. You do not
> have to do db.commit() after a select.
Please show me where that is documented.
Section 7 of http://www.sqlite.org/lockingv3.html says that ONLY
happens when autocommit is on, and you have stated that it is off
(contrary to my mistaken memory).
Once all reading has completed, the SHARED lock is dropped.
Further, The Book says that only
inside of a Web2py app does commit/rollback back happen automatically.
SQLite is a file-based database, so has to have more course locking than a typical client/server database. Other databases have "select for update" functionality, which allows you to select a set of records and lock them. web2py actually supports this, via .select(..., for_update=True).
> Anyway, you might be better off looking into using the scheduler, or a
> third-party task queue like Celery (a web2py Celery plugin was started at
> some point, though not sure it is complete).
I don't understand, how will that let me do a read-modify-update
transaction? It seems as if for_update is precisely what I need to do
that.