> The problem is that by default in most of our backends, a transaction runs
> with constraint checks deferred. This means that foreign keys are *not*
> checked for integrity until a transaction is committed. But here's the rub:
> we never commit a transaction during tests, we roll it back instead.
> Therefore, none of the queries against the DB are being checked for
> integrity. (The exception is MySQL InnoDB, which I'll get to in a sec.)
> As part of the work I was doing recently to disable and re-enable constraint
> checks during fixture loading, I realized that we were handling this sort of
> incorrectly across our backends. Postgresql in fact does provide a facility
> for enabling constraint checks: SET CONSTRAINT CHECKS ALL IMMEDIATE. This
> statement causes Postgresql to behave as it normally would, when it was
> executing queries outside of a transaction.
By default, PostgreSQL creates all constraints as NOT DEFERRABLE. To
get the observed behaviour we'd need to be explicitly defining
constraints as DEFERRABLE INITIALLY DEFERRED.
Maintaining the property of deferrable constraints seems important
here, so changing the deferrability of constraints, or overriding it
using the SET CONSTRAINTS command at the top of the transaction might
not be what we want.
What I would recommend is that we issue an explicit SET CONSTRAINTS
ALL IMMEDIATE command immediately before the ROLLBACK at *end* of
test. This will fire any outstanding checks. That way all constraint
checks will occur in the same place they would during a commit, yet we
can maintain the situation that the test ends with a rollback.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
> Maintaining the property of deferrable constraints seems important
> here, so changing the deferrability of constraints, or overriding it
> using the SET CONSTRAINTS command at the top of the transaction might
> not be what we want.
Well, that's just it. We want tests to behave as much like production code as possible, so we actually *don't* want constraint checks to be deferred.
When you're working with DB normally in production, i.e. outside of a transaction, constraints are checked immediately. It's only when you get into a transaction that they are deferred. Each of our tests run inside of a transaction, which is an artificial construct. So to emulate production, I'm arguing that this property is not something we want (*unless* we are testing a transaction or transaction like behavior, in which case it does make sense to temporarily suspend constraint checks).
> What I would recommend is that we issue an explicit SET CONSTRAINTS
> ALL IMMEDIATE command immediately before the ROLLBACK at *end* of
> test. This will fire any outstanding checks. That way all constraint
> checks will occur in the same place they would during a commit, yet we
> can maintain the situation that the test ends with a rollback.
This would conceivably work I think. I'm pretty sure Ramiro was exploring this approach actually. My feeling, however, is that this still allows you to get away with stuff you might not otherwise get away with in production. I also think it's more helpful seeing exactly where an integrity issue came up so you can address it. This is, for example, what allowed me to understand the handful of bugs that were hiding, i.e. because I could trace the Intergrity Error to the exact line of code that was triggering it.
Your questions raise one issue that had not occurred to me though. One possible "problem" with putting constraint checks at the beginning is that there is now way for a test to recover from them. If you try to put bad data into the DB with immediate constraint checks on, you will raise and error *and* if I'm not mistaken the transaction will be rolled back at that very instant. So if for some reason you knew you were putting bad data in and wanted to recover from it in your test and keep going, that would not be possible. I'm not sure that's actually a problem, but it's certainly something to consider. It's another dimension of behavior that is changed.
My role here is to help with Postgres details, so any comments I make
are just attempts at providing useful information.
It sounds like there is a slight confusion about the way PostgreSQL
works. Everything is a transaction, so there is no difference between
inside/outside a transaction.
You can choose whether you wish immediate or deferred constraint
checking. It's completely user selectable, though the default is
immediate. If you're seeing deferred checks, they can be changed.
ISTM that the tests should work the same way as things do in
production, and it looks possible to make that happen.
>> What I would recommend is that we issue an explicit SET CONSTRAINTS
>> ALL IMMEDIATE command immediately before the ROLLBACK at *end* of
>> test. This will fire any outstanding checks. That way all constraint
>> checks will occur in the same place they would during a commit, yet we
>> can maintain the situation that the test ends with a rollback.
>
> This would conceivably work I think. I'm pretty sure Ramiro was exploring this approach actually. My feeling, however, is that this still allows you to get away with stuff you might not otherwise get away with in production. I also think it's more helpful seeing exactly where an integrity issue came up so you can address it. This is, for example, what allowed me to understand the handful of bugs that were hiding, i.e. because I could trace the Intergrity Error to the exact line of code that was triggering it.
It does seem sensible to make the tests work like production. Are we
sure they are different, given comments above?
> Your questions raise one issue that had not occurred to me though. One possible "problem" with putting constraint checks at the beginning is that there is now way for a test to recover from them. If you try to put bad data into the DB with immediate constraint checks on, you will raise and error *and* if I'm not mistaken the transaction will be rolled back at that very instant. So if for some reason you knew you were putting bad data in and wanted to recover from it in your test and keep going, that would not be possible. I'm not sure that's actually a problem, but it's certainly something to consider. It's another dimension of behavior that is changed.
If you want to recover from an error and then continue, you can use
SAVEPOINTs. These allow you to put a mark in the sand and return to it
in case of error.
e.g. in SQL
BEGIN;
INSERT -- succeeds
SAVEPOINT s1
INSERT -- fails check
ROLLBACK to SAVEPOINT s1;
INSERT -- succeeds
COMMIT;
I hope that info helps the decision process.
> On Sun, Jul 10, 2011 at 3:27 PM, Jim Dalton <jim.d...@gmail.com> wrote:
>> On Jul 10, 2011, at 3:13 AM, Simon Riggs wrote:
>>
>>> Maintaining the property of deferrable constraints seems important
>>> here, so changing the deferrability of constraints, or overriding it
>>> using the SET CONSTRAINTS command at the top of the transaction might
>>> not be what we want.
>>
>> Well, that's just it. We want tests to behave as much like production code as possible, so we actually *don't* want constraint checks to be deferred.
>>
>> When you're working with DB normally in production, i.e. outside of a transaction, constraints are checked immediately. It's only when you get into a transaction that they are deferred. Each of our tests run inside of a transaction, which is an artificial construct. So to emulate production, I'm arguing that this property is not something we want (*unless* we are testing a transaction or transaction like behavior, in which case it does make sense to temporarily suspend constraint checks).
>
> My role here is to help with Postgres details, so any comments I make
> are just attempts at providing useful information.
Thank you, btw, for offering your input in that capacity. I found it quite reassuring when I read your first reply that someone with your credentials was giving feedback on this issue. :)
>
> It sounds like there is a slight confusion about the way PostgreSQL
> works. Everything is a transaction, so there is no difference between
> inside/outside a transaction.
> You can choose whether you wish immediate or deferred constraint
> checking. It's completely user selectable, though the default is
> immediate. If you're seeing deferred checks, they can be changed.
>
> ISTM that the tests should work the same way as things do in
> production, and it looks possible to make that happen.
Okay, this is interesting -- and thank you for clarifying this point, which I had grossly oversimplified. I think you actually shed a huge amount of light on part of the problem here (at least for me).
So in normal, everyday use, Django opens a connection to Postgresql with an open transaction. Per the docs, it commits this transaction immediately whenever you do an INSERT or UPDATE etc. At that point Postgresql would run its constraint checks and rollback the transaction on error or else commit.
During a test, we open up a big transaction during setUp and then *disable* the transaction commit and rollback operations, i.e. a call to either does nothing. Since we can't nest transactions in Postgresql (right?) this has been the only sensible way to allow tests to proceed. The problem has been -- I am positing -- that we're getting away with a lot of stuff as a result, because constraint checks are *never* checked during testing. The small slew of bugs I found is a demonstration of that, to me.
My solution up to now, however, is also "unlifelike", which your comment has helped me to realize. It works kind of similar to the way Django works normally because in some ways Django sort of emulates immediate checks in its default behavior. But I think my present solution is a kludge in the other direction and I agree is not close enough to reality.
I'm thinking this might not be too hard to solve though. I'm thinking that rather than *only* enabling constraint checks immediately as I was suggesting before, we could instead focus on the transaction-related methods that we are presently just blanking out (by this I mean when tests are set up, transaction.commit() transaction.rollback() etc. are being monkey patched with a function that does nothing). Even though we can't nest a transaction explicitly, we can better emulate the behavior of normal Django operation. We can keep things in deferred mode as they are in production, but then do a temporary flip to IMMEDIATE at commit() which will fire the constraint checks. Presumably, we could even do a SAVEPOINT, as you suggested, at the start so that the test could be recovered and continue on if e.g. the IntegrityError needed to be caught during testing.
I imagine that this will still flush out most if not all of the little errors I had discovered (and prevent us from writing test cases going forward with similar problems) but at the same time will very closely emulate production. The challenge will be wrangling all this together with the other DBs (i.e. MySQL and SQLite) which don't share this execution model. Should be workable though.
Anyhow, thank you once again Simon for shedding light on this for me.
Cheers
Jim
> So in normal, everyday use, Django opens a connection to Postgresql with an
> open transaction. Per the docs, it commits this transaction immediately
> whenever you do an INSERT or UPDATE etc. At that point Postgresql would run
> its constraint checks and rollback the transaction on error or else commit.
Yes, each statement is a transaction in itself unless you explicitly
request an extended transaction.
> During a test, we open up a big transaction during setUp and then *disable*
> the transaction commit and rollback operations, i.e. a call to either does
> nothing. Since we can't nest transactions in Postgresql (right?) this has
> been the only sensible way to allow tests to proceed. The problem has been
> -- I am positing -- that we're getting away with a lot of stuff as a result,
> because constraint checks are *never* checked during testing. The small slew
> of bugs I found is a demonstration of that, to me.
No, sorry, nested transaction commands aren't supported.
> Anyhow, thank you once again Simon for shedding light on this for me.
No problem. Thanks to everybody working on Django.