When are constraints being checked.

590 views
Skip to first unread message

Rami

unread,
Nov 24, 2010, 5:43:33 PM11/24/10
to h2-da...@googlegroups.com
Hi,

Let's assume a db with 2 tables.

CREATE TABLE A(ID IDENTITY, FK BIGINT);
CREATE TABLE B(ID IDENTITY, FK BIGINT REFERENCES A(ID));
ALTER TABLE A ADD CONSTRAINT FOO FOREIGN KEY(FK) REFERENCES B(ID);

So A refers to B and vice versa.

Now I would like to insert a row into A and B.
row 'a' refers to row 'b' and vice versa.

SET AUTOCOMMIT OFF;
INSERT INTO A(ID, FK) VALUES(1, 1);
INSERT INTO B(ID, FK) VALUES(1, 1);
COMMIT;

At the moment these statements produce errors because the foreign key
constraint (and other constraints eg NOT NULL etc.)
are checked at the statement boundary.

It turns out there is no way to insert these rows (other than turning
referential constain checking off
in the whole database for all connections but that is obviously not good
for consistency.)

I propose that all constraints would be checked at the transaction boundary.
So within a transaction the database could be in an inconsistent state
but only for
that one transaction. And when the changes of the transaction are
published to other
connections after a succesful commit data would be in consistent state
because
all checks for the changes have been made.

Does someone see any problems to consistency with this approach?
Would this be a big change?

- Rami

Dario Fassi

unread,
Nov 24, 2010, 5:58:17 PM11/24/10
to h2-da...@googlegroups.com
Hi Rami,
I think this schema would not work because DEFERRED constraints are not supported yet.
A deferred constraint will be checked at commit time (at least not at Statement.execute time) to let the rest of transaction the opportunity to satisfy all involved deferrable constraints.

See this: http://www.h2database.com/html/grammar.html#referential_constraint
"Defines a referential constraint. If the table name is not specified, then the same table is referenced. RESTRICT is the default action. As this database does not support deferred checking, RESTRICT and NO ACTION will both throw an exception if the constraint is violated. If the referenced columns are not specified, then the primary key columns are used. The required indexes are automatically created if required. Some tables may not be referenced, such as metadata tables."

regards,
Dario

El 24/11/10 19:43, Rami escribió:

Rami

unread,
Nov 24, 2010, 6:04:15 PM11/24/10
to h2-da...@googlegroups.com
Indeed.
I was just wondering why not make all constraints deferred?
What would be the problem with that?
For example NOT NULL constraint is checked before even the BEFORE trigger is invoked.
Sometimes this can cause problems because the trigger would provide the value but now it can't.

I am just wondering what bad consequences would it have to do all the constraint checking at the transaction boundary.
I know it would have a lot of good effects.

- Rami

Dario Fassi wrote:
--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To post to this group, send email to h2-da...@googlegroups.com.
To unsubscribe from this group, send email to h2-database...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/h2-database?hl=en.

Dario Fassi

unread,
Nov 24, 2010, 7:00:35 PM11/24/10
to h2-da...@googlegroups.com

El 24/11/10 20:04, Rami escribió:
Indeed. I was just wondering why not make all constraints deferred?

Because in some (or most) cases you want an immediate rejection to rollback and release resources and locks  ASAP.
 
Suppose you have a giant transaction and the first sentence fail on a constraint you know it's unrelated with the rest of transaction,
then the db engine will have to carry out all the job (and allocating all resources)
for nothing.


What would be the problem with that?

As I can see, computational cost and resources.

 
For example NOT NULL constraint is checked before even the BEFORE trigger is invoked.
Sometimes this can cause problems because the trigger would provide the value but now it can't.

This is a different issue and in this case I can agree with you.
Don't remember how other databases handle this case.

Rami

unread,
Nov 25, 2010, 3:35:46 AM11/25/10
to h2-da...@googlegroups.com

> Because in some (or most) cases you want an immediate rejection to
> rollback and release resources and locks ASAP.
>
> Suppose you have a giant transaction and the first sentence fail on a
> constraint you know it's unrelated with the rest of transaction,
> then the db engine will have to carry out all the job (and allocating
> all resources) for nothing.

But wouldn't it be a more appropriate place to decide that when starting
a transaction.
You see when I create a referential constraint (or any other constraint
for that matter)
I don't know if someone wants to have it deferred or not.
But when someone starts his transaction he knows if he is going to break
some consistency rules or not.

So wouldn't it be better to make deferring of the constraint checks a
setting in a connection like this.

SET CHECKS_DEFERRED ON;
SET AUTOCOMMIT OFF;
INSERT ...
INSERT ...
INSERT ...
WHATEVER ...
COMMIT;

- Rami

>
>> What would be the problem with that?
>
> As I can see, computational cost and resources.
>
>> For example NOT NULL constraint is checked before even the BEFORE
>> trigger is invoked.
>> Sometimes this can cause problems because the trigger would provide
>> the value but now it can't.
>
> This is a different issue and in this case I can agree with you.
> Don't remember how other databases handle this case.
>
>> I am just wondering what bad consequences would it have to do all the
>> constraint checking at the transaction boundary.
>> I know it would have a lot of good effects.
>>
>> - Rami
>>

Sergi Vladykin

unread,
Nov 25, 2010, 4:11:13 AM11/25/10
to H2 Database
Hi,

I agree with Dario, that "fail fast" behavior is more appropriate in
the majority of cases, so making all the checks deffered is not a good
idea.
Better to do this in a Postgresql way
http://www.postgresql.org/docs/9.0/interactive/sql-set-constraints.html

regards,
Sergi

Rami

unread,
Nov 25, 2010, 4:32:28 AM11/25/10
to h2-da...@googlegroups.com
Hmm. But what I proposed was that normally (initially) when a transaction starts
the checks are immediate
CHECKS_DEFERRED = FALSE

With SET
CHECKS_DEFERRED TRUE one can make them deferred for the lifetime of that connection.

What Postgresql adds to this is an extra layer of complexity:
A way to define for foreign key constraints whether they are initially in DEFERRED or IMMEDIATE mode.
So simply setting a default value.
Well, to be fair the always NOT DEFERRABLE allows for foreign keys that can not be deferred
(why would someone want this?)

But there is a clear blunder that Postgresql has in it's constraint implementation.
It does not treat all constraints uniformly.
Quote:

"Compatibility

This command complies with the behavior defined in the SQL standard, except for the limitation that, in PostgreSQL, it does not apply to NOT NULL and CHECK constraints. Also, PostgreSQL checks non-deferrable uniqueness constraints immediately, not at end of statement as the standard would suggest."

Who would like to say in his tables definition:
COLUMN INTEGER NOT NULL
DEFERRABLE INITIALLY DEFERRED UNIQUE DEFERRABLE INITIALLY IMMEDIATE PRIMARY KEY  NOT DEFERRABLE REFERENCES FOO(x) DEFERRABLE INITIALLY DEFERRED ON UPDATE RESTRICT ON DELETE CASCADE

?!?

Most important is that we treat all the constraints in the same manner.

I have nothing against the ability to set default values to constraints
but to me it seems that the crucial feature is the construct

SET CONSTRAINTS { ALL | name [, ...] } { DEFERRED | IMMEDIATE }

The rest is building some extra features on top of that fundamental feature.

- Rami

Dario Fassi

unread,
Nov 25, 2010, 1:23:25 PM11/25/10
to h2-da...@googlegroups.com
Hi,

@Sergi

This is DB2's way too, but need grammars change in many others commands like:
 
    [CREATE | ALTER] TABLE  [ADD|ALTER] CONSTRAINTS ...

Referential Constaint:

 --+- [ NOT
DEFERRABLE ] -----------------+-
   |                                      |
   +-
DEFERRABLE INITIALLY -+- IMMEDIATE -+
                            |             |
                            +-
DEFERRED  --+

Upon creation, a constraint
is given one of three characteristics:
    DEFERRABLE INITIALLY DEFERRED,
    DEFERRABLE INITIALLY IMMEDIATE,
    NOT DEFERRABLE.
The third class is always IMMEDIATE and is not affected by the SET CONSTRAINTS command.
The first two classes start every transaction in the indicated mode, but their behavior can be changed within a transaction by SET CONSTRAINTS.

@Rami,

SET CHECKS_DEFERRED TRUE  for the lifetime of that connection.

I think that your proposition isn't enough, a finer granularity is needed because the constraint's subject is the CONSTRAINT and not the sentence .

In cases with circular references (like your initial sample) be able to setup IMMEDIATE for all sentence  in a transaction ,
make no sense because this will fail always and that is known at design time.
 

"Compatibility
This command complies with the behavior defined in the SQL standard, except for the limitation that, in PostgreSQL, it does not apply to NOT NULL and CHECK constraints. Also, PostgreSQL checks non-deferrable uniqueness constraints immediately, not at end of statement as the standard would suggest."

You are right, but this matter is mainly about REFERENTIAL constraints - nothing is perfect !

Who would like to say in his tables definition:
COLUMN INTEGER NOT NULL
DEFERRABLE INITIALLY DEFERRED UNIQUE DEFERRABLE INITIALLY IMMEDIATE PRIMARY KEY  NOT DEFERRABLE REFERENCES FOO(x) DEFERRABLE INITIALLY DEFERRED ON UPDATE RESTRICT ON DELETE CASCADE


In the grammar if any nothing is specified defaults to  NOT DEFERRABLE, so you only need to specify deferred cases (that are exceptional cases).

regards,
Dario.

Rami

unread,
Nov 25, 2010, 2:57:21 PM11/25/10
to h2-da...@googlegroups.com
Sergi,

Do you know where in the source the constraint check is done?
I tried to follow quickly from Session to Command to Table to Index but
I must have missed the constraint checks.
All other constraints (except NOT NULL) seem to inherit from a common
root (Constraint) which is a good sign :-)

- rami

Sergi Vladykin

unread,
Nov 25, 2010, 3:22:34 PM11/25/10
to H2 Database
See org.h2.table.Table.fireConstraints(...) method

Rami

unread,
Nov 25, 2010, 3:51:00 PM11/25/10
to h2-da...@googlegroups.com
Thanks.

So after a quick glance it seems that the firing of constraints is very nicely concentrated.
Only table.fireBeforeRow and table.fireAfterRow call it.

The signature is
fireConstraints(Session session, Row oldRow, Row newRow, boolean before)

The important ones are oldRow and newRow.

So if someone would like to move the constraint check to the end of transaction
one would need to create some kind of changeset caused by the transaction.

The simple way to do it would be
List<Row> oldRows, List<Row> newRows
that would simply be a list of rows that have been somehow affected by the transaction.

These lists could be populated in the same place where the constraints are now being checked.

And then at commit time the constraints could be actually checked.

And for all you speed freaks this could actually increase speed because now the constraints
are checked twice for every statement (before and after the statement). Doing it at transaction boundary
would require running them only once :-)

Comments?
Should I give this a shot?

- Rami

Sergi Vladykin

unread,
Nov 25, 2010, 4:50:42 PM11/25/10
to H2 Database
> So if someone would like to move the constraint check to the end of
> transaction
> one would need to create some kind of changeset caused by the transaction.
>
> The simple way to do it would be
> List<Row> oldRows, List<Row> newRows
> that would simply be a list of rows that have been somehow affected by
> the transaction.
>
> These lists could be populated in the same place where the constraints
> are now being checked.
>
> And then at commit time the constraints could be actually checked.

On long transactions or large data imports this can cause out of
memory errors. Also the same row can be modified mutiple times in
transaction so in these lists you will hold multiple versions of the
same row. This is only simple problems I see but I think there will be
more weird ones. Anyways just keep all changes in memory obviously is
not very good idea.
In my opinion here is needed to store only failed row keys, manage
them (delete key if row was deleted from table or if this row was
updated and constraint was satisfied, may be more complex things will
be needed for referential constraints) and if on commit at least one
failed row key is still here then transaction should be rolled back
and exception thrown.

> And for all you speed freaks this could actually increase speed because
> now the constraints
> are checked twice for every statement (before and after the statement).
> Doing it at transaction boundary
> would require running them only once :-)

It is not true. Before statement called one part of constraints and
after another part but not all constraints two times (see isBefore()).

> Comments?
> Should I give this a shot?

I think it would be nice if you will implement this feature but please
do more investigations before coding.

regards,
Sergi

Dario Fassi

unread,
Nov 25, 2010, 6:43:59 PM11/25/10
to h2-da...@googlegroups.com
Hi,

El 25/11/10 17:51, Rami escribió:


> The signature is fireConstraints(Session session, Row oldRow, Row newRow, boolean before)

> So if someone would like to move the constraint check to the end of transaction
> one would need to create some kind of changeset caused by the transaction.
>
> The simple way to do it would be
> List<Row> oldRows, List<Row> newRows
> that would simply be a list of rows that have been somehow affected by the transaction.
> These lists could be populated in the same place where the constraints are now being checked.

> Comments?

I hope it's as simple as it seems, but I'm afraid it won't be.
Every insert, update and delete in the same transaction could cascade other updates and fire triggers doing what ever they want, and all that updates have the potential to queue more deferred constraints to check on many distinct tables.
I imagine that to defer all constraint's check (those who are nowchecked before) it may be necessary to retain more information (or ab-use the transaction log) that a list of old and new row per table.
Anyway I hope to be wrong.

> Should I give this a shot?
>

Any step forward to deferred check implementation would be nice and useful even a partial one.

regards,
Dario.

Rami Ojares

unread,
Nov 26, 2010, 10:51:08 AM11/26/10
to h2-da...@googlegroups.com
Good pointers Sergi and Dario.
So my checklist is:
- test large transaction from memory point of view
- think about how to identify rows to be the same row
- do more investigations before coding
- Expect weird and disappointing things to come your way

With that and Thomas' long instructions on providing patches
I will move forward. I will first just do something simple and stupid
and try it in my own project just to see if I can make it work at all.

I will keep you posted.

- rami

Reply all
Reply to author
Forward
0 new messages