Order of primary key and unique key / index evaluation not deterministic

18 views
Skip to first unread message

Mark Rotteveel

unread,
Apr 27, 2024, 5:15:43 AMApr 27
to firebir...@googlegroups.com
While looking at this Stack Overflow question[1], I discovered that
evaluation of primary key, unique key and unique indexes is not
necessarily deterministic across backup and restore.

It seems that the evaluation order of indexes is determined by their
RDB$INDEX_ID (and not for example the sort order of names, like for
triggers with the same position), and this in turn affects which
violation fires first.

The use of RDB$INDEX_ID means that in a scenario where indexes are only
added, indexes are evaluated in their creation order. However, this goes
out the window as soon as indexes are dropped and added, as that means a
newer index can get a lower RDB$INDEX_ID, and thus if it is a unique
index, it might signal its violation before older indexes (like the one
backing a primary key)

In addition, the RDB$INDEX_ID is not preserved across backup and
restore, and neither is their order. Instead, indexes are recreated in
their physical storage order, which might be different from the original
RDB$INDEX_ID order. In other words, after a restore, a different
constraint or index might fire first compared to the original database.

Now these might seem like a minor concerns, but it can screw up your
application logic, if - as in the linked question - you expect the
primary key constraint to fire first, but now you get a unique index
violation.

The original problem of the question can - relatively - easily be fixed
by dropping both constraints/indexes and recreating them in the desired
order, but there is no guarantee that order survives the backup and
restore if the second constraint or index ends up earlier in physical
storage order.

I think that the restore should at least preserve the order of
RDB$INDEX_ID (not necessarily RDB$INDEX_ID itself). What do you think?

Mark

[1]: https://stackoverflow.com/q/78349219/466862
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
Apr 27, 2024, 5:35:49 AMApr 27
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 27.04.2024 11:15:
> I think that the restore should at least preserve the order of RDB$INDEX_ID (not
> necessarily RDB$INDEX_ID itself). What do you think?

As an application developer I see no problem: an error is an error, no matter
in which order they are returned so unless ANSI standard has something about it
I would document this undefined behavior and leave it as is.

--
WBR, SD.

Alex Peshkoff

unread,
Apr 27, 2024, 6:51:03 AMApr 27
to firebir...@googlegroups.com
+1

I also do not see problems with it.


Dimitry Sibiryakov

unread,
Apr 27, 2024, 7:03:46 AMApr 27
to firebir...@googlegroups.com
'Dimitry Sibiryakov' via firebird-devel wrote 27.04.2024 11:35:
>   As an application developer I see no problem: an error is an error, no matter
> in which order they are returned

An example just to explain my opinion: say you perform an operation in a
database that raises both error1 and error2. You can workaround error1 but not
error2. If you get error2 first you understand that this is a dead end. If you
get error1 you try to workaround it and then you get error2. Overall result -
you are still in the same dead end.
That's why IMHO order doesn't matter.

--
WBR, SD.

Mark Rotteveel

unread,
Apr 27, 2024, 7:05:22 AMApr 27
to firebir...@googlegroups.com
That highly depends on what you can do for conflict resolution.

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
Apr 27, 2024, 7:08:49 AMApr 27
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 27.04.2024 13:05:
>>    An example just to explain my opinion: say you perform an operation in a
>> database that raises both error1 and error2. You can workaround error1 but not
>> error2. If you get error2 first you understand that this is a dead end. If you
>> get error1 you try to workaround it and then you get error2. Overall result -
>> you are still in the same dead end.
>>    That's why IMHO order doesn't matter.
>
> That highly depends on what you can do for conflict resolution.

From IBP Replicator's experience: almost nothing. Maximum possible action -
attempt to update existing record if insert failed with PK/UK violation. If
(original or conflict resolution) update failed with PK/UK violation - it is the
dead end.

--
WBR, SD.

Mark Rotteveel

unread,
Apr 27, 2024, 7:11:13 AMApr 27
to firebir...@googlegroups.com
There are other types of applications that need conflict resolution,
possibly even user-interactive conflict resolution.

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
Apr 27, 2024, 7:27:21 AMApr 27
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 27.04.2024 13:11:
> There are other types of applications that need conflict resolution, possibly
> even user-interactive conflict resolution.

Yes, but:

1) The case in your link is exactly a replicator.
2) Interactive conflict resolution in my example is unavoidable in the described
dead end and the path you went into it is irrelevant. (IBPR does exactly that -
leave conflict to user if automatic CR failed.)

--
WBR, SD.

Mark Rotteveel

unread,
Apr 27, 2024, 9:03:57 AMApr 27
to firebir...@googlegroups.com
On 27/04/2024 13:27, 'Dimitry Sibiryakov' via firebird-devel wrote:
> 'Mark Rotteveel' via firebird-devel wrote 27.04.2024 13:11:
>> There are other types of applications that need conflict resolution,
>> possibly even user-interactive conflict resolution.
>
>   Yes, but:
>
> 1) The case in your link is exactly a replicator.

That question may have sparked my question, it does not mean that is the
only use case to consider. In other words, take a wider look than just
that case.

> 2) Interactive conflict resolution in my example is unavoidable in the
> described dead end and the path you went into it is irrelevant. (IBPR
> does exactly that - leave conflict to user if automatic CR failed.)

And yet, I can foresee situations where there will be different handling
required if its is known the record already exists as opposed to
handling the case where a new record just happens to violate a unique
index because *another* record has the same combination.

Sure, there are other ways to address that, but don't act as if it is
impossible that someone builds an application that expect a certain
order of operations, and which breaks in unexpected ways if the order
changes after a backup and restore.

Mark
--
Mark Rotteveel

Ertan Küçükoglu

unread,
Apr 27, 2024, 9:20:56 AMApr 27
to firebir...@googlegroups.com
Hello,

I have a service application which uses primary key violation errors to identify existing records.
I do not use a select before insert, since inserting a record always checks for a primary key violation.
If the primary key error fires, that insert is skipped silently. Any other error is a full stop.
Since it is a service application there is no user feedback.
Having a unique key error rather than primary key violation would be a problem for that application.

So, if I have a vote then I also support that primary key errors should fire first (following RDB$INDEX order on restore).

Sure, a customer complaint would make me look for it and figure out what's wrong eventually.
That would definitely take time for someone who didn't know about such an "undefined behavior"

Thanks & Regards,
Ertan

'Dimitry Sibiryakov' via firebird-devel <firebir...@googlegroups.com>, 27 Nis 2024 Cmt, 12:35 tarihinde şunu yazdı:
--
You received this message because you are subscribed to the Google Groups "firebird-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firebird-deve...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/firebird-devel/02a081b5-7f59-4e7b-80fa-ad9e905b4aa4%40ibphoenix.com.

Dimitry Sibiryakov

unread,
Apr 27, 2024, 9:22:40 AMApr 27
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 27.04.2024 15:03:
> And yet, I can foresee situations where there will be different handling
> required if its is known the record already exists as opposed to handling the
> case where a new record just happens to violate a unique index because *another*
> record has the same combination.
>
> Sure, there are other ways to address that, but don't act as if it is impossible
> that someone builds an application that expect a certain order of operations,
> and which breaks in unexpected ways if the order changes after a backup and
> restore.

Of course it is possible for anyone to build an application that relies on
constraint check order. People do stupid things all the way. It doesn't make
these things are right.
Let's take a wider look: one has table with two unique constraints. Any of
these constraints can be violated independently so the application has to be
prepared to handle such violations of both constraints.

a) Constraint1 is violated, application solves it in a way1.
b) Constraint2 is violated, application solves it in a way2.
c) Both constraints are violated.
c1) Violation of constraint1 is reported first, way1 is applied but in the
result constraint2 is violated anyway.
c2) Violation of constraint2 is reported first, way2 is applied but in the
result constraint1 is violated anyway.

I.e. case c) is unsolvable.

Example from the SO:

Record exists with values (1,2).
Record with values (1,1) is attempted to be inserted. PK violation happen,
insert is transformed into update, result is record (1,1) all is fine. This is
what topicstarter expects.
Record with values (2,2) is attempted to be inserted. UK violation happen. I
have no idea how they suppose to fix that but let's guess the same way: insert
is transformed into update by UK, result is record (2,2) in database.

Records (1,2) and (2,1) exist.
Record with values (1,1) is attempted to be inserted. If PK violation is
reported first, insert is transformed into update UK=1 WHERE PK=1 which violates
UK constraint. Failure.
If UK violation is reported first, insert is transformed into update PK=1 WHERE
UK=1 which violates PK constraint. Failure.

Here my fantasy is out of fuel so if you can imagine a way to resolve such
situation that depends on predictable order of constraint checks - I'm all ears.

--
WBR, SD.

Mark Rotteveel

unread,
Apr 27, 2024, 9:24:02 AMApr 27
to firebir...@googlegroups.com
On 27/04/2024 15:20, Ertan Küçükoglu wrote:
> So, if I have a vote then I also support that primary key errors should
> fire first (following RDB$INDEX order on restore).

To be clear, I'm not suggesting the primary key should always fire first
(I can also see situations where you might want another unique key or
unique index to fire first), I'm just saying the order of evaluation
should be consistent across backup and restore.

Mark
--
Mark Rotteveel

Ertan Küçükoglu

unread,
Apr 27, 2024, 9:39:31 AMApr 27
to firebir...@googlegroups.com
I meant that but worded wrongly.
Keeping the order across backup and restore would be just fine.

'Mark Rotteveel' via firebird-devel <firebir...@googlegroups.com>, 27 Nis 2024 Cmt, 16:24 tarihinde şunu yazdı:
--
You received this message because you are subscribed to the Google Groups "firebird-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firebird-deve...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages