Intermittent deadlock errors even with wait transactions?

119 views
Skip to first unread message

Kjell Rilbe

unread,
Aug 13, 2024, 12:10:03 PM8/13/24
to firebird-support

Hi!

Trying to solve a real-world deadlock error in an app. To test how it works I've tried with both FlameRobin and isql, and I can't really get the hang of it.

Firebird version 3.0.11.33703, Windows Server 2019, 64 bit.

As far as I understand, a transaction with isolation level read committed and wait, should be able to wait for a concurrent update transaction to complete, and then successfully do its own update, basing its update on the preceding transaction's committed result.

Is this correct? If not, what did I get wrong?

FlameRobin doesn't seem to have any setting for wait timeout, but for isql I've used 600 seconds = 10 minutes, which I surely haven't exceeded. And FlameRobin does appear to wait indefinitely.

All tests are done with a tiny table that has a single bigint column, with a single record. (The app uses it to reserve global object id:s, much like a generator. I have no control over this mechanism, since it's from a third party framework, but I can control the transaction options being used.)

The SQL query is:

update t set c = c + 1;

With FlameRobin

I set isolation level to read committed and checked to enable wait.

Three FlameRobin sessions. Execute update in the first session, which will also start a transaction. Repeat in session 2 and 3, which will both end up pending/waiting. Commit in the first session. One of the other two sessions stops waiting and successfully runs its update. Commit there, and the remaining session will stop waiting, but it will sometimes succeed and sometimes fail with a deadlock error:

deadlock
update conflicts with concurrent update
concurrent transaction number is nnnnn

With isql

Three isql sessions, connect and then issue these statements:

1: set transaction read write wait isolation level read committed no record_version lock timeout 600;
2: update t set c = c + 1;
3: commit;

I run 1 (set transaction) in session 1, then session 2, then session 3. Then run 2 (update) in session 1, 2, 3. Then run 3 (commit) in session 1, 2, 3.

I also tried to run 1 + 2 (set trans + update) in session 1, then 2, then 3. Then run 3 (commit) in session 1, 2, 3.

In both cases I get failure and success in some random mixture for the third session. Sessions 1 and 2 seem to always succeed.

If I change to record_version, then sessions 2 and 3 seem to always fail immediately after doing commit in the first session.


Is this expected behavior? Please explain why. Why doesn't session 3 always succeed? (Or always fail for that matter...)


Final question: What I'm trying to achieve is a transaction setting that will let any number of concurrent connections issue that update and have them executed in som (arbitrary) order, but ensuring that whenever one update is executed it will see the result of the preceding update. The framework does this under a very short lived transaction, i.e.:

1. Start transaction.
2. update t set c = c +1;
3. select c from t; -- To check the result and use appropriate object id:s accordingly.
4. commit;

I just need those transactions to wait for one another and execute in any order, consistently basing the update on the result of the preceding committed transaction.

Mvh,
Kjell Rilbe

Dimitry Sibiryakov

unread,
Aug 13, 2024, 12:17:22 PM8/13/24
to firebird...@googlegroups.com
Kjell Rilbe wrote 13.08.2024 18:09:
> As far as I understand, a transaction with isolation level read committed and
> wait, should be able to wait for a concurrent update transaction to complete,
> and then successfully do its own update, basing its update on the preceding
> transaction's committed result.
>
> Is this correct? If not, what did I get wrong?

It is much more complicated and unpredictable. It may work like you want on
low tension but will fail with three players exactly as you found out. All is
depending on timing.

> Final question: What I'm trying to achieve is a transaction setting that will let any number of concurrent connections issue that update and have them executed in som (arbitrary) order, but ensuring that whenever one update is executed it will see the result of the preceding update. The framework does this under a very short lived transaction, i.e.:
>
> 1. Start transaction.
> 2. update t set c = c +1;
> 3. select c from t; -- To check the result and use appropriate object id:s accordingly.
> 4. commit;

What you are looking for is a pessimistic locking scenario but... The
simplest solution would be to do all updates in single session. Other sessions
won't bring speed, only problems.

--
WBR, SD.

Kjell Rilbe

unread,
Aug 13, 2024, 4:47:43 PM8/13/24
to firebird...@googlegroups.com
Den 2024-08-13 kl. 18:17, skrev 'Dimitry Sibiryakov' via firebird-support:
Kjell Rilbe wrote 13.08.2024 18:09:
As far as I understand, a transaction with isolation level read committed and wait, should be able to wait for a concurrent update transaction to complete, and then successfully do its own update, basing its update on the preceding transaction's committed result.

Is this correct? If not, what did I get wrong?

  It is much more complicated and unpredictable. It may work like you want on low tension but will fail with three players exactly as you found out. All is depending on timing.

Matches what I observe, I suppose.


Final question: What I'm trying to achieve is a transaction setting that will let any number of concurrent connections issue that update and have them executed in som (arbitrary) order, but ensuring that whenever one update is executed it will see the result of the preceding update. The framework does this under a very short lived transaction, i.e.:

1. Start transaction.
2. update t set c = c +1;
3. select c from t; -- To check the result and use appropriate object id:s accordingly.
4. commit;

  What you are looking for is a pessimistic locking scenario but... The simplest solution would be to do all updates in single session. Other sessions won't bring speed, only problems.

OK. I think this kind-of defeats the purpose of wait semantics. I do realize that this is a complex system so I can fully understand that it's not straightforward to implement it, especially considering performance is important. I assume that the problem stems from the fact that once the server knows which records will be updated, and thus checked for pending updates from other transactions, it will already have evaluated all expressions, and thus have read the record versions it has access to (depending on isolation level and record_version or no record_version. Once the pending updates that it's waiting for are committed, it needs to decide if the update query will give correct results or not based on the record versions it has used during execution. In my simple case it might appear straightforward, but to solve that in the general case does appear to be a daunting task.

I note that in the documentation it says that the transaction setting restart requests is described in source code as "Restart all requests in the current attachment to utilize the passed transaction.", but that it's unclear what it actually does, and that we shouldn't use it. I tried using it, but it didn't help. I do think that the description seems to indicate that it would lead to re-execution of the query (queries?) in the transaction that ended up waiting, and at that re-execution it would read any new record versions that have become current according to the transactions isolation level etc. If that were the case, i think it should solve my problem, but apparently not.

Out of curiosity I would be happy to gain insight into that seemingly poorly documented transaction setting, if anyone has any knowledge and would care to share. :-)

In the meantime, it looks like I will have to use a mutex or a retry mechanism in my app.

Thanks,
Kjell

Vlad Khorsun

unread,
Aug 13, 2024, 5:47:04 PM8/13/24
to firebird-support

I do realize that this is a complex system so I can fully understand that it's not straightforward to implement it, especially considering performance is important. I assume that the problem stems from the fact that once the server knows which records will be updated, and thus checked for pending updates from other transactions, it will already have evaluated all expressions, and thus have read the record versions it has access to (depending on isolation level and record_version or no record_version. Once the pending updates that it's waiting for are committed, it needs to decide if the update query will give correct results or not based on the record versions it has used during execution.

  Good and correct explanation.

  Firebird 4 have solution for this kind of problem - new mode for READ COMMITTED isolation level - READ COMMITTED READ CONSISTENCY.

  It uses first optimistic way for update and then, if update conflict happens, fallback to pessimistic mode and restarts request (query).
carefully read "Handling of Update Conflicts" chapter.

Regards,
Vlad

Dimitry Sibiryakov

unread,
Aug 13, 2024, 5:48:27 PM8/13/24
to firebird...@googlegroups.com
Kjell Rilbe wrote 13.08.2024 22:47:
> I note that in the documentation it says that the transaction setting restart
> requests is described in source code as "Restart all requests in the current
> attachment to utilize the passed transaction.", but that it's unclear what it
> actually does, and that we shouldn't use it. I tried using it, but it didn't
> help. I do think that the description seems to indicate that it would lead to
> re-execution of the query (queries?) in the transaction that ended up waiting,
> and at that re-execution it would read any new record versions that have become
> current according to the transactions isolation level etc. If that were the
> case, i think it should solve my problem, but apparently not.

Request restart on conflict requires Firebird 4 and Read Committed Read
Consistency transaction TIL. They are not available in your 3.0.

--
WBR, SD.

Kjell Rilbe

unread,
Aug 14, 2024, 3:54:06 AM8/14/24
to firebird...@googlegroups.com
Den 2024-08-13 kl. 23:47, skrev Vlad Khorsun:

I do realize that this is a complex system so I can fully understand that it's not straightforward to implement it, especially considering performance is important. I assume that the problem stems from the fact that once the server knows which records will be updated, and thus checked for pending updates from other transactions, it will already have evaluated all expressions, and thus have read the record versions it has access to (depending on isolation level and record_version or no record_version. Once the pending updates that it's waiting for are committed, it needs to decide if the update query will give correct results or not based on the record versions it has used during execution.

  Good and correct explanation.

Thanks. :-)


  Firebird 4 have solution for this kind of problem - new mode for READ COMMITTED isolation level - READ COMMITTED READ CONSISTENCY.

  It uses first optimistic way for update and then, if update conflict happens, fallback to pessimistic mode and restarts request (query).
carefully read "Handling of Update Conflicts" chapter.

Nice! (And thanks to you too Dimitry Sibiryakov for also mentioning this.)

Looks like we should move forward with a Firebird upgrade.

Kjell

Mark Rotteveel

unread,
Aug 14, 2024, 5:47:58 AM8/14/24
to firebird...@googlegroups.com
On 13/08/2024 22:47, Kjell Rilbe wrote:
> I note that in the documentation it says that the transaction setting
> restart requests is described in source code as "Restart all requests in
> the current attachment to utilize the passed transaction.", but that
> it's unclear what it actually does, and that we shouldn't use it. I
> tried using it, but it didn't help. I do think that the description
> seems to indicate that it would lead to re-execution of the query
> (queries?) in the transaction that ended up waiting, and at that re-
> execution it would read any new record versions that have become current
> according to the transactions isolation level etc. If that were the
> case, i think it should solve my problem, but apparently not.

The RESTART REQUESTS part of the SET TRANSACTION is not something you
should use, and AFAIK has no bearing on what you're trying to do here.

It is documented because it is part of the syntax, but given it is
unclear what it really does, you shouldn't use it.

In any case, as others have said, your problem can probably be addressed
with Firebird 4.

Mark
--
Mark Rotteveel

Kjell Rilbe

unread,
Aug 14, 2024, 6:29:00 AM8/14/24
to firebird...@googlegroups.com
Den 2024-08-14 kl. 11:47, skrev 'Mark Rotteveel' via firebird-support:
On 13/08/2024 22:47, Kjell Rilbe wrote:
I note that in the documentation it says that the transaction setting restart requests is described in source code as "Restart all requests in the current attachment to utilize the passed transaction.", but that it's unclear what it actually does, and that we shouldn't use it. I tried using it, but it didn't help. I do think that the description seems to indicate that it would lead to re-execution of the query (queries?) in the transaction that ended up waiting, and at that re- execution it would read any new record versions that have become current according to the transactions isolation level etc. If that were the case, i think it should solve my problem, but apparently not.

The RESTART REQUESTS part of the SET TRANSACTION is not something you should use, and AFAIK has no bearing on what you're trying to do here.

It is documented because it is part of the syntax, but given it is unclear what it really does, you shouldn't use it.

No, obviously not. I just thought it would be interesting to test if it made any difference. It didn't.


In any case, as others have said, your problem can probably be addressed with Firebird 4.

Yes. Will schedule that work for "near future".

Kjell

Svein Erling Tysvær

unread,
Aug 16, 2024, 5:34:45 AM8/16/24
to firebird...@googlegroups.com
Updating Firebird version is probably the best way for you to go, but your original query made me wonder whether the old "An Auditable Series of Numbers" on http://www.ibobjects.com/TechInfo.html could be useful for your case.

Svein Erling

--
You received this message because you are subscribed to the Google Groups "firebird-support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firebird-suppo...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/firebird-support/b0238730-ae6b-40f7-af1e-07cf355baa16%40marknadsinformation.se.
Reply all
Reply to author
Forward
0 new messages