Unexpected HeuristicMixedException during "rollback prepared" after "prepare" failed?

84 views
Skip to first unread message

Wladimir Hofmann

unread,
Feb 21, 2024, 4:05:28 PMFeb 21
to narayana-users
Hi,
when running an XA transaction via quarkus/narayana-jta with multiple postgres DBs and the prepare of a resource does not succeed (due to a deferred constraint violation), the subsequent abort of the transaction leads to an attempted rollback prepared, which then fails ("prepared transaction does not exist"), thus raising a HeuristicMixedException.

- Since the resource already rolled back on prepare, a HeuristicMixedException should not be expected in this case?

Given the XA specification ( https://pubs.opengroup.org/onlinepubs/009680699/toc.pdf ), it seems unclear where the problem is, as there are two (contradictory?) statements:

- Page 8: "In Phase 2, the TM issues all RMs an actual request to commit or roll back the transaction branch"
--> this seems to imply that a rollback should also be issued to RMs whose prepare was failing (so the postgres XAResource should be able to handle this without throwing any exception?)

- Page 45: "a resource manager may return XA_RBINTEGRITY upon prepare: upon return, the resource manager has rolled back the branch's work and has released all resources"
--> that indicates that a subsequent phase-2-rollback by the transaction manager should not be issued on that resource?


Can you confirm the problem? (Or is there e.g. some configuration problem in the project?)

Best regards, Wladimir

Manuel Finelli

unread,
Feb 26, 2024, 10:49:49 AMFeb 26
to narayana-users

Hi Wladimir,

Thank you for your reproducer. It was very easy to use and fully demonstrated the problem.

Also, thank you for your interesting question. After having reviewed the XA specification, I think that the logic employed in ArjunaJTA takes advantage of some flexibility in the specs. Basically, it treats all possible xa_prepare()’s error codes in the same way. This choice comes from the knowledge that xa_rollback() should return XAER_NOTA when invoked with an XID that does not exist. As a consequence, Narayana calls xa_rollback() for all xa_prepare()’s error codes. Of course, there is an argument that Narayana shouldn’t call xa_rollback() for resource managers that already replied negatively to the `prepare` phase (from the XA spec: [Page 8 of the XA spec, section 2.3.1] “The TM does not issue Phase 2 requests to RMs that responded negatively in Phase 1. The TM does not need to record stably the decision to roll back nor the participants in a rolled back global transaction”). This would also save Narayana a useless invocation. The reason why Narayana uses this trick goes back to the days when not all RMs were XA compliant and ArjunaJTA had to deal with this situation (that’s why the code here). Nowadays, most resource managers are XA compliant so we could/should think about improving this part of Narayana’s code base. On the other hand, when invoking xa_rollback() on PostgreSQL’s resource manager, Narayana is receiving an XAER_RMERR, which is the wrong error code. That is why Narayana reports a heuristic completion (basically, Narayana thinks that it invoked rollback on the resource manager, which then returns XAER_RMERR, thus Narayana doesn’t know if the resource manager rolled back the txn’s branch or not; hence, the outcome is mixed heuristic). My plan is to open an issue in the pgjdbc community and discuss the need to return an XAER_NOTA error code intestate of XAER_RMERR. While we wait for a reply and the pgjdbc community decides what to do with their driver, I am working on an improvement of the Narayana’s codebase to optimise Narayana’s handling of the XA_RB* error code.

Wladimir Hofmann

unread,
Feb 27, 2024, 12:55:02 PMFeb 27
to narayana-users
Hi Manuel, thank you a lot for the detailed analysis and explanation! Both proposed changes sound fine, please keep me updated on further news.
Best regards, Wladimir

Manuel Finelli

unread,
Mar 7, 2024, 4:50:23 AMMar 7
to narayana-users
Hi Wladimir,

I raised a PR in Narayana to update when ArjunaJTA/JTS invokes `rollback`. In parallel, I also raised a PR to fix/discuss how pgjdbc handles rollback invocations when there has been a constraint violation during the prepare phase.

Best wishes,
Manuel
Reply all
Reply to author
Forward
0 new messages