Postgresql Rollback race condition?

What follows is an annotated Postgresql log during a 2PC using Atomikos:

1. One of the connections is preparing a transaction to be committed:

2017-04-03 18:28:08[29937]:LOG:  execute <unnamed>: PREPARE TRANSACTION '1096044365_MTAuMTEuMTM5LjIzOS50bTE0OTEyNDQwODgxOTcwMTYxMA==_MTAuMTEuMTM5LjIzOS50bTI4NTI='

2. Another connection seems to be asking for all prepare xtns and rolling the one from above back. (This must be the transaction timeout mechanism.)

2017-04-03 18:28:08[4747]:LOG:  execute <unnamed>: SELECT gid FROM pg_prepared_xacts where database = current_database()

2017-04-03 18:28:08[4747]:LOG:  execute <unnamed>: ROLLBACK PREPARED '1096044365_MTAuMTEuMTM5LjIzOS50bTE0OTEyNDQwODgxOTcwMTYxMA==_MTAuMTEuMTM5LjIzOS50bTI4NTI='

3. The first connection tries to commit the txn that it previously prepared.

2017-04-03 18:28:08[29937]:LOG:  execute <unnamed>: COMMIT PREPARED '1096044365_MTAuMTEuMTM5LjIzOS50bTE0OTEyNDQwODgxOTcwMTYxMA==_MTAuMTEuMTM5LjIzOS50bTI4NTI='

2017-04-03 18:28:08[29937]:ERROR:  prepared transaction with identifier "1096044365_MTAuMTEuMTM5LjIzOS50bTE0OTEyNDQwODgxOTcwMTYxMA==_MTAuMTEuMTM5LjIzOS50bTI4NTI=" does not exist

4. (Log not shown) Endless sequences like #3 above for the TXN ID that results in "-3" Internal Errors.

I suppose we can dramatically increase the connection timeout to avoid this (assuming it is indeed the timeout mechanism responsible for the rollback in #2 above). Other suggestions? Wild guess: do you think this is a Postgresql driver or Atomikos issue?

Monday, April 03, 2017

