Atomikos Forum |
|
Hello everybody,
I'm trying to get XA transactions running with PostgreSQL and ActiveMQ. I am using Spring Framework 3.0.0 (declarative transactions), Hibernate 3.3.2, PostgreSQL 8.4 + JDBC4 Driver 8.4-701 (also tested with PostgreSQL 8.3 + JDBC3 Driver 8.3-603), ActiveMQ 5.3.0, Atomikos TransactionEssentials 3.6.0 (AtomikosDataSourceBean) I am getting an error with the following workflow because of a call to prepare() on a different PGXAConnection instance than the calls to start() and end(): Query database and send JMS message with JmsTemplate in XA transaction. 1. Start transaction - UserTransactionImp.begin() (atomikos/spring) 2. Query database for the current transaction timestamp 2.1. PGXAConnection.start() (postgres) 2.2. Select timestamp from db 2.3. PGXAConnection.end() (postgres) 3. Send JMS message with JmsTemplate (activemq/spring) 4. Commit transaction (2PC) - UserTransactionImp.commit() (atomikos/spring) 4.1. PGXAConnection.prepare(): currentXid is null because prepare() is called on a different PGXAConnection instance than start() and end() -> NullPointerException, transaction is NOT prepared in database 4.2. PGXAConnection.rollback(): because of the NPE in 4.1 rollback() is called on the same PGXAConnection instance as prepare() 4.3. ROLLBACK PREPARED 'xid' is executed in the database but this xid DOES NOT EXIST because the transaction was started on a different connection and PREPARE TRANSACTION failed Database log: 2010-05-12 16:59:41.761 CEST - appuser@10.10.1.3 testdb [4beac20c.396f|1/1027] LOG: duration: 0.057 ms bind S_1: BEGIN 2010-05-12 16:59:41.761 CEST - appuser@10.10.1.3 testdb [4beac20c.396f|1/1027] LOG: duration: 0.017 ms execute S_1: BEGIN 2010-05-12 16:59:41.761 CEST - appuser@10.10.1.3 testdb [4beac20c.396f|1/1027] LOG: duration: 0.297 ms parse <unnamed>: select cast(now() as timestamptz(3)) 2010-05-12 16:59:41.761 CEST - appuser@10.10.1.3 testdb [4beac20c.396f|1/1027] LOG: duration: 0.061 ms bind <unnamed>: select cast(now() as timestamptz(3)) 2010-05-12 16:59:41.762 CEST - appuser@10.10.1.3 testdb [4beac20c.396f|1/1027] LOG: duration: 0.056 ms execute <unnamed>: select cast(now() as timestamptz(3)) 2010-05-12 16:59:48.257 CEST - appuser@10.10.1.3 testdb [4beac210.3971|2/90] LOG: duration: 0.706 ms parse <unnamed>: SELECT gid FROM pg_prepared_xacts 2010-05-12 16:59:48.257 CEST - appuser@10.10.1.3 testdb [4beac210.3971|2/90] LOG: duration: 0.073 ms bind <unnamed>: SELECT gid FROM pg_prepared_xacts 2010-05-12 16:59:48.257 CEST - appuser@10.10.1.3 testdb [4beac210.3971|2/90] LOG: duration: 0.044 ms execute <unnamed>: SELECT gid FROM pg_prepared_xacts 2010-05-12 17:00:03.007 CEST - appuser@10.10.1.3 testdb [4beac210.3971|2/91] LOG: duration: 0.095 ms parse <unnamed>: ROLLBACK PREPARED '1096044365_MTAuMTAuMS4zLnRtMDAwMDUwMDAwMQ==_MTAuMTAuMS4zLnRtNQ==' 2010-05-12 17:00:03.007 CEST - appuser@10.10.1.3 testdb [4beac210.3971|2/91] LOG: duration: 0.010 ms bind <unnamed>: ROLLBACK PREPARED '1096044365_MTAuMTAuMS4zLnRtMDAwMDUwMDAwMQ==_MTAuMTAuMS4zLnRtNQ==' 2010-05-12 17:00:03.007 CEST - appuser@10.10.1.3 testdb [4beac210.3971|2/91] ERROR: prepared transaction with identifier "1096044365_MTAuMTAuMS4zLnRtMDAwMDUwMDAwMQ==_MTAuMTAuMS4zLnRtNQ==" does not exist 2010-05-12 17:00:03.007 CEST - appuser@10.10.1.3 testdb [4beac210.3971|2/91] STATEMENT: ROLLBACK PREPARED '1096044365_MTAuMTAuMS4zLnRtMDAwMDUwMDAwMQ==_MTAuMTAuMS4zLnRtNQ==' The database log shows that BEGIN (transaction) is executed with a different database connection/session (4beac20c.396f) than ROLLBACK PREPARED (4beac210.3971). It is OK to execute PREPARE TRANSACTION and COMMIT/ROLLBACK PREPARED with different connections, but PREPARE TRANSACTION was never executed because of the NPE in step 4.1 of the workflow. When debugging the application with Eclipse I can see that poolSize == maxPoolSize connections (PGXAConnection) are allocated by AtomikosDataSourceBean. start() and end() are called on one of these PGXAConnection instances. There is an additional connection allocated for recovery, prepare() is always called on that additional PGXAConnection instance and that causes the error. The only way I can prevent the error from happening is setting max_connections in the database configuration file (postgresql.conf) to the same value as maxPoolSize of the AtomikosDataSourceBean configuration. This way the additional connection for recovery cannot be allocated and prepare() is called on the same connection as start() and end(). Btw. in postgresql.conf max_prepared_transactions is >= max_transactions in all scenarios. I know that a lot of information is missing here to allow for a detailed analysis of the issue. I can provide further information like the complete spring configuration, atomikos log and application stack trace. But for the moment I just want to ask if this bug rings a bell for anybody. Is this a known limitation of PostgreSQL XA support? Any hints, comments and further questions appreciated! Thanks a lot, tscho
hello,
thanks for you reply and sorry for not reporting back earlier. an update to atomikos TransactionEssentials 3.6.5 solved this particular problem. but now i have a somehow related problem with version 3.6.5. the application is running and i force a shutdown of the database system after BEGIN but before PREPARE (to simulate a failure). when i restart the database system a ROLLBACK PREPARED 'xid' is executed on the new connection. the ROLLBACK PREPARED statement is executed over and over again. is there a way to limit the retry count? why is ROLLBACK PREPARED executed when PREPARE was never executed for the transaction? Atomikos Log: 2010-05-28 14:25:14,946 [Atomikos:9] DEBUG atomikos - XA resource 'dataSourceJTA': prepare for XID '31302E31302E312E332E746D30303034353030303031:31302E31302E312E332E746D3837' raised -3: the XA resource detected an internal error org.postgresql.xa.PGXAException: Error preparing transaction at org.postgresql.xa.PGXAConnection.prepare(PGXAConnection.java:254) at com.atomikos.datasource.xa.XAResourceTransaction.prepare(XAResourceTransaction.java:591) at com.atomikos.icatch.imp.PrepareMessage.send(PrepareMessage.java:63) at com.atomikos.icatch.imp.PropagationMessage.submit(PropagationMessage.java:114) at com.atomikos.icatch.imp.Propagator$PropagatorThread.run(Propagator.java:90) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: org.postgresql.util.PSQLException: FATAL: terminating connection due to administrator command at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1592) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1327) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:192) at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:451) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:336) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:282) at org.postgresql.xa.PGXAConnection.prepare(PGXAConnection.java:242) ... 7 more 2010-05-28 14:25:14,947 [DefaultMessageListenerContainer-1] DEBUG atomikos - Coordinator 10.10.1.3.tm0004500001 entering state: ABORTING 2010-05-28 14:25:14,947 [DefaultMessageListenerContainer-1] DEBUG atomikos - (1.5) executing task: com.atomikos.icatch.imp.Propagator$PropagatorThread@8e683a 2010-05-28 14:25:14,948 [Atomikos:9] INFO atomikos - XAResource.rollback ( 31302E31302E312E332E746D30303034353030303031:31302E31302E312E332E746D3837 ) on resource dataSourceJTA represented by XAResource instance org.postgresql.xa.PGXAConnection@da1c9f 2010-05-28 14:25:14,949 [Atomikos:9] WARN atomikos - XA resource 'dataSourceJTA': rollback for XID '31302E31302E312E332E746D30303034353030303031:31302E31302E312E332E746D3837' raised -3: the XA resource detected an internal error org.postgresql.xa.PGXAException: Error rolling back prepared transaction at org.postgresql.xa.PGXAConnection.rollback(PGXAConnection.java:357) at com.atomikos.datasource.xa.XAResourceTransaction.rollback(XAResourceTransaction.java:659) at com.atomikos.icatch.imp.RollbackMessage.send(RollbackMessage.java:74) at com.atomikos.icatch.imp.PropagationMessage.submit(PropagationMessage.java:114) at com.atomikos.icatch.imp.Propagator$PropagatorThread.run(Propagator.java:90) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: org.postgresql.util.PSQLException: An I/O error occured while sending to the backend. at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:218) at org.postgresql.jdbc2.AbstractJdbc2Connection.executeTransactionCommand(AbstractJdbc2Connection.java:650) at org.postgresql.jdbc2.AbstractJdbc2Connection.commit(AbstractJdbc2Connection.java:670) at org.postgresql.jdbc2.AbstractJdbc2Connection.setAutoCommit(AbstractJdbc2Connection.java:633) at org.postgresql.xa.PGXAConnection.rollback(PGXAConnection.java:343) ... 7 more Caused by: java.net.SocketException: Broken pipe at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) at java.net.SocketOutputStream.write(SocketOutputStream.java:136) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) at org.postgresql.core.PGStream.flush(PGStream.java:508) at org.postgresql.core.v3.QueryExecutorImpl.sendSync(QueryExecutorImpl.java:676) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:191) ... 11 more === shutdown/restart === 2010-05-28 14:25:24,949 [Atomikos:9] DEBUG atomikos - Propagator: retrying message: RollbackMessage to XAResourceTransaction: 31302E31302E312E332E746D30303034353030303031:31302E31302E312E332E746D3837 2010-05-28 14:25:24,949 [Atomikos:9] DEBUG atomikos - dataSourceJTA: forcing refresh of XAConnection... 2010-05-28 14:25:24,972 [Atomikos:9] INFO atomikos - XAResource.rollback ( 31302E31302E312E332E746D30303034353030303031:31302E31302E312E332E746D3837 ) on resource dataSourceJTA represented by XAResource instance org.postgresql.xa.PGXAConnection@f1036f 2010-05-28 14:25:24,973 [Atomikos:9] WARN atomikos - XA resource 'dataSourceJTA': rollback for XID '31302E31302E312E332E746D30303034353030303031:31302E31302E312E332E746D3837' raised -3: the XA resource detected an internal error org.postgresql.xa.PGXAException: Error rolling back prepared transaction at org.postgresql.xa.PGXAConnection.rollback(PGXAConnection.java:357) at com.atomikos.datasource.xa.XAResourceTransaction.rollback(XAResourceTransaction.java:659) at com.atomikos.icatch.imp.RollbackMessage.send(RollbackMessage.java:74) at com.atomikos.icatch.imp.PropagationMessage.submit(PropagationMessage.java:114) at com.atomikos.icatch.imp.Propagator$PropagatorThread.run(Propagator.java:90) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: org.postgresql.util.PSQLException: ERROR: prepared transaction with identifier "1096044365_MTAuMTAuMS4zLnRtMDAwNDUwMDAwMQ==_MTAuMTAuMS4zLnRtODc=" does not exist at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1592) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1327) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:192) at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:451) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:336) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:282) at org.postgresql.xa.PGXAConnection.rollback(PGXAConnection.java:347) ... 7 more Database Log: 2010-05-28 14:25:05.216 CEST - postgres@10.10.1.3 testdb [4bffb5e5.214b|1/1409] LOG: duration: 0.016 ms bind S_1: BEGIN 2010-05-28 14:25:05.216 CEST - postgres@10.10.1.3 testdb [4bffb5e5.214b|1/1409] LOG: duration: 0.004 ms execute S_1: BEGIN 2010-05-28 14:25:05.216 CEST - postgres@10.10.1.3 testdb [4bffb5e5.214b|1/1409] LOG: duration: 0.074 ms parse <unnamed>: select cast(now() as timestamptz(3)) 2010-05-28 14:25:05.216 CEST - postgres@10.10.1.3 testdb [4bffb5e5.214b|1/1409] LOG: duration: 0.018 ms bind <unnamed>: select cast(now() as timestamptz(3)) 2010-05-28 14:25:05.216 CEST - postgres@10.10.1.3 testdb [4bffb5e5.214b|1/1409] LOG: duration: 0.011 ms execute <unnamed>: select cast(now() as timestamptz(3)) 2010-05-28 14:25:06.213 CEST - @ [4bffae25.e61|] LOG: received fast shutdown request 2010-05-28 14:25:06.213 CEST - @ [4bffae25.e61|] LOG: aborting any active transactions 2010-05-28 14:25:06.213 CEST - postgres@10.10.1.3 testdb [4bffb5f3.216b|12/0] FATAL: terminating connection due to administrator command 2010-05-28 14:25:06.213 CEST - postgres@10.10.1.3 testdb [4bffb5e5.2152|8/0] FATAL: terminating connection due to administrator command 2010-05-28 14:25:06.213 CEST - postgres@10.10.1.3 testdb [4bffb5f2.2162|11/0] FATAL: terminating connection due to administrator command 2010-05-28 14:25:06.213 CEST - postgres@10.10.1.3 testdb [4bffb5e5.2154|10/0] FATAL: terminating connection due to administrator command === shutdown/restart === 2010-05-28 14:25:16.554 CEST - [unknown]@ [unknown] [4bffb62c.222b|] LOG: connection received: host=10.10.1.3 port=49073 2010-05-28 14:25:16.554 CEST - postgres@10.10.1.3 testdb [4bffb62c.222b|] LOG: connection authorized: user=postgres database=testdb 2010-05-28 14:25:16.559 CEST - postgres@10.10.1.3 testdb [4bffb62c.222b|1/8] LOG: duration: 0.185 ms parse <unnamed>: ROLLBACK PREPARED '1096044365_MTAuMTAuMS4zLnRtMDAwNDUwMDAwMQ==_MTAuMTAuMS4zLnRtODc=' 2010-05-28 14:25:16.559 CEST - postgres@10.10.1.3 testdb [4bffb62c.222b|1/8] LOG: duration: 0.008 ms bind <unnamed>: ROLLBACK PREPARED '1096044365_MTAuMTAuMS4zLnRtMDAwNDUwMDAwMQ==_MTAuMTAuMS4zLnRtODc=' 2010-05-28 14:25:16.559 CEST - postgres@10.10.1.3 testdb [4bffb62c.222b|1/8] ERROR: prepared transaction with identifier "1096044365_MTAuMTAuMS4zLnRtMDAwNDUwMDAwMQ==_MTAuMTAuMS4zLnRtODc=" does not exist 2010-05-28 14:25:16.559 CEST - postgres@10.10.1.3 testdb [4bffb62c.222b|1/8] STATEMENT: ROLLBACK PREPARED '1096044365_MTAuMTAuMS4zLnRtMDAwNDUwMDAwMQ==_MTAuMTAuMS4zLnRtODc='
hello guy,
thanks for your reply! the transaction was definitely not prepared because the database system was shutdown before PREPARE. do you mean PostgreSQL should support a PREPARE on a different connection than BEGIN? afaik PostgreSQL does support a COMMIT/ROLLBACK PREPARED on a different connection than PREPARE but not PREPARE or COMMIT/ROLLBACK on a different connection than BEGIN. is there a way to limit the retry count of ROLLBACK PREPARED? the transaction will be rolled back due to a timeout anyway when no PREPARE is executed. Or am i missing something here!? any further comments appreciated! regards, tscho
hello guy,
regarding your reply: "If this happens across restart then this means that the transaction was prepared indeed." The error occurs across (actually is caused by) a restart of the database system, not across a restart of the application or the atomikos transaction manager. regards, tscho
There are two things here:
-the fact we call rollback is due to the fact that prepare fails with an error, so we assume the worst case, being a prepared resource -> hence the rollback call after prepare -the fact that prepare is called on a different connection than start: this is allowed by the JTA/XA specs, but you can override this by setting com.atomikos.icatch.threaded_2pc to false Best
hello,
again, thanks for your quick reply! There are two things here: -the fact we call rollback is due to the fact that prepare fails with an error, so we assume the worst case, being a prepared resource -> hence the rollback call after prepare > ok, fair enough, that makes sense! is there a way to limit the retry count of ROLLBACK PREPARED? -the fact that prepare is called on a different connection than start: this is allowed by the JTA/XA specs, but you can override this by setting com.atomikos.icatch.threaded_2pc to false > i've run this particular scenario with "true" and "false", makes no difference since PREPARE fails due to a broken connection. but this information is very important for me, because it seems that PostgreSQL does not support the JTA/XA specs in this case. if would really appreciate any further comments on this issue, because after several unsuccessful calls of ROLLBACK PREPARED i get an hibernate exception, and other unrelated transactions fail too. regards, tscho |