Atomikos Forum |
|
Hi,
I am using Atomikos3.4. I am experiencing some site freez problems. After looking at the thread dump from server it is found that lot od requests are waiting for ConnectionPool.borrowConnection() to return connection. ConnectionPool.borrowConnection() is waiting to acquire the lock which is held by ConnectionPool.shrinkPool(). shrinkPool() method is basically taking some time to destroy physical connection in certain events, which causes all the requests for borrowConnection to wait and site freezes. My suggestion is to improve shrinkPool() as follows 1) collect all the connections to be closed 2) remove them from List maintained by ConnectionPool 3) create a new thread and pass it the List of connections to destroy. 4) Let the new thread call destroy() on connections and close physical connections. This will stop any freez in pool even if closing physical connection takes long time.
We have
1) java.sql.SQLException: Closed Connection 2) Site freeze issue It is observed that site freezes when scheduled jobs runs. By looking at the code dump, following locking situation is found. - All the requests waiting for ConnectionPool.borrowConnection() - ConnectionPool.borrowConnection() is waiting for shrinkPool() to finish. - ConnectionPool.shrinkPool() trying to close() connection and waiting for the lock on the connection. - Lock on the connection is acquired by long running scheduled job, which is still running.
continue....
Q: If connection is active(used by schedule job) then why shrinkPool() is trying to close it. A: Look at the shrinkPool() code below, it simply tries to find when connection was last returned back to pool. It does not care whether connection is acquired after that or not. Hence, say connection was returned to pool at 13:00. After that connection acquired by scheduled job at 13:01 and scheduled job is taking long time. Let's say current time is 13:07 and scheduled job is still running scheduled job. When shrinkPool() runs it thinks that lastRelease time was 13:00 and connection is idle since then. Hence it tries to close it but can't acquire the lock of the connection as it is used by scheduled job. This causes shrinkPool() to wait and all the other processes which are looking to borrowConnection() to wait. Similarly if there is a long running query then site freeze could be experienced e.g. at 13:00 connection was last returned back to pool. Request A acquires a connection at 13:05:30 and runs a query which takes 3mins. when shrinkPool() runs at 13:06 it thinks that connection lastReleased at 13:00 and it will try to close the connection which is already in use. Fix1:- change shrinkPool() to check that connection is being used or not before closing it (look the shrinkPool() method with comment FIX1). Fix2:- xpc.destroy() should be called in some other thread (non synchronized method), this will avoid shrinkPool from being in synchronized method for too long time. private synchronized void shrinkPool() { if (connections == null || properties.getMaxIdleTime() <= 0 ) return; Configuration.logDebug ( this + ": trying to shrink pool" ); List connectionsToRemove = new ArrayList(); int maxConnectionsToRemove = availableSize() - properties.getMinPoolSize(); if ( maxConnectionsToRemove > 0 ) { for ( int i=0 ; i < connections.size() ; i++ ) { XPooledConnection xpc = ( XPooledConnection ) connections.get(i); //FIX1: if(!xpc.isAvailable()){continue;} long lastRelease = xpc.getLastTimeReleased(); long maxIdle = properties.getMaxIdleTime(); long now = System.currentTimeMillis(); Configuration.logDebug ( this + ": connection idle for " + (now - lastRelease) + "ms"); if ( ( (now - lastRelease) >= (maxIdle * 1000L) ) && ( connectionsToRemove.size() < maxConnectionsToRemove ) ) { Configuration.logDebug ( this + ": connection idle for more than " + maxIdle + "s, closing it: " + xpc); xpc.destroy(); connectionsToRemove.add(xpc); } } } connections.removeAll(connectionsToRemove); } Q: why is there a problem of closed connection then? A: Due to almost same reason as stated in above answer. Consider the scenario, at 13:00 connection was last returned back to pool. Request A acquires a connection at 13:05:59.112. Request A has not yet fired any query on the connection. Now when shrinkPool() runs at 13:06 it will look for lastReleased time, which is 13:00, so shrinkPool() closes the connection. When Request A tries to use the connection it gets a Connection closed issue. |