I found and fixed two pretty serious bugs.
1. It turns out that the version of commons-pool we were using for
connection pooling would synchronize the entire object creation. Since
the object we were creating was a socket connection, this means the
entire creation of the tcp/ip connection was synchronized with a
global lock. The same problem effects DBCP. As a result if the
connection timed out, the lock would be held the entire time,
preventing any new connections. Plus we were not setting an explicit
timeout on the connection, and the soTimeout doesn't get used during
the connection either. This problem would manifest if you try to
connect to a non-existant ip, as you will get no response and no other
connections could be created during this period. The fix is to upgrade
to commons pool 1.5, which appears to fix the problem.
This is the second time we have gotten badly burned by commons pool,
and I wonder if it doesn't make sense to just swap in a properly
written ConcurrentMap+BlockingQueue as a substitute. Most of the
configuration options commons pool offers seem to do more harm than
good and the implementation is quite frightening. I have also been
thinking that the dynamic growth of the pool is kind of pointless, it
might make sense to just create all the connections on startup so that
things are immediately in a steady state.
2. We discovered that interrupting socket IO does not actually have
any effect. As a result the shutdown() method on the SocketServer was
allowing requests on existing connections to continue while the server
was shutting down. This lead to request timeouts during the shutdowns.
The fix is to keep a Map of active sessions and forcefully close the
socket for each active session when shutdown is called to ensure no
active connections are present when we begin shutting down the storage
engines.
Cheers,
-Jay
I think this is platform-dependant, and mostly due to difficulties in
being able to interrupt actual OS-dependant blocking functionality.
And worse than this, not implemented (at least across the board, for
most commonly used blocking operations) on enough platforms to make it
usable.
So yes, unfortunately one can not count on thread.interrupt() to wake
up blocked threads in Java. :-/
(I think this was also something that NIO was hoped to help resolve)
-+ Tatu +-