Oh, thanks, makes sense. Yeah, I briefly looked into how AR
ConnectionPool would deal with JDBC connections (and connection pools)
under jruby -- but decided that was indeed a whole different kind of
mess. (The 'right' solution there for actual AR is probably different
than #with_new_connection)
In case anyone is interested in what I'm actually dealing with and what
I suspect the problems are, even though it takes some words to describe....
At the moment, I'm actually using MRI. The GIL doesn't bother me. The
main reason I need threads is because I've got some workers that need to
do a lot of HTTP connections to external services (using complex logic
between HTTP calls to decide the next call, it's not something you can
just throw at typhoeus).
MRI GIL'd threads seem to work just fine for making sure a thread gets
switched out when waiting on external I/O, so I dont' need to serially
wait for a whole bunch of external servers to respond to HTTP requests
in order.
The problem is that these workers periodically do need to use AR to
store their 'findings' in the db. And that's where the mess comes in.
I'm not entirely sure what's causing the mess right now.
But I believe one aspect of it is the way MRI
mutex/monitor/conditionvariables are NOT "first in, first out". It's
possible, in a race condition, for a thread waiting on a
conditionvariable to get continuously bumped by other threads who
actually got there later but managed to get the contested connection's
sooner.
I think Java mutex/monitor/conditionvariables actually have the same
lack of first-in-first-out.
I contributed a patch to rails 3-2-stable (now in 3.2.3) that _lessened_
the frequency/likelyhood of that sort of race condition, but couldn't
actually eliminate it. tenderlove at that point didn't want my patch in
master/rails4, but no matter, cause I'm still having problems in my
rails 3.2.3 app. That I _think_ are partially caused by the lack of
first-in-first-out, but it's possible I have no idea what's going on.
Leaked connections may also be a problem, although I've got some testing
that tries to show it's not. Or maybe something else I have no idea.
Anyhow, so I started considering different possible ways to use
Celluloid to work around this and get more predictable concurrency
semantics. There were several possible architectures I was thinking of,
depending on how AR/Celluloid interacted. Currently, I guess I'm
thinking of if there's a way to do all my AR from my own previously
existing (not neccesarily Celluloid) threads mediated through a pool of
Celluloid threads, that do the actual AR, with Celluloid's better
control of concurrency semantics including in some cases
first-in/first-out semantics.
But it's gonna be a mess no matter what, I'm not quite sure what to do.
And yeah, sadly, this is an existing fairly complicated app, that also
has third-party plugins written for it, sigh. Certainly changing the
whole architecture to use a redis queue or something would be another
alternative, but not a pleasant one at this point.
Ironically, the problems were _smaller_ when the app was Rails 2.1 with
mysql (not mysql2) -- concurrency was totally broken in that scenario,
but, ironically, it was broken in a way that kept these other problems
from coming up, but still allowed adequate performance for my use case.