ActiveRecord and Celluloid?

1,690 views
Skip to first unread message

Jonathan Rochkind

unread,
May 23, 2012, 10:01:51 AM5/23/12
to Celluloid
Can anyone share any experience of using ActiveRecord from inside a
Celluloid actor?

It is not clear to me if a particular actor is guaranteed to always be
executing under the _same_ thread/fiber. My impression is that a
given actor will always execute in the same Thread -- but possibly
within a variety of fibers within that Thread, not always the same
single fiber.

This has implications for ActiveRecord, because of ActiveRecord's use
of Thread.current[] thread local storage. Contrary to some popular
belief, it turns out 'thread' local storage is actually reset for
_each fiber_, even in MRI 1.9.3, not just in jruby.

Since ActiveRecord uses thread local storage (among other things) to
figure out the active checked out connection, if an Actor uses
different fibers under the hood without you being aware of it, this
could have.... weird implications for ActiveRecord. Possibly just
using many more AR connections than you meant to and not correctly
checking them back in when you're done -- possibly worse bugs than
this, it's not clear AR has been tested or has a clear contract for
multi-fiber use. Also possibly there being no good way to check back
in AR connection when you're done -- you may find that you _thought_
you had a checked out connection (which you'll check back in when
you're done), but suddenly you're executing in a different fiber than
you thought, AR automatically checks out a _different_ connection to
you, without you realizing it or having any good way to check it back
in.

AR's concurrency contract is definitely a bit weird, sadly. But
perhaps I'm inventing problems that won't really exist. Has anyone
actually used AR with their actors, or investigated what's going on?

Thanks for any advice.

Artūras Šlajus

unread,
May 23, 2012, 10:27:21 AM5/23/12
to cellulo...@googlegroups.com
Excerpt from my code...

arturas@zeus:~/work/spacegame/server$ cat GOTCHAS.md
# Preface

This document explains various gotchas that you need to be aware of while
developing server code.

## Fibers, Tasks and database connections

Celluloid works in a way that each method invoked gets its own Task, which
brings a new Fiber with it. JRuby currently doesn't have lightweight
Fibers, so
they are emulated by a thread pool.

ActiveRecord connection pooling relies on thread locals to store
connection id.
But they are not preserved between Fibers. That means such code is
leaking DB
connections:

class WithDb
include Celluloid

def initialize
# This checked out connection is never used in #work, because
#work
# always has new fiber.
@connection = ActiveRecord::Base.connection_pool.checkout
end

def finalize
ActiveRecord::Base.connection_pool.checkin(@connection)
end

def work
# This method is invoked in separate Task, separate Fiber,
and has
# separate thread locals. That means it automatically checks
out a new
# connection.
unit = Unit.all
# ... do things ... #

# And this connection is never checked back in, which means we'll
# run out of connections in the pool.
end

The solution is either:

* Use ```ActiveRecord::Base.connection_pool.with_connection``` when DB
conectivity is needed. Beware that this will return existing connection
if it
is not currently used and does not ensure that new connection will always be
checked out.
* Use ```ActiveRecord::Base.connection_pool.with_new_connection``` when DB
conectivity is needed. This will enforce new connection checkout.
* If you really need a separate connection use
```ActiveRecord::Base.connection_pool.checkout_with_id```,
```ActiveRecord::Base.connection_pool.checkin``` in ```ensure``` section and
```ActiveRecord::Base.connection_id = stored_in```.

Example:

class Pooler
include Celluloid

def initialize
@connection, @connection_id =
ActiveRecord::Base.connection_pool.checkout_with_id
run!
end

def finalize
ActiveRecord::Base.connection_pool.checkin(@connection)
end

def run
ActiveRecord::Base.connection_id = @connection_id

# ...
end
end

This ensures that each #run invocation reuses same DB connection, even
though
it runs on different Fiber/Thread.


from monkey_patches.rb

# Monkey-patches for ActiveRecord.
module ActiveRecord
class ConnectionAdapters::ConnectionPool
def checkout_with_id
connection = checkout
connection_id = current_connection_id
raise "Connection ID #{connection_id} is already checked out!" \
if @reserved_connections.has_key?(connection_id)
@reserved_connections[connection_id] = connection

[connection, connection_id]
end

def current_connection_id
ActiveRecord::Base.connection_id = Celluloid.actor? \
? Celluloid.current_actor.object_id \
: Thread.current.object_id
end

# Ensures that new connection is checked out for the block. See
GOTCHAS.md
def with_new_connection
connection = checkout
yield
ensure
checkin(connection)
end
end
end

So far this works...

Each actor checks out one connection and then sets the connection id
when it enters new fiber (method).

Tony Arcieri

unread,
May 23, 2012, 10:39:06 AM5/23/12
to cellulo...@googlegroups.com
On Wed, May 23, 2012 at 9:01 AM, Jonathan Rochkind <roch...@jhu.edu> wrote:
It is not clear to me if a particular actor is guaranteed to always be
executing under  the _same_ thread/fiber.  My impression is that a
given actor will always execute in the same Thread -- but possibly
within a variety of fibers within that Thread, not always the same
single fiber.

Yes, that's correct. 

This has implications for ActiveRecord, because of ActiveRecord's use
of Thread.current[] thread local storage. Contrary to some popular
belief, it turns out 'thread' local storage is actually reset for
_each fiber_, even in MRI 1.9.3, not just in jruby.

Yes, this decision was made in order to facilitate compatibility with programs that (ab)use thread local storage as a tool for dynamic scope. Sadly, for applications that really do need *thread* local storage (especially in Celluloid), this turns out to be a terrible decision.

I began working on "TaskWithThreadLocals" to address this problem, but stopped working on it after a lack of interest:

 
Since ActiveRecord uses thread local storage (among other things) to
figure out the active checked out connection, if an Actor uses
different fibers under the hood without you being aware of it, this
could have.... weird implications for ActiveRecord.   Possibly just
using many more AR connections than you meant

As far as I've seen, this is the only side effect: Celluloid will use more AR connections than it really needs.

If you can reproduce other issues, let me know.

--
Tony Arcieri

Jonathan Rochkind

unread,
May 23, 2012, 10:52:48 AM5/23/12
to cellulo...@googlegroups.com, Artūras Šlajus
Thanks. So you had to monkey patch AR, eh? I'm still working through
all the implications.

One note: This is NOT just a jruby issue, I'm pretty sure. I don't know
if MRI "emulates fibers with a thread pool" -- but I do know in MRI
1.9.3, Thread.current[] local storage is still per-fiber, not
per-thread, despite the name.

When I considered using AR use_connection (I am very familiar with AR's
concurrency model, alas) -- here's what I was worried about -- if in the
middle of your wth_connection block, the fiber switches, suddenly AR
automatically and transparently gives you a NEW connection, which is
never checked back in.

But I guess that can't happen, because while Celluloid gives each method
invokation a new fiber --- it won't change fibers in the middle of a
method invocation somehow, is that true?

I haven't completely wrapped my head around what your monkey patch gets
you. Can you explain for dummies what it does, that just using AR's
existing with_connection wouldn't?

Thanks a lot,

Jonathan

Jonathan Rochkind

unread,
May 23, 2012, 10:59:30 AM5/23/12
to cellulo...@googlegroups.com
On 5/23/2012 10:39 AM, Tony Arcieri wrote:
> As far as I've seen, this is the only side effect: Celluloid will use
> more AR connections than it really needs.

Cool, that's good to know -- but sadly, that's exactly the problem I'm
looking at a celluloid solution to try and solve.

AR's concurrency stuff is.... really wacky. And is unlikely to be made
less wacky any time soon for backwards compat reasons, among other
reasons (like the code is just really weird, and I'm not sure any
committers fully understand it's intentions).

So I'm having trouble using AR in a multi-threaded concurrency scenario,
even following it's contracts -- it ends up using more connections than
it really needs, which makes my db go crazy. I am having real problems
here, that I think I've diagnosed correctly (but man is concurrency
confusing to debug).

(Interestingly, part of the problem here seems to be ruby
mutex/monitor's conditionvariable monitor, where when threads are
waiting on a variable, there's no first-in/first-out guarantee. A thread
that's 'signaled' may or may not actually _get_ the mutex -- some other
thread may get it _first_, including one that wasn't even waiting at
all, it just managed to get scheduled asking for the mutex at just the
right/wrong time to jump in line).

So anyhow, I am contemplating trying to set up some architecture using
celluloid to 'gatekeep' access to AR connections. It's not entirely
clear how this would work. But the end goal is making sure no more
connections are used than needed, and letting N threads share M
connections, where M is less than N. But sadly, the characteristic
where celluloid actors end up using more connections than really needed
might be a problem here.

Or possibly not, if I use celluloid `exclusive` carefully. Hmm. I'm
going to have to think through a way to do this, and will probably come
back wtih more questions at some point.

Thanks!

Artūras Šlajus

unread,
May 23, 2012, 11:09:30 AM5/23/12
to cellulo...@googlegroups.com
On 05/23/2012 05:39 PM, Tony Arcieri wrote:
> As far as I've seen, this is the only side effect: Celluloid will use
> more AR connections than it really needs.

Connections will be checked out, and never checked in, thus you will run
out of connections sooner or later.

Jonathan Rochkind

unread,
May 23, 2012, 11:12:42 AM5/23/12
to cellulo...@googlegroups.com, Artūras Šlajus
Oh wait, well, that's a much worse side effect than using more
connections than really needed.

This is true only if you use checkin/checkout ConnectionPool model
though, right?

Or have you seen it or predict it even if you always use
ConnectionPool.with_connection?

I think a single method invocation in an Actor will always be in a
single Fiber, right? So if inside a method invocation, all AR use is
wrapped in a ConnectionPool.with_connection, then I _think_ _maybe_
it'll be okay.

What do you think, Arturas?

I have previously, personally, concluded that
ConnectionPool.with_connection is the only sane way to use AR in a
concurrency scenario anyway, with or without Celluloid, with or without
fibers. The major annoyance is that AR won't let you enforce "only use
with_connection" -- if you accidentally do an AR call outside a
with_connection, it'll transparently check out a new connection without
telling you (that you'll then have trouble realizing you need to check
back in, or finding it to check it back in).

Jonathan

Artūras Šlajus

unread,
May 23, 2012, 11:22:00 AM5/23/12
to cellulo...@googlegroups.com
On 05/23/2012 05:52 PM, Jonathan Rochkind wrote:
> Thanks. So you had to monkey patch AR, eh? I'm still working through all
> the implications.
>
> One note: This is NOT just a jruby issue, I'm pretty sure. I don't know
> if MRI "emulates fibers with a thread pool" -- but I do know in MRI
> 1.9.3, Thread.current[] local storage is still per-fiber, not
> per-thread, despite the name.
I really hate that.

> When I considered using AR use_connection (I am very familiar with AR's
> concurrency model, alas) -- here's what I was worried about -- if in the
> middle of your wth_connection block, the fiber switches, suddenly AR
> automatically and transparently gives you a NEW connection, which is
> never checked back in.
Yup, it indeed does that. And then it dies when the pool limit is reached.

>
> But I guess that can't happen, because while Celluloid gives each method
> invokation a new fiber --- it won't change fibers in the middle of a
> method invocation somehow, is that true?
It will if the method is not wrapped in exclusive and you do a sync call
to other actor.

> I haven't completely wrapped my head around what your monkey patch gets
> you. Can you explain for dummies what it does, that just using AR's
> existing with_connection wouldn't?
* Use ```ActiveRecord::Base.connection_pool.with_connection``` when DB
conectivity is needed. Beware that this will return existing connection
if it is not currently used and ***does not ensure that new connection
will always be checked out***.

Basically #with_connection checks if any of the existing connections may
be reused.

For true concurrency each actor should have its own connection.
#checkout_with_id checks out a connection and gives its ID to me. Which
I can then set when I enter the exclusive block of my actor method.

Artūras Šlajus

unread,
May 23, 2012, 11:22:10 AM5/23/12
to cellulo...@googlegroups.com
On 05/23/2012 06:12 PM, Jonathan Rochkind wrote:
> This is true only if you use checkin/checkout ConnectionPool model
> though, right?
.connection automatically checks out connection for you.

> Or have you seen it or predict it even if you always use
> ConnectionPool.with_connection?
No, you'll be fine. But your actors won't have separate connections too.

> I think a single method invocation in an Actor will always be in a
> single Fiber, right? So if inside a method invocation, all AR use is
> wrapped in a ConnectionPool.with_connection, then I _think_ _maybe_
> it'll be okay.
Yeah, as long as it's in exclusive block.


> I have previously, personally, concluded that
> ConnectionPool.with_connection is the only sane way to use AR in a
> concurrency scenario anyway, with or without Celluloid, with or without
> fibers. The major annoyance is that AR won't let you enforce "only use
> with_connection" -- if you accidentally do an AR call outside a
> with_connection, it'll transparently check out a new connection without
> telling you (that you'll then have trouble realizing you need to check
> back in, or finding it to check it back in).
I actually had a monkey patch for that too. I can get it from the repo
if you want :))

Artūras Šlajus

unread,
May 23, 2012, 11:28:51 AM5/23/12
to cellulo...@googlegroups.com
On 05/23/2012 06:19 PM, Jonathan Rochkind wrote:
> Ah, thanks a lot Arturus. I totally understand now. Thanks SO much for
> your help understanding this stuff.
>
> This was the key part I didn't totally grok:
>
> "It will if the method is not wrapped in exclusive and you do a sync
> call to other actor."
>
> Is this true if you're doing a sync call to I/O, say Net::HTTP too? Or
> just to another actor? (But it also tells me maybe I can get away with
> what I need if I wrap all methods that are going to do AR in exclusive
> blocks? Does that make sense?)
No, just another actor. And yes, wrapping does make sense.

> Either way, this makes things totally a mess.
Indeed. Ruby-land!

> I see the point of your monkey patches now, with_connection insisting on
> a new connection. It does mean that you can't _nest_ with_connections
> anymore though, which is useful semantics of ordinary with_connection.
Well, for my use case - I have no such requirement :)

> I'm trying to think through the 'right' way to fix this in ActiveRecord,
> that we might want to submit a pull request for. (I'd, potentially, if I
> have time, submit the pull request, but I've gotten into arguments with
> @tenderlove about AR concurrency before, so I'd want to say, see, look
> Arturas agrees! heh.)
Yeah, you can mention me as arturaz on github :)

> Is it okay for each actor to have one and only one connection, even with
> the multiple fibers stuff?
Well, it depends. If different fibers will do different stuff, then they
probably shouldn't share connections. For example:

F1: start transaction
F2: expects no transaction, does stuff
F1: does stuff
F2: waits forever because it is in transaction to see changes.

And similar funky things. I'd stick to exclusive...

> Do you think your monkey patches make sense as a standard pull request
> to AR itself? Are you interested in making one? Are you interested in me
> making one? Do you have any tests for em yet?
Not sure. Probably not. Perhaps this stuff can go into celluloid wiki or
something.

TBH - multithreading in ruby is a wreck. It seems that most ruby devs
are scared as hell of multithreading and don't want to think about it.

Jonathan Rochkind

unread,
May 23, 2012, 11:30:10 AM5/23/12
to cellulo...@googlegroups.com
On 5/23/2012 11:16 AM, Artūras Šlajus wrote:
>> fibers. The major annoyance is that AR won't let you enforce "only use
>> with_connection" -- if you accidentally do an AR call outside a
>> with_connection, it'll transparently check out a new connection without
>> telling you (that you'll then have trouble realizing you need to check
>> back in, or finding it to check it back in).
> I actually had a monkey patch for that too. I can get it from the repo
> if you want :))


I actually had my own monkey patch for that too once! But mine was
really messy. I'd be interested in seeing yours, if it's better than
mine I might try submitting it to Rails as a pull request.

I am really interested in trying to improve ActiveRecord here for our
use cases -- and I'm hoping some support from the Celluloid community
will help @tenderlove understand the importance of various use cases
better, that I seem to have had some trouble getting him to understand
before (it's probably my fault not explaining them well enough).

Tony Arcieri

unread,
May 23, 2012, 11:34:22 AM5/23/12
to cellulo...@googlegroups.com, Artūras Šlajus
On Wed, May 23, 2012 at 10:12 AM, Jonathan Rochkind <roch...@jhu.edu> wrote:
I have previously, personally, concluded that ConnectionPool.with_connection is the only sane way to use AR in a concurrency scenario anyway, with or without Celluloid, with or without fibers. The major annoyance is that AR won't let you enforce "only use with_connection" -- if you accidentally do an AR call outside a with_connection, it'll transparently check out a new connection without telling you (that you'll then have trouble realizing you need to check back in, or finding it to check it back in).

I've talked to tenderlove about this a little. Perhaps I should revisit it.
 
--
Tony Arcieri

Jonathan Rochkind

unread,
May 23, 2012, 11:36:58 AM5/23/12
to cellulo...@googlegroups.com
On 5/23/2012 11:28 AM, Artūras Šlajus wrote:
>> Do you think your monkey patches make sense as a standard pull request
>> to AR itself? Are you interested in making one? Are you interested in me
>> making one? Do you have any tests for em yet?
> Not sure. Probably not. Perhaps this stuff can go into celluloid wiki or
> something.

Hmm, why do you think your monkey patches don't make sense for AR core?

Hmm, is it the "If different fibers will do different stuff, then they
probably shouldn't share connections" issue? Your monkey patch with
your use case still potentially leads to different fibers doing
different stuff, if a sync call to another actor is made in the middle
of AR stuff?


>
> TBH - multithreading in ruby is a wreck. It seems that most ruby devs
> are scared as hell of multithreading and don't want to think about it.
>

No kidding. That's frustrating enough to begin with, but I just think
"Oh well, okay, opensource, if the maintainers don't care about
multithreading they aren't going to do it for me and I can't expect em to."

You know what's even more frustrating? When I try to figure out how to
improve standard gems to do MT better, and have to fight with the
maintainers on it, and lose the fight. I'd link to the github issues for
AR, but they somehow got corrupted on github to be unreadable/unfindable.

If AR just said "We don't support any kind of multi-threading", okay,
you'd know to give up. But it says it does, it's documented to do so, if
it didn't they'd have to stop pretending Rails in general does with
config.threadsafe! -- but it's implementation has been buggy as heck,
and it's fundamental design for concurrency has some serious problems.

(Yes, one solution would be: Okay, don't use ActiveRecord. I have
various reasons for continuing to try to use AR, including lack of faith
that any other mature ruby ORM does any better, I could spend a lot of
cost-of-switch and learn-a-new-ORM time, and just wind up in the same
place. Devil you know.)

Artūras Šlajus

unread,
May 23, 2012, 11:43:34 AM5/23/12
to cellulo...@googlegroups.com
On 05/23/2012 06:36 PM, Jonathan Rochkind wrote:
> On 5/23/2012 11:28 AM, Artūras Šlajus wrote:
>>> Do you think your monkey patches make sense as a standard pull request
>>> to AR itself? Are you interested in making one? Are you interested in me
>>> making one? Do you have any tests for em yet?
>> Not sure. Probably not. Perhaps this stuff can go into celluloid wiki or
>> something.
>
> Hmm, why do you think your monkey patches don't make sense for AR core?
Well, somebody might not be using celluloid ;)

> Hmm, is it the "If different fibers will do different stuff, then they
> probably shouldn't share connections" issue? Your monkey patch with your
> use case still potentially leads to different fibers doing different
> stuff, if a sync call to another actor is made in the middle of AR stuff?
All my actors are exclusive. I'm pretty safe there :)

> (Yes, one solution would be: Okay, don't use ActiveRecord. I have
> various reasons for continuing to try to use AR, including lack of faith
> that any other mature ruby ORM does any better, I could spend a lot of
> cost-of-switch and learn-a-new-ORM time, and just wind up in the same
> place. Devil you know.)
The basic idea is that Fiber behavior on Thread.current is just evil. It
shouldn't be fiber local and things would be fine there...

Artūras Šlajus

unread,
May 23, 2012, 11:45:54 AM5/23/12
to cellulo...@googlegroups.com
On 05/23/2012 06:36 PM, Jonathan Rochkind wrote:
> (Yes, one solution would be: Okay, don't use ActiveRecord. I have
> various reasons for continuing to try to use AR, including lack of faith
> that any other mature ruby ORM does any better, I could spend a lot of
> cost-of-switch and learn-a-new-ORM time, and just wind up in the same
> place. Devil you know.)

BTW - if your project is still young, I recommend you to check out Scala
language and consider it for development.

I know I'd use it if I had to start over from scratch (BTW, just brought
down 3 second algo to 0.08s with conversion from jruby -> scala.
Impressive, huh?)

Jonathan Rochkind

unread,
May 23, 2012, 12:04:44 PM5/23/12
to cellulo...@googlegroups.com
On 5/23/2012 11:43 AM, Artūras Šlajus wrote:
>>
>> Hmm, why do you think your monkey patches don't make sense for AR core?
> Well, somebody might not be using celluloid ;)

Sure, but is it unique to celluloid? It seems possibly generally useful
for any multi-threaded use of AR, no? But I need to think about it
more, if you dont' think so or are unsure.

Tony Arcieri

unread,
May 23, 2012, 12:15:29 PM5/23/12
to cellulo...@googlegroups.com
On Wed, May 23, 2012 at 11:04 AM, Jonathan Rochkind <roch...@jhu.edu> wrote:
Sure, but is it unique to celluloid? It seems possibly generally useful for any multi-threaded use of AR, no?  But I need to think about it more, if you dont' think so or are unsure.

As it so happens, I'm sitting inside a talk at JRubyConf and this very issue just came up as it relates to background threads in multithreaded applications in general.

Apparently tenderlove wants to fix it. I pinged him on Twitter... hopefully he might opine on this thread.

--
Tony Arcieri

Jonathan Rochkind

unread,
May 23, 2012, 12:22:13 PM5/23/12
to cellulo...@googlegroups.com
On 5/23/2012 12:15 PM, Tony Arcieri wrote:
> On Wed, May 23, 2012 at 11:04 AM, Jonathan Rochkind <roch...@jhu.edu
> <mailto:roch...@jhu.edu>> wrote:
>
> Sure, but is it unique to celluloid? It seems possibly generally
> useful for any multi-threaded use of AR, no? But I need to think
> about it more, if you dont' think so or are unsure.
>
>
> As it so happens, I'm sitting inside a talk at JRubyConf and this very
> issue just came up as it relates to background threads in multithreaded
> applications in general.

Awesome. Which "very issue", we were talking about several, or at least
two:

1. The inability to prevent ActiveRecord from transparently checking out
a connection you don't know about (and thus can't easily check back in)
when you intend to always use #with_connection, but have code that in
error does an AR call without being wrapped in #with_connection.

* This one is pretty clear to me a problem for any MT use of AR, and
needs to be fixed one way or another. The trick is doing it in a way
that doesn't significantly change the contract for 'ordinary'
single-threaded Rails request loop use of AR. I've got some ideas, but
not a magic solution.

2. Arturas's issue, which is harder to explain, involving his
connection_with_id monkey patch. Basically an ability to tell AR "I
already know about a connection, I want to make sure you use it in this
block."

* This one is quite a bit more confusing, and it's not entirely clear
to me in what use cases it would actually be appropriate/safe to use,
whether with or without Celluloid.


Tony Arcieri

unread,
May 23, 2012, 12:23:28 PM5/23/12
to cellulo...@googlegroups.com
On Wed, May 23, 2012 at 11:22 AM, Jonathan Rochkind <roch...@jhu.edu> wrote:
Awesome. Which "very issue", we were talking about several, or at least two

The general issue of AR connection leakage inside multithreaded programs, and potential solutions including explicit connection checkout
 
--
Tony Arcieri

Artūras Šlajus

unread,
May 23, 2012, 12:48:14 PM5/23/12
to cellulo...@googlegroups.com
On 05/23/2012 05:52 PM, Jonathan Rochkind wrote:
> I haven't completely wrapped my head around what your monkey patch gets
> you. Can you explain for dummies what it does, that just using AR's
> existing with_connection wouldn't?
Oh, by the way.

In JRuby if your underlying #jdbc_connection is busy from lets say java
code, activerecord actually has no idea about that. That's why I need
#with_new_connection :)

Jonathan Rochkind

unread,
May 23, 2012, 1:08:29 PM5/23/12
to cellulo...@googlegroups.com
Oh, thanks, makes sense. Yeah, I briefly looked into how AR
ConnectionPool would deal with JDBC connections (and connection pools)
under jruby -- but decided that was indeed a whole different kind of
mess. (The 'right' solution there for actual AR is probably different
than #with_new_connection)

In case anyone is interested in what I'm actually dealing with and what
I suspect the problems are, even though it takes some words to describe....

At the moment, I'm actually using MRI. The GIL doesn't bother me. The
main reason I need threads is because I've got some workers that need to
do a lot of HTTP connections to external services (using complex logic
between HTTP calls to decide the next call, it's not something you can
just throw at typhoeus).

MRI GIL'd threads seem to work just fine for making sure a thread gets
switched out when waiting on external I/O, so I dont' need to serially
wait for a whole bunch of external servers to respond to HTTP requests
in order.

The problem is that these workers periodically do need to use AR to
store their 'findings' in the db. And that's where the mess comes in.

I'm not entirely sure what's causing the mess right now.

But I believe one aspect of it is the way MRI
mutex/monitor/conditionvariables are NOT "first in, first out". It's
possible, in a race condition, for a thread waiting on a
conditionvariable to get continuously bumped by other threads who
actually got there later but managed to get the contested connection's
sooner.

I think Java mutex/monitor/conditionvariables actually have the same
lack of first-in-first-out.

I contributed a patch to rails 3-2-stable (now in 3.2.3) that _lessened_
the frequency/likelyhood of that sort of race condition, but couldn't
actually eliminate it. tenderlove at that point didn't want my patch in
master/rails4, but no matter, cause I'm still having problems in my
rails 3.2.3 app. That I _think_ are partially caused by the lack of
first-in-first-out, but it's possible I have no idea what's going on.
Leaked connections may also be a problem, although I've got some testing
that tries to show it's not. Or maybe something else I have no idea.

Anyhow, so I started considering different possible ways to use
Celluloid to work around this and get more predictable concurrency
semantics. There were several possible architectures I was thinking of,
depending on how AR/Celluloid interacted. Currently, I guess I'm
thinking of if there's a way to do all my AR from my own previously
existing (not neccesarily Celluloid) threads mediated through a pool of
Celluloid threads, that do the actual AR, with Celluloid's better
control of concurrency semantics including in some cases
first-in/first-out semantics.

But it's gonna be a mess no matter what, I'm not quite sure what to do.
And yeah, sadly, this is an existing fairly complicated app, that also
has third-party plugins written for it, sigh. Certainly changing the
whole architecture to use a redis queue or something would be another
alternative, but not a pleasant one at this point.

Ironically, the problems were _smaller_ when the app was Rails 2.1 with
mysql (not mysql2) -- concurrency was totally broken in that scenario,
but, ironically, it was broken in a way that kept these other problems
from coming up, but still allowed adequate performance for my use case.

Artūras Šlajus

unread,
May 23, 2012, 1:11:35 PM5/23/12
to cellulo...@googlegroups.com
What about just using a non-blocking http library and having a single
thread for activerecord?

Ben Langfeld

unread,
May 23, 2012, 1:23:12 PM5/23/12
to cellulo...@googlegroups.com
You might even consider using Sequel, since it claims to fix this problem.

Regards,
Ben Langfeld

Jonathan Rochkind

unread,
May 23, 2012, 1:36:22 PM5/23/12
to cellulo...@googlegroups.com
On 5/23/2012 1:11 PM, Artūras Šlajus wrote:
> What about just using a non-blocking http library and having a single
> thread for activerecord?

Definitely one possibility. But there are a bunch of plugins written by
other people that would need to be changed to use the non-blocking http
library.

Worse, some of these plugins need to, say, use SOAP, so use a ruby SOAP
gem, that may not be able to use a non-blocking http library without a
patch there. Or use some other client/agent/wrapper gem for some other
specialized stuff, that may not be able to use a standard preferred
non-blocking http library.

It would wind up being a cascade of code changes required, including
patches to other people's gems in sometimes non-trivial ways. (Or
non-trivial changes to my stuff or my collaborator's plugins to stop
using those gems).

I've got a bunch of not so great options in front of me, I've spent some
time considering them all.

Sean McKibben

unread,
May 23, 2012, 2:02:41 PM5/23/12
to cellulo...@googlegroups.com
I'm in a pretty similar boat, though fortunately it is mostly greenfield at this point. Based on today's discussion, we're going to take out our AR code, probably use Mike Perham's connection_pool gem to dole out PG connections from a single actor and hopefully use ActiveModel objects in a Value Object + mediator pattern.

I'm glad our AR code only made its way into the celluloid arena Monday, so I can still back it out. I'm definitely not in a place where I could jump to Scala, plus I'm optimistic that the Ruby community can actually make the transition to more comfortable multithreading. If not, it would be a major loss IMHO.

Sean

Ben Langfeld

unread,
May 23, 2012, 2:06:04 PM5/23/12
to cellulo...@googlegroups.com
Sean: Have you tried/considered Sequel, rather than rolling your own
stuff as you described? It kinda claims to be a silver-bullet and
could possibly be a big time-saver.

Regards,
Ben Langfeld

Sean McKibben

unread,
May 23, 2012, 2:37:15 PM5/23/12
to cellulo...@googlegroups.com
I did use Sequel with the version of our app that runs in 2.2, but it has been a while. I forget if you provide connections to it or if it maintains its own pool, and perhaps it has changed since we first used it. I am definitely going to see what it can do for us. From a persistence standpoint, what we're doing from our actors (which get started from a sidekiq process) is mostly stuff like SQL level atomic incrementing, etc. and we're storing some things in Riak already, so we're pretty used to ActiveModel + MultiJson for object persistence anyway.


Sean

Kevin Bouwkamp

unread,
Nov 13, 2013, 12:05:24 PM11/13/13
to cellulo...@googlegroups.com
As it pertains to the AR connection pool situation, I've had luck wrapping all of my database code inside of my actors with this function, in Rails 4:

def db(&block)
  begin
    ActiveRecord::Base.connection_pool.reap
    yield block
  rescue Exception => e
    raise e
  ensure
    ActiveRecord::Base.connection.close if ActiveRecord::Base.connection
    ActiveRecord::Base.clear_active_connections!
  end
end

# example
db do
  Model.destroy_all
end

I originally found this solution on StackOverflow a while back when having the same issue with rufus-scheduler, but I am having trouble finding it now.

Kevin

On Wednesday, May 23, 2012 10:01:51 AM UTC-4, Jonathan Rochkind wrote:
Can anyone share any experience of using ActiveRecord from inside a
Celluloid actor?

It is not clear to me if a particular actor is guaranteed to always be
executing under  the _same_ thread/fiber.  My impression is that a
given actor will always execute in the same Thread -- but possibly
within a variety of fibers within that Thread, not always the same
single fiber.

This has implications for ActiveRecord, because of ActiveRecord's use
of Thread.current[] thread local storage. Contrary to some popular
belief, it turns out 'thread' local storage is actually reset for
_each fiber_, even in MRI 1.9.3, not just in jruby.

Since ActiveRecord uses thread local storage (among other things) to
figure out the active checked out connection, if an Actor uses
different fibers under the hood without you being aware of it, this
could have.... weird implications for ActiveRecord.   Possibly just
using many more AR connections than you meant to and not correctly
checking them back in when you're done -- possibly worse bugs than
this, it's not clear AR has been tested or has a clear contract for
multi-fiber use. Also possibly there being no good way to check back
in AR connection when you're done -- you may find that you _thought_
you had a checked out connection (which you'll check back in when
you're done), but suddenly you're executing in a different fiber than
you thought, AR automatically checks out a _different_ connection to
you, without you realizing it or having any good way to check it back
in.

AR's concurrency contract is definitely a bit weird, sadly.  But
perhaps I'm inventing problems that won't really exist. Has anyone
actually used AR with their actors, or investigated what's going on?

Thanks for any advice.

Kamil Kukura

unread,
Dec 29, 2017, 5:43:24 PM12/29/17
to Celluloid
Looking into ActiveRecord 5, it seems it doesn't use thread locals (with one exception in ActiveRecord::NoTouching module). In ConnectionPool::Reaper there is private method:

#--
# From the discussion on GitHub:
# This hook-in method allows for easier monkey-patching fixes needed by
# JRuby users that use Fibers.
def connection_cache_key(thread)
  thread
end

Normally it's called with Thread.current so perhaps this could be overridden to return Fiber.current

viren...@teliax.com

unread,
Jan 25, 2018, 9:38:16 PM1/25/18
to Celluloid

> I think a single method invocation in an Actor will always be in a 
> single Fiber, right? So if inside a method invocation, all AR use is 
> wrapped in a ConnectionPool.with_connection, then I _think_ _maybe_ 
> it'll be okay. 
Yeah, as long as it's in exclusive block. 

Mostly what  I can take out of the above interaction, if the block is exclusive using a with_connection is the deal to handle AR currency with Celluloid. But has anyone checked the whether the reconnect of the AR happens (in case of reconnect logic) 

Reply all
Reply to author
Forward
0 new messages