Merb 1.0.15: run_later seems broken w/ clusters

1 view
Skip to first unread message

Gary Yngve

unread,
Dec 31, 2009, 7:09:30 PM12/31/09
to merb
run_later is working just fine in bin/merb -i or bin/merb, but no
longer seems to be working with the -c parameter (clustering).

The diffs btwn 1.0.12 and 1.0.15 are very small.. looks like the
worker starting was moved a little bit.. about to dive in more..
Has anyone been able to get run_later working on Merb 1.0.15 w/
clustering?

-Gary

Gary Yngve

unread,
Dec 31, 2009, 8:26:22 PM12/31/09
to merb
So on 1.0.15, I added some logging regarding run_later, initializing a
worker, and entering the work loop..

I get:


initializing! [82326, false, 0]
in worker! [82326, false, 0]
QUEUEING! [82327, true, 0]

and nothing else..
where 82326 is the PID of the spawner, 82327 is the PID of one of the
merb workers, the boolean is Merb::Worker.started?, and the number is
the size of the queue.

i trigger a few more requests..
QUEUEING! [82327, true, 1]
QUEUEING! [82327, true, 2]
QUEUEING! [82327, true, 3]

the queue isn't being emptied..

The problem is the Worker should have been initialized in PID 82327,
right?

-Gary

Martin Gamsjaeger

unread,
Jan 1, 2010, 12:34:04 PM1/1/10
to me...@googlegroups.com
Gary,

This is a known bug that also affects merb/master. I'm not really
familiar with those parts of the code, but here's a excerpt from a
conversation with someone on irc who summarized it as follows:

"The background worker is started after you fork the spawner, but
before you fork workers so, although the workers each have their own
work_queue (which is getting filled up), none of them have a
background worker thread to process the queue."

A patch for that would be highly appreciated :)

snusnu

> --
>
> You received this message because you are subscribed to the Google Groups "merb" group.
> To post to this group, send email to me...@googlegroups.com.
> To unsubscribe from this group, send email to merb+uns...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/merb?hl=en.
>
>
>

Gary Yngve

unread,
Jan 2, 2010, 2:48:33 PM1/2/10
to merb
Ah, here is the key info from Kernel.fork:

"The thread calling fork is the only thread in the created child
process. fork doesn‘t copy other threads."

So the @worker and started? methods in Merb::Worker are misleading
when in a child: the thread is dead.
Here is what Merb::Worker looks like in the child just after a fork:
#<Merb::Worker:0x37ca9a4 @thread=#<Thread:0x37ca88c dead>>

Here is my fix (a little ugly because of the need to nuke the
singleton) that is the simplest possible patch to get me unblocked:

merb-core/rack/adapter/abstract.rb:

def self.spawn_worker(port)
worker_pid = Kernel.fork
+ unless Merb::Worker.start.thread.alive?
+ Merb::Worker.instance_eval("@worker = nil")
+ Merb::Worker.start
+ end
...

You'll probably want to abstract the code into
Merb::Worker.alive?
and
Merb::Worker.restart

so that the code then reads
Merb::Worker.restart unless Merb::Worker.alive?

-Gary

jonah honeyman

unread,
Jan 6, 2010, 4:31:58 AM1/6/10
to me...@googlegroups.com
Gary,

Can you submit the patch to lighthouse? I would like to see this merged in to master and 1.0.x ASAP. I also think that a 1.0.16 release with this applied is more than appropriate.

--

You received this message because you are subscribed to the Google Groups "merb" group.
To post to this group, send email to me...@googlegroups.com.
To unsubscribe from this group, send email to merb+uns...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/merb?hl=en.





--
-jonah

Gary Yngve

unread,
Jan 6, 2010, 3:45:12 PM1/6/10
to merb
Jonah, done.

Sadly I don't have the time to invest to figure out how to write a
unit test for this.. seems tricky.

I think it's scary that the passenger change was introduced without
realizing that it broke clusters, and it would be good to avoid a
repeat.
When a CEO has to weigh the benefits of a sexy web framework that
makes engg happy vs. the risk that new versions of the framework may
break in unpredictable ways that require engineering manhours or
impact customers, it is going to be a tough sell.

-Gary

Pavel Kunc

unread,
Feb 6, 2010, 8:57:36 AM2/6/10
to merb
I've tried to apply patch agains the 1.1.0.pre (the patch which broke
it is in 1.1.0.pre as well). I have Mac OS X and tested on Ruby 1.9.1
and 1.8.7.

The problem is that in 1.9.1 on OSX cluster workers doesn't even start
to listen and the whole cluster looks dead. Not that this would be
caused by the patch.

On 1.8.7 I've managed to run the cluster and applied the patch however
the patch didn't fixed the problem. I still don't get anything from
the background threads.

Can you guys share your system configuration when the patch fixes the
problem, please.

Pavel Kunc

unread,
Feb 7, 2010, 3:26:43 PM2/7/10
to merb
Good news:

We've applied altered patch after long and deep discussion/reading
Merb source with Jonathan. Patch we've applied fixed run_later
behaviour on Merb 1.1.0.pre & Thin. Patch has been tested on 1.9.1 and
1.8.7 and worked.

HOWEVER!
We didn't have an opportunity to test it on Passenger, we think though
that the patch should not influence the Passenger functionality.

WE WOULD BE GRATEFUL SOMEBODY COULD TEST THE PATCH ON HIS PASSENGER
SETUP AND GIVE A FEEDBACK!

I'm not going to release new 1.1.0.pre gems to GT until this is tested
on Passenger and other servers/platforms.

Please provide your feedback to:
https://merb.lighthouseapp.com/projects/7433/tickets/1288-run_later-doesnt-work-on-ubuntu

Thanks a lot Gary for your research!

Pavel

Reply all
Reply to author
Forward
0 new messages