Complex situation

19 views
Skip to first unread message

Gonzalo Rodríguez-Baltanás Díaz

unread,
Nov 1, 2010, 7:41:53 PM11/1/10
to ruby...@googlegroups.com

Hello


I have a situation that I don't know how to solve.

We have 1 Worker_Manager and a fixed number of Workers.

The Workers are created inside the Worker_Manager using fork{ ... }

Each worker should communicate to the Worker_Manager when it finalizes its task.

The Worker_Manager should stop the EM and BrB service when all workers has contacted the Worker_manager.

Currently I have a counter in Worker_Manager. Everytime a worker finalises y call a method on the Worker_Manager that increments that counter. When the counter reaches the desired number (number of workers) then it stops the EM and Brb::Service.

The problem is that there is a death_lock in this design.

If I say that the method call to increment the counter at Worker_Manager is _block, then when the services are stop there won't be a return, so the worker is left waiting.

If it is not a _block, then some thing weird happens – even if I put a sleep(4) – and the Work_Manager blocks too.

I don't know what more to try. Do you see anything wrong?

You can reproduce all this with:
rspec spec/worker_manager/WorkerManager_spec.rb -f d -c

Thank you.

Code:
http://github.com/Nerian/DPovray/tree/WorkerManager

See worker_manager.rb , method render_scene

See worker.rb , method start_your_work

Guillaume Luccisano

unread,
Nov 3, 2010, 1:11:53 AM11/3/10
to ruby...@googlegroups.com
Hi,

Do you still have the issue ? Because I cloned your project and the spec is running now.

2010/11/1 Gonzalo Rodríguez-Baltanás Díaz <sio...@gmail.com>

Gonzalo Rodríguez-Baltanás Díaz

unread,
Nov 3, 2010, 7:55:28 AM11/3/10
to ruby...@googlegroups.com

Hi!

But did you used the branch WorkerManager?

Gonzalo Rodríguez-Baltanás Díaz

unread,
Nov 3, 2010, 9:13:37 PM11/3/10
to ruby...@googlegroups.com

I think that it never gets out of the BrB::Service loop. Check the last commit in the branch WorkerManager. The Workers communicate with the Worker_Manager and registrations and unregistrations are done right. But I think I am not doing the Shutting everything down part right.

def report(arg)
      puts ">>> #{arg}"
      @number_of_completions = @number_of_completions+1
      if @number_of_completions == @subjobs.count            
        BrB::Service.stop_service 
      end
end

This method is called on the Worker_Manager from Worker when it completes the job. When the number of workers that have completed the job is the right one the Event Machine loop should stop.

But if you run the tests, it seems that it never stops. I am really stuck here :) Any tips?

Regards,
Gonzalo


El 03/11/2010, a las 06:11, Guillaume Luccisano escribió:

Guillaume Luccisano

unread,
Nov 3, 2010, 9:21:57 PM11/3/10
to ruby...@googlegroups.com
Yes sorry, I'm kind of busy recently but I will try to have a look tonight !
I keep you posted !

2010/11/3 Gonzalo Rodríguez-Baltanás Díaz <sio...@gmail.com>

Guillaume Luccisano

unread,
Nov 4, 2010, 12:51:51 AM11/4/10
to ruby...@googlegroups.com
Ok, so after a try, if you want to leave the EM.run block, you should call EM.stop somewhere, like just after your BrB::Service.stop_service.
I hope this answer your question !

2010/11/3 Guillaume Luccisano <guillaume...@gmail.com>

Gonzalo Rodríguez-Baltanás Díaz

unread,
Nov 4, 2010, 4:26:09 PM11/4/10
to ruby...@googlegroups.com

Thanks for the clarification!

Now the code runs perfect. I also added a sleep() before the workers start to work, because they were being created faster than the creating of the EM, that's why the app used to freeze.

Thank you for your help!

Guillaume Luccisano

unread,
Nov 4, 2010, 4:29:19 PM11/4/10
to ruby...@googlegroups.com
You are welcome, I glad I helped you.
Anyway, the sleep if probably not the better solution, you can't implement a callback mechanism instead ?

2010/11/4 Gonzalo Rodríguez-Baltanás Díaz <sio...@gmail.com>

Gonzalo Rodríguez-Baltanás Díaz

unread,
Nov 4, 2010, 6:16:48 PM11/4/10
to ruby...@googlegroups.com

Hello

I made a Gist with the description of the problem. I agree that sleep is not a good solution. I am going to think about it now that I am refactoring all current code.

Reply all
Reply to author
Forward
0 new messages