We just run into a situation that feels like projects will have, so I wanted to discuss some options to solve it.
We've got an Actor that produces work very quickly (i.e. it just pulls records from the database) and sends that work as individual messages to some (routed) Actors who do some long-time processing on those.
Our problem is that the producer is creating work much faster than all the worker actors will be able to work off. Those messages will eventually eat up our heap space, so I guess we need to throttle the producer.
Those are some solutions we thought about, however all seem to have caveats. I'd like to know if there's a better one out there and which one you would choose:
1) have thousands of worker actors. That doesn't work for us because they depend on a database which is our actual bottleneck.
2) use a bounded mailbox size for the worker actors. That would then block the producing Actor when sending even more messages to the workers. Sounds like what we need, however this doesn't work with remote Actors: it doesn't block in that case but sends the message to the Deadletter Queue
3) use Derek Wyatt's PressureQueue (https://github.com/derekwyatt/PressureQueue-Concept/). It's a custom mailbox for the worker actors that delays the sumission of new messages based on the mailbox size. Sounds like a really nice solution, I'm just a bit if that's `the Akka way`, also because I don't see anyone actually using it and there's no commits for 10 months.
4) the producer could only pull the IDs from the database and we hope that those fit into memory - i.e. we'd have millions of IDs as messages floating around. The workers then fetch the complete record later on and slowly get the job done. This obviously only works if all the IDs fit into memory.
5) use the TimerBasedThrottler (http://letitcrash.com/post/28901663062/throttling-messages-in-akka-2) which makes our producer only create X amount of work per time unit. The problem here is how do I get the X and the time unit? It's only ever going to be a rough guess, so I'm either missing out on performance (if my workers could do faster) or potentially running out of memory (if my workers can't catch up, e.g. because of other load on the system)
Any thoughts?
Michael
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
And here's how we ended up implementing it - including test:
http://www.michaelpollmeier.com/akka-work-pulling-pattern-to-throttle-work/
On Tuesday, 12 February 2013 16:59:10 UTC+13, Michael Pollmeier wrote:Thanks for your thoughts, guys!
Another problem i found with the PressureQueue concept is that it blocks the producing actor on queuing the message which means it can't react to other messages.
We've come up with a simple acknowledgement idea inspired by Derek's article and Evan's hint:
- workers register themselves at the master
- master sends all workers a WorkAvailable message if it has some work
- workers then request work from master which responds with a piece of work
- when the worker is finished with that piece of work they request more work from the master
- in case there's no more work available the master simply doesn't respond
Best regards
Michael
On Monday, 11 February 2013 04:40:11 UTC, Michael Pollmeier wrote:We just run into a situation that feels like projects will have, so I wanted to discuss some options to solve it.
We've got an Actor that produces work very quickly (i.e. it just pulls records from the database) and sends that work as individual messages to some (routed) Actors who do some long-time processing on those.
Our problem is that the producer is creating work much faster than all the worker actors will be able to work off. Those messages will eventually eat up our heap space, so I guess we need to throttle the producer.
Those are some solutions we thought about, however all seem to have caveats. I'd like to know if there's a better one out there and which one you would choose:
1) have thousands of worker actors. That doesn't work for us because they depend on a database which is our actual bottleneck.
2) use a bounded mailbox size for the worker actors. That would then block the producing Actor when sending even more messages to the workers. Sounds like what we need, however this doesn't work with remote Actors: it doesn't block in that case but sends the message to the Deadletter Queue
3) use Derek Wyatt's PressureQueue (https://github.com/derekwyatt/PressureQueue-Concept/). It's a custom mailbox for the worker actors that delays the sumission of new messages based on the mailbox size. Sounds like a really nice solution, I'm just a bit if that's `the Akka way`, also because I don't see anyone actually using it and there's no commits for 10 months.
4) the producer could only pull the IDs from the database and we hope that those fit into memory - i.e. we'd have millions of IDs as messages floating around. The workers then fetch the complete record later on and slowly get the job done. This obviously only works if all the IDs fit into memory.
5) use the TimerBasedThrottler (http://letitcrash.com/post/28901663062/throttling-messages-in-akka-2) which makes our producer only create X amount of work per time unit. The problem here is how do I get the X and the time unit? It's only ever going to be a rough guess, so I'm either missing out on performance (if my workers could do faster) or potentially running out of memory (if my workers can't catch up, e.g. because of other load on the system)
Any thoughts?
Michael
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.