[erlang-questions] time consuming operations inside gen_server

55 views
Skip to first unread message

Martin Dimitrov

unread,
Dec 12, 2012, 4:27:42 AM12/12/12
to Erlang Questions
Hi all,

In our application, we have a gen_server that does a time consuming
operation. The message is of type cast thus the caller process doesn't
sit and wait for the operation to finish. But while the gen_server is
busy with the cast message, it doesn't serve any other call, right?

So, would it be appropriate to create a process that will do the time
consuming operation and then notify the gen_server?

Thanks for looking into this.

Best regards,
Martin
_______________________________________________
erlang-questions mailing list
erlang-q...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

Attila Rajmund Nohl

unread,
Dec 12, 2012, 4:42:02 AM12/12/12
to Martin Dimitrov, Erlang Questions
2012/12/12 Martin Dimitrov <mrtndi...@gmail.com>:
> Hi all,
>
> In our application, we have a gen_server that does a time consuming
> operation. The message is of type cast thus the caller process doesn't
> sit and wait for the operation to finish. But while the gen_server is
> busy with the cast message, it doesn't serve any other call, right?
>
> So, would it be appropriate to create a process that will do the time
> consuming operation and then notify the gen_server?

The pattern I use in this case is to still use gen_server:call (if the
caller needs to be blocked), start a separate process for the
time-consuming stuff, return {noreply, ...} from the handle_call (so
the gen_server can handle other calls), then call gen_server:reply
from the started process when the time-consuming operation finished.
Of course, the applicability of this pattern depends on what you
actually do.

Martin Dimitrov

unread,
Dec 12, 2012, 4:46:59 AM12/12/12
to Attila Rajmund Nohl, Erlang Questions
Thanks, this sounds very cool.

Roman Gafiyatullin

unread,
Dec 12, 2012, 4:58:00 AM12/12/12
to Martin Dimitrov, Erlang Questions
Hello Martin,

In this cases the problem is usually solved as Attila Rajmund Nohl said.
Since it's the common situation I wrote a lib for that: https://github.com/RGafiyatullin/gen_wp
See src/gen_wp_example.erl for how to use it. 

Questions and constructive critics are welcome and appreciated.

--
RG

Max Bourinov

unread,
Dec 12, 2012, 5:04:18 AM12/12/12
to Martin Dimitrov, Erlang Questions
Hi Martin,

First of all I try to avoid calls in my systems. I believe that is there is a call - there is a potential lock. Casts are friends.

In your case I would suggest you to have "special" workers for time consuming operations. Workers should be gen_servers and should be a part of your supervisors tree.

This is just a general suggestion - of course each case is unique. Please adapt it to your case if possible.

Best regards,
Max

Martin Dimitrov

unread,
Dec 12, 2012, 5:11:25 AM12/12/12
to Max Bourinov, Erlang Questions
Hi,

This is actually a cast but since it takes long the gen_server will be
blocked and I want to avoid this.

Thanks for the replies,

Martin

Martin Dimitrov

unread,
Dec 12, 2012, 5:27:23 AM12/12/12
to Roman Gafiyatullin, Erlang Questions
I browsed through the files and looks very interesting. What does "wp"
stand for?

Martin

On 12/12/2012 11:58 AM, Roman Gafiyatullin wrote:
> Hello Martin,
>
> In this cases the problem is usually solved as Attila Rajmund Nohl said.
> Since it's the common situation I wrote a lib for that: https://github.com/RGafiyatullin/gen_wp
> See src/gen_wp_example.erl (https://github.com/RGafiyatullin/gen_wp/blob/master/src/gen_wp_example.erl) for how to use it.
>
> Questions and constructive critics are welcome and appreciated.
>
> --
> RG
>
>
> On Wednesday, December 12, 2012 at 12:46 pm, Martin Dimitrov wrote:
>
>> Thanks, this sounds very cool.
>>
>> On 12/12/2012 11:42 AM, Attila Rajmund Nohl wrote:
>>> 2012/12/12 Martin Dimitrov <mrtndi...@gmail.com (mailto:mrtndi...@gmail.com)>:
>>>> Hi all,
>>>>
>>>> In our application, we have a gen_server that does a time consuming
>>>> operation. The message is of type cast thus the caller process doesn't
>>>> sit and wait for the operation to finish. But while the gen_server is
>>>> busy with the cast message, it doesn't serve any other call, right?
>>>>
>>>> So, would it be appropriate to create a process that will do the time
>>>> consuming operation and then notify the gen_server?
>>>>
>>>
>>>
>>> The pattern I use in this case is to still use gen_server:call (if the
>>> caller needs to be blocked), start a separate process for the
>>> time-consuming stuff, return {noreply, ...} from the handle_call (so
>>> the gen_server can handle other calls), then call gen_server:reply
>>> from the started process when the time-consuming operation finished.
>>> Of course, the applicability of this pattern depends on what you
>>> actually do.
>>>
>>
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-q...@erlang.org (mailto:erlang-q...@erlang.org)
>> http://erlang.org/mailman/listinfo/erlang-questions

Michael Truog

unread,
Dec 12, 2012, 5:28:52 AM12/12/12
to Martin Dimitrov, Erlang Questions
On 12/12/2012 01:27 AM, Martin Dimitrov wrote:
> So, would it be appropriate to create a process that will do the time
> consuming operation and then notify the gen_server?

Yes, processes are cheap in Erlang. There is no reason not to create a separate process for a task, that runs for awhile, to make sure the task doesn't stop your gen_server from processing its message queue. You could do it with a erlang:spawn_link or supervisor:start_child, both of which are typically used in this type of a situation. You want to avoid using erlang:process_flag(trap_exit, true), because it is simpler to have a supervisor manage the process, than have manual, custom supervisor-like functionality (within your gen_server).

There is another reason which long-running tasks often need separate processes, which is a separate concern. If memory is used quickly, in a large quantity, you want the garbage collector to run quickly to free memory for other processes. The short-lived process causes the garbage collector to collect when it dies (all the memory from the task), which would otherwise be a problem in a longer-lived process (since the memory would just accumulate, and would not be collected quickly).

Max Lapshin

unread,
Dec 12, 2012, 5:29:24 AM12/12/12
to Martin Dimitrov, Erlang Questions
Call is a throttling and overload control mechanism.

If your worker is too slow you will ruine your erlang vm with several millions of casts in the message box

Loïc Hoguin

unread,
Dec 12, 2012, 5:30:05 AM12/12/12
to Max Bourinov, Erlang Questions
Workers shouldn't be gen_servers. If they are, they are quite poor ones
considering you'll have a hard time inspecting it, debugging it or
tracing it because your sys:get_status/1 call will simply timeout while
the gen_server is busy doing whatever it's doing. You also likely have a
single call/cast for it, so you don't use much of gen_server itself.

It's OK to write processes that aren't gen_servers, especially in cases
like this. All you have to do is spawn a supervised worker passing it
your Pid, monitor it and then wait for either it to go down or finish
the task and give a result which you'll get back as a message. The
worker can simply stop after it sends said message. If you don't expect
a result, it's even simpler!

On 12/12/2012 11:04 AM, Max Bourinov wrote:
> Hi Martin,
>
> First of all I try to avoid calls in my systems. I believe that is there
> is a call - there is a potential lock. Casts are friends.
>
> In your case I would suggest you to have "special" workers for time
> consuming operations. Workers should be gen_servers and should be a part
> of your supervisors tree.
>
> This is just a general suggestion - of course each case is unique.
> Please adapt it to your case if possible.
>
> Best regards,
> Max
>
>
>
>
> On Wed, Dec 12, 2012 at 1:46 PM, Martin Dimitrov <mrtndi...@gmail.com
> <mailto:mrtndi...@gmail.com>> wrote:
>
> Thanks, this sounds very cool.
>
> On 12/12/2012 11:42 AM, Attila Rajmund Nohl wrote:
> > 2012/12/12 Martin Dimitrov <mrtndi...@gmail.com
> <mailto:mrtndi...@gmail.com>>:
> >> Hi all,
> >>
> >> In our application, we have a gen_server that does a time consuming
> >> operation. The message is of type cast thus the caller process
> doesn't
> >> sit and wait for the operation to finish. But while the
> gen_server is
> >> busy with the cast message, it doesn't serve any other call, right?
> >>
> >> So, would it be appropriate to create a process that will do the
> time
> >> consuming operation and then notify the gen_server?
> >
> > The pattern I use in this case is to still use gen_server:call
> (if the
> > caller needs to be blocked), start a separate process for the
> > time-consuming stuff, return {noreply, ...} from the handle_call (so
> > the gen_server can handle other calls), then call gen_server:reply
> > from the started process when the time-consuming operation finished.
> > Of course, the applicability of this pattern depends on what you
> > actually do.
> >
>
> _______________________________________________
> erlang-questions mailing list
> erlang-q...@erlang.org <mailto:erlang-q...@erlang.org>
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-q...@erlang.org
> http://erlang.org/mailman/listinfo/erlang-questions
>


--
Loïc Hoguin
Erlang Cowboy
Nine Nines
http://ninenines.eu

Fred Hebert

unread,
Dec 12, 2012, 8:23:26 AM12/12/12
to Martin Dimitrov, Erlang Questions
Hi Martin,

I gave a quick read to this thread and there are a few things I think
should be mentioned in order to make a decision. I'm writing this as
some kind of general guide I follow mentally, so please do not feel
patronized if you find I approach things at a basic level that's too
simple for your level of expertise. I'm writing it for you, but also for
myself (or anyone else finding it over google or whatever).

I believe you won't solve this problem by leaving things as they are,
but there are properties to figure out regarding the kind of work you're
doing:

1. Is this queue build-up related to temporary overflow? Does it happen
at peak time, in bursts, or is it a continuous overflow?
2. Are the tasks you're running in any way bound by time? What I mean
here is to ask how long you're allowed to wait. Is it milliseconds,
seconds, or hours, before a cast is a problem?
3. Are you in charge of producing the events in-system, or is it
something triggered by user actions, outside of your control?
4. Why does it take long to process? Is it a problem due to CPU-bound
problems, depending on other workers, I/O bound problems (disk,
network) slowing your server down?
5. What's the nature of events you're handling?

Answering each of these questions will be the first step to being able
to pick an adequate solution. Here are a few possibilities:

- if you're in charge of producing events (your system creates them from
some static data source, for example) and can regulate them, by having
a fixed number of producers and synchronous calls to put back pressure
from your server to the workers. They won't do more work than the
consuming part of your system can handle.

In general terms, applying back-pressure this way is the most
efficient way to solve and survive all overload issues. It's a bit
tricky because it means you're pushing the problem up a level in your
stack, until at some point you lower your issues with pressure or
that at some point you push the backpressure back to users, and that's
sometimes not acceptable. Pushing it back to some load-balancing
mechanism that dispatches through more instances is often acceptable
as an alternative.

- You may expect tasks to be long to run, but to be fast to be
acknowledged. In this case, moving to an asynchronous model makes
sense. This can be done by spawning workers to do tasks while the
server simply accepts the queries, responds to them, and queues up
answers that have yet to come.

This form of concurrency is different from adding processes for the
sake of parallelism. We don't expect to handle more requests (or at
least more than Original*NumberOfCores), we just want to make the
accepting/response of events and theur handling disjoint, not
happening in the same timeline to avoid blocking.

- If requests are to be very short-lived, it may be interesting to just
not handle them when the system is very busy, and fail. This is a bit
more tricky to put in place, and I believe it's rarely a good solution
based on the nature of the problem you solve. Systems where you're
allowed to give up and not do handle things are a bit rare, I believe.

- If your handling of events is slow due to the task simply being long
to handle, you could try to figure the ideal rate at which you process
data and the overload you could handle in peak hours. This means
figuring out how many requests per second (or whatever time slice) you
can handle, how much you receive, and then finding a way to raise this
value through optimization, or adding more processes or more machines to
handle it. This is often a good way to proceed, as far as I know, when
you deal with predictable levels of overload or a constant level of
overload.

If the problem is being CPU-bound, then there's an upper
limit to how much parallelism will help you. Better algorithm or data
structure choice, going down to HiPE or C, or finally using more
computers to do the work can all be considered.

If it's network or disk bound, then you have plenty of different
options to try. SSDs, compressing data, buffering before pushing it
around, possibly merging events or dropping non-vital ones, adding
more end-points (similar to sharding) may all help reduce that cost.

- It's possible you have different kind of events, either from different
sources or to different endpoints. If that happens, it may be
interesting to quickly dispatch events from your central process to
workers dedicated to a source and/or an endpoint. This will naturally
divide the workload done and may solve your problem to some extent. If
what you get is extremely uneven distribution of events (for example,
95% of them are from one source to one endpoint), then divinding your
dispatching and handling is likely to only help about 5% of requests.

- Dropping out of OTP is a possibility, as Loïc suggested, but I would
personally only do it once you know for a fact OTP is one of the
bottleneck causing problems. I think OTP-by-default is a sane thing to
have, and while dropping down is always an option, I think you should
be able to debate and prove why it made sense to do it before doing
it.

- If you manage to fix your sequential bottleneck, you'll possibly find
out you're creating a new one further down the system, up until you
either get rid of all of them, or you reach a point where you need to
apply back-pressure at a hard limit. This one is particularly painful
because it may mean parts of your hard work need to be undone to start
bubbling the back-pressure mechanisms up until a higher level.

It may be interesting to make sure you know the true underlying cause
of your problem there, to avoid optimizing towards a wall that way.

That's about what I can manage to think of this morning. I hope this
proves helpful to anybody out there and that I didn't insert to many
typos or errors.

Regards,
Fred.

Serge Aleynikov

unread,
Dec 12, 2012, 8:47:20 AM12/12/12
to bour...@gmail.com, erlang-q...@erlang.org
Max,

If the system implementation is entirely based on casts, then it would
only work property under small loads, otherwise its lack of congestion
control would be subject to "fast producer slow consumer" problems. In
the later case a gen_server that for one reason or another is not
fetching messages from its message queue fast enough would accumulate
many messages in its queue which will degrade performance of the entire
system.

On the other hand, the use of calls removes this problem by slowing down
the producer, and the gen_server's implementer has a design decision to
make in terms of how to ensure that producer is only minimally delayed
and the server is scalable. This may indeed involve spawning processes
from gen_server or using a pool of resources, and returning an error
when all resources are busy, or some other design.

Serge

On 12/12/2012 5:04 AM, Max Bourinov wrote:
> Hi Martin,
>
> First of all I try to avoid calls in my systems. I believe that is there
> is a call - there is a potential lock. Casts are friends.
>
> In your case I would suggest you to have "special" workers for time
> consuming operations. Workers should be gen_servers and should be a part
> of your supervisors tree.
>
> This is just a general suggestion - of course each case is unique.
> Please adapt it to your case if possible.
>
> Best regards,
> Max
>
>
>
>
> On Wed, Dec 12, 2012 at 1:46 PM, Martin Dimitrov <mrtndi...@gmail.com
> <mailto:mrtndi...@gmail.com>> wrote:
>
> Thanks, this sounds very cool.
>
> On 12/12/2012 11:42 AM, Attila Rajmund Nohl wrote:
> > 2012/12/12 Martin Dimitrov <mrtndi...@gmail.com
> <mailto:mrtndi...@gmail.com>>:
> >> Hi all,
> >>
> >> In our application, we have a gen_server that does a time consuming
> >> operation. The message is of type cast thus the caller process
> doesn't
> >> sit and wait for the operation to finish. But while the gen_server is
> >> busy with the cast message, it doesn't serve any other call, right?
> >>
> >> So, would it be appropriate to create a process that will do the time
> >> consuming operation and then notify the gen_server?
> >
> > The pattern I use in this case is to still use gen_server:call (if the
> > caller needs to be blocked), start a separate process for the
> > time-consuming stuff, return {noreply, ...} from the handle_call (so
> > the gen_server can handle other calls), then call gen_server:reply
> > from the started process when the time-consuming operation finished.
> > Of course, the applicability of this pattern depends on what you
> > actually do.
> >
>
> _______________________________________________
> erlang-questions mailing list
> erlang-q...@erlang.org <mailto:erlang-q...@erlang.org>
> http://erlang.org/mailman/listinfo/erlang-questions
Reply all
Reply to author
Forward
0 new messages