GenServers and scaling

Matthew Welch

unread,

May 9, 2014, 11:16:20 AM5/9/14

to elixir-l...@googlegroups.com

How well do GenServers scale? It seems that since each call or cast can update the state of the server, all the calls must be handled sequentially. Even if you offload the state doesn't the presence of that mechanism turn GenServers into potential bottlenecks?

And on a related note, is it faster to implement everything as a cast? If you need a return value you could pass a callback or pid to send a messsage to. Would that make things faster or just complicate them?

Daniel Goertzen

unread,

May 9, 2014, 12:06:24 PM5/9/14

to elixir-l...@googlegroups.com

On Fri, May 9, 2014 at 10:16 AM, Matthew Welch <matthewma...@gmail.com> wrote:

How well do GenServers scale? It seems that since each call or cast can update the state of the server, all the calls must be handled sequentially. Even if you offload the state doesn't the presence of that mechanism turn GenServers into potential bottlenecks?

It certainly is a bottleneck if you design everything to pass through one GenServer. If you can, try to design your system so it doesn't do that.

And on a related note, is it faster to implement everything as a cast? If you need a return value you could pass a callback or pid to send a messsage to. Would that make things faster or just complicate them?

But that is exactly what call does for you, so you would just be reinventing the wheel.

Cheers,
Dan.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-ta...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

José Valim

unread,

May 9, 2014, 12:18:10 PM5/9/14

to elixir-l...@googlegroups.com

Also, call works as a backpressure mechanism. If you only use cast, you can overload the gen server and the clients will continue sending messages.

José Valim

www.plataformatec.com.br

Skype: jv.ptec

Founder and Lead Developer

Martin Schurrer

unread,

May 9, 2014, 1:19:52 PM5/9/14

to elixir-l...@googlegroups.com

On 09/05/14 18:06, Daniel Goertzen wrote:
> It certainly is a bottleneck if you design everything to pass through
> one GenServer. If you can, try to design your system so it doesn't do that.

An IMO great example for how to do that is gproc and especially
gproc_pool's "bottleneck free" (well it moves the bottleneck to ETS)
implementation of worker pools:

https://github.com/uwiger/gproc/blob/master/doc/gproc_pool.md#concepts

"The server gproc_pool is used to serialize pool management updates, but
worker selection is performed entirely in the calling process, and can
be performed by several processes concurrently."

Luckily for me I can just use ETS & gproc and treat them as black boxes
that just work and can focus my thinking on the easy to reason about
GenServers, even though they are bottlenecks.

--

Kind regards,
Martin Schurrer

Saša Jurić

unread,

May 9, 2014, 2:29:18 PM5/9/14

to elixir-l...@googlegroups.com

A couple of my own thoughts:

Single GenServer (or any process for that matter) runs concurrently to others, but is sequential internally. This is a property. The consequences are that actions are serialized (which is good), but a single process can become bottleneck.

If many processes depend on a single one, then that process will be a bottleneck, and the system might not scale well.

There are some ways around it:

1. If actions need not be serialized, run them in different processes.

2. If writes must be serialized, but not reads, use ETS, and synchronize only writes (this is essentially what Martin mentioned).

3. If actions must be serialized, consider whether the entire action needs serializing or just some part of it. If latter, move code that can run concurrently out of the process. Also, try optimizing the code that must be serialized.

Multiple gen_servers (or processes) are of course scalable. If you have thousands of processes that are mostly independent. A napkin diagram showing inter-process dependencies will immediately present possible bottlenecks. This is why processes are great - you can easily reason about the concurrency of your entire system.

Regarding calls vs cast, I prefer to use casts, unless response is needed. Calls are performance and scalability killers, and potential deadlock sources. Notice, that sometimes you may want to use call to return the success of a write. However, if you treat success/fail operation equally, then just use casts. Making some custom scheme of turning a call into cast, only to send back the message later is just reinventing a wheel. However, you may want to issue an operation, then do something else, and pick up the response later. In this scenario, you should check xgen's tasks.

I strongly disagree that calls are good tool for backpressure control (despite seeing this pattern being mentioned). It is a hacky, and implicit way of limiting a client, and can't help in all situations (e.g. many clients attacking a single server). Furthermore, timeout will not remove the message from the queue.

For explicit backpressure load management, I use a middle-man process. This is formulated in a library called workex (https://github.com/sasa1977/workex) which gave me good results in production. Having a middle man process induces some performance penalty, but it gives you a control over your message queue. You can prioritize, bundle, and discard messages as you please. There is also a popular jobs library by Ulf Wiger (https://github.com/esl/jobs). I didn't use it, but given Ulf's reputation, I'd trust it to be good, most probably better then what I wrote.

Reply all

Reply to author

Forward