As a noobie to CB with minimal technical qualifications, this kind of solid engineering evaluation is profoundly reassuring. Performance, reliability, and supportability are all high on my list as I set out to build the applications to support my publishing business.
I do have a question about changing gen-servers to modules, however: As I understand it, the system-at-large is protected from failure of a well-supervised gen-server. Would this still be true if you changed existing gen-servers to modules?
Thanks for your fine work.
LRP
I think the underlying problem might not be the number of calls to
gen_server per se but the fact that CB uses single registered
processes in many places so that the message mailboxes get backed up.
A more OTP solution would be to dispatch requests to worker processes
that returned values to the requester when ready. BossNews uses this
architecture, where each channel is managed by a worker process. A
central boss_news registered process simply dispatches messages to the
appropriate channel, and the channel manager sends the reply to the
requester when the work is done.
Did you try the SMP test I suggested, i.e. run the tests again with
SMP disabled (-smp disable)? This should reveal whether the problem is
really CPU time spent in gen_server or if it is due to poor process
design. I suspect the latter.
Nice find with max_connections. BTW I'd prefer it if max_connections
were made a configuration option, ideally that applied to all servers
(even though we might dump non-Misultin servers in the future).
So to summarize: I like the parameter tuning, but I'm not convinced
we've found the real bottleneck with gen_server. I think when we do,
we'll see the system scale as well as standalone Misultin.
Thanks for your all your work on this!
Evan
--
Evan Miller
http://www.evanmiller.org/
Thanks, Bip.
handle_call({xxx, yyy}, From, State) ->
spawn_link(fun() ->
Result = slow_function(State),
gen_server:reply(From, Result)
end),
{noreply, State};The Poolboy library might help:
https://github.com/devinus/poolboy
The basic idea is to use worker processes with one central process
that doles out work to them (e.g. retrieving session info from
Memcached). This way the central process always returns immediately
and can process the next message in its mailbox, while the workers hum
away on the blocking activity.
I haven't used it, but it looks like Poolboy will take care of the
details for you. It may not help with mock sessions (yet) but should
help with Memcached sessions. If we want to improve the scalability of
mock sessions we might use a hashing algorithm, so that there are many
stateful session processes each of which hold a subset of the
sessions. Then a worker pool will make sense.
By the way I think BossDB would be a great candidate for using
Poolboy, and it would let us rip out the internal pools used by the DB
adapters. Reworking BossDB to use a worker pool is one of the last
objectives for 1.0.
Let me know if there's anything I can clarify further, and thanks for
all your work on this!
Evan