Supervising multiple workers

1,886 views
Skip to first unread message

Paul Butcher

unread,
Sep 1, 2013, 4:59:45 AM9/1/13
to elixir-l...@googlegroups.com
I'm writing a supervisor that supervises multiple workers. The workers are stateless - they just need starting up and restarting if they fail. I want to parameterise my supervisor so that I can say how many workers I want it to create. My first attempt was:

defmodule CounterSupervisor do
  use Supervisor.Behaviour

  def start_link(num_counters) do
    :supervisor.start_link(__MODULE__, num_counters) 
  end

  def init(num_counters) do
    workers = List.duplicate(worker(Counter, []))
    supervise(workers, strategy: :one_for_one)
  end
end

But that gives me:

** (EXIT from #PID<0.96.0>) {:start_spec, {:duplicate_child_name, Counter}}

I've solved this by changing the initialisation of workers to:

workers = Enum.map(1..num_counters, fn(n) -> worker(Counter, [], [id: "counter#{n}"]) end)

But that feels clunky to me. Am I missing a trick? It strikes me that this must be a pretty common requirement and that there should be an easier way to achieve it?

--
paul.butcher->msgCount++

Snetterton, Castle Combe, Cadwell Park...
Who says I have a one track mind?

http://www.paulbutcher.com/
LinkedIn: http://www.linkedin.com/in/paulbutcher
MSN: pa...@paulbutcher.com
AIM: paulrabutcher
Skype: paulrabutcher

José Valim

unread,
Sep 1, 2013, 8:36:07 AM9/1/13
to elixir-l...@googlegroups.com
Paul,

Your approach is correct if you are creating a handful of workers. However, if you are creating many, many workers, or even adding new children dynamically (resizing them and what not), you want to use the `simple_one_for_one` strategy, which won't require a new id for each and it will also be more efficient for such use cases.



José Valim
Skype: jv.ptec
Founder and Lead Developer


--
You received this message because you are subscribed to the Google Groups "elixir-lang-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-ta...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Paul Butcher

unread,
Sep 1, 2013, 9:33:05 AM9/1/13
to elixir-l...@googlegroups.com
Thanks José.

Is there  an example of simple_one_for_one anywhere I can take a look at? I've tried to get something working, but I'm not having much luck. Given the following minimal worker and supervisor:

defmodule Counter do
  use GenServer.Behaviour

  def start_link do
    :gen_server.start_link(__MODULE__, nil, [])
  end
end

defmodule CounterSupervisor do
  use Supervisor.Behaviour

  def start_link do
    :supervisor.start_link(__MODULE__, nil) 
  end

  def init(_args) do
    workers = [worker(Counter, [])]
    supervise(workers, strategy: :simple_one_for_one)
  end
end

This is what I get in iex:

iex(1)> sup = CounterSupervisor.start_link
{:ok, #PID<0.47.0>}
iex(2)> :supervisor.start_child(sup, [])
** (exit) {{:function_clause, [{:gen, :call, [{:ok, #PID<0.47.0>}, :"$gen_call", {:start_child, []}, :infinity], [file: 'gen.erl', line: 147]}, {:gen_server, :call, 3, [file: 'gen_server.erl', line: 184]}, {:erl_eval, :do_apply, 6, [file: 'erl_eval.erl', line: 569]}, {:elixir, :eval_forms, 3, [file: 'src/elixir.erl', line: 147]}, {IEx.Server, :eval, 4, [file: '/private/tmp/elixir-lNxT/elixir-0.10.1/lib/iex/lib/iex/server.ex', line: 111]}, {IEx.Server, :wait_input, 1, [file: '/private/tmp/elixir-lNxT/elixir-0.10.1/lib/iex/lib/iex/server.ex', line: 53]}, {IEx.Server, :wait_input, 1, [file: '/private/tmp/elixir-lNxT/elixir-0.10.1/lib/iex/lib/iex/server.ex', line: 60]}, {IEx.Server, :start, 1, [file: '/private/tmp/elixir-lNxT/elixir-0.10.1/lib/iex/lib/iex/server.ex', line: 32]}]}, {:gen_server, :call, [{:ok, #PID<0.47.0>}, {:start_child, []}, :infinity]}}
    gen_server.erl:188: :gen_server.call/3
    erl_eval.erl:569: :erl_eval.do_apply/6
    src/elixir.erl:147: :elixir.eval_forms/3

What am I missing?

--
paul.butcher->msgCount++

Snetterton, Castle Combe, Cadwell Park...
Who says I have a one track mind?

http://www.paulbutcher.com/
LinkedIn: http://www.linkedin.com/in/paulbutcher
MSN: pa...@paulbutcher.com
AIM: paulrabutcher
Skype: paulrabutcher

Dave Thomas

unread,
Sep 1, 2013, 10:35:39 AM9/1/13
to elixir-l...@googlegroups.com

The List.duplicate looks suspicious to me. It looks like you're supervising the same worker n times.

What happens if you instead do

workers = List.map 1..num_counters, &worker(Counter, [])

Dave

Paul Butcher

unread,
Sep 1, 2013, 11:01:11 AM9/1/13
to elixir-l...@googlegroups.com
On 1 Sep 2013, at 15:35, Dave Thomas <da...@pragprog.com> wrote:

What happens if you instead do

workers = List.map 1..num_counters, &worker(Counter, [])


A syntax error ;-)

** (SyntaxError) /Users/paul/pb7con/Book/code/ActorsNew/word_count/lib/counter.ex:47: invalid args for &, expected an expression in the format of &Mod.fun/arity, &local/arity or a capture containing at least one argument as &1, got: worker(Counter, [])

But if I do do:

workers = Enum.map 1..num_counters, fn(_) -> worker(Counter, []) end

I get exactly the same error as before.

I believe I'm right in saying that worker doesn't actually start a child - it just creates a *description* of a child that is subsequently started by the supervisor. The problem is that by default worker creates a description in which the id of the child is the name of the module passed as an argument, and two children can't have the same id.

I already have a solution to that (in my original mail) but it feels "clunky" to me. José's suggestion that I should use :simple_one_to_one instead makes sense, but I've been unable to get it to work (see subsequent mail) :-(

José Valim

unread,
Sep 1, 2013, 11:14:38 AM9/1/13
to elixir-l...@googlegroups.com
Paul,

I put an example from a current project that uses the simple_one_for_one strategy:


Basically, the goal of those three modules is to spawn new processes with a given key and store them in a ets table. Later, we can use this key to retrieve the spawned process PID quickly and send it message. The ETSSupervisor is the main API and the genserver that maintains the table. The Spawner is the simple  one for one supervisor which is actually responsible for keeping each spawned process running and the Sup is another supervisor responsible for supervising the genserver and the simple one for one supervisor. The name given on initialization of the main supervisor is used as the ets table name and the server name.





José Valim
Skype: jv.ptec
Founder and Lead Developer


Paul Butcher

unread,
Sep 1, 2013, 12:22:26 PM9/1/13
to elixir-l...@googlegroups.com
Thanks, José,

From a quick glance at that code, I can't see any reason why the small example I put together wouldn't work - I guess I must be missing something relatively subtle.

I'll see if I can cut down your example until I find the discrepancy. Thanks.

--
paul.butcher->msgCount++

Snetterton, Castle Combe, Cadwell Park...
Who says I have a one track mind?

http://www.paulbutcher.com/
LinkedIn: http://www.linkedin.com/in/paulbutcher
MSN: pa...@paulbutcher.com
AIM: paulrabutcher
Skype: paulrabutcher

jni viens

unread,
Sep 3, 2013, 8:28:22 AM9/3/13
to elixir-l...@googlegroups.com
Using start_link supervisor will call init/1 to find info about children and stuff. When you call start_child, it will also call this method with different arguments in order to dynamically add a child. In your case you ignore the arguments, therefore leading to twins. That's why start_link works, but it fails on start_child. See http://www.erlang.org/doc/man/supervisor.html

Paul Butcher

unread,
Sep 3, 2013, 9:30:16 AM9/3/13
to elixir-l...@googlegroups.com
On 3 Sep 2013, at 13:28, jni viens <jni....@gmail.com> wrote:

Using start_link supervisor will call init/1 to find info about children and stuff. When you call start_child, it will also call this method with different arguments in order to dynamically add a child. In your case you ignore the arguments, therefore leading to twins. That's why start_link works, but it fails on start_child. See http://www.erlang.org/doc/man/supervisor.html

Thanks - that kinda makes sense. Kinda.

I'm not sure that I understand where start_child gets the arguments to send to init? What could/should I do with the argument to init instead of ignoring it?

jni viens

unread,
Sep 3, 2013, 9:56:41 AM9/3/13
to elixir-l...@googlegroups.com
I just read the code again, I'm not sure of my answer anymore since you start a new worker... I might be wrong (happens alot).

However, what I do know is that if you call start_child(sup, [ARGS]) you'll end up with init([ARGS]) in case of a simple_one_for_one. Otherwise, these ARGS contain a few specific key/values for a "childspec" (start function, etc). Check the description of "start_child" in the doc I linked


--
You received this message because you are subscribed to a topic in the Google Groups "elixir-lang-talk" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elixir-lang-talk/IjM0cq-cDF0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elixir-lang-ta...@googlegroups.com.

Paul Butcher

unread,
Sep 3, 2013, 10:47:19 AM9/3/13
to elixir-l...@googlegroups.com
On 3 Sep 2013, at 14:56, jni viens <jni....@gmail.com> wrote:

I just read the code again, I'm not sure of my answer anymore since you start a new worker... I might be wrong (happens alot).

However, what I do know is that if you call start_child(sup, [ARGS]) you'll end up with init([ARGS]) in case of a simple_one_for_one. Otherwise, these ARGS contain a few specific key/values for a "childspec" (start function, etc). Check the description of "start_child" in the doc I linked

Perhaps I'm missing something, but I can't see anything in the document you link to that describes what arguments will be passed to init, or how they should be used. This is the only relevant portion I can find:

If the case of a simple_one_for_one supervisor, the child specification defined inModule:init/1 will be used and ChildSpec should instead be an arbitrary list of terms List. The child process will then be started by appending List to the existing start function arguments, i.e. by calling apply(M, F, A++List) where {M,F,A} is the start function defined in the child specification.

That says (if I'm reading it correctly) that the arguments given to start_child will be appended to the arguments within the ChildSpec, and subsequently sent to the worker start function (i.e. M.F). There's no mention of what arguments will be passed to init itself, or what init should do with them?

What am I missing?

James Fish

unread,
Sep 3, 2013, 1:43:47 PM9/3/13
to elixir-l...@googlegroups.com
Hi Paul,

Notice in an iex session it prints the result after each line. At iex(1) you call start_link/0, which returns {:ok, #PID<...>}. You pass this value to :supervisor.start_child/2 in iex(2), whereas you really just want to pass the second element (the pid and not whole the tuple).

A supervisor is a :gen_server so :supervisor.start_child/2 uses :gen_server.call/3 to call the supervisor to start the child, and :gen_server.call/3 uses :gen.call/4. The type {:ok, pid()} is not a valid target of a call made by :gen.call/4 so the function clause error occurs.

Paul Butcher

unread,
Sep 4, 2013, 5:37:39 AM9/4/13
to elixir-l...@googlegroups.com
D'oh!

Thanks James.

The single biggest issue I've had with Elixir is descrambling the error messages - I've had many (many!) occasions where some utterly mystifying (to me) bug turns out to be a very simple error, but the error message hasn't pointed me in the right direction. This is another such instance.

I guess it's clear to someone with previous Elixir/Erlang/OTP experience such as (I assume :-) yourself? But the fact that my mistake was *so* simple and yet nobody on this mailing list spotted it until you did 3 days after I posted it should ring alarm bells, I would think?

--
paul.butcher->msgCount++

Snetterton, Castle Combe, Cadwell Park...
Who says I have a one track mind?

http://www.paulbutcher.com/
LinkedIn: http://www.linkedin.com/in/paulbutcher
MSN: pa...@paulbutcher.com
AIM: paulrabutcher
Skype: paulrabutcher

Reply all
Reply to author
Forward
0 new messages