Proposal: built-in DynamicSupervisor partitioning

124 views
Skip to first unread message

José Valim

unread,
Sep 2, 2021, 11:28:19 AM9/2/21
to elixir-l...@googlegroups.com
Given supervisors are also processes, they may also become bottlenecks. While this is unlikely to happen to a Supervisor, since it is mostly static, it can happen to a DynamicSupervisor.

We can address this by partitioning the dynamic supervisor. Imagine the following dynamic supervisor:

defmodule MyApp.DynamicSupervisor do
  use DynamicSupervisor

  def start_link(opts) do
    DynamicSupervisor.start_link(__MODULE__, arg, opts)
  end

  def init(_arg) do
    DynamicSupervisor.init(strategy: :one_for_one)
  end
end

In order to partition it, we can start 8 instances of said supervisor inside a regular Supervisor, and then pick one partition at random when starting a child. For example:

defmodule MyApp.Supervisor do
  use Supervisor

  @partitions 8
  @name __MODULE__

  def start_child(module, arg) do
    i = :erlang.phash2(self(), @partitions) + 1
    DynamicSupervisor.start_child(:"#{__MODULE__}#{i}", {module, arg})
  end

  def start_link do
    Supervisor.start_link(__MODULE__, arg, name: @name)
  end

  def init(arg) do
    children =
      for i <- 1..@partitions do
        name = :"#{__MODULE__}#{i}"
        Supervisor.child_spec({MyApp.DynamicSupervisor, name: name}, id: name)
      end

    Supervisor.init(children, strategy: :one_for_one)
  end
end

I would like to make the above more convenient by introducing a :partitions option to DynamicSupervisor.start_link. When given, the new option will automatically start N dynamic supervisors under a supervisor, like above:

DynamicSupervisor.start_link(__MODULE__, :ok, partitions: 8, name: @name)

For now, the :name option will be required.

Now, when spawning child processes, you will use via tuples:

DynamicSupervisor.start_child({:via, DynamicSupervisor, {@name, self()}}, {module, arg})

The via tuple has the format {:via, DynamicSupervisor, {supervisor_name, value_to_partition_on}}. Once invoked, it will take care of partitioning based on the current process and dispatching it.

Overall, encapsulating the partitioning of DynamicSupervisor (and consequently of Task.Supervisor) into an easy to use API can help to vertically scale up applications that use those constructs.

## Open questions

One of the confusing aspects of the above is that the :name option no longer reflects the name of the DynamicSupervisor but of the parent Supervisor. One alternative is to *not* accept the :name option when :partitions is given, instead we could have a :root_name option instead (or something more appropriately named).

Implementation wise, we will store the available processes in ETS or using a Registry (to be started alongside the Elixir application).

Feedback?

Ben Wilson

unread,
Sep 2, 2021, 11:48:19 AM9/2/21
to elixir-lang-core
Implemented this way, options like max_restarts max_children and so on would occur per partition. I take it the plan would be to simply note that in the docs? I don't see any easy way to enforce those values across all partitions, which I think is just fine.

which_children/1 and so on won't work as expected either, so maybe we want a helper function inside of DynamicSupervisor that flat_maps that over the child supervisors?

Overall though, +1 from me.

José Valim

unread,
Sep 2, 2021, 2:01:34 PM9/2/21
to elixir-l...@googlegroups.com
Thank you Ben. Those are very good points and questions which I am not sure I can get satisfactory answers for. I will investigate more and follow up.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/bae8be24-a2e5-4a07-89db-7758ccfaf0d4n%40googlegroups.com.

Felipe Stival

unread,
Sep 2, 2021, 2:27:15 PM9/2/21
to elixir-l...@googlegroups.com
+1. 

As of now, I see no harm in the complexity gain related to the change. 

I like that if this feature is added on DynamicSupervisor (instead of as a new module or dependency) it can be used for free in TaskSupervisor as well.

--

Christopher Keele

unread,
Sep 2, 2021, 4:44:58 PM9/2/21
to elixir-lang-core
This sounds great to me. A couple of thoughts:

1. DynamicSupervisor.start_link(__MODULE__, :ok, partitions: 8) — should we default partitions to 1?
  Technically a more aggressive value would be acceptable, since the goal is for this to be a transparent improvement to DynamicSupervisor.
  But I can imagine removing an old bottleneck possibly having unintended side effects in applications, ex different throughput characteristics may overwhelm otherwise tuned aspects of their systems.
2. Pick one partition at random when starting a child — should this be a different algorithm, customizable, or selectable from some presets ?
  Indeed the sample code you include doesn't seem to be true random, but a hash-bucket approach.
  I guess round-robin would be most obvious algorithm, but that requires co-ordination, and the stateless hashing approach has a similar distribution with enough load.
  

Reply all
Reply to author
Forward
0 new messages