Proposing Registry

1,027 views
Skip to first unread message

José Valim

unread,
Oct 25, 2016, 5:06:47 PM10/25/16
to elixir-l...@googlegroups.com
Hello everyone,

I would like to propose the addition of the Registry project to Elixir:


The Registry project is a local and scalable key-value process storage in Elixir. It encapsulates 3 known use cases:

* Process registry: to register process with dynamic names. Often Elixir developers need to rely on gproc or other tools.
* Code dispatching: dispatch a module/function associated to a given key
* PubSub implementation: send messages to local processes registered under a given topic

There are probably other use cases waiting to be discovered. :)
You can learn more in the documentation:


The project clocks only 700LOC with documentation and performs well. We have extracted, improved and generalized the patterns from Phoenix.PubSub, the exact implementation used to manage and publish messages to 2 million subscribers.

When benchmarking thousands of processes registering serially, it is twice slower than local atoms, albeit 33% faster than gproc. On concurrent cases, it distributes well across all cores, becoming only 15% slower than local atoms, and 3x faster than gproc (gproc seems to be serial on its default configurations).

Please give it a try and let us know what you think.


José Valim
Skype: jv.ptec
Founder and Director of R&D

Brian Cardarella

unread,
Oct 25, 2016, 8:08:37 PM10/25/16
to elixir-lang-core, jose....@plataformatec.com.br
Is this meant to be used instead of ets?

Chris Keele

unread,
Oct 25, 2016, 8:44:02 PM10/25/16
to elixir-lang-core
Can this be used as a general key-value store, or generally register anything other than processes? The docs suggest not. In which case, would Process.Registry be the name of this thing when added to the stdlib? We don't have a strong tradition of nested namespaces in the stdlib but this seems like a good candidate to me.

Saša Jurić

unread,
Oct 26, 2016, 2:48:52 AM10/26/16
to elixir-lang-core, jose....@plataformatec.com.br
Looks great! A very welcome addition :-)

Looking at docs and code, it seems that Registry suffers from a mild case of the :supervisor syndrome :-) Namely, there's some conditional logic and documentation depending on the registry type (unique vs duplicate). I wonder if these should be split in two, say Registry.Unique and Registry.Duplicate, or maybe just Registry and PubSub, since the latter is not a :via compatible registry anyway?

I also agree with Chris that it should reside under the Process namespace.
Message has been deleted

Eugeniu Tambur

unread,
Oct 26, 2016, 4:19:05 AM10/26/16
to elixir-lang-core, jose....@plataformatec.com.br
Looks great!!! 
I think it will be a very welcome addition :-)


José Valim

unread,
Oct 26, 2016, 5:37:11 AM10/26/16
to Eugeniu Tambur, elixir-lang-core
Is this meant to be used instead of ets?

Generally no. However, if you need to store data that is automatically removed when a process dies, it may be a good candidate.

would Process.Registry be the name of this thing when added to the stdlib? 

That's a good question. Right now the word registry is highly specific to processes, that's why we didn't consider Process.Registry but we will keep it in mind. One downside for this change is that I could see a strong argument being made for Process.Supervisor (instead of Supervisor). Both are process specific after all.

Looking at docs and code, it seems that Registry suffers from a mild case of the :supervisor syndrome :-)

We had this in mind when discussing the registry. At this point, it is rather a :ets syndrome: functions receive the same arguments and return the same values although may behave slightly different depending on the registry type. After your question, I have decided to quantify those a bit more:

Supervisor:

* Functions that behave exactly the same: count_children/1, which_children/1
* Functions that expect same inputs/outputs but behave differently: start_link/3 (due to differences in the init/1 callback)
* Functions that expect different inputs/outputs: start_child/2, terminate_child/2
* Functions that are not supported by all types: delete_child/2, restart_child/2

Registry:

* Functions that behave exactly the same: start_link/3, dispatch/3, keys/2, unregister/2
* Functions that expect same inputs/outputs but behave differently: register/3 (one type supports multiple registrations)
* Functions that expect different inputs/outputs:
* Functions that are not supported by all types: whereis/2 (which powers :via)

As you can see, more than half of the functions in Supervisor behave differently depending on the type, many having different inputs/outputs or are not supported across types.

For the registry, we don't have functions that expect different inputs and outputs depending on the type and a single function that is not supported across registries. Most operations behave exactly the same, except register/3 (which is the point of having different registries).

Code wise there are some checks on the registry type but they are mostly optimization. For example, the checks on dispatch/3 and keys/2 are optimizations and could be removed by doing more work at some point.

It is hard to know if two functionalities should be kept apart or together. Right now, it makes sense to keep them together but this may not be true in the future when the API surface grows. On the other hand, we could split them and realize in 2 years that even with API additions, they behave the same. Our best bet is to quantify how similar they are to each other now and try to estimate what we may want from the registry in the future.

Furthermore, if we split it apart, I don't believe we should call it PubSub since it isn't one (although it can be used to build one).

Thanks everyone for the feedback so far!

Saša Jurić

unread,
Oct 26, 2016, 6:19:52 AM10/26/16
to elixir-l...@googlegroups.com

On 26 Oct 2016, at 11:36, José Valim <jose....@plataformatec.com.br> wrote:

Registry:

* Functions that behave exactly the same: start_link/3, dispatch/3, keys/2, unregister/2

Looking at docs for dispatch/3, keys/2, unregister/2, I see different explanations for unique vs duplicate. For example for keys/2: "If the registry is unique, the keys are unique. Otherwise they may contain duplicates if the process was registered under the same key multiple times.”. So the signatures might be the same, but behavior varies.

It’s true that it’s more like an ets syndrome, but it still feels these two versions are doing different things. One is for a unique registration, another for process groups. Separating them would slightly simplify interface of each. For example, in a registry I don’t need dispatch/3, but rather value/2. Via registration would be possible only with the registry, not with groups. There wouldn’t be whereis in groups.

Another question: maybe I’m missing it, but how can I get a value associated with the process(es)? It seems I can only do a side-effect with dispatch.

José Valim

unread,
Oct 26, 2016, 6:34:49 AM10/26/16
to elixir-l...@googlegroups.com
Looking at docs for dispatch/3, keys/2, unregister/2, I see different explanations for unique vs duplicate. For example for keys/2: "If the registry is unique, the keys are unique. Otherwise they may contain duplicates if the process was registered under the same key multiple times.”. So the signatures might be the same, but behavior varies.

The documentation for keys/2 could be described as: "Returns all keys for the given process."

If the registry is unique, it will obviously not have duplicate keys, and vice-versa. The fact the documentation is being clear does not imply a change of behaviour. As mentioned, the implementation could be the same if optimizations are removed. In fact they are the same for unregister/2. The difference of behaviour is only on register/3.

I could make the docs worse to prove this point but I would prefer to keep the docs clear with examples and conditions than to be succinct.
 
It’s true that it’s more like an ets syndrome, but it still feels these two versions are doing different things. One is for a unique registration, another for process groups. Separating them would slightly simplify interface of each.

Except it isn't a process group. You can't lookup all processes with their values on a given key.
 
For example, in a registry I don’t need dispatch/3, but rather value/2.

Both are useful. Being able to dispatch/3 without caring what is the registry kind may be useful. And for when you need the value, there is whereis/2.

José Valim

unread,
Oct 26, 2016, 6:44:01 AM10/26/16
to elixir-l...@googlegroups.com
I could make the docs worse to prove this point but I would prefer to keep the docs clear with examples and conditions than to be succinct.

Unless it turns out not listing how each registry behaves is clearer. In this case, we should remove the conditional paragraphs. Feedback here is appreciated.

Except it isn't a process group. You can't lookup all processes with their values on a given key.

We could make it a process group though but it doesn't play well with partitioning (we need to do N serial lookups where N is the number of partitions). Food for thought.

Sorry for being too succinct in the previous e-mail. :)

Saša Jurić

unread,
Oct 26, 2016, 8:07:20 AM10/26/16
to elixir-l...@googlegroups.com
On 26 Oct 2016, at 12:43, José Valim <jose....@plataformatec.com.br> wrote:

I could make the docs worse to prove this point but I would prefer to keep the docs clear with examples and conditions than to be succinct.

Unless it turns out not listing how each registry behaves is clearer. In this case, we should remove the conditional paragraphs. Feedback here is appreciated.

If they remain unified, I think the differences should be listed. Which is an implicit way of me saying that I think two different things are conflated into one :P


Except it isn't a process group. You can't lookup all processes with their values on a given key.

We could make it a process group though but it doesn't play well with partitioning (we need to do N serial lookups where N is the number of partitions). Food for thought.

Why can’t you do similar to what you’re doing with dispatch: N concurrent reads and then reduce results into a list?

Being able to dispatch/3 without caring what is the registry kind may be useful.

What would be an example for this?

And for when you need the value, there is whereis/2.

Ah, somehow I missed that whereis also returns a value, sorry.

matteo brancaleoni

unread,
Oct 26, 2016, 8:47:27 AM10/26/16
to elixir-lang-core, jose....@plataformatec.com.br
Hi,
 

* Process registry: to register process with dynamic names. Often Elixir developers need to rely on gproc or other tools.

How is different from registering a GenServer with the {:global, term} name where term can be dynamic ?

(I'm rather new in elixir, so excuse me if is a silly question)


José Valim

unread,
Oct 26, 2016, 8:59:51 AM10/26/16
to matteo brancaleoni, elixir-lang-core
Why can’t you do similar to what you’re doing with dispatch: N concurrent reads and then reduce results into a list?

We could do that although I am afraid doing N concurrent reads will likely add more overhead than not. We will need to spawn multiple processes and then copy the data back.

To put it another way, we could support it, but there is a cost per partition which I am afraid we can't go around.

How is different from registering a GenServer with the {:global, term} name where term can be dynamic ?

The global term is distributed. It means that the value is global across *all nodes*, which also incurs communication costs on multi-node setups.



José Valim
Skype: jv.ptec
Founder and Director of R&D

José Valim

unread,
Oct 26, 2016, 11:04:06 AM10/26/16
to matteo brancaleoni, elixir-lang-core
@voltone and others have benchmarked the registry. I have consolidated @voltone results here:


Summary: on a machine with 40 cores, a Registry with 40 partitions provides across the board better results on concurrent registration. Without partitioning, the Registry starts to scale poorly when the concurrency factor is somewhere between 8 and 20.

Surprisingly, gproc performs quite well, even in concurrent scenarios. <speculation>I am assuming serializing writes ensures there is no contention for concurrent threads, so less coordination and process switches.</speculation>

Erlang's built-in atom registration does not perform well in highly concurrent scenarios but that's fine. It was not meant to support dynamic names registration anyway (and doing so would certainly be a bug in your app!).


José Valim
Skype: jv.ptec
Founder and Director of R&D

Chris Keele

unread,
Oct 26, 2016, 11:12:44 AM10/26/16
to elixir-lang-core, eugen...@gmail.com, jose....@plataformatec.com.br
Right now the word registry is highly specific to processes, that's why we didn't consider Process.Registry but we will keep it in mind. One downside for this change is that I could see a strong argument being made for Process.Supervisor (instead of Supervisor). Both are process specific after all.

This is a good point. My only thought is that Supervisor is made to be used to build others (Task.Supervisor) so stays out of the way in the top level, whereas Registry cannot be used (AFAIK) and we might want to create other kinds of Registry in the future (Module.Registry, Node.Registry, or something), so it stays out of the way by being under the Process namespace.

@voltone and others have benchmarked the registry.

Those are some pretty incredible benchmarks! :o

José Valim

unread,
Oct 26, 2016, 4:46:13 PM10/26/16
to elixir-l...@googlegroups.com
I have bumped the version to 0.2.0. This version removes whereis/2 in favor of lookup/2. The new lookup/2 function works on both unique and duplicate registries.

put_info/3 and update/3 have also been added. 


José Valim
Skype: jv.ptec
Founder and Director of R&D

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/08f13768-4b85-43d9-816f-dffd0951854c%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Piotr Szeremeta

unread,
Oct 26, 2016, 4:53:55 PM10/26/16
to elixir-lang-core, jose....@plataformatec.com.br
Hi

Are there plans to support queries in dispatch/3 ? Something like:

iex> {:ok, _} = Registry.start_link(:duplicate, Registry.DispatcherTest) iex> {:ok, _} = Registry.register(Registry.DispatcherTest, {Greetings, "hello"}, {IO, :inspect}) iex> Registry.dispatch(Registry.DispatcherTest, {Greetings, :_ }, fn entries -> ...> for {pid, {module, function}} <- entries, do: apply(module, function, [pid]) ...> end)

On the other hand this could be somewhat emulated with:

iex> {:ok, _} = Registry.start_link(:duplicate, Registry.DispatcherTest) iex> {:ok, _} = Registry.register(Registry.DispatcherTest, "hello", :world) iex> Registry.dispatch(Registry.DispatcherTest, "hello", fn 
 ...> {pid, :world} -> do_stuff
...> _ -> dont_do_stuff
...> end)

José Valim

unread,
Oct 26, 2016, 5:06:27 PM10/26/16
to elixir-l...@googlegroups.com
No plans.



José Valim
Skype: jv.ptec
Founder and Director of R&D

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.

ILYA Khlopotov

unread,
Oct 27, 2016, 9:47:18 AM10/27/16
to elixir-l...@googlegroups.com
Looks great!!

My 2 cents. Please consider changing link to monitor (or support it as an option) to avoid gen_event mistakes.
What if we don't want our process to die when registry crashes. We would have to trap_exits and handle EXIT but how we would know a pid of registry partition in our process to match on.
with monitors the process can pass a callback MFA to be called when either registry or the process itself dies. Something similar to gen_server's terminate.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

ILYA Khlopotov

unread,
Oct 27, 2016, 11:56:33 AM10/27/16
to elixir-l...@googlegroups.com
In addition to dispatch which dispatches to all processes it would be nice to be able to dispatch to 1 process (or m out of n). possible dispatch strategies:
- round robin (unlikely possible since it is not stateless strategy)
- random
- m processes chosen randomly
- provided MFA dispatch strategy

On October 25, 2016 2:06:24 PM PDT, "José Valim" <jose....@plataformatec.com.br> wrote:

José Valim

unread,
Oct 27, 2016, 12:33:54 PM10/27/16
to elixir-l...@googlegroups.com
There is a lookup function that would allow you to retrieve all entries for a given key and then you can perform the dispatching strategy as you wish.



José Valim
Skype: jv.ptec
Founder and Director of R&D

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.

José Valim

unread,
Oct 27, 2016, 8:34:56 PM10/27/16
to elixir-l...@googlegroups.com
Registry has been updated to v0.3.0 with:

  1. Improved performance for 1 partition (most common case)
  2. Allow a process to update its own value
  3. Support for named listeners
  4. Allow metadata to be stored along-side the registry

The last two features should make the registry more extensible. Listeners allows you to give a named process when the registry is started that will receive events whenever a process registers or unregisters a key. You could use those features to add await support to the registry or even use the registry as a pool. We have added an example of using the registry as a sojourn pool to the examples directory. The pool is an example, do not use it in production:


The pool above measures the sojourn time, which is how long messages stay in each pooled process queue. That provides an idea of how fast processes are responding. Whenever there is a dispatch, we pick two random pooled processes and choose the one with the most recent reply and smaller sojourn time.

The pool itself works by starting a worker and a registry. The worker is a listener of the registry and a supervisor guarantees that both are restarted in case any of crashes. Whenever a new process is added to the registry, the worker is notified and starts sending sampling messages to that process which updates its own entry in case of crashes.

I hope it serves as inspiration for playing with the registry for other use cases.

Paul Schoenfelder

unread,
Oct 27, 2016, 8:59:29 PM10/27/16
to elixir-l...@googlegroups.com
Hey José,

The use case I've been working on/with, and am most interested in, is a better distributed (global) registry, is this something that Registry will support (or perhaps even currently supports)? I haven't yet had time to look at Registry in detail, but it was the first thing that came to mind when I saw the announcement.

Paul


--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.

matteo brancaleoni

unread,
Oct 28, 2016, 12:54:55 AM10/28/16
to elixir-l...@googlegroups.com
Hi,

On Fri, Oct 28, 2016 at 2:59 AM, Paul Schoenfelder <paulscho...@gmail.com> wrote:
Hey José,

The use case I've been working on/with, and am most interested in, is a better distributed (global) registry, is this something that Registry will support (or perhaps even currently supports)? I haven't yet had time to look at Registry in detail, but it was the first thing that came to mind when I saw the announcement.


under the hood it uses the erlang :ets  terms storage, which is not distributed and cannot be made distributed.
Maybe with mnesia something can be done :)

Paul Schoenfelder

unread,
Oct 28, 2016, 1:46:23 AM10/28/16
to elixir-l...@googlegroups.com
Right, distribution implies some form of replication, which is what mnesia does under the covers. Mnesia isn't a good fit though, because it can't handle dynamic node membership in a cluster (so any kind of autoscaling orchestration is out). The library I wrote uses eventually consistent replication, but I'd love to see a solution which is either part of Registry, or can be built on it as an extension of some kind.

Paul
--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.

José Valim

unread,
Oct 28, 2016, 4:35:33 AM10/28/16
to elixir-l...@googlegroups.com
Right, distribution implies some form of replication, which is what mnesia does under the covers. Mnesia isn't a good fit though, because it can't handle dynamic node membership in a cluster (so any kind of autoscaling orchestration is out). The library I wrote uses eventually consistent replication, but I'd love to see a solution which is either part of Registry, or can be built on it as an extension of some kind.

The Registry is local by definition. The Elixir team currently has no plans of tackling a distributed Registry specially because a global, distributed, unique registry is currently being tackled by the OTP team.

We do have plans to tackle a distributed process group (which Phoenix.Presence already is) and a distributed process registry as part of the Phoenix.PubSub "suite" (and not Elixir) using many of the concepts from Gabi's talk at ElixirConf. But it is still too early to tell.

José Valim

unread,
Oct 28, 2016, 6:25:30 PM10/28/16
to elixir-l...@googlegroups.com
The benchmark results have been refreshed: https://docs.google.com/spreadsheets/d/1MByRZJMCnZ1wPiLhBEnSRRSuy1QXp8kr27PIOXO3qqg/edit#gid=0

Thanks to Bram Verburg for running the new batch and updating the data!



José Valim
Skype: jv.ptec
Founder and Director of R&D

José Valim

unread,
Nov 2, 2016, 5:20:48 AM11/2/16
to elixir-l...@googlegroups.com
The Registry has been merged into Elixir master.



José Valim
Skype: jv.ptec
Founder and Director of R&D

Paweł Dawczak

unread,
Nov 2, 2016, 12:31:58 PM11/2/16
to elixir-lang-core, jose....@plataformatec.com.br
Absolutely great news! Thanks for your work!


On Tuesday, 25 October 2016 22:06:47 UTC+1, José Valim wrote:
Hello everyone,

I would like to propose the addition of the Registry project to Elixir:


The Registry project is a local and scalable key-value process storage in Elixir. It encapsulates 3 known use cases:

* Process registry: to register process with dynamic names. Often Elixir developers need to rely on gproc or other tools.
* Code dispatching: dispatch a module/function associated to a given key
* PubSub implementation: send messages to local processes registered under a given topic

There are probably other use cases waiting to be discovered. :)
You can learn more in the documentation:


The project clocks only 700LOC with documentation and performs well. We have extracted, improved and generalized the patterns from Phoenix.PubSub, the exact implementation used to manage and publish messages to 2 million subscribers.

When benchmarking thousands of processes registering serially, it is twice slower than local atoms, albeit 33% faster than gproc. On concurrent cases, it distributes well across all cores, becoming only 15% slower than local atoms, and 3x faster than gproc (gproc seems to be serial on its default configurations).

Please give it a try and let us know what you think.
Reply all
Reply to author
Forward
0 new messages