[Proposal] GenServer delayed initialization callback

369 views
Skip to first unread message

Michał Muskała

unread,
May 15, 2016, 8:45:27 AM5/15/16
to elixir-l...@googlegroups.com
Hello everybody,

Today GenServer (and all stdlib behaviours) are initialized
synchronously - only after init/1 returns the start_link function will
return. But in many places the initialization may be expensive or is
not essential to the GenServer's operation and can be, at least
partially, delayed.

There are couple solutions to this problem. One is sending itself a
message from init/1, which is error prone, because we have no
guarantees that this will be the first message received by the
process. The other one is to use :proc_lib or :gen directly, similar
to how it's used in the connection library
https://github.com/fishcakez/connection/blob/master/lib/connection.ex#L595
- this solution is correct, but very complicated and requires advanced
knowledge of OTP internals.

I'd like to propose adding another callback to GenServer called
delayed_init/1. that would be called if a new tuple {:delayed_init,
arg, state} is returned from init/1. The callback would be called
after start_link returned, but before any message is processed by the
server.
The delayed_init/1 would support returning:
{:ok, state} - for entering normal GenServer loop
{:ok, state, timeout | :hibernate} - similar to init/1
{:stop, reason, state} - similar to the return value of call/3 and cast/2

This would allow to deal with the problem easily from application code
and simplify libraries such as connection significantly, removing all
the OTP plumbing code.

Example use cases include - opening sockets or ports, loading some
state from external resources like databases or doing some expensive
initialization that we know usually succeeds and that should not block
starting the supervision tree, awaiting the start of other parts of
the supervision tree or other applications before processing messages.

What do you think about that?

Michał.

Andrea Leopardi

unread,
May 15, 2016, 9:34:17 AM5/15/16
to elixir-lang-core
I'm not sure about this. It would make Elixir's GenServer different from Erlang's gen_server, and I'm not sure that's ok. I agree that it's nasty to do this correctly, but it's still very possible.
Also, I would say that if you depend on said database resources/expensive initialization for your application to function properly, than I think it's ok to have a long initialization. Section 2.2 in chapter 2 of Erlang in Anger has a very interesting discussion about this.

Maybe this is more suited to an external library (similar to (gen :P)connection)?

Michał Muskała

unread,
May 15, 2016, 10:20:44 AM5/15/16
to elixir-l...@googlegroups.com
Yes, diverging from Erlang's gen_server is an issue, but we're already
doing that, albeit on a smaller scale. On the other hand this change
is purely extensible - if not used, it does not affect your code in
any way.

As with most such things synchronous vs asynchronous initialization is
a tradeoff, I'd say it should be easily possible for the developers to
choose which guarantees interest them. Right now one is extremely easy
to achieve, and the other is rather complicated. I agree, though, that
synchronous initialization is a better choice in most circumstances,
but when you need async, well - you need it ;)
Another case for async initialization is that the synchronous one is
happening one process at a time. In case you have a lot of processes
that all need to load some state from database doing it concurrently
should be a huge improvement in the startup time. The guarantee of the
supervision tree is decreased only slightly if you have a process that
checks for the availability of the connection earlier in the
supervision tree.

Doing this as a separate library was my initial idea, until I tried to
find a name and couldn't come up with anything satisfying other than
just "GenServer". At that point I decided to write this proposal
first, but doing it as a library is still something I'm considering.
Writing a proposal was also a way to gather opinions on usefulness of
such a library.
> --
> You received this message because you are subscribed to the Google Groups
> "elixir-lang-core" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elixir-lang-co...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elixir-lang-core/f766ac8f-a9e3-4bf7-a98a-ce280e7fd69e%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

James Fish

unread,
May 15, 2016, 10:23:24 AM5/15/16
to elixir-l...@googlegroups.com
I think the larger problem being tackled here is that current OTP behaviours require the callbacks to be purely message based and do not make it easy to dispatch internal events. Have you looked at :gen_statem in OTP 19 rc1? I think it solves this problem quite well, or at least I have not seen something I think is better. Once OTP 19 lands I'll release :gen_connection which is :connection but using a :gen_statem instead of a :gen_server. It simplifies the code enormously and works better with OTPs logging and debugging features. It might be that you want to propose including GenStatem instead of a delayed init.

Peter Hamilton

unread,
May 15, 2016, 11:07:02 AM5/15/16
to elixir-l...@googlegroups.com

One nitpick. Merge arg and state in the return in init. Having them separate is confusing. Delayed init takes one parameter and returns a new state. What happens to the originally returned state?


Eric Meadows-Jönsson

unread,
May 15, 2016, 1:10:08 PM5/15/16
to elixir-l...@googlegroups.com
There are couple solutions to this problem. One is sending itself a
message from init/1, which is error prone, because we have no
guarantees that this will be the first message received by the
process.

What are the instances when a message can arrive before one sent from init/1?


For more options, visit https://groups.google.com/d/optout.



--
Eric Meadows-Jönsson

Michał Muskała

unread,
May 15, 2016, 1:13:15 PM5/15/16
to elixir-l...@googlegroups.com
2016-05-15 17:06 GMT+02:00 Peter Hamilton <petergh...@gmail.com>:
> One nitpick. Merge arg and state in the return in init. Having them separate
> is confusing. Delayed init takes one parameter and returns a new state. What
> happens to the originally returned state?
>

It was my intention to have delated_init/2 function taking the
argument and state as arguments. Thank you for pointing that out.

On Sun, May 15, 2016, 7:23 AM James Fish <ja...@fishcakez.com> wrote:
>
> I think the larger problem being tackled here is that current OTP
> behaviours require the callbacks to be purely message based and do not make
> it easy to dispatch internal events. Have you looked at :gen_statem in OTP
> 19 rc1? I think it solves this problem quite well, or at least I have not
> seen something I think is better. Once OTP 19 lands I'll release
> :gen_connection which is :connection but using a :gen_statem instead of a
> :gen_server. It simplifies the code enormously and works better with OTPs
> logging and debugging features. It might be that you want to propose
> including GenStatem instead of a delayed init.

I have to admit, I haven't looked into gen_statem before. Now that I
have, I can see some overlap in the things it offers and my proposal,
especially the ability to call `enter_loop` by hand. My main use case
of this feature was offering a custom gen_server-like behaviours, so
I'm fine with that level of detail to handle in a lib.
Do you intend to keep the gen_server like interface for gen_connection
or rather migrate to something more similar to gen_statem?

Michał.

José Valim

unread,
May 15, 2016, 1:14:46 PM5/15/16
to elixir-l...@googlegroups.com
Michał, we can also call enter_loop for GenServer.

Eric, the process is registered before we call init. So another process can send the registered name a message.
--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


--


José Valim
Skype: jv.ptec
Founder and Director of R&D

Peter Hamilton

unread,
May 15, 2016, 1:16:25 PM5/15/16
to elixir-l...@googlegroups.com
On Sun, May 15, 2016 at 10:10 AM Eric Meadows-Jönsson <eric.meado...@gmail.com> wrote:
There are couple solutions to this problem. One is sending itself a
message from init/1, which is error prone, because we have no
guarantees that this will be the first message received by the
process.

What are the instances when a message can arrive before one sent from init/1?
Convention can provide fairly strong guarantees, but there are no absolute ones. An example of it not being the first message:

def init(_) do
  send other_process, {:new_pid, self}
  send self, :connect
  {:ok, nil}
end

In theory, there's a race condition there in which other_process could attempt to send a message that arrives before :connect.

Aside from this example, are named processes registered before or after init runs? if it's before, that's another avenue.

Saša Jurić

unread,
May 15, 2016, 1:54:49 PM5/15/16
to elixir-lang-core
I think this is an excellent idea! Performing possible long running initialization is currently burdened with some caveats, which are not always obvious, especially to newcomers. Having a callback which is invoked after the init returns (and the ack has been sent to the parent) , but before the receive loop starts would be a great simplification. I'd call the callback post_init, and make it non-optional. The callback then becomes a part of the GenServer lifecycle, and there are no changes to return tuples from init/1. The __using__ macro of GenServer could provide the default impl, which would do nothing, thus ensuring that there are no breaking changes.

I wonder if this would be best solved at the OTP level. That's of course assuming that OTP team would be open to such change.

Either way, I'm definitely +1 on this feature.


On Sunday, May 15, 2016 at 2:45:27 PM UTC+2, Michał Muskała wrote:

James Fish

unread,
May 15, 2016, 2:06:39 PM5/15/16
to elixir-l...@googlegroups.com
Exactly the same API


Michał.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

Michał Muskała

unread,
May 16, 2016, 9:27:20 AM5/16/16
to elixir-l...@googlegroups.com
I misunderstood the enter_loop function, I'm not fine with that as the
API after all.

2016-05-15 19:54 GMT+02:00 Saša Jurić <sasa....@gmail.com>:
> I'd call the callback
> post_init, and make it non-optional. The callback then becomes a part of the
> GenServer lifecycle, and there are no changes to return tuples from init/1.
> The __using__ macro of GenServer could provide the default impl, which would
> do nothing, thus ensuring that there are no breaking changes.

I like the name post_init better.

The problems I see with an implicitly invoked callback are:
- it silently fails on older versions leaving the process not
initialized, while returning an unknown tuple would cause an error
- the initialization params need to be passed through state, so if
they are not normally part of the state now I need some additional
field to store them.

Sasa Juric

unread,
May 16, 2016, 10:05:25 AM5/16/16
to elixir-l...@googlegroups.com
Another idea is to consider supporting explicit acknowledgement to the parent. If I’m not mistaken, the parent is blocked until :proc_lib.init_ack is invoked. Perhaps the callback could call init_ack manually from init/1, to allow the parent to move on. However, now we need to inform the behaviour that we’ve already notified the parent ourselves, so it doesn’t send the ack again. Maybe introducing responses like {:acknowledged, standard_init_response} would do the trick. 

Another (possibly ugly) variation is to introduce a wrapper function and use proc dict to cache acknowledgment. The client code could look like:

def init(arg) do
  # do essential init here
  GenServer.send_ack()
  # do long running init, and return standard init response
end

While admittedly hacky, this has following benefits:

1. No need for additional init responses, which are possibly error prone anyway
2. No additional callback required
3. No need to store arg in the state
4. This code will crash explicitly on older versions. However, older code will work precisely as before.

On the flip side, it’s a bit implicit and polymorphic - parent is notified after init, unless explicitly done from init/1. Relying on proc dict is also somewhat controversial, though not unprecedented. IIRC, OTP uses it for some internal stuff (e.g. $ancestors) already.

Thoughts?


--
You received this message because you are subscribed to a topic in the Google Groups "elixir-lang-core" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elixir-lang-core/fLdVQDZcFo0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGAFNpnuiXOLP%2BuhUWGzm5XaDFYo35Lpz4NZdbNurtV4druvVQ%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages