You don't have to implement your own supervisor to get this kind of
behavior, simply move connection out of initialization. As a general
rule initialization should never be dependent on anything outside your
node's control -- especially not something across the network.
It is less complicated to either:
1. Write a service manager: A connection manager process whose job it is
to know what connections have failed and how long ago and implements
*exactly* the kind of backoff you want by having the workers start up
disconnected and have a connect/0 call.
2. Write smarter workers: The connections processes themselves written
to handle the case where the connection is lost and implement reconnect
backoff themselves.
Which way you choose to do this is up to you. Neither is very complicated.
Losing an external resource is not a *fault* in your program, but rather
an expected case that you know about and are discussing right now.
Putting this into supervisors is overloading and specializing
supervisors to handle a state management task that belongs either in a
sub-service manager process or inside the state of the workers
themselves. I tend to opt for the "write a service manager" approach
when we have a simple_one_for_one type supervisor structure (typical of
the case where we have multiple incoming connections, the
service->worker pattern), and the "write smarter workers" approach when
we have a predetermined number of connections of various types (often
meaning named workers connecting to specific external resources like one
connection each to a DB, an upstream feed, and a presence service, all
of which have totally different code internally).
"Write smarter children" sometimes becomes "write a backoff connection
behavior" so that the details of backoff can be implemented just once,
but if it is three or fewer modules... meh.
-Craig