Auto-reconnect in transport.

921 views
Skip to first unread message

Jonathan Slenders

unread,
Dec 16, 2013, 3:51:11 AM12/16/13
to python...@googlegroups.com

Hi all, for the asyncio_redis library, I need a pointer on how to implement auto reconnect.

We currently do:
loop.create_connection(RedisProtocol, 'localhost', 6379)

But we want to create a transport that establishes a new TCP connection when the existing is closed (or timed out?). I understood that it should be the transport that calls connection_made and connection_lost every time again.

Thanks!
Jonathan

Guido van Rossum

unread,
Dec 16, 2013, 1:24:39 PM12/16/13
to Jonathan Slenders, python-tulip
I suppose you want to reuse the Protocol instance?

Perhaps in the connection_lost method you can call create_connection()
again with a protocol factory argument that's just "lambda: self".
--
--Guido van Rossum (python.org/~guido)

Jonathan Slenders

unread,
Dec 16, 2013, 2:55:54 PM12/16/13
to Guido van Rossum, python-tulip
Yes, I'd like to reuse the Protocol class.

That would work, but why is it that Twisted uses such a complicated ReconnectingClientFactory?

If we are going to implement a lot of protocols on top of asyncio, and if it's proven that such a reconnecting strategy works better, would that make a good candidate for the asyncio core itself?



2013/12/16 Guido van Rossum <gu...@python.org>

Jonathan Slenders

unread,
Dec 16, 2013, 3:17:38 PM12/16/13
to python...@googlegroups.com, Guido van Rossum, jona...@slenders.be
There is another issue:

We could do something like this in the protocol:

class Protocol:
  def connection_lost(self, exc):
    loop.create_connection(lambda: self, self.transport.address, self.transport.port)

but there is no way to retrieve the address or port from the transport or protocol. Somehow, we should be able to create a new, but identical transport instance from the same protocol.

Saúl Ibarra Corretgé

unread,
Dec 16, 2013, 3:51:33 PM12/16/13
to python...@googlegroups.com
On 12/16/2013 09:17 PM, Jonathan Slenders wrote:
> There is another issue:
>
> We could do something like this in the protocol:
>
> class Protocol:
> def connection_lost(self, exc):
> loop.create_connection(lambda: self, self.transport.address,
> self.transport.port)
>
> but there is no way to retrieve the address or port from the transport
> or protocol. Somehow, we should be able to create a new, but identical
> transport instance from the same protocol.
>

You can get the peer address out of a transport with
transport.get_extra_info('peername').

That should be the host, port tuple where the transport was connected.


--
Saúl Ibarra Corretgé
bettercallsaghul.com

Guido van Rossum

unread,
Dec 16, 2013, 4:02:01 PM12/16/13
to Jonathan Slenders, python-tulip
On Mon, Dec 16, 2013 at 11:55 AM, Jonathan Slenders
<jona...@slenders.be> wrote:
> Yes, I'd like to reuse the Protocol class.
>
> That would work, but why is it that Twisted uses such a complicated
> ReconnectingClientFactory?
> http://twistedmatrix.com/trac/browser/tags/releases/twisted-8.2.0/twisted/internet/protocol.py#L198

You'd have to ask Twisted.

> If we are going to implement a lot of protocols on top of asyncio, and if
> it's proven that such a reconnecting strategy works better, would that make
> a good candidate for the asyncio core itself?

But it's far from proven at this point. I'd like to see more people
experiment with different ways of reconnecting for different types of
protocols before adding to asyncio's feature list.

Glyph

unread,
Dec 17, 2013, 1:21:14 AM12/17/13
to Guido van Rossum, Jonathan Slenders, python-tulip

On Dec 16, 2013, at 1:02 PM, Guido van Rossum <gu...@python.org> wrote:

That would work, but why is it that Twisted uses such a complicated
ReconnectingClientFactory?
http://twistedmatrix.com/trac/browser/tags/releases/twisted-8.2.0/twisted/internet/protocol.py#L198

You'd have to ask Twisted.

This is widely regarded as a mistake and we are trying to fix it.  Putting this into the transport has all kinds of knock-on additional complexity; for example, you can connect either using connectTCP(), or connectTCP().connect().  You get two methods called for every connection drop (ClientFactory.clientConnectionLost, Protocol.connectionLost) which are purely redundant.

This is another case where pushing more convenience functionality into a primitive just means more re-implementation of that convenience functionality in different primitive implementations.  All of this reconnecting logic can be higher-level, defined in terms of wrappers around protocols, without any additional complexity on the part of the caller.

Technically it's not deprecated yet, but that's only because the replacement isn't done.  You can see the work (slowly) proceeding here: <http://twistedmatrix.com/trac/ticket/4735>.

-glyph

Antoine Pitrou

unread,
Dec 17, 2013, 6:27:48 AM12/17/13
to python...@googlegroups.com
On Mon, 16 Dec 2013 22:21:14 -0800
Glyph <gl...@twistedmatrix.com> wrote:
>
> This is another case where pushing more convenience functionality into a primitive just means more re-implementation of that convenience functionality in different primitive implementations. All of this reconnecting logic can be higher-level, defined in terms of wrappers around protocols, without any additional complexity on the part of the caller.

I don't understand how reconnection can be a "wrapper around protocols".
To reconnect you have to interact with a given transport. Worse, you
must *decide* whether to reconnect or not (because a normal connection
close should usually not lead you to reconnect). So it *is*
transport-specific already.

Actually, the general pattern may be to call self.transport.reconnect()
from the protocol's connection_lost().

Regards

Antoine.


Jonathan Slenders

unread,
Dec 18, 2013, 4:15:37 AM12/18/13
to python...@googlegroups.com


Le mardi 17 décembre 2013 12:27:48 UTC+1, Antoine Pitrou a écrit :
Actually, the general pattern may be to call self.transport.reconnect()
from the protocol's connection_lost().

Yes, it would be very useful to have a reconnect method on the transport,  
I was trying something like calling the following coroutine in the connection_lost:


def reconnect():
  info = transport.get_extra_info('peername')
  loop = asyncio.get_event_loop()
  transport, protocol = yield from loop.create_connection(lambda: self, *info)
  self.transport = transport

This will probably work, but makes an assumption on which kind of transport the protocol is using. What if this is a unix socket instead of a TCP socket. Does the get_extra_info return what needs to be passed into create_connection?

I'd like any of the following:

  transport.reconnect()
or
  new_transport = transport.reconnect(protocol_instance)

Glyph

unread,
Dec 18, 2013, 6:16:47 AM12/18/13
to Antoine Pitrou, python-tulip
On Dec 17, 2013, at 3:27 AM, Antoine Pitrou <soli...@pitrou.net> wrote:

On Mon, 16 Dec 2013 22:21:14 -0800
Glyph <gl...@twistedmatrix.com> wrote:

This is another case where pushing more convenience functionality into a primitive just means more re-implementation of that convenience functionality in different primitive implementations.  All of this reconnecting logic can be higher-level, defined in terms of wrappers around protocols, without any additional complexity on the part of the caller.

I don't understand how reconnection can be a "wrapper around protocols".
To reconnect you have to interact with a given transport. Worse, you
must *decide* whether to reconnect or not (because a normal connection
close should usually not lead you to reconnect).

You create a protocol factory wrapper (PFW) around an application protocol factory (PF) and some instructions for how to create an outbound connection that takes a protocol factory.

When PFW is asked to create a protocol, it asks PF to create a protocol P, and then creates a wrapped protocol PW, which delegates everything except connection_lost to P.

PW.connection_lost then executes the instructions to create a new outgoing connection with PF, and you get a new protocol for that new connection.

So it *is* transport-specific already.

No, it's protocol-specific.  If it were "transport-specific" that would mean that it would make sense to reconnect all TCP connections, for example.


Actually, the general pattern may be to call self.transport.reconnect()
from the protocol's connection_lost().

If you need to do something in connection_lost anyway, then why not just call create_connection again yourself?  The whole point of having transport-level support for reconnection, it seems to me, would be to implement application-independent reconnection policies.

(Surely you're not suggesting that you *re-use* the protocol instance?  That would be like re-using a stack frame for a subsequent call.)

In the simple case of TCP with happy eyeballs, and your original connection was to a host name and not an IP address, there are a multitude of things that "reconnect()" could mean.  Some examples:

  • start doing exactly what create_connection did in the first place, i.e. talk to your DNS servers, issue some parallel connections, return the first one.
  • do the second part of what create_connection does; assume the hostname resolution is going to return the same thing (perhaps you're within the TTL window on the records you'd previously resolved, in which case it's very nearly obligated to) but issue a bunch of connections
  • do only the TCP-level reconnecting, assuming the exact same IP address and port number.

Then if you imagine you had a connection to an SSH server, you could do all of those things with the host name and then also make some decisions about whether to make a new connection at all, or to simply re-issue the command over the existing SSH channel.

'transport.reconnect' is not a broad enough interface to communicate all of this stuff.  As the SSH example illustrates, you can't just add some flags, because every new transport implementation would add a slew of its own.  Plus, you kinda want to communicate these instructions to the application in some way in advance, and you don't want to do all this work for *re*connection in a way that can't be used for connecting in the first place.

To sum up: you cannot "re-connect" a transport.  Once the stream has ended, it's ended, and you have to make a whole ton of decisions about how to create one that is similar but will always, ultimately, be distinct.

-glyph

Antoine Pitrou

unread,
Dec 18, 2013, 8:43:19 AM12/18/13
to python...@googlegroups.com
On Wed, 18 Dec 2013 03:16:47 -0800
Glyph <gl...@twistedmatrix.com> wrote:
> > So it *is* transport-specific already.
>
> No, it's protocol-specific. If it were "transport-specific" that would mean that it would make sense to reconnect all TCP connections, for example.

You misunderstood what I said. It is transport-specific because you
must examine the specific reasons for disconnection before deciding
whether to reconnect or not.

> > Actually, the general pattern may be to call self.transport.reconnect()
> > from the protocol's connection_lost().
>
> If you need to do something in connection_lost anyway, then why not just call create_connection again yourself?

Because you don't know the connection parameters from inside the
protocol. It's irrational to ask the protocol to know about the
connection parameters when the whole point of protocols is to be
decoupled from transport and connection characteristics.

Regards

Antoine.


Glyph

unread,
Dec 18, 2013, 11:02:48 PM12/18/13
to Antoine Pitrou, python-tulip
On Dec 18, 2013, at 5:43 AM, Antoine Pitrou <soli...@pitrou.net> wrote:

If you need to do something in connection_lost anyway, then why not just call create_connection again yourself?

Because you don't know the connection parameters from inside the
protocol. It's irrational to ask the protocol to know about the
connection parameters when the whole point of protocols is to be
decoupled from transport and connection characteristics.

But, as you've said, the protocol needs to examine the reasons for disconnection anyway, and different disconnection reasons may result in different connection parameters.  And as I explained previously, even if you have *some* information about your current transport, there's information about how to re-create that transport ("connection parameters", as you say) which can vary by protocol even if the transport itself remembers everything.

So if you have a protocol that needs to know how to re-establish its connection, just write a protocol that knows what "re-establish" means.

-glyph

Antoine Pitrou

unread,
Dec 19, 2013, 2:08:31 AM12/19/13
to python...@googlegroups.com
On Wed, 18 Dec 2013 20:02:48 -0800
Glyph <gl...@twistedmatrix.com> wrote:
> On Dec 18, 2013, at 5:43 AM, Antoine Pitrou <soli...@pitrou.net> wrote:
>
> >> If you need to do something in connection_lost anyway, then why not just call create_connection again yourself?
> >
> > Because you don't know the connection parameters from inside the
> > protocol. It's irrational to ask the protocol to know about the
> > connection parameters when the whole point of protocols is to be
> > decoupled from transport and connection characteristics.
>
> But, as you've said, the protocol needs to examine the reasons for disconnection anyway, and different disconnection reasons may result in different connection parameters. And as I explained previously, even if you have *some* information about your current transport, there's information about how to re-create that transport ("connection parameters", as you say) which can vary by protocol even if the transport itself remembers everything.

Uh, no. The protocol needs to decide *whether* to reconnect, it
certainly shouldn't know *how* to reconnect (other than say
`self.transport.reconnect()`). Otherwise the protocol / transport
abstraction barrier is broken.

Regards

Antoine.


Glyph

unread,
Dec 20, 2013, 4:06:29 AM12/20/13
to Antoine Pitrou, python-tulip
You're right that there should be an abstraction barrier here. In Twisted, the abstraction here is called an "endpoint". However, while this abstraction should not have to be part of every protocol, it is also not part of every transport. A transport is an object that can move some bytes around. "Re-connecting" is not necessarily part of that contract, even for a "client" transport. The fact that TCP client superficially makes it possible to do this is misleading.

However, I guess you're not buying my previous argument about TCP or SSH reconnection though, so let's talk about something entirely different, not based on the same substrate :).

Consider a serial port. You "connect" to a serial transport because control over a serial port is exclusive. However, if the connection is lost (i.e. the serial device is removed) there's no sensible thing for "reconnect" to mean. Re-establishing the connection may well mean waiting for a USB hotplug event to tell you that a *new* serial device is available, and at that point you still have to figure out if it's the right kind somehow. So unlike with TCP where it means one of N ambiguous things, in this case it means *zero* things unless you have additional application-specific logic for auditing metadata about available serial devices.

Is that a more compelling example?

-glyph

Antoine Pitrou

unread,
Dec 20, 2013, 9:58:40 AM12/20/13
to python...@googlegroups.com
On Fri, 20 Dec 2013 01:06:29 -0800
Glyph <gl...@twistedmatrix.com> wrote:
>
> You're right that there should be an abstraction barrier here. In Twisted, the abstraction here is called an "endpoint". However, while this abstraction should not have to be part of every protocol, it is also not part of every transport. A transport is an object that can move some bytes around. "Re-connecting" is not necessarily part of that contract, even for a "client" transport. The fact that TCP client superficially makes it possible to do this is misleading.

This is true, but then reconnect() can simply raise
NotImplementedError.

> Consider a serial port. You "connect" to a serial transport because control over a serial port is exclusive. However, if the connection is lost (i.e. the serial device is removed) there's no sensible thing for "reconnect" to mean.

Same answer :-)

Regards

Antoine.


Jonathan Slenders

unread,
Dec 26, 2013, 7:27:26 AM12/26/13
to python...@googlegroups.com
Hi all,

I implemented auto reconnect for the redis library:

It happens here:

There is a pubsub example now that reconnects successfully:

Cheers,
Jonathan

Guido van Rossum

unread,
Dec 26, 2013, 4:14:55 PM12/26/13
to Jonathan Slenders, python-tulip
If you look in the Tulip examples directory there's a reconnecting
client for a custom protocol in cacheclt.py.

FWIW I expect that the dynamic creation of protocol classes in Johan's
example is wasteful -- a Python class object costs a lot of resources.
A plain old un-nested class and a lambda would seem more appropriate.

Stefan Scherfke

unread,
Sep 18, 2015, 6:49:27 AM9/18/15
to python-tulip, jonathan...@gmail.com, gu...@python.org
Hi,

I want to implement automatic reconnect for a request-reply channel
built with asyncio.  The challenge here (in contrast to what was
discussed before) is, that the server is stateful and needs to detect if
a client reconnects.

Before I discuss my ideas and questions, I’d like to briefly explain the
channel’s architecture (source code):

The UI is a Channel object that a client creates when it connects to
a server and that a server creates for each new connection.

A client sends a request to the server and waits for a reply with
rep
= await channel.send('ohai')

The server side channel waits for requests and replies to them with
req = await channel.recv()
req
.reply('cya')

=> Complete example

The Channel object wraps a ChannelProtocol which is very similar to the
built-in asyncio streaming protocol (except that my messages are not
"lines" but length-terminated JSON objects).  I don’t use custom
transport implementations.

When a client reconnects, both the client's and the server's channel
instances need to stay the same.  Creating a new protocol instance (if
needed) after a reconnect should be no problem, because it is not
exposed to the user.

I think the only way for the server to detect if a new client connects
or an existing client reconnects is that the client uses some kind of
session ID.  The server associates a session ID with a channel.  And
reuses channel instances for known SIDs and creates new channels else.

This means that a Channel will always send a greeting message to the
server after it successfully connected.  It also needs to send
a goodby-message when it disconnects on purpose (to tell the server
to close the session/channel).

Since every message already has a message type (REQUEST, RESULT,
EXCEPTION), it wouln’t hurt to add two more types (SESSION_START,
SESSION_STOP).

So my first question is: Does this sound reasonable?

What I described above should work well when the connections drops
*before* a request is made or *between* a request and its reply.  But
what happens if a disconnect happens while a request or reply is being
transmitted (after the user called channel.send() but before the server
received the complete message).

Is there a way to detect this?  Or should the client use a timeout and
resent the same message (with the same message ID) after a timeout
occurs?  The server could then decide whether to execute the message
(when the original message got lost on the way from the client to the
server) or just re-sent its reply (when the reply got lost on its way
from the server back to the client).

And how could I test this?  Its relatively easy to provoke disconnects
before a request or between a request and reply, but I have now idea how
to break the connection exactly when a message is being transfered.

Cheers,
Stefan
Reply all
Reply to author
Forward
0 new messages