Connection timeout in create_connection

1,158 views
Skip to first unread message

Jonathan Slenders

unread,
Apr 3, 2014, 5:34:05 AM4/3/14
to python...@googlegroups.com
Hi all,

For asyncio_redis, tumb1er posted an issue that create_connection could block forever.

First he proposed to wrap create_connection into wait_for with a timeout=1 parameter, but that would leave the file descriptor open that was created in BaseEventLoop.create_connection().  Therefore he proposes to create the socket ourself, using sock_connect , and if that fails due to a timeout, call sock.close manually.


I'm not sure whether I understand this correctly. How is it possible that loop.create_connection blocks? Doesn't it just fail after a while if the host is not reachable? How are other libraries handling this? I don't like to copy paste much of the asyncio code to solve this issue, especially because it's something not directly related to asyncio_redis, but something that would apply to other libraries as well.


Jonathan.

Andrew Svetlov

unread,
Apr 3, 2014, 5:47:43 AM4/3/14
to Jonathan Slenders, python...@googlegroups.com
I think technically you can wait for long time if firewall eats TCP
packets for example. The code will receive error at the end but after
long timeout only.
--
Thanks,
Andrew Svetlov

Victor Stinner

unread,
Apr 3, 2014, 6:10:56 AM4/3/14
to Jonathan Slenders, python-tulip
Hi,

2014-04-03 11:34 GMT+02:00 Jonathan Slenders <jonathan...@gmail.com>:
> First he proposed to wrap create_connection into wait_for with a timeout=1
> parameter, but that would leave the file descriptor open that was created in
> BaseEventLoop.create_connection(). Therefore he proposes to create the
> socket ourself, using sock_connect , and if that fails due to a timeout,
> call sock.close manually.

When you use wait_for(), the coroutine gets a CancelledError and so
can cleanup its data (close the socket).

If create_connection() doesn't close the socket on error, it's a bug.
Please open an issue.

> How is it possible that loop.create_connection blocks?

The slowest part is probably the DNS resolution. By default,
getaddrinfo() is called in a thread.

> Doesn't it just fail after a while if the host is not reachable?

Network operations can be slow, but create_connection() should not
"block", the coroutine should be running in background, as any "async"
task.

It would help to identify exactly which Python line hangs.

Victor

Jonathan Slenders

unread,
Apr 3, 2014, 7:16:43 AM4/3/14
to python...@googlegroups.com, Jonathan Slenders
By blocking, I meant "not progressing anymore". Not that is blocks the thread.


Should we catch CancelledError here at this line?

Guido van Rossum

unread,
Apr 3, 2014, 1:33:24 PM4/3/14
to Jonathan Slenders, python-tulip
On Thu, Apr 3, 2014 at 4:16 AM, Jonathan Slenders <jonathan...@gmail.com> wrote:
By blocking, I meant "not progressing anymore". Not that is blocks the thread.


Should we catch CancelledError here at this line?


Or use a finally block. There are probably some other places where we don't clean up when unexpected exceptions strike. This was a good catch, Jonathan! We should make sure this is fixed in as many places as possible before 3.4.1 goes out.
 

Le jeudi 3 avril 2014 12:10:56 UTC+2, Victor Stinner a écrit :
Hi,

2014-04-03 11:34 GMT+02:00 Jonathan Slenders <jonathan...@gmail.com>:
> First he proposed to wrap create_connection into wait_for with a timeout=1
> parameter, but that would leave the file descriptor open that was created in
> BaseEventLoop.create_connection().  Therefore he proposes to create the
> socket ourself, using sock_connect , and if that fails due to a timeout,
> call sock.close manually.

When you use wait_for(), the coroutine gets a CancelledError and so
can cleanup its data (close the socket).

If create_connection() doesn't close the socket on error, it's a bug.
Please open an issue.

> How is it possible that loop.create_connection blocks?

The slowest part is probably the DNS resolution. By default,
getaddrinfo() is called in a thread.

> Doesn't it just fail after a while if the host is not reachable?

Network operations can be slow, but create_connection() should not
"block", the coroutine should be running in background, as any "async"
task.

It would help to identify exactly which Python line hangs.

Victor



--
--Guido van Rossum (python.org/~guido)

Jonathan Slenders

unread,
Apr 5, 2014, 4:24:31 AM4/5/14
to python...@googlegroups.com, Jonathan Slenders, gu...@python.org
How much time do we have for 3.4.1?

I think in this case I think it should be a catch of CancelledError, not a finally. On success, we don't want to close the socket when the coroutine terminates. Do we?

Guido van Rossum

unread,
Apr 5, 2014, 12:49:39 PM4/5/14
to Jonathan Slenders, python-tulip
I believe Victor already put a patch up for review, although I cannot find the issue right now. IIRC I even reviewed it -- but no search keywords I try show it, only the (AFAIK derived) http://bugs.python.org/issue21155

Jonathan Slenders

unread,
Apr 16, 2014, 5:20:00 AM4/16/14
to python...@googlegroups.com, Jonathan Slenders, gu...@python.org
Just a thought... if in the event loop we would use weakrefs for sockets. Do we ever have to handle CancelledError?
Would the socket automatically be closed by the garbage collector if there is no reference anymore to the transport, and because of that also no reference to the socket anymore?

I notice that cancellation can be hard to get right in my own projects, but it's sometimes too easy to forget about it, leading to weird bugs.

Glyph

unread,
Apr 17, 2014, 1:27:13 AM4/17/14
to Jonathan Slenders, python...@googlegroups.com, Guido van Rossum

On Apr 16, 2014, at 5:20 AM, Jonathan Slenders <jonathan...@gmail.com> wrote:

Just a thought... if in the event loop we would use weakrefs for sockets. Do we ever have to handle CancelledError?
Would the socket automatically be closed by the garbage collector if there is no reference anymore to the transport, and because of that also no reference to the socket anymore?

Relying on this is a recipe for disaster.  It's a great way to get mysterious unbounded resource leaks, mysterious GC-preventing circular references, and crashes when you run out of file descriptors.

All of this is two or three times as bad if you care about running on PyPy, where garbage collection semantics mean files may not be closed for much longer if there isn't a lot of memory pressure.

-glyph

Jonathan Slenders

unread,
Apr 18, 2014, 5:05:24 AM4/18/14
to python...@googlegroups.com, Jonathan Slenders, Guido van Rossum
Okay, thanks, definitely true.
Reply all
Reply to author
Forward
0 new messages