Weak UDP support in asyncio

Victor Stinner

unread,

Feb 26, 2015, 8:37:55 AM2/26/15

to python-tulip

Hi,

I'm trying to port a project using eventlet to asyncio, but I don't
know how to use UDP. I like asyncio.open_connection() because it makes
possible to write sequential code using yield from.

UDP doesn't seem to be well supported in asyncio today :-( For
example, there are loop.sock_recv() and loop.sock_sendall() method for
stream protocols (TCP), but no loop.sock_recvfrom() or
loop.sock_sendto() for datagram protocols (UDP).

I know that UDP is not connected and has almost no guarantee that data
is sent, but sending a datagram may take time. sendto() can returns
temporarly EWOULDBLOCK. We have to buffer datagrams.

I see different options to write sequential UDP code:

- use a protocol which writes into a queue, and read from the queue
(no change required in asyncio)

- add sock_recvfrom() and sock_sendto() method to the event loop (it
would be better to implement it in asyncio to have portable code)

- add stream classes like asyncio.open_connection() for TCP (may be
implemented in a third party library/code, until the idea is accepted
into asyncio)

For an UDP client, the stream API can be as simple as
reader.read(data) and writer.write(data), the address can be implicit
(it's always the same).

For a UDP server, it's more complex. Sending a datagram requires an
explicit address. Receiving a datagram requires to provide the
address, so StreamReader.read() API doesn't fit.

When the listening socket becomes readable, we call recvfrom() which
returns (data, addr). Does it make sense to concatenate two datagrams
from the same client? I don't think so. So we should store (data,
addr) in a FIFO container and provide an API to pop one tuple, called
"recvfrom" for example.

New question: how should we specify the size parameter of recvfrom()?

recvfrom() has a size parameter. Currently, the default size is 256
KB. For TCP, it doesn't really matter since we can just buffer small
or large packets, there are all concatenated, the consumer specifies
how much bytes are required (read(n)).

It's possible to change the max_size attribute of a
_SelectorDatagramTransport, but I'm not sure that this attribute is
currently not documented and it has an issue. When a transport is
created, the transport immediatly starts to listen for read event. If
the max_size attribute is modified too late, the transport may already
have called recv() or recvfrom() with the old max_size value.

The max_size attribute is specific to selectors based on the select()
API. On Windows, a proactor event loop doesn't have such default size,
since the transport only starts reading when a recv() method is
explicitly called.

The code using eventlet (socket is eventlet.green.socket, not socket
from the Python stdlib):

def start_udp(self):
udp = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
udp.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
udp.bind((cfg.CONF.collector.udp_address,
cfg.CONF.collector.udp_port))

self.udp_run = True
while self.udp_run:
data, source = udp.recvfrom(64 * 1024)
sample = msgpack.loads(data, encoding='utf-8')
self.dispatcher_manager.map_method('record_metering_data',
sample)

Victor

Antoine Pitrou

unread,

Feb 26, 2015, 10:50:06 AM2/26/15

to python...@googlegroups.com

On Thu, 26 Feb 2015 14:37:34 +0100
Victor Stinner <victor....@gmail.com>
wrote:

>
>
> I see different options to write sequential UDP code:

The thing is, "sequential UDP code" doesn't really mean anything...
What is your use case?

> - add stream classes like asyncio.open_connection() for TCP (may be

UDP is not a stream protocol...

Regards

Antoine.

Victor Stinner

unread,

Feb 26, 2015, 10:53:13 AM2/26/15

to Antoine Pitrou, python-tulip

2015-02-26 16:49 GMT+01:00 Antoine Pitrou <soli...@pitrou.net>:
> What is your use case?

See the code at the end of my message:

while True:
data, addr = sock.recvfrom()
...

I expect a similar syntax using asyncio:

while True:
data, addr = yield from sock.recvfrom()
...

Victor

Antoine Pitrou

unread,

Feb 26, 2015, 10:55:11 AM2/26/15

to python...@googlegroups.com

On Thu, 26 Feb 2015 16:52:52 +0100
Victor Stinner <victor....@gmail.com>
wrote:

That's not a use case, just some code :-)

What is the protocol you are handling?

Regards

Antoine.

Victor Stinner

unread,

Feb 26, 2015, 10:59:26 AM2/26/15

to Antoine Pitrou, python-tulip

2015-02-26 16:55 GMT+01:00 Antoine Pitrou <soli...@pitrou.net>:
> That's not a use case, just some code :-)
>
> What is the protocol you are handling?

It's not a known protocol. It's just data serialized by msgpack. The
code comes from the Ceilometer project of OpenStack.

The server only receives data, it doesn't send UDP packets.

Victor

Antoine Pitrou

unread,

Feb 26, 2015, 11:03:01 AM2/26/15

to python...@googlegroups.com

On Thu, 26 Feb 2015 16:59:05 +0100
Victor Stinner <victor....@gmail.com>
wrote:

Are you implementing the server? I think a "stream"-like functionality
would be mostly useful for clients. A server is better suited to the
traditional Protocol idiom IMHO.

In any case, in UDP you want the server to send a response to the
client (if only a "ACK"); otherwise the client doesn't know whether the
message was received or not.

On the client side, a "client" object would make sense to expose
sendto() and recvfrom(), I guess. You also want some timeout handling...

Regards

Antoine.

Victor Stinner

unread,

Feb 26, 2015, 11:35:31 AM2/26/15

to Antoine Pitrou, python-tulip

2015-02-26 17:02 GMT+01:00 Antoine Pitrou <soli...@pitrou.net>:
> Are you implementing the server?

Yes.

> I think a "stream"-like functionality
> would be mostly useful for clients. A server is better suited to the
> traditional Protocol idiom IMHO.

I really hate the protocol API. It's too hard to use it for my little head.

I prefer a regular question-answer code, what I call "sequential" code:
---
data = sock.recv(100)
...
sock.send(reply)
---

in asyncio, it's written something like:
---
data = yield from reader.recv(100)
...
writer.write(reply)
yield from writer.drain() # optional
---

asyncio supports streams for the server side. See the "TCP echo server
using streams" example:
https://docs.python.org/dev/library/asyncio-stream.html#tcp-echo-server-using-streams

You can compare it to the protocol version:
https://docs.python.org/dev/library/asyncio-protocol.html#asyncio-tcp-echo-server-protocol

> In any case, in UDP you want the server to send a response to the
> client (if only a "ACK"); otherwise the client doesn't know whether the
> message was received or not.

In the specific case of Ceilometer: there is no ack, the server only
consumes input packets and dropped packets are ignored. It doesn't
really matter if a few packets are lost. For example, Ceilometer can
be used to store the CPU usage. It doesn't matter if you loose some
points.

Victor

Antoine Pitrou

unread,

Feb 26, 2015, 11:41:09 AM2/26/15

to python...@googlegroups.com

On Thu, 26 Feb 2015 17:35:11 +0100
Victor Stinner <victor....@gmail.com>
wrote:

> > I think a "stream"-like functionality
> > would be mostly useful for clients. A server is better suited to the
> > traditional Protocol idiom IMHO.
>
> I really hate the protocol API. It's too hard to use it for my little head.

Perhaps you should get used to it :-)

Especially if your server only receives data, it's very simple: just
implement the data_received() method (or datagram_received(), I don't
remember).

(it's also probably faster, in case you care about performance ;-))

> In the specific case of Ceilometer: there is no ack, the server only
> consumes input packets and dropped packets are ignored. It doesn't
> really matter if a few packets are lost. For example, Ceilometer can
> be used to store the CPU usage. It doesn't matter if you loose some
> points.

I would disagree with that :-) If you start losing packets it probably
means you are in the middle of something interesting, so you definitely
want to have metrics for that particular moment.

Regards

Antoine.

Victor Stinner

unread,

Feb 26, 2015, 11:48:14 AM2/26/15

to Antoine Pitrou, python-tulip

2015-02-26 17:40 GMT+01:00 Antoine Pitrou <soli...@pitrou.net>:
> Especially if your server only receives data, it's very simple: just
> implement the data_received() method (or datagram_received(), I don't
> remember).
>
> (it's also probably faster, in case you care about performance ;-))

It's already hard enough to explain asyncio to developers using
eventlet. I don't want to show them protocols.

It's already very hard to justify that the code must be modified to
add a few yield-from.

Why don't you want to give the choice to user between
transport/protocols and streams API?

Victor

Antoine Pitrou

unread,

Feb 26, 2015, 12:17:14 PM2/26/15

to python...@googlegroups.com

On Thu, 26 Feb 2015 17:47:53 +0100
Victor Stinner <victor....@gmail.com>
wrote:
>

> It's already hard enough to explain asyncio to developers using
> eventlet. I don't want to show them protocols.
>
> It's already very hard to justify that the code must be modified to
> add a few yield-from.
>
> Why don't you want to give the choice to user between
> transport/protocols and streams API?

I am just explaining you how to write a UDP server without a "yield
from" facility. You can write such a facility if you want. Frankly, I
have never found protocols difficult to understand: it's just
event-driven programming...

That said, if your UDP server only ever receives datagrams and never
sends them, and doesn't do anything complicated with them, then a
synchronous server is probably just as good (IMHO).

Regards

Antoine.

Guido van Rossum

unread,

Feb 26, 2015, 1:07:44 PM2/26/15

to Antoine Pitrou, python-tulip

I didn't follow all of that, but sendto() exists on the selector loop. You should ignore errors from it, since it's just as likely that the kernel accepts the packet (so sendto() succeeds) but some other layer or router drops it. There's absolutely no point in buffering and retrying based on the error returned by sendto() -- most likely if it returns an error there's bad congestion somewhere and retrying will make things worse. If you don't want your users to have to write a Protocol themselves, write some helper library. You could also work directly with the socket if you really want to.