At the 73rd IETF meeting in Minneapolis, I talked to Lars Eggert, who
has agreed to sponsor my asynchronous channels draft. This means
that, pending review, this document will be published as a standards
track RFC.
<http://tools.ietf.org/html/draft-thomson-beep-async>
If you have any feedback on this document, I'd appreciate it if you
could send that feedback to this list, or to me directly.
Regards,
Martin
The draft looks good to me. Really interesting work. Congratulations!
--
Francis Brosnan Blazquez <fra...@aspl.es>
Advanced Software Production Line, S.L.
I am unconvinced of the value of this draft, for three reasons:
a - This draft describes how to build into BEEP something you can
already do with BEEP (several ways).
I can write a trivial example using vortex that implements the
motivating use-case for those unconvinced.
b - BEEP adoption suffers from perceived complexity.
I don't personally think it is more complex than it has to be, but it
would be better to document how to use what it offers than to extend
it.
c - BEEP adoption suffers from non-interoperability of existing toolkits.
If implementors are already not writing interoperable toolkits when
implementing the existing RFC3080/3081, optional extensions are going
to make this situation worse.
Cheers,
Sam
Sorry. Couldn't resist implementing this.
Draft states following use case:
"""
Asynchronous applications require a protocol that is able to support
a large number of concurrent outstanding requests. The analogy of a
channel as a thread does not scale to the large number of threads
used in modern systems. Modern applications regularly have large
numbers of concurrent processing threads. Thus, a better way of
multiplexing large numbers of concurrent requests is required.
This document describes an BEEP feature, an extension to BEEP, that
enables the creation of an asynchronous channel. An asynchronous
channel is a channel where response ordering is not fixed to the
order of the requests sent by the client peer. An asynchronous
channel is identical to other channels, using unmodified framing;
only requests may be processed in parallel and responses may be sent
in any order.
"""
Note that Vortex delivers each MSG on a seperate thread, exactly as
described above.
BEEP msgnos are ordered across a channel, of course, but application
msgs can be replied whenever you want. The profile designer specifies
the content and meaning of beep message payload.
Profile designers are not required to map BEEP internal msgno 1-to-1
to their profile's message identifiers. Often, this is convenient.
When it is not convenient, don't do it.
Server:
#include <vortex.h>
#define PROFILE "http://example.com/beep/delayed"
/*
MSG payload is msgid
server waits a random amount of time, and sends reply.
RPY payload is "msgid delay", where msgid is the msgid of the request being
responded to, and delay is how long the server delayed before sending the
reply.
Actual code might put msgid in a MIME header, and the payload would be the
request information. Or the request would be xml encoded, and include a msgid.
There are lots of variations of this, none of them requiring extending the BEEP
protocol.
*/
VortexQueue* msgno_queue;
void frame_received(VortexChannel* chan, VortexConnection* conn,
VortexFrame* frame, void*v)
{
int msgno = vortex_frame_get_msgno (frame);
int delay = (rand() % 30) + 1;
char* msgid = vortex_frame_get_payload(frame);
char str[128];
vortex_queue_push(msgno_queue, INT_TO_PTR(msgno));
printf("msgno %d msgid %s delay %d\n", msgno, msgid, delay);
sleep(delay);
msgno = PTR_TO_INT(vortex_queue_pop(msgno_queue));
sprintf(str, "%s %d", msgid, delay);
vortex_channel_send_rpy(chan, str, strlen(str), msgno);
}
const char USAGE[] = "usage: %s <port>";
int main(int argc, char ** argv)
{
if(!argv[1]) {
printf("%s\n", USAGE);
return 1;
}
msgno_queue = vortex_queue_new();
vortex_init();
vortex_log_enable(1);
vortex_profiles_register(PROFILE, NULL, NULL, NULL, NULL,
frame_received, NULL);
vortex_listener_new("0.0.0.0", argv[1], NULL, NULL);
vortex_listener_wait();
vortex_exit();
return 0;
}
Client:
#include <vortex.h>
#include <assert.h>
#define PROFILE "http://example.com/beep/delayed"
int msgid = 0;
void send_msg(VortexChannel* channel)
{
int ok = vortex_channel_send_msgv(channel, 0, "%d", ++msgid);
assert(ok);
}
void on_frame(VortexChannel* channel, VortexConnection* conn,
VortexFrame* frame, void* v)
{
int msgno = vortex_frame_get_msgno(frame);
const char* content = vortex_frame_get_payload(frame);
printf("(msgno %d) %s\n", msgno, content);
send_msg(channel);
}
int main (int argc, char ** argv)
{
VortexConnection * connection = NULL;
VortexChannel * channel = NULL;
vortex_init ();
connection = vortex_connection_new(argv[1], argv[2], NULL, NULL);
if (!vortex_connection_is_ok(connection, false)) {
fprintf(stderr, "Unable to connect remote server, error was: %s\n",
vortex_connection_get_message(connection));
return 1;
}
channel = vortex_channel_new(connection, 0,
PROFILE,
NULL, NULL, /* no close handling */
on_frame, NULL,
NULL, NULL /* no async channel creation */
);
if (channel == NULL) {
fprintf(stderr, "Unable to create the channel..\n");
return 1;
}
printf(".. send msg\n");
send_msg(channel);
send_msg(channel);
send_msg(channel);
send_msg(channel);
send_msg(channel);
{
char c;
read(0, &c, 1);
}
vortex_exit ();
return 0 ;
}
Client output:
% ./async-client localhost 3333
.. send msg
(msgno 0) 3 28
(msgno 1) 1 14
(msgno 2) 2 17
(msgno 3) 5 24
(msgno 4) 4 26
(msgno 5) 10 2
(msgno 6) 11 3
(msgno 7) 9 10
(msgno 8) 8 13
(msgno 9) 12 8
Not BEEP msgno order is maintained, application msg order is asynchronous.
Cheers,
Sam
> BEEP msgnos are ordered across a channel, of course, but application
> msgs can be replied whenever you want. The profile designer specifies
> the content and meaning of beep message payload.
>
> Profile designers are not required to map BEEP internal msgno 1-to-1
> to their profile's message identifiers. Often, this is convenient.
> When it is not convenient, don't do it.
I can provide some additional real-world evidence to back up Sam's
claims.
The way Xgrid uses BEEP is to send an immediate empty RPY for each MSG
that is received. The RPY in this case acts as nothing more than an
acknowledgement that the MSG was received. In Xgrid, all messages,
whether they are requests, responses, or notifications, are mapped to
BEEP MSGs. When Xgrid wants to reply to a request, it just sends it
in another MSG. The body of the MSGs indicate whether the message is
a request or a response, and each request includes a correlation ID
which is then also included in the corresponding response.
The bodies of requests Xgrid sends look something like this pseudo-xml:
<message>
<message-type>request</message-type>
<message-name>status</message-name>
<message-correlation-id>87</message-correlation-id>
<message-body>...</message-body>
</message>
It turns out that Xgrid actually responds to all messages in order, so
this technique isn't being used to allow for out-of-order responses,
even though it could be. In Xgrid's case, this technique is being
used to allow for notifications -- messages for which there is no
response. The use of the ANS-style replies is not suitable for Xgrid
because of the requirement that each ANS message have a unique ID,
which results in both the sender and receiver having to keep track of
every outstanding ANS ID. For long-running connections with millions
of notifications this puts pressure on memory and reduces performance.
So I have to agree with Sam that on the face of things this proposal
for a new asynchronous channel type seems unnecessary. Applications
do not need to use BEEP message-numbers, they can ignore that detail
of the BEEP implementation and just implement their own message
tracking through the message headers or bodies.
However, while we're on the topic of improving the BEEP protocol, if
it were up to me the one change I would consider making is to add a
new message type, the notification (NFN). This would be like a MSG,
except that the peer does not send a RPY. This might be a simpler
change that achieves the OPs goals. However, it really isn't very
important, because it is easy to send empty RPYs, and the advantage of
sending empty RPYs is that they act like an acknowledgement of
receipt. If you were using NFN style notification messages the only
acknowledgement you would get would be an eventual SEQ frame.
-David
I have a couple of responses to your message, as well as some comments
on the draft.
===
>> c - BEEP adoption suffers from non-interoperability of existing
>> toolkits.
>
> Feature negotiation ensures that this option cannot be used without
> agreement. I can't see how this work affects interoperability.
If I create a profile that requires the use of a new feature, then
that profile can not be implemented on a BEEP stack that does not
provide that feature, so that profile cannot interoperate with that
BEEP stack. Perhaps this isn't the classic definition of
interoperability, but it is something to consider.
However, upon further consideration, I do not see how a profile could
actually _require_ the use of the async feature, so the
interoperability argument is moot.
>> The way Xgrid uses BEEP is to send an immediate empty RPY for each
>> MSG
>> that is received.
>
> This is a valid choice for your application. This solution doesn't
> presume to prevent this sort of behaviour, only expand the choices
> available to application protocol designers.
My point is that the protocol may already be flexible enough to solve
the problem without inventing new features. I do not think expanding
choices for the sake of choices is a good idea. If there is already a
good way of doing something, we shouldn't add an additional way.
> (Note that empty RPYs are a redundant receipt indication; you get
> receipt indications at the TCP layer as well.)
I do not think the TCP layer can indicate that that the remote
application has read all of the message's frames from the socket, can
it? That is what the empty RPY indicates, right?
===
Anyways, reading through the draft, I think one reason I am resistant
to this feature is that I do not fully understand when it would better
than the alternatives. As I see it, the two main alternatives to
async channels are empty replies and multiple channels.
The consequence of empty replies is that is pushes the burden of
message correlation onto the application layer. I can appreciate that
this is undesireable for applications that don't already have a reason
to correlate messages. But this could be hidden from the application
at the toolkit level. Rather than inventing a new type of channel, we
could define some standard MIME headers that applications that require
async messaging should use. And toolkits could add features to
automatically add these headers and correlate replies with messages
using them. Toolkits that didn't provide these features automatically
could still interoperate with those that did, as long as the profile
implementor handled the message correlation manually.
The other option is multiple channels. Sending messages on separate
channels has the asynchronous behavior you want, right? By using one
channel per message you can implement your profile on any BEEP stack.
The client can control the amount of parallel requests it makes by
limiting the number of channels it opens, and the server can control
the amount of parallel requests it has to handle by limiting the
number of channels it allows the client to open. Again, the toolkit
could provide features to automatically open and close channels for
each message that is sent, or keep channel pools open, so the
application still wouldn't have to deal with these details.
In fact, the one-channel-per-message strategy provides even greater
asynchrony than async channels, because the frames of both the MSGs
and the RPYs on independent channels can be interleaved over the
socket. The async channels feature specifically excludes this: "An
asynchronous channel must still observe the rules in [RFC3080]
regarding segmented messages. Each message must be completed before
any other message can be sent on that same channel."
However, I am concerned about this statement: "Different "ANS"
messages that are sent in a one-to-many exchange may be interleaved
with responses to other "MSG" messages." This seems to be a bad
idea... is this saying that ANS and RPY frames can be interleaved on
async channels? Why?
I am especially concerned about the note following that statement: "It
is recommended that BEEP peers do not generate interleaved ANS
segments." Why is it recommended that you do not interleave ANS
segments? Isn't this a basic feature of BEEP? Is this statement mean
only in the context of async channels, or all channels?
Based on this analysis it looks like async channels provide a
compromise between empty RPYs, which puts the burden of message
correlation onto the application, and multiple channels, which puts
the burden of channel management onto the application. Both of the
burdens could be lessened by toolkits or frameworks, without changing
the BEEP wire protocol. But if you just want your server to be able
to answer out of order, but you don't want to correlate the replies
yourself, and you don't want the answers interleaved, then I see how
the async channel is the most efficient solution.
-David
This is far-fetched.
Vortex is acting correctly. It delivers the MSGs to the API serially
(to different threads).
BEEP places no requirements on the what "process" means for a profile.
[reordered]
> Sam:
>> BEEP msgnos are ordered across a channel, of course, but application
>> msgs can be replied whenever you want.
>
> This is in direct contravention of the quoted requirement, above.
BEEP RPYs are passing over the wire in the RFC defined order.
I could write the code in the other toolkits I use, too (beepy,
swirl/beepcore-c, and beep4j).
> Of course, to operate in this mode, a client will have to implement a demux layer that redirects responses to the entity that originated the request. Without this layer, the BEEP stack would redirect responses to the wrong entity.
> In your example, the entity that makes request 3 would receive response 5 instead. This is less of a problem in C, but much more of a problem for an OO language implementation, c.f. beep4j.
> You've made a conscious decision to move the complexity to the application.
Yes, where it belongs.
Demonstrably trivial complexity (implementable in a few lines of C
code in my example, I don't believe its harder with beep4j, we use it,
its a good toolkit).
The complexity of the code that implements application functionality
will dwarf these few lines.
This is what BEEP is designed for, to be a protocol to build
protocols, not a finished thing implementing everything you could
want.
> Sam:
>> c - BEEP adoption suffers from non-interoperability of existing toolkits.
>
> Feature negotiation ensures that this option cannot be used without agreement. I can't see how this work affects interoperability.
Some toolkits will have it, some won't. Profiles specified as needing
the extension will be implementable with some toolkits, not with
others.
Cheers,
Sam
In below, "message" means a *complete* MSG, RPY, ERR, or ANS, not a frame.
Interleaving of ANS frames provides a useful feature not commonly
implemented by toolkits other than beepcore-c.
AFAICT, several toolkits (beep4j, for example) started off only
working in terms of messages, for both send and receive.
This doesn't allow a peer to process data as it arrives in frames, and
can cause deadlock if the window fills. So, eventually they allow
frames to be received by applications as they arrive, perhaps
optionally (vortex will accumulate all frames into a complete message,
if requested, for example, before presenting them to an application).
It is quite useful to expose frames to applications on the sending
side, too. For example, it allows a single RPY to have an arbitrarily
large size, so an entire file could be transferred in a single RPY
(ANS, MSG, etc.) in block-size frames.
All the toolkit APIs since beepcore-c (that I've seen) force the
application to present the entire payload of a MSG, RPY, ANS, etc. as
a contiguous chunk of data in a single API call when sending.
Anyhow, RFC3080 describes MSG/ANS as a one-to-many exchange, and each
one of the entities on the "many" side is allowed to generate its ANS
as a series of frames. Without this capability, the entities would be
forced to provide complete ANSs serially, instead of in parallel.
Sam
RFC 3080 only defines wire representation. Peer behaviour is defined
by a profile, not RFC3080.
Cheers,
Sam
Implementors are having enough trouble getting basic communication going!
Beepy and beep4j wouldn't communicate at all when we first tried,
beepy had broken SEQ handling, and beep4j had broken MIME header
format parsing, and beepcore-c wouldn't talk to vortex. I haven't
tried the first two against the latter two, yet. Hopefully it will go
well.
> Can you provide a use case?
But those gripes aside, I sure can. Both for an ANS, and just
generally for transferring files larger than should be manipulated
in-memory.
RFC3080: "in a one-to-many exchange, multiple answers may be
simultaneously in progress"
MSG: <get-all-movies actress="uma thurman">
ANS: a movie, type and name defined in MIME headers, with Uma Thurman in it
...
The MSG would kick into gear a heavy duty search of known movie
download sites, and each ANS would be a single movie, sent as it is
found in small chunks (movies are big!), possibly multiple ones would
be found at once and sent.
Anyhow, like async channels, its a feature that you can do without,
you can build it on top of smaller message exchanges tied together by
identifiers that are not BEEP message numbers.
For example, we transfer large dynamically generated PDFs over BEEP.
With beepcore-c, as data is ready we would write out a frame of a RPY
(which might get further subdivided as needed to fit window size), but
instead we RPY with a "its coming at you soon with this message-id",
and then send each chunk of data as it is available with a MSG (which
gets an empty RPY).
Mind you, we reply before we actually process the request to generate
and send the file.
Actually, we have a "thin" UI client (beep4j) that uses BEEP to talk
to its server (heavily modified beepy), and almost everything that
happens on the client involves a MSG being sent to the server,
followed by an immediate RPY <ok/>, and then the server starts doing
whatever processing is required for the requested action.
Would you really consider our use of BEEP a massive conformance breach
of an RFC3080 "process before sending RPY" restriction on the
implementation of profiles?
Cheers,
Sam