Reading fixed amount of data from non-blocking channel

Mark Janssen

unread,

Jul 5, 2007, 1:25:05 PM7/5/07

to

While trying to implement an SCGI server in Tcl, I had some trouble
finding a clean way to use read in an event handler to read a specific
amount of data.

Some background: The SCGI protocol, when sending a request, first
sends the length of the whole request terminated by a ':'. After that
I would like to be able to read either the whole request or nothing at
all.

I first tried something like:

read $chan $length ; if {[fblocked $chan]} { # not the whole message}

but this doesn't work, because it appears [read] never sets fblocked
and just returns what's available up until then. It therefore seems
necessary to collect the read data until the whole request has been
received and store the intermediate read data for instance in a global
var.

This requires some bookkeeping and checks to determine if all the data
has been read and the responsibility to clean up the partially or
completely read data in case of errors or after handling the request.

What I eventually came up with, was to pass the already read data in
the fileevent handler by redefining the handler on read to include the
newly read data (see http://wiki.tcl.tk/SCGI for the result)

I was wondering:
1) Is there a cleaner idiom to handle a scenario like this?
2) Why read returns what has been received already instead of
returning nothing and setting fblocked if the amount of data requested
is not available?

Note that even though point 2 is documented on the read man page, this
is not the behaviour one would expect when reading the fblocked man
page. If the channel would have been blocking the read call would have
blocked if insufficient data was available (and no EOF was detected).

Mark

Nick Hounsome

unread,

Jul 7, 2007, 11:12:29 AM7/7/07

to

On Jul 5, 6:25 pm, Mark Janssen <mpc.jans...@gmail.com> wrote:
> While trying to implement an SCGI server in Tcl, I had some trouble
> finding a clean way to use read in an event handler to read a specific
> amount of data.

This is really nothing to do with Tcl - It's just the way that the
underlying sockets work which in turn is based on the way that TCP
works.

Unfortunately you can often get away with not buffering everything up
if the messages are small, the sender writes the whole message in a
single write and no gateways or routers decide to split things up
hence you will see anything up to 90% of network code not handling
split messages properly. People see this code. It seems to work. They
copy it and soon everyone starts to believe that recv/read on a socket
is supposed to give you what you asked for.

Blocking only affects what happens if there are 0 bytes to read.

Obviously Tcl COULD have done it differently but it's always easiest
for people to pick up stuff it is the same as whatever they are
already familiar with - which in the networking world means BSD
sockets.

Mark Janssen

unread,

Jul 7, 2007, 2:44:38 PM7/7/07

to

I agree Tcl cannot provide a whole message if it has not been
delivered yet because of segmentation in the network. However Tcl puts
the data in an internal buffer when it arrives and fires of a readable
event. So in that case it would be possible to deliver what was asked
for (which is essentially what a non-blocking [gets] is doing).
I can understand why [read] behaves the way it does if only because of
historical reasons. More serious is that the [fblocked] and [chan
pending] commands don't work as documented with read and are really
not useful when you want to use read to get your data. For the [chan
pending] case I have logged an issue in SF (1749581) which contains
some more info.

Mark