Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

TIP #287: Add a Commands for Determining Size of Buffered Data

109 views
Skip to first unread message

Fredderic

unread,
Dec 14, 2006, 3:05:56 AM12/14/06
to
I remember a little while back there was a discussion on the DoSability
of [gets], and ways to plug the hole. This TIP being one of them...
I've looked through the TIP index, but didn't see anything matching my
query here;

As I understand it, TIP #287 allows a script to check on the state of
the incoming buffer, and if it appears to be accumulating too much data
(1MB is a rather large URL)

My question is whether there's any plans for a method of [gets] DoS
prevention that doesn't require polling the channels periodically.
Something like an alert threshold, or upper limit on the line length
allowed for a channel? It's difficult to figure out a good poll
interval to set, at which to check all the channels to see if their
input buffers are being flooded. It's also something that a lot of
people wouldn't implement until after they discover that they needed
to, which is not really optimal.

Having a buffer/line controlling option there, say one that triggers an
overflow fileevent once the buffer reaches a certain limit, is much
easier to implement, far less obscure, and hence more likely to
actually be put into practise BEFORE the problem occurs.

At least, that's what I think. ;)


Fredderic

Stephan Kuhagen

unread,
Dec 14, 2006, 3:26:58 AM12/14/06
to
Fredderic wrote:

> My question is whether there's any plans for a method of [gets] DoS
> prevention that doesn't require polling the channels periodically.
> Something like an alert threshold, or upper limit on the line length
> allowed for a channel?

How about a switch for [gets] that sets the maximum line length that is
allowed to be returned? That would be similar to [read numChars], but with
the convenient behaviour of [gets] in normal cases. Example:

catch {
gets -maxLength 1024 theLine
} error

If [gets] reads more than 1024 chars without finding a newline, it throws an
error. This would not break old code and there is no need to poll.

Regards
Stephan

Andreas Leitgeb

unread,
Dec 14, 2006, 8:08:14 AM12/14/06
to
Stephan Kuhagen <nos...@domain.tld> wrote:
> Fredderic wrote:
>> My question is whether there's any plans for a method of [gets] DoS
>> prevention that doesn't require polling the channels periodically.
>> Something like an alert threshold, or upper limit on the line length
>> allowed for a channel?
> How about a switch for [gets] that sets the maximum line length that is
> allowed to be returned? That would be similar to [read numChars], but with
> the convenient behaviour of [gets] in normal cases. Example:

> catch {
> gets -maxLength 1024 theLine
> } error

I like the idea of throwing an error (when the limit is exceeded)
very much: afterall it has to be distinguishable from a normal
return where the eol appeared right in time.
In C's fgets(), this case is distinguishable because the
(eventual) eol-char is part of the returned string.
In Tcl we need something else.

However, to make it robust, one would still have to
check for the reason of [gets] failing (it could also
have been due to an array-variable given, or for
an invalid channel, or whatever other programming error)

we could also specify it as passing the maxlength as
an optional third argument. This would mean that giving
a maxlen forces a variable-name to also be given. I don't
think this is a problem, because even if gets runs into
a too long line, it's good to still know what has been
read so far, and this information can only be made available
through the variable (not through return-value, in the case
of error-throwing!)

Alternatively, we could change errror-handling inside gets such,
that if an error is thrown, the data read so far is marked as
unread, so it could be retrieved with a following [read ...].

Currently, this at least doesn't happen in the case where the
variable specified is an array. (data is read but never assigned)
As of now this isn't a problem, because it only happens for
programming errors, but once it could happen for "foreign errors"
it instantly becomes a problem.

Stephan Kuhagen

unread,
Dec 14, 2006, 9:03:42 AM12/14/06
to
Andreas Leitgeb wrote:

> However, to make it robust, one would still have to
> check for the reason of [gets] failing (it could also
> have been due to an array-variable given, or for
> an invalid channel, or whatever other programming error)

Of course, but this can be checked by looking at the contents of the
error-variable from [catch].

> we could also specify it as passing the maxlength as
> an optional third argument. This would mean that giving
> a maxlen forces a variable-name to also be given. I don't
> think this is a problem, because even if gets runs into
> a too long line, it's good to still know what has been
> read so far, and this information can only be made available
> through the variable (not through return-value, in the case
> of error-throwing!)

Good point, and makes it even more elegant and easy to write. So it should
be

catch {
gets theLine 1024
} error

> Alternatively, we could change errror-handling inside gets such,
> that if an error is thrown, the data read so far is marked as
> unread, so it could be retrieved with a following [read ...].

I would prefer the first solution for two reasons: 1. I think, it's faster,
because it is not necessary to put the data back (if this is possible at
all because of buffering, reading from pipe and such), 2. the
implementation of [gets] can be thought as something like (untested...)

proc gets {channel {vname ""} {maxLen -1}} {
set count 0
if {$vname != ""} {
upvar $vname data
}
set data {}
while { ($count<$maxLen) || ($maxLen<0) } {
set char [read $channel 1]
if {$char=="\n"} break
append data $char
incr count
}
if {$vname != ""} {
return $count
}
return $data
}

So I would expect, that if there is some error inside that code, I have at
least the chars read so far.

Regards
Stephan

Andreas Leitgeb

unread,
Dec 14, 2006, 10:14:43 AM12/14/06
to
Stephan Kuhagen <nos...@domain.tld> wrote:
>> However, to make it robust, one would still have to
>> check for the reason of [gets] failing (it could also
>> have been due to an array-variable given, or for
>> an invalid channel, or whatever other programming error)
> Of course, but this can be checked by looking at the contents of the
> error-variable from [catch].

Yes, sure there is a way, but I see again the tendency
of programmers to avoid the real check, and just assume
that a thrown error implies a too long input line...

Still, this bug is still the harmlesser one compared
to DoS-prone current [gets].

> Good point, and makes it even more elegant and easy to write.
> So it should be
> catch {
gets theLine 1024
> } error

The channel is of course non-optional (gets stdin theLine 1024).
And one also should not ignore catch's result in this context,
or one might run into surprises later :-)

>> Alternatively, we could change errror-handling inside gets such,
>> that if an error is thrown, the data read so far is marked as
>> unread, so it could be retrieved with a following [read ...].
> I would prefer the first solution for two reasons: 1. I think, it's faster,
> because it is not necessary to put the data back (if this is possible at
> all because of buffering, reading from pipe and such),

I think that unlike your procedural description, gets really
doesn't "read byte by byte till it finds a eol". I rather think
it reads a block, scans it for eol's and leaves the rest in some
input buffer to be used at next gets/read. If it's that way,
then "unreading" should be all the easier implementable.
If it's not that way (and if putting back into buffer is
really impossible), then the read bytes should at least be
written to the variable (except for invalid varnames), to
prevent their loss.

Darren New

unread,
Dec 14, 2006, 3:04:48 PM12/14/06
to
Stephan Kuhagen wrote:
> Of course, but this can be checked by looking at the contents of the
> error-variable from [catch].

I would like to suggest that any time people want to distinguish causes
of errors, $erroCode be defined and used for that? Trying to parse
human-readable error message strings is fraught with error, and Tcl
provides a very nice alternative to that.

--
Darren New / San Diego, CA, USA (PST)
Scruffitarianism - Where T-shirt, jeans,
and a three-day beard are "Sunday Best."

Michael A. Cleverly

unread,
Dec 15, 2006, 12:14:37 AM12/15/06
to
On Thu, 14 Dec 2006, Fredderic wrote:

> My question is whether there's any plans for a method of [gets] DoS
> prevention that doesn't require polling the channels periodically.

I'm not sure what you mean by "doesn't require polling the channels
periodically"...

There are two different cases:

1) the channel is in blocking mode (in which case [gets] doesn't
return until it has a complete line period);

or

2) the channel is in non-blocking mode and the event loop is active
(and presumably you've defined a [fileevent readable] callback).

TIP #287 doesn't aim to do anything with blocking input at all. That
would have already been solveable at the script level by a proc that did
something like sit in a while loop doing [read $chan 1] and appending a
character at a time until a newline or some limit was reached.

What TIP #287 was intended to help with was the non-blocking fileevent
driven case. And for that you don't need to do any kind of new polling
that you weren't already doing before.

It's easy to miss the fact that a fileevent readable callback is triggered
anytime there is unread data on the channel *even* when the channel is in
line buffering mode and there isn't a complete line available yet. (This
is precisely the reason we have [fblocked]; to after-the-fact distinguish
between [gets] reading a line of length 0 and [gets] returning (or
assigning to our variable) the empty string because there was an
incomplete line in the buffer.)

So, in your existing fileevent callback you can use [chan pending input
$chan] to peek and see how much unread data Tcl has buffered for you. If
it is above some threshold you take appropriate action. What is an
appropriate action will, of course, depend on the nature of your
application. Some possibilities might include:

1) abort the connection (close the socket);

2) use [read $chan $n_bytes] instead of [gets] to partially drain and
process the input buffer

For example here is a simple "echo" server that doesn't like incomplete
lines > 40 characters. (Naturally it requires tcl 8.5 that has [chan
pending]...)

### Beginning of server.tcl

proc accept {sock peer port} {
fconfigure $sock -buffering line -blocking 0 -buffersize 40
fileevent $sock readable [list echo $sock]
puts "Accepted connection ($sock) from $peer"
}

proc echo {sock} {
# Have [gets] store what it reads in $line and return the
# length of the line read (-1 has to be distinguished by
# calling [eof] or [fblocked], etc.)
if {[gets $sock line] > -1} then {
puts "They wrote: $line"
return [puts $sock "You wrote: $line"]
}

# Either we've got an EOF on this channel or
# there was some data but not a full line
if {[eof $sock]} then {
puts "EOF detected"
return [close $sock]
}

# An incomplete line is waiting in the buffer; check
# and see how much unread data there is
if {[set qty [chan pending input $sock]] > 40} then {
puts "Wow, they talk too much! ($qty > 40 !)"
puts $sock "Sorry, you talk too much! ($qty > 40 !)"
return [close $sock]
}

# There is an incomplete line, but the total amount of data
# buffered is still <= 40 bytes. The readable fileevent won't
# fire again until there is either more data or EOF
puts "There wasn't a full line, but they aren't too chatty yet ($qty)"
return
}

socket -server accept 1234
vwait forever

### End of server.tcl / Beginning of client.tcl

set sock [socket 127.0.0.1 1234]
fconfigure $sock -buffering none

puts $sock "Hello World"
flush $sock

puts -nonewline $sock [string repeat ! 30]
flush $sock

# with the data flushed the server will have an incomplete line (len=30)
puts $sock " really."

# pause then burst progressively large amounts of data
foreach {qty} {1 5 10 9876 32767 987654321} {
puts "Trying $qty"
if {[eof $sock] || [catch {
puts -nonewline $sock [string repeat . $qty]
}]} then {
puts "They shut us down before we could finish writing $qty"
break
} else {
flush $sock
}
}

### end of client.tcl

The output I get (on stdout of the server) when I run the above demos is:

powerbook:~ michael$ ~/sandbox/bin/tclsh8.5 server.tcl
Accepted connection (sock6) from 127.0.0.1
There wasn't a full line, but they aren't too chatty yet (11)
They wrote: Hello World
There wasn't a full line, but they aren't too chatty yet (30)
There wasn't a full line, but they aren't too chatty yet (38)
They wrote: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!! really.
There wasn't a full line, but they aren't too chatty yet (1)
There wasn't a full line, but they aren't too chatty yet (6)
There wasn't a full line, but they aren't too chatty yet (16)
Wow, they talk too much! (4112 > 40 !)

and on the client:

powerbook:~ michael$ tclsh client.tcl
Trying 1
Trying 5
Trying 10
Trying 9876
They shut us down before we could finish writing 9876

Michael

Stephan Kuhagen

unread,
Dec 15, 2006, 12:35:00 AM12/15/06
to
Darren New wrote:

> I would like to suggest that any time people want to distinguish causes
> of errors, $erroCode be defined and used for that? Trying to parse
> human-readable error message strings is fraught with error, and Tcl
> provides a very nice alternative to that.

Of course you are right. But the problem with errorCode often seems to be
(for me) that it does not add any information to the error-checking. For
example with channels (and so related to this thread):

% gets noSuchChannel theLine
can not find channel named "noSuchChannel"
% puts $errorCode
NONE

Not very helpfull... So we must parse $errorInfo anyway, which of course is
error prone and not a really good solution.

Regards
Stephan


Stephan Kuhagen

unread,
Dec 15, 2006, 12:45:57 AM12/15/06
to
Andreas Leitgeb wrote:

>>> an invalid channel, or whatever other programming error)
>> Of course, but this can be checked by looking at the contents of the
>> error-variable from [catch].
>
> Yes, sure there is a way, but I see again the tendency
> of programmers to avoid the real check, and just assume
> that a thrown error implies a too long input line...
>
> Still, this bug is still the harmlesser one compared
> to DoS-prone current [gets].

I think Darren News suggestion of using (a more expressive) $errorCode would
be a good solution for that. If $errorCode would be extended to have more,
and maybe command specific, codes, it would give us much better error
handling options and simple error checking.

>> Good point, and makes it even more elegant and easy to write.
>> So it should be
>> catch {
> gets theLine 1024
>> } error
> The channel is of course non-optional (gets stdin theLine 1024).
> And one also should not ignore catch's result in this context,
> or one might run into surprises later :-)

Ah, I used the ReadProgrammerMind-Extended Tclsh, where the channel is
optional, but I should have skipped the [catch] then also, because the
errorInfo automatically materialises in your mind with that... ;-)

>> possible at all because of buffering, reading from pipe and such),
>
> I think that unlike your procedural description, gets really
> doesn't "read byte by byte till it finds a eol". I rather think
> it reads a block, scans it for eol's and leaves the rest in some
> input buffer to be used at next gets/read. If it's that way,
> then "unreading" should be all the easier implementable.

I don't know the implementation of [gets]. It was just to demonstrate, what
I think would be an intuitive behaviour in case of an error.

Regards
Stephan

Andreas Leitgeb

unread,
Dec 15, 2006, 6:46:46 AM12/15/06
to
Stephan Kuhagen <nos...@domain.tld> wrote:
> Ah, I used the ReadProgrammerMind-Extended Tclsh, where the channel is
> optional, but I should have skipped the [catch] then also, because the
> errorInfo automatically materialises in your mind with that... ;-)

Oh, then *think* about submitting the necessary patches to sf.
(you don't need to actually do it: let your tclsh do it for you :-)

>> I rather think
>> it reads a block, scans it for eol's and leaves the rest in some
>> input buffer to be used at next gets/read.

After rethinking that, it probably wouldn't work that way, still,
because for sufficiently large maxSize's, it might read some
(eol-free) blocks completely, and only later notice the lack of
eol.

So, having gets place up to maxSize already-read bytes into the
variable even for the "no eol in sight"-case, seems to be the
sanest way.

Andreas Leitgeb

unread,
Dec 15, 2006, 7:02:10 AM12/15/06
to
Michael A. Cleverly <mic...@cleverly.com> wrote:
> For example here is a simple "echo" server that doesn't like incomplete
> lines > 40 characters. (Naturally it requires tcl 8.5 that has [chan
> pending]...)
>
> ### Beginning of server.tcl
> proc echo {sock} {
> if {[gets $sock line] > -1} then { ... }
> if {[eof $sock]} then { ... }
> if {[set qty [chan pending input $sock]] > 40} then { ... }
> }

A big ouch! It doesn't take into account,
that between [gets] and [chan pending] some moderate block of
bytes might become available, perhaps even having an eol right
at start (to finish off the previously incomplete short line).

So the script would "think" that a long-line is waiting,
although the other side sends perfectly valid data.

That isn't even just theoretic, but with a 4096byte buffered
stream, it's not unlikely for a "line" to end just at the start
of the next 4096 bytes.

It's not a solution, it's a very race-condition-prone bad hack.

Michael A. Cleverly

unread,
Dec 15, 2006, 11:07:30 AM12/15/06
to
On Fri, 15 Dec 2006, Andreas Leitgeb wrote:

> Michael A. Cleverly <mic...@cleverly.com> wrote:
> > For example here is a simple "echo" server that doesn't like incomplete
> > lines > 40 characters. (Naturally it requires tcl 8.5 that has [chan
> > pending]...)
> >
> > ### Beginning of server.tcl
> > proc echo {sock} {
> > if {[gets $sock line] > -1} then { ... }
> > if {[eof $sock]} then { ... }
> > if {[set qty [chan pending input $sock]] > 40} then { ... }
> > }
>
> A big ouch! It doesn't take into account,
> that between [gets] and [chan pending] some moderate block of
> bytes might become available, perhaps even having an eol right
> at start (to finish off the previously incomplete short line).

Actually no.

The implementation calls the (long existing in the core but never exposed
to the script level) Tcl_InputBuffered() function.

Without having returned to the event loop (presuming there isn't an
[update] between your [gets] and [chan pending]) and without another
attempted input operation (another call to [gets] or [read]) then the data
will be buffered at the OS level, but not yet read/seen/known to Tcl.

So you tried to read a line. The *entire* contents of the data available
to Tcl at the time of the [gets] did not contain an end-of-line. So
gets returns a -1 and leaves all the data in its own (Tcl) buffers,
unread.

Then without attempting to read even more from the OS (by asking Tcl to
attempt another input operation on our behalf) and without re-entering the
event loop or dropping back to the event loop we ask Tcl how much data it
had available to it--that it looked through in search of the non-existant
newline--and then take some application specific action based on the size
it tells us.

> It's not a solution, it's a very race-condition-prone bad hack.

If that were the case then [fblocked] would have always been a very
race-condition-prone bad hack too. But it isn't.

Michael

Neil Madden

unread,
Dec 15, 2006, 11:15:07 AM12/15/06
to
Darren New wrote:
> Stephan Kuhagen wrote:
>> Of course, but this can be checked by looking at the contents of the
>> error-variable from [catch].
>
> I would like to suggest that any time people want to distinguish causes
> of errors, $erroCode be defined and used for that? Trying to parse
> human-readable error message strings is fraught with error, and Tcl
> provides a very nice alternative to that.

This is useful in your own code, where you can control the format of
errorCode messages. It's a shame that errorCode seems to be so little
used in general (even some built-in Tcl errors produce nothing useful in
errorCode). There's also no standard format for errorCode (beyond being
a list, and even that isn't ensured), which means in practice it is just
another arbitrary string and not necessarily any easier to deal with
than the error message (e.g. is "ARITH DIVZERO {divide by zero}" really
any more use than just "divide by zero"?). The one advantage is that it
is much less likely to change between releases.

I know my own exception handling in Tcl tends to be very imprecise, and
tends not to actually distinguish between different error cases (just
fail/succeed). Often this is because I've already checked the various
error cases that I could actually do something about (e.g. [eof],
[fblocked], [info exists] etc), so when an error is thrown I just log it
or propagate it without bothering to examine it. About the only time I
actually do something with the result of a [catch] it is to distinguish
between error/break/continue codes rather than error types. So, while
producing useful errorCodes should be encouraged, is it fair to say that
by and large, other methods tend to be used for distinguishing between
error conditions?

Also, for those using Tcl 8.5 you can now get at the errorCode and
errorInfo without having to deal with 'orrible global variables:

catch { someCode ... } ret opts
switch -glob [dict get $opts -errorcode] {
{ARITH *} { puts "An arithmetic error" }
NONE { puts "No error at all!" }
default { puts "An unknown error" }
}

-- Neil

Darren New

unread,
Dec 15, 2006, 11:58:41 AM12/15/06
to
Stephan Kuhagen wrote:
> Not very helpfull... So we must parse $errorInfo anyway, which of course is
> error prone and not a really good solution.

But that's my point. That's a bad way to go about it. What happens when
you localize Tcl and errorCode is now in French?

The right solution is to fix every error to return an errorCode.

In my S3 library, for example, every errorCode is a list that starts
with S3, then the type of error (local, remote, usage, etc), then the
argument that caused the error (so a usage error would have perhaps
-blocking as the third element if you provided "-blocking hello" instead
of a true/false value), and for local errors the fourth element is the
original errorCode, and for remote errors the fourth element is the HTTP
status result, and so on.

Of course, the errorInfo says something useful to the reader. But if you
want to *handle* the errors, having good errorCodes are important. The
answer "errorCode is currently broken, so we might as well leave it
broken and work around it forever" is a suboptimal solution, I think. If
you're defining a new error, you should specify the errorCode that goes
with it, especially if you know you want to handle it automatically.

Darren New

unread,
Dec 15, 2006, 12:04:45 PM12/15/06
to
Neil Madden wrote:
> than the error message (e.g. is "ARITH DIVZERO {divide by zero}" really
> any more use than just "divide by zero"?).

It would be if (a) one had nationalized error messages, or (b) everyone
used them. In this case, you catch an error, it's an ARITH error, so you
know when processing the file it wasn't a "could not open file" error.

> So, while
> producing useful errorCodes should be encouraged, is it fair to say that
> by and large, other methods tend to be used for distinguishing between
> error conditions?

Don't misunderstand. I wasn't saying it's not the best solution for now.
But I just cringe when someone says "Let's add some functionality, and
since the user will often want to distinguish this failure from others,
let's parse the error message, and not consider specifying an
errorCode." I'm simply encouraging all who define new error conditions
to define the errorCode that goes with them.

Andreas Leitgeb

unread,
Dec 15, 2006, 12:31:14 PM12/15/06
to
Michael A. Cleverly <mic...@cleverly.com> wrote:
> Actually no.

> Without having returned to the event loop (presuming there isn't an
> [update] between your [gets] and [chan pending]) and without another
> attempted input operation (another call to [gets] or [read]) then the data
> will be buffered at the OS level, but not yet read/seen/known to Tcl.

Oh, that of course makes my comment moot.

> If that were the case then [fblocked] would have always been a very
> race-condition-prone bad hack too. But it isn't.

Actually, I never used [fblocked], but then I never
queried non-blocking channels for more than gets' return
value (being >=0 or not) and [eof]. I never came into
the situation of having a use for the information provided
by [fblocked], yet.

Nevertheless, I'd find a maxSize feature for [gets] much more
programmer-friendly than the solution you offered, despite
that your solution works right now, whereas the maxSize
feature would require a TIP to be written, accepted and
implemented. :-)

Larry W. Virden

unread,
Dec 15, 2006, 12:36:53 PM12/15/06
to

Neil Madden wrote:
>
> This is useful in your own code, where you can control the format of
> errorCode messages. It's a shame that errorCode seems to be so little
> used in general (even some built-in Tcl errors produce nothing useful in
> errorCode).

Of course, when developers encounter such cases, I encourage you to
submit feature requests/bug reports (depending on what you find) so
that such things can be improved.


> There's also no standard format for errorCode (beyond being
> a list, and even that isn't ensured),

Perhaps the Tcl core should implement a standard format for itself (I'm
not suggesting imposing such a thing on foreign extensions). This way,
one can "lead by example" (adding, for instance, use of the standard
format in the sample extension, Tk, and any other "core-like"
extensions,...)

>
> I know my own exception handling in Tcl tends to be very imprecise, and
> tends not to actually distinguish between different error cases (just
> fail/succeed).

One of things I find myself most frequently changing during debugging
sessions (of programs in just about every language) is error output -
particularly in cases where 6 different error conditions result in the
identical error output. It makes debugging for me, and the developer,
easier if each error condition has some sort of distinguishing
characteristic. Even if the error text itself is identical, at least
some kind of code uniquely identifying which occurance is appearing is
helpful when trying to figure out what is causing a particular problem.
And, if _I_ have responsibility for the code, I will frequently add a
bit more info to error logs or whatever mechanism is being used to
report problems - often looking at the code and deciding what variables
I might need to see if I had the chance to look......

Ralf Fassel

unread,
Dec 15, 2006, 3:25:20 PM12/15/06
to
* Neil Madden <n...@cs.nott.ac.uk>

| There's also no standard format for errorCode (beyond being a list,
| and even that isn't ensured),

Waddayamean "even that isn't ensured"? You mean,
lindex $errorCode 0
isn't safe? The docs explicitly say in tclvars(n)

errorCode
--<snip-snip>--
errorCode consists of a Tcl list with one or more elements.

?

| (e.g. is "ARITH DIVZERO {divide by zero}" really any more use than
| just "divide by zero"?). The one advantage is that it is much less
| likely to change between releases.

We use the codes to i18n the messages.

R'

Neil Madden

unread,
Dec 15, 2006, 3:34:03 PM12/15/06
to
Ralf Fassel wrote:
> * Neil Madden <n...@cs.nott.ac.uk>
> | There's also no standard format for errorCode (beyond being a list,
> | and even that isn't ensured),
>
> Waddayamean "even that isn't ensured"? You mean,
> lindex $errorCode 0
> isn't safe? The docs explicitly say in tclvars(n)
>
> errorCode
> --<snip-snip>--
> errorCode consists of a Tcl list with one or more elements.

Exactly - it isn't safe. Anyone can put anything into the -errorcode and
nothing checks that it is a well formed list. e.g.:

% error dummy "" "{badlist"
dummy
% lindex $errorCode 0
unmatched open brace in list

-- Neil

Michael A. Cleverly

unread,
Dec 15, 2006, 9:49:26 PM12/15/06
to
On Fri, 15 Dec 2006, Andreas Leitgeb wrote:

> Nevertheless, I'd find a maxSize feature for [gets] much more
> programmer-friendly than the solution you offered, despite
> that your solution works right now, whereas the maxSize
> feature would require a TIP to be written, accepted and
> implemented. :-)

I wanted to see a solution to the problem in 8.5 and went with the
simplest thing that I knew I'd be able to provide an implementation for in
time. I do think adding some kind of -max switch to [gets] could be
useful for a lot of folks, but I'll happily defer to someone else slightly
more ambitious than myself. :-)

Michael

Ralf Fassel

unread,
Dec 16, 2006, 9:33:39 AM12/16/06
to
* Neil Madden <n...@cs.nott.ac.uk>

| % error dummy "" "{badlist"
| dummy
| % lindex $errorCode 0
| unmatched open brace in list

Ouch...

But at least the errors which TCL itself throws should be safe?
I mean, everything that errors out from a (non-overridden) TCL
builtin-command?

R'

Cameron Laird

unread,
Dec 16, 2006, 11:00:32 AM12/16/06
to
In article <yga1wmz...@panther.akutech-local.de>,

'Sounds like a good activity for an eager newcomer:
systematically work with the tcltest crew to ensure
that all errorCodes are validated as lists (at least).

Darren New

unread,
Dec 16, 2006, 12:31:55 PM12/16/06
to
Cameron Laird wrote:
> 'Sounds like a good activity for an eager newcomer:
> systematically work with the tcltest crew to ensure
> that all errorCodes are validated as lists (at least).

The tcltest page gives an example of checking that the error string is
correct, but no example of how to check that the errorCode is correct. I
suspect adding a convenient method of checking that to tcltest would go
a long way.

Me, I put the body in a call to "expectError" and that routine catches,
the error, returning the return code and errorCode as a list. Then I use
a comparison mode that matches the beginning of the errorCode list
against the expected result. I wasn't able to figure out how to check
$errorCode in the comparison function without the extra catch inside the
test. (http://s3.amazonaws.com/darren/TclS3.zip for examples.)

Fredderic

unread,
Dec 16, 2006, 10:28:38 PM12/16/06
to
On Thu, 14 Dec 2006 22:14:37 -0700,
"Michael A. Cleverly" <mic...@cleverly.com> wrote:

> What TIP #287 was intended to help with was the non-blocking fileevent
> driven case. And for that you don't need to do any kind of new
> polling that you weren't already doing before.
>
> It's easy to miss the fact that a fileevent readable callback is
> triggered anytime there is unread data on the channel *even* when the
> channel is in line buffering mode and there isn't a complete line
> available yet. (This is precisely the reason we have [fblocked]; to
> after-the-fact distinguish between [gets] reading a line of length 0
> and [gets] returning (or assigning to our variable) the empty string
> because there was an incomplete line in the buffer.)

It is indeed... I hope an example similar to your 40-byte echo server
will make it into the documentation somewhere nice and obvious... :)


Fredderic

Stephan Kuhagen

unread,
Dec 18, 2006, 12:57:20 AM12/18/06
to
Darren New wrote:

> Of course, the errorInfo says something useful to the reader. But if you
> want to *handle* the errors, having good errorCodes are important. The
> answer "errorCode is currently broken, so we might as well leave it
> broken and work around it forever" is a suboptimal solution, I think.

Okay, maybe I misunderstood you there before. Of course would it be the best
solution, to have useful errorCodes and use them, and I would appreciate,
if there was some strict style guide in the core and for extensions, how
the errorCode _must_ be implemented. What I wrote was just a reaction of
how errorInfo/errorCode currently work.

One more to the errorCode/errorInfo style guide: for example what you do
with S3 may be a good solution for this, but I have the feeling, there has
never been any real thoughts of a working solution, which can be used a
general rule (and requirement) for generating errorInfo/errorCode. But
there should be something like this for a good error handling. If I know S3
and if I know, I'm working just with it, I can build a nice error handling
scheme. But then I add some more libraries and extensions and have to add
individual error handling for each of them, because everybody either finds
his/her own system of useful and well designed error generation rules or
simply doesn't care.

So you are right: if [gets] or anything else fails, it should possible (and
considered good style) to check for the errorCode and handle the error. But
for that to become good practice, errorCode must be a reliable feature
throughout the whole language.

Regards
Stephan

Darren New

unread,
Dec 18, 2006, 1:13:00 AM12/18/06
to
Stephan Kuhagen wrote:
> if there was some strict style guide in the core and for extensions, how
> the errorCode _must_ be implemented.

I think the best approach is probably to always use a list, always have
the first element indicate the subsystem returning the error, and have
things near the start of the list more general than things near the end
of the list. I think that's how the core does errorCode now in the
places it does, with (for example) ARITH and POSIX error codes.

Larry W. Virden

unread,
Dec 18, 2006, 7:10:30 AM12/18/06
to

Darren New wrote:
> Me, I put the body in a call to "expectError" and that routine catches,
> the error, returning the return code and errorCode as a list.

At least that forces errorCode into list-safeness, which should be a
good beginning for standardized checking.

0 new messages