Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

... and I thought I knew the quoting rules

30 views
Skip to first unread message

Helmut Giese

unread,
Nov 9, 2007, 3:18:18 AM11/9/07
to
Hello out there,
I stumbled over something weird which I need some help to understand.
I am building a checksum by XORing a sequence of bytes like so
---
set cks 0
set hex FF
set cks [format %x [expr 0x$cks ^ 0x$hex]]
puts "cks: [format %02X 0x$cks]"
set hex F0
set cks [format %x [expr {0x$cks ^ 0x$hex}]]
puts "cks: [format %02X 0x$cks]"
---

Anybody looking at the third line will immediately say "you should
always brace the argument(s) to [expr]" - but when I do (in the next
to last line) I get the error
syntax error in expression "0x$cks ^ 0x$hex": extra tokens at end
of expression

Could somebody please enlighten me what is going on here (Tcl 8.4.14
on Windows)?
Thanks and best regards
Helmut Giese

Andreas Leitgeb

unread,
Nov 9, 2007, 4:29:59 AM11/9/07
to
Helmut Giese <hgi...@ratiosoft.com> wrote:
> set cks 0
> set hex FF
> set cks [format %x [expr 0x$cks ^ 0x$hex]]
> puts "cks: [format %02X 0x$cks]"

While there are different ways to use hexnums, which are longer to type
(involving [scan ... %x]) but probably still more efficient, this is
one of the situations, where expr without braces yields the effect you
want, and with braces it wouldn't.

Anyway, the result for me shows as: "cks: FF", which seems 100%
correct to me.
So either you'd have to tell us, what other result you see, or what
makes you think, that FF was a wrong result in your opinion.

"time {expr {[scan $cks %x] ^ [scan $hex %x]}} 100000" takes less than
3/4 of the time that "time {expr 0x$cks ^ 0x$hex} 100000" takes.
If any of these numbers is constant (in a loop), then obtaining its
integervalue just once would surely again accellerate it.

Helmut Giese

unread,
Nov 9, 2007, 5:41:44 AM11/9/07
to
Hi Andreas,

>While there are different ways to use hexnums, which are longer to type
>(involving [scan ... %x]) but probably still more efficient, this is
>one of the situations, where expr without braces yields the effect you
>want, and with braces it wouldn't.
... but why?

The results of the operation and efficiency are (in this example) of
no concern - it is just the curiosity why


set cks [format %x [expr 0x$cks ^ 0x$hex]]

works and


set cks [format %x [expr {0x$cks ^ 0x$hex}]]

doesn't.
Best regards
Helmut Giese

Tobias Hippler

unread,
Nov 9, 2007, 5:54:16 AM11/9/07
to
Hi there,

another way would be using double quotes around the operators inside the
curly braces:

set hex F0
set cks [format %x [expr {"0x$cks" ^ "0x$hex"}]]
puts "cks: [format %02X 0x$cks]"

The problem occurs because expr first tokenizes it's arguments following
the rules of the expr doku page, and then variable substitution. In an
expr-expression variables are not expected to have any kind of prefix
(like '0x').

Tobi

Jonathan Bromley

unread,
Nov 9, 2007, 6:37:33 AM11/9/07
to
On Fri, 09 Nov 2007 11:41:44 +0100,
Helmut Giese wrote:

> it is just the curiosity why
> set cks [format %x [expr 0x$cks ^ 0x$hex]]
>works and
> set cks [format %x [expr {0x$cks ^ 0x$hex}]]
>doesn't.

so, in the first example Tcl substitutes the variables,
and in the second example it's [expr] that is obliged to
do the substitution. I had vaguely assumed that [expr]
would use [subst] to do this, but that's not quite true...

% set n FF
FF
% expr 0x$c ---- Tcl substitutes $c
255 ---- [expr] evaluated "0xFF"
% subst {0x$c}
0xFF ---- as expected
% expr {0x$c} ---- [expr] does the substitutions?
syntax error in expression "0x$c": .....
% subst {"0x$c"}
"0xFF" ---- the quotes are preserved
% expr {"0x$c"} ---- [expr] does the substitutions???
255

so it's clear that [expr] is doing some clever, but
not quite clever enough, stuff. I know that [expr]
does some ingenious guessing, about what mistakes you
have made, which couldn't be done by [subst]:

% expr {a+b}
syntax error [...] variable references require preceding $

and it sometimes has spectacularly counterintuitive effects...

% expr {$c eq FF}
syntax error [...] variable references require preceding $
% expr {$c eq "FF"}
1

Maybe I should go read the source code for [expr] to see
what's really happening.
--
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
jonathan...@MYCOMPANY.com
http://www.MYCOMPANY.com

The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.

Gerald W. Lester

unread,
Nov 9, 2007, 10:29:18 AM11/9/07
to

Helmut, others have addressed the direct question you pose -- I'd like to
ask why you are doing this like this? Why not use the binary format command
to convert the string of bytes to binary string then binary scan to a list
of integer values then just do:

foreach byte $listOfBytes {
set cka [expr {$cka ^ $byte}]
}


--
+--------------------------------+---------------------------------------+
| Gerald W. Lester |
|"The man who fights for his ideals is the man who is alive." - Cervantes|
+------------------------------------------------------------------------+

Tom Conner

unread,
Nov 9, 2007, 10:42:58 AM11/9/07
to

"Helmut Giese" <hgi...@ratiosoft.com> wrote in message
news:dh48j3pku389v5mgi...@4ax.com...

> Hello out there,
> I stumbled over something weird which I need some help to understand.
> I am building a checksum by XORing a sequence of bytes like so
> ---
> set cks 0
> set hex FF
> set cks [format %x [expr 0x$cks ^ 0x$hex]]
> puts "cks: [format %02X 0x$cks]"
> set hex F0
> set cks [format %x [expr {0x$cks ^ 0x$hex}]]
> puts "cks: [format %02X 0x$cks]"
> ---
>
> Anybody looking at the third line will immediately say "you should
> always brace the argument(s) to [expr]" - but when I do (in the next
> to last line) I get the error
> syntax error in expression "0x$cks ^ 0x$hex": extra tokens at end
> of expression
>


I had the exact same issue this week, both XORing and with comparisons (<=
>=). Very irritating. I ended up prepending 0x onto the var before the
expr command.


Roy Terry

unread,
Nov 9, 2007, 11:12:53 AM11/9/07
to

A generic issue with expr:
app) 310 % set a a ;set b a
a
(app) 311 % expr {$a$b eq "ab"} ;# $a%b won't fly
syntax error in expression "$a$b eq "ab"": extra tokens at end of expression
(app) 312 % expr {"$a$b" eq "ab"} ;# A-OK
0

Roy
>
> Tobi

Glenn Jackman

unread,
Nov 9, 2007, 11:15:32 AM11/9/07
to
At 2007-11-09 05:54AM, "Tobias Hippler" wrote:
> another way would be using double quotes around the operators inside the
> curly braces:
>
> set hex F0
> set cks [format %x [expr {"0x$cks" ^ "0x$hex"}]]
> puts "cks: [format %02X 0x$cks]"

To be precise (I love to nitpick), you mean:
double quotes around the _operands_

The operator is ^

--
Glenn Jackman
"You can only be young once. But you can always be immature." -- Dave Barry

tom.rmadilo

unread,
Nov 9, 2007, 2:08:27 PM11/9/07
to
On Nov 9, 12:18 am, Helmut Giese <hgi...@ratiosoft.com> wrote:
> set cks [format %x [expr 0x$cks ^ 0x$hex]]
...

> Anybody looking at the third line will immediately say "you should
> always brace the argument(s) to [expr]" - but when I do (in the next
> to last line) I get the error
> syntax error in expression "0x$cks ^ 0x$hex": extra tokens at end
> of expression
>
> Could somebody please enlighten me what is going on here (Tcl 8.4.14
> on Windows)?

Bracing or not, this is not a rule. If you actually got this to work,
who cares why something else doesn't work, basically this is an
example of pretty brittle code all the way around and is bound to
break sooner or later. Same thing goes with using <=, <, >, >=. Wait,
just be very careful relying on Tcl conversions and comparisons in
something like [expr], including args to [for] or first arg to [if]
and [while]. You need to prepare ahead of time your values, then feed
them to expr. Otherwise you rely on expr to parse things as strings,
then reparse as hex, etc. This is why the unbraced expr works, before
expr gets its argument, it is a string. After expr gets it, the 0x or
ff looks like a variable reference.

The safest way to do this is to convert to integers, base 10:
% set hex_chars ff
ff
% set mask_chars "02"
02
% set a [format %i 0x$hex_chars]
255
% set mask [format %i 0x$mask_chars]
2
% set xor [expr $mask ^ $a]
253
% set result_hex 0x[format %x $xor]
0xfd

Of course, then wrap all of this up into a proc so it can be done in
one line of code anytime you need it.

It looks like [binary format/scan] have the same issues: you need to
use the leading 0x to signal the following is hex, so same problems
should show up, well, maybe not because you use a separate command to
format the input, which is what I suggest.

suchenwi

unread,
Nov 9, 2007, 6:25:09 PM11/9/07
to
I think the main point is that [expr] doesn't do all substitutions or
interpolations as Tcl does, for efficiency. Experiments:
% set a 1; set b 2
2
% expr $a$b
12
% expr {$a$b}
syntax error in expression "$a$b": extra tokens at end of expression
% expr {"$a$b"}
12
% expr {"$a$b" * 3}
36
So quoting can group string parts to be conjoined.
% expr {$a $op $b}
syntax error in expression "$a $op $b": extra tokens at end of
expression
% expr $a $op $b
3
But this is done only for operands, operators are not substituted by
[expr].

Jonathan Bromley

unread,
Nov 10, 2007, 7:51:22 AM11/10/07
to
On Fri, 09 Nov 2007 15:25:09 -0800, suchenwi
<richard.suchenw...@siemens.com> wrote:

>I think the main point is that [expr] doesn't do all
> substitutions or interpolations as Tcl does, for
> efficiency.

indeed so.

> But [ expr's substitution ] is done only for operands,

> operators are not substituted by [expr].

And what's more, only *complete* operands are $-substituted.
This is not at all obvious but, once you know what you're
looking for, the [expr] man page is lucid enough; and it
provides the get-out clause: you can use "" quoting to
force more complete substitution.

It's going into my list of Tcl "gotchas".

tom.rmadilo

unread,
Nov 10, 2007, 11:39:49 AM11/10/07
to
On Nov 9, 3:25 pm, suchenwi <richard.suchenwirth-

bauersa...@siemens.com> wrote:
> So quoting can group string parts to be conjoined.
> % expr {$a $op $b}
> syntax error in expression "$a $op $b": extra tokens at end of
> expression
> % expr $a $op $b
> 3
> But this is done only for operands, operators are not substituted by
> [expr].

I'm pretty sure that [expr] is tokenizing the expression prior to
substitution. Since there are no patterns which go {a b c} or even {a
b}, you get an error. First [expr] validates the form of the
expression, otherwise it wouldn't know what branch to go into next. {$a
$b} is passed unsubstituted to [expr] and is equivalent to {$a $b},
whereas $a$b is passed as a substituted 12, so it isn't that operators
are not substituted, it is more that you never get to the substitution
step before [expr] figures out the expression is invalid. In addition,
the quotes which surround a part have the effect of telling [expr]
that this is a single token, so the expression can pass the syntax
checking stage:

% set a 1;set b 2
2
% expr $a$b
12
% expr {$a$b}
syntax error in expression "$a$b": extra tokens at end of expression

% expr {$a $b}
syntax error in expression "$a $b": extra tokens at end of expression
% expr {$a}
1
% expr {$a$b * 2}
syntax error in expression "$a$b * 2": extra tokens at end of
expression
% expr {"$a$b" * 2}
24
% expr {"$a$b"}
12
% expr {"a$b"}
a2
% expr {"a$b" * 2}
can't use non-numeric string as operand of "*"
% set a a
a
% expr {"$a$b" * 2}
can't use non-numeric string as operand of "*"

Now we have passed substitution and we are on to the type check/
conversion phase:

% set a 0x
0x
% expr {"$a$b" * 2}
4
% set b f
f
% expr {"$a$b" * 5}
75
% set b ""
% expr {"$a$b" * 5}
can't use integer value too large to represent as operand of "*"

The variety of potential error messages here points to the need to do
some additional checking prior to using expr. For instance, even:
% string is integer $b
1
Use instead:
% string is integer -strict $b
0
An empty string is a very common cause for these strange error
messages which show up, sometimes years after you write your code.

Schelte Bron

unread,
Nov 10, 2007, 3:20:14 PM11/10/07
to
tom.rmadilo wrote:
> The safest way to do this is to convert to integers, base 10:
> % set hex_chars ff
> ff
> % set mask_chars "02"
> 02
> % set a [format %i 0x$hex_chars]
> 255
> % set mask [format %i 0x$mask_chars]
> 2
Format is not the right tool for this job. This is what scan is
perfect for:

% scan $hex_chars %x a
1
% scan $mask_chars %x mask
1

> % set xor [expr $mask ^ $a]
> 253

Now there's no excuse not to brace that expression.


% set xor [expr {$mask ^ $a}]
253


Schelte
--
set Reply-To [string map {nospam schelte} $header(From)]

tom.rmadilo

unread,
Nov 10, 2007, 10:25:03 PM11/10/07
to
On Nov 10, 12:20 pm, Schelte Bron <nos...@wanadoo.nl> wrote:
> Format is not the right tool for this job. This is what scan is
> perfect for:
>
> % scan $hex_chars %x a
> 1
> % scan $mask_chars %x mask
> 1
>
> > % set xor [expr $mask ^ $a]
> > 253
>
> Now there's no excuse not to brace that expression.
> % set xor [expr {$mask ^ $a}]
> 253

After my last round of examples above, I agree 100%. The un-braced
expression has the potential to abuse the first round of Tcl
substitution.

I'm far from 100% convinced that scan is either safe or desirable (not
that I think format wonderful). For instance:

% set hex_chars 12g
12g


% scan $hex_chars %x a
1

% puts $a
18

In other words, there is no validation or data integrity here. In some
cases, the input will pass the test, even though it is just a string
that starts with [0-9a-f]. Errors like this could pass through
undetected for a very long time. For both code development and long
term data security, failing fast is very important.

Scan says: find as much of a match as you can find, ignore the rest.
This is why I think that in general it is a very bad idea to combine
string substitution, parsing, formatting, scanning into a single line
of code. Usually the consequences are unimportant, and everyone can
decide for themselves. Personally I try to make a habit of verifying
stuff before use in [expr], [if], [while] or [for]. Not even braced
expressions save you from potential problems, but the error messages
are slightly more informative:

% set a ""
% expr {$a * 5}
can't use empty string as operand of "*"
% expr $a * 5
syntax error in expression " * 5": unexpected operator *

My suggestion is to do something like this:
if {![string is integer -strict $a]} {
return_and_notify_user_of_bad_a_using_helpful_language
}

99.44% of the time, the [if] condition will evaluate to '0', which is
what you want, but when it doesn't, you get a user error, not
something that the developer/maintainer may get tasked with tracking
down.

Schelte Bron

unread,
Nov 11, 2007, 5:40:55 AM11/11/07
to
tom.rmadilo wrote:
> I'm far from 100% convinced that scan is either safe or desirable
> (not that I think format wonderful). For instance:
>
> % set hex_chars 12g
> 12g
> % scan $hex_chars %x a
> 1
> % puts $a
> 18
>
> In other words, there is no validation or data integrity here. In
> some cases, the input will pass the test, even though it is just a
> string
> that starts with [0-9a-f]. Errors like this could pass through
> undetected for a very long time. For both code development and
> long term data security, failing fast is very important.
>
If checking for such a condition is desired I usually do:
if {[scan $hex_chars %x%1s a -] != 1} {
return_and_notify_user_of_bad_hex_chars_using_helpful_language
}

Schelte.

suchenwi

unread,
Nov 11, 2007, 6:44:45 AM11/11/07
to
On 11 Nov., 11:40, Schelte Bron <nos...@wanadoo.nl> wrote:
> If checking for such a condition is desired I usually do:
> if {[scan $hex_chars %x%1s a -] != 1} {
> return_and_notify_user_of_bad_hex_chars_using_helpful_language
> }

A simple check for integers, which in Tk brings up the usual "expected
integer but got ..." error dialog, is also
incr myvar 0

Tobias Hippler

unread,
Nov 12, 2007, 3:15:08 AM11/12/07
to
Glenn Jackman wrote:
> At 2007-11-09 05:54AM, "Tobias Hippler" wrote:
>> another way would be using double quotes around the operators inside the
>> curly braces:
>>
>> set hex F0
>> set cks [format %x [expr {"0x$cks" ^ "0x$hex"}]]
>> puts "cks: [format %02X 0x$cks]"
>
> To be precise (I love to nitpick), you mean:
> double quotes around the _operands_
>
> The operator is ^
>

you are so right!
thanks for bugfixing this :)

Helmut Giese

unread,
Nov 12, 2007, 3:58:23 AM11/12/07
to
Please excuse the delay.

Thanks to everybody who explained what is going on here - it did
further my understanding of some of Tcl's subtleties, like "[expr] has
its own rules for parsing, and these are almost - but not quite - like
Tcl's rules".

I also appreciate the remarks about improving the efficiency of the
code I posted. In the che context where it was used, I have to admit
that efficiency was my least concern.
We are developing a serial device which speaks a binary protocol, and
in order to test it we needed to provide some input. Inputting hex
data can be done with some monitor tools, but working with any kind of
checksum calls for a special solution.
The following lines are the central part of this "application": The
user enters space separated hex values like 'AB CD EF', and then the
script takes over.
---
gets stdin line
set hexLst [split $line " "]
# in case we have multiple spaces
while { [set idx [lsearch $hexLst ""]] != -1 } {
set hexLst [lreplace $hexLst $idx $idx]
}
# calculate checksum
set cks 0
foreach hex $hexLst {


set cks [format %x [expr 0x$cks ^ 0x$hex]]
}

# ... add it
lappend hexLst $cks
# ... and send everything off
foreach hex $hexLst {
puts -nonewline $fd [format %c 0x$hex]
}
---
Using the '0x$hex' idiom for [expr] seemed like the easiest way.

BTW now the spec has changed: The checksum is now a CRC - oh well,

Thanks again to all of you and best regards
Helmut Giese

Andreas Leitgeb

unread,
Nov 12, 2007, 8:30:35 AM11/12/07
to
Helmut Giese <hgi...@ratiosoft.com> wrote:
> foreach hex $hexLst {
> set cks [format %x [expr 0x$cks ^ 0x$hex]]
> }
I don't see the reason, why cks is thrown forth and back to hexadecimal:
set cks 0; foreach hex $hexLst { set cks [expr {$cks ^ [scan $hex %x]}] }

> lappend hexLst $cks
here convert it to hex just once:
lappend hexLst [format %x $cks]

> BTW now the spec has changed: The checksum is now a CRC - oh well,

What way ever it is calculated, there is no need to convert the
interims-results to and from hexadecimal at each iteration.

tom.rmadilo

unread,
Nov 12, 2007, 12:03:58 PM11/12/07
to
On Nov 12, 5:30 am, Andreas Leitgeb <a...@gamma.logic.tuwien.ac.at>
wrote:

Someone also pointed out to me that an input line like this is already
a tcl list, so you can skip all the splitting and just jump to the
foreach:

set chs 05
set fd stdout

foreach hex [gets stdin] {
puts -nonewline $fd [format %c [expr 0x$chs ^ 0x$hex]]
}
# for testing:
puts "\ndone"


Kaitzschu

unread,
Nov 12, 2007, 12:24:28 PM11/12/07
to
On Mon, 12 Nov 2007, tom.rmadilo wrote:

> Someone also pointed out to me that an input line like this is already a
> tcl list, so you can skip all the splitting and just jump to the
> foreach:

Pointed you wrong.

% foreach x [gets stdin] {puts $x}
foo { x
unmatched open brace in list

--
-Kaitzschu
s="TCL ";while true;do echo -en "\r$s";s=${s:1:${#s}}${s:0:1};sleep .1;done

"Good thing programmers don't build bridges
[the same way as Kaitzschu writes code]."
--John Kelly in comp.lang.tcl

tom.rmadilo

unread,
Nov 12, 2007, 1:30:40 PM11/12/07
to
On Nov 12, 9:24 am, Kaitzschu

<kaitzs...@kaitzschu.cjb.net.nospam.plz.invalid> wrote:
> On Mon, 12 Nov 2007, tom.rmadilo wrote:
> > Someone also pointed out to me that an input line like this is already a
> > tcl list, so you can skip all the splitting and just jump to the
> > foreach:
>
> Pointed you wrong.
>
> % foreach x [gets stdin] {puts $x}
> foo { x
> unmatched open brace in list

The original code would also error out at a later point, mine just
does it faster. The assumption in the example by Helmut is that only
whitespace and hex values between 00 and FF are present. I'm sticking
with these assumptions.

But much worse, I never did a checksum, here is another try:

set chs 0
set fd stdout

foreach hex [gets stdin] {
set chs [expr $chs ^ 0x$hex]
puts -nonewline $fd [format %c 0x$hex]
}
# puts checksum
puts -nonewline $fd [format %c $chs]


But it is a good thing that they have decided to move to a better
checksum, because the above checksum could be a whitespace character,
or some other control character.

Ralf Fassel

unread,
Nov 13, 2007, 3:25:07 AM11/13/07
to
* "tom.rmadilo" <tom.r...@gmail.com>

| But it is a good thing that they have decided to move to a better
| checksum, because the above checksum could be a whitespace
| character, or some other control character.

Isn't that true for any checksum? Usually checksums are printed in
Hex (or some other char-to-integer encoding) anyway, no?

R'

tom.rmadilo

unread,
Nov 13, 2007, 10:38:31 AM11/13/07
to
On Nov 13, 12:25 am, Ralf Fassel <ralf...@gmx.de> wrote:
> * "tom.rmadilo" <tom.rmad...@gmail.com>

Probably, but here the checksum is integer-to-char, the integer
checksum was converted to a single character. But this would still be
okay if you did a few other things:
1. make sure you send the length of the actual data
2. send the data
3. send the checksum (1 char)
4. okay, that is really one other thing...

In a line buffered system, you could easily lose some whitespace, you
might even end up sending a cr or crlf sequence in the data itself. To
avoid this, you need a length specification so the reader can just
read chars, which are just 8-bit integers. The input hex chars are
assumed safe in this application, with only values 0-f and whitespace.

0 new messages