Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Array defaults vs exception overhead

0 views
Skip to first unread message

Alexandre Ferrieux

unread,
May 10, 2007, 7:14:49 PM5/10/07
to
Hi,

The following idiom happens often in my code:

if {[info exists a($x)]} {do_something $a($x)}

Problem: the above does the hash lookup twice. Argh.
A natural answer is:

catch {do_something $a($x)}

But unfortunately, this is even slower (from 7.6 to 8.4.14) due to the
exception unwinding overhead.
Moreover, it is not strictly equivalent since it loses any exception
from [do_something].
To restore equivalence, we'd need pattern matching on the error...
even slower !

Any better idiom doing just one hash lookup ?

Another approach would be to have "array defaults": an option to
substitute a common value to all undefined slots of an array instead
of raising an exception. This one would also allow things like a blind
"incr a($x)".
I remember an old discussion on this. What's the current status ?

Yet another approach would be more general, and akin to a Bourne Shell
mechanism: "variable defaults".
The idea is to have ${foo:-bar} yield $foo or bar if foo is undefined.
The natural extension to arrays yields

do_something ${a($x):-nope}

(here [do_something] is supposed to handle 'nope' innocuously, at
least faster than a 2nd hash lookup)

Reactions ?

-Alex

Roy Terry

unread,
May 10, 2007, 8:17:21 PM5/10/07
to
"Alexandre Ferrieux" <alexandre...@gmail.com> wrote in message
news:1178838889.5...@o5g2000hsb.googlegroups.com...

> Hi,
>
> The following idiom happens often in my code:
>
> if {[info exists a($x)]} {do_something $a($x)}
>
> Problem: the above does the hash lookup twice. Argh.
> A natural answer is:
>
> catch {do_something $a($x)}
>
> But unfortunately, this is even slower (from 7.6 to 8.4.14) due to the
> exception unwinding overhead.
It sort of depends on what percentage of time you expect the array slot
to be undefined. If slot is usually defined then the catch is overall less
overhead,
otherwise info exist wins.
I have also long wished for a more concise solution to this situation. But
speed is usually not an issue as [info exists] is really quite fast. Still
you
end up coding the array reference twice which can be annoying when
the index is built of 3 or 4 substrings. Sometimes I use a macro to
automatically test and fetch in one call.

I wonder, does the Tcl interp "cache" array lookups so that immediately
following identical look ups are faster? Doubtful.

> Moreover, it is not strictly equivalent since it loses any exception
> from [do_something].
> To restore equivalence, we'd need pattern matching on the error...
> even slower !
>
> Any better idiom doing just one hash lookup ?
>
> Another approach would be to have "array defaults": an option to
> substitute a common value to all undefined slots of an array instead
> of raising an exception. This one would also allow things like a blind
> "incr a($x)".
> I remember an old discussion on this. What's the current status ?

Re incr, that issue is addressed in 8.5 (it auto creates the variable).
I can't see "array defaults" going anywhere, but who knows.

My thought (all of 10 seconds):
info value a(x) defaultValue
Could have info exists baked-in and if non-existent,
just return the supplied default otherwise return the
wanted value.

>
> Yet another approach would be more general, and akin to a Bourne Shell
> mechanism: "variable defaults".
> The idea is to have ${foo:-bar} yield $foo or bar if foo is undefined.
> The natural extension to arrays yields
>
> do_something ${a($x):-nope}

Interesting. However I'd don't think I'd enjoy reading Tcl Core comments on
a TIP of this sort :^(

Roy

Koen Danckaert

unread,
May 11, 2007, 4:21:08 AM5/11/07
to
> The following idiom happens often in my code:
>
> if {[info exists a($x)]} {do_something $a($x)}
>
> Problem: the above does the hash lookup twice. Argh.
> A natural answer is:
>
> catch {do_something $a($x)}
>
> But unfortunately, this is even slower (from 7.6 to 8.4.14) due to the
> exception unwinding overhead.
> Moreover, it is not strictly equivalent since it loses any exception
> from [do_something].
> To restore equivalence, we'd need pattern matching on the error...
> even slower !
>
> Any better idiom doing just one hash lookup ?

upvar 0 a($x) val
if {[info exists val]} {do_something $val}

I've not tested whether it's faster or slower...

-- Koen

Neil Madden

unread,
May 11, 2007, 4:32:36 AM5/11/07
to
Alexandre Ferrieux wrote:
> Hi,
>
> The following idiom happens often in my code:
>
> if {[info exists a($x)]} {do_something $a($x)}
>
> Problem: the above does the hash lookup twice. Argh.

Is this really a problem? If it really is, then you could use [array get]:

foreach {_ val} [array get a $x] { do_something $val }

Alternatively, if you want a default value, then you can do:

do_something [append a($x) ""]
do_something [incr a($x)] ;# 8.5

...


> Yet another approach would be more general, and akin to a Bourne Shell
> mechanism: "variable defaults".
> The idea is to have ${foo:-bar} yield $foo or bar if foo is undefined.
> The natural extension to arrays yields
>
> do_something ${a($x):-nope}

Ewwwww. Yuck! :)

-- Neil

sleb...@gmail.com

unread,
May 11, 2007, 6:20:42 AM5/11/07
to
On May 11, 4:32 pm, Neil Madden <n...@cs.nott.ac.uk> wrote:

> Alexandre Ferrieux wrote:
> > Yet another approach would be more general, and akin to a Bourne Shell
> > mechanism: "variable defaults".
> > The idea is to have ${foo:-bar} yield $foo or bar if foo is undefined.
> > The natural extension to arrays yields
>
> > do_something ${a($x):-nope}
>
> Ewwwww. Yuck! :)
>
> -- Neil

Yucky only if implemented as a syntax extension. It looks clean
implemented in tcl:

proc get {var {default {}}} {
upvar 1 $var v
if {[info exists v]} {
return $v
}
return $default
}

# Usage (as per original example):
do_something [get a($x) -nope]

Alexandre Ferrieux

unread,
May 11, 2007, 7:05:58 AM5/11/07
to
On May 11, 10:32 am, Neil Madden <n...@cs.nott.ac.uk> wrote:
> Alexandre Ferrieux wrote:
> > if {[info exists a($x)]} {do_something $a($x)}
> > Problem: the above does the hash lookup twice. Argh.
>
> Is this really a problem? If it really is, then you could use [array get]:
>
> foreach {_ val} [array get a $x] { do_something $val }

Does [array get a $x] optimize for the case where $x has no
wildcards ?
If yes, you win.
If no, then [array get a $x] will replace a hash lookup by a linear
search.
Which yucks, as you say ;-)

> Alternatively, if you want a default value, then you can do:
>
> do_something [append a($x) ""]
> do_something [incr a($x)] ;# 8.5

I don't want side-effects populating the array ! The above will fill
the hashtable...

> > do_something ${a($x):-nope}

> Ewwwww. Yuck! :)

Do you Yuck at the precise {:-} syntax, or at the fact that it is a
syntax extension ?

-Alex

sleb...@gmail.com

unread,
May 11, 2007, 7:33:27 AM5/11/07
to
On May 11, 7:05 pm, Alexandre Ferrieux <alexandre.ferri...@gmail.com>
wrote:

Don't know about Neil but for me personally YUCK for both. There is a
very simple rule in Tcl: {} is non-subst quoting. Even when we use it
as a syntax to disambiguate $ substitution like ${x}y the behavior of
{} haven't changed, it's still non-subst quoting. Making it behave any
other way will lead to quoting hell. I've had enough of that from ""
thankyouverymuch, now don't go messing with my {}!

Concerning the fact that it's a syntax extension, we can easily
achieve the same effect, even preserving the format of the syntax,
using a regular proc. Lispers consider doing it with regular proc
"syntax extension". Of course, since it looks so ordinary and easy in
tcl, Tclers don't consider it a "syntax extension". No, instead every
once in a while we get suggestions to introduce even more special
characters to tcl. Perhaps we really should be thinking more like the
Lisp people and do it with procs whenever we feel like "extending" the
syntax. Only in cases where it cannot be implemented as procs/commands
should we consider "real" syntax extension.


Alexandre Ferrieux

unread,
May 11, 2007, 8:04:59 AM5/11/07
to
On May 11, 12:20 pm, "slebet...@yahoo.com" <slebet...@gmail.com>
wrote:

> On May 11, 4:32 pm, Neil Madden <n...@cs.nott.ac.uk> wrote:
>
> > Alexandre Ferrieux wrote:
> > >
> > > do_something ${a($x):-nope}
>
> > Ewwwww. Yuck! :)
>
> Yucky only if implemented as a syntax extension. It looks clean
> implemented in tcl:
> do_something [get a($x) -nope]

The idea of moving out of syntax extensions is good. (I mean, I'd
have preferred the {:-} but it seems /bin/sh has fewer fans than I
expected).

Now, to implement efficiently your [get], even at the C level, we need
to cancel the overhead of exception unwinding. I have no definitive
explanation, but I suspect it is linked with the buildup
of ::errorInfo. So we'd need an internal flag saying "don't build
errorInfo when crawling back from exception". And come to think of
it, this flag could even be put to use at the tcl level:

fastcatch {code} err

--> like [catch], but without building errorInfo.
Of course this flag would be set/reset in nested fashion by the
various [catch] and [fatscatch]:

catch {
f
fastcatch {
g
catch {
h
} ;# here I have errorInfo from h
} ;# here I have no errorInfo
} ;# here I have errorInfo from f but not from g

Reactions ?

-Alex

Kevin Kenny

unread,
May 11, 2007, 9:28:58 AM5/11/07
to
sleb...@yahoo.com wrote:
> Concerning the fact that it's a syntax extension, we can easily
> achieve the same effect, even preserving the format of the syntax,
> using a regular proc. Lispers consider doing it with regular proc
> "syntax extension". Of course, since it looks so ordinary and easy in
> tcl, Tclers don't consider it a "syntax extension". No, instead every
> once in a while we get suggestions to introduce even more special
> characters to tcl. Perhaps we really should be thinking more like the
> Lisp people and do it with procs whenever we feel like "extending" the
> syntax. Only in cases where it cannot be implemented as procs/commands
> should we consider "real" syntax extension.

Right! That's why {*} went in. Interpolation of a list as multiple
arguments to a command was something that couldn't be neatened up
with a proc; anything that was available was really no better than
[eval [linsert $args 0 $command]]... with all the dangers of [eval].

Fear not, most of the TCT understand Lisp. Well, a Smug Lisp
Weenie would say that if we *really* understood Lisp, we'd abandon
Tcl and switch to Lisp. We tend to be more open-minded. (I
can just hear the Smug Lisp Weenies snorting and muttering,
"empty-headed, rather!")

--
73 de ke9tv/2, Kevin

Neil Madden

unread,
May 11, 2007, 3:25:41 PM5/11/07
to
Alexandre Ferrieux wrote:
> On May 11, 10:32 am, Neil Madden <n...@cs.nott.ac.uk> wrote:
>> Alexandre Ferrieux wrote:
>>> if {[info exists a($x)]} {do_something $a($x)}
>>> Problem: the above does the hash lookup twice. Argh.
>> Is this really a problem? If it really is, then you could use [array get]:
>>
>> foreach {_ val} [array get a $x] { do_something $val }
>
> Does [array get a $x] optimize for the case where $x has no
> wildcards ?
> If yes, you win.
> If no, then [array get a $x] will replace a hash lookup by a linear
> search.
> Which yucks, as you say ;-)

Good point. :-S

>> Alternatively, if you want a default value, then you can do:
>>
>> do_something [append a($x) ""]
>> do_something [incr a($x)] ;# 8.5
>
> I don't want side-effects populating the array ! The above will fill
> the hashtable...

Details, details...

>
>>> do_something ${a($x):-nope}
>
>> Ewwwww. Yuck! :)
>
> Do you Yuck at the precise {:-} syntax, or at the fact that it is a
> syntax extension ?

Mostly both. I'm not entirely against syntax extensions (I'd still like
syntax for [expr]...), but I don't think this needs it. I also think
{:-} is particularly yucky :)

There are a number of places where it might be nice to avoid two
hash-lookups:
array ifpresent $arr $key $cmd; # call $cmd with value if present
array default $arr $key $default; # return val or default
array replace $arr $key $value; # return old value
array setifabsent $arr $key $val
array setifpresent $arr $key $val
Along with equivalents for [dict]. However, I'm not sure that the cost
is worth it. A much better solution would be to figure out if the result
of the lookup can be cached in the key Tcl_Obj so that subsequent
lookups are faster. I thought dicts already did this, but I can't see
anything in the code.

Thinking about it, those kinds of atomic operations are also what you'd
want in a thread-shared array. I wonder if the Thread extension contains
any useful code for this?

-- Neil

Stephan Kuhagen

unread,
May 11, 2007, 3:58:04 PM5/11/07
to
Neil Madden wrote:

>>> do_something [append a($x) ""]
>>> do_something [incr a($x)] ;# 8.5
>>
>> I don't want side-effects populating the array ! The above will fill
>> the hashtable...
>
> Details, details...

How about:

set myArray(some) thing
lindex [concat [array get myArray some] default: theThingValue] 1
lindex [concat [array get myArray noSuchIndex] default: theNoneValue] 1

Also a little bit long, but no exception, no array population and no [info
exists]...

Besides this, I also would like to have defaults for none existing array
values, but I also would like to avoid syntax extensions. I'd prefer a
[array] option for this, e.g.

array get -default "theValue" myArray theIndex

should return {theIndex theValue}.

Regards
Stephan

Alexandre Ferrieux

unread,
May 11, 2007, 3:58:52 PM5/11/07
to
On May 11, 9:25 pm, Neil Madden <n...@cs.nott.ac.uk> wrote:
> >> do_something [append a($x) ""]
> >> do_something [incr a($x)] ;# 8.5
>
> > I don't want side-effects populating the array ! The above will fill
> > the hashtable...
>
> Details, details...

Well, if the hit-miss ratio is large, that makes a hefty detail...

> {:-} is particularly yucky :)

Look at that line ! Two smileys fighting one-on-one !

> There are a number of places where it might be nice to avoid two
> hash-lookups:
> array ifpresent $arr $key $cmd; # call $cmd with value if present

> ...


> Along with equivalents for [dict]. However, I'm not sure that the cost
> is worth it.

Then, what about the [fastcatch] in my other post ? A single primitive
allowing to write all
these variants (array and dict as well) in pure Tcl !

> [cached lookups]

Elegant, but hides a few quirks. If you store the key with the pointer
to the collision list element, you get extra housekeeping at deletion
time (at least); if instead you store it with a bucket index, you
still have to scan the collision list...
But I'm not saying this is redhibitory; I'm ready to be convinced.
Which one is simplest to implement: fatscatch or cached lookups ?

-Alex

Uwe Klein

unread,
May 11, 2007, 4:04:14 PM5/11/07
to

the possibility to trace on {element|variable} not found

trace add variable notfound ...

would be a boon anyways .
currently if you want to use a sparse array that is filled on request
you have to trace the complete array.


uwe

0 new messages