8.1 slower then 8.0???

14 views
Skip to first unread message

John McLaughlin

unread,
May 8, 1999, 3:00:00 AM5/8/99
to
I was playing with the newly released 8.1 the other day and I decided to
take a peek at performance. I honestly expected it to be the same or
perhaps a hair better but I am suprised to find that it appears as if it
actually may be quite a bit worse!

First my setup:

Linux Redhat 5.2
tcl 8.0.3 (as it comes with RedHat)
tcl 8.1 (./configure with no options then compile)
450MHz PII

For the most simple things 8.1 seems a hair faster, but as the work
get's more complicated it seems to slow down quite a bit!

8.0 8.1
------ -------
Empty Proc 6us 5us
Factorial 10 480us 512us

String Reversal
of len 128 1.4ms 1.7ms
of len 4096 41ms 532ms (Yes that's correct 10x
slower?!?)

I tried recompiling 8.1 with -O4 but it didn't change the results at
all, I realize that this is not an exhaustive test but for the few test
cases *I* have run 8.1 ends up being slower. Has anyone done research
on 8.1 performance and is this known or did I do something wrong?

The other question it begs is what *is* the performance of 8.1 (in
general) compared to 8.0?

-John

(for reference my fact & string reversal scripts were, I really have not
tried to optimize them I just tossed them together)

proc fact { n } {
if { $n == "1" } {
return 1
}

return [ expr $n*[fact [ expr $n - 1 ] ] ]
}


proc tclStrrev s {
set length [ string length $s ]

for { set i $length } {$i >= 0 } { incr i -1 } {
append rev [ string index $s $i]
}
return $rev
}

Mbaccar

unread,
May 9, 1999, 3:00:00 AM5/9/99
to
Hi

I wrote a substantial script with tcl (1100+ lines
of code I cannot disclose). This script makes use of all features of tcl
(array, regexp, string, list, etc...)

When I saw the previous note, I ran a quick comparison myself. I get
approximately 2X
the performance when I use tcl8.0 vs. 8.1:
:
Tcl8.0:
Start Execution Time: 05/08/1999 23:05:53
End Execution Time: 05/08/1999 23:07:09
(1 minute 16 seconds)

Tcl8.1
Start Execution Time: 05/08/1999 23:03:10
End Execution Time: 05/08/1999 23:05:11
(2 minutes and 1 second)

I am using nt4.0, on P233 MMX system.
The executables come directly from scriptics
public area (tcl8.0.5 and tcl8.1).

Thanks
Mohamed


Chang LI

unread,
May 9, 1999, 3:00:00 AM5/9/99
to
Mbaccar wrote:
>

Tcl8.1 is noticable slower to load than Tcl8.0. Especially for Tk 8.1.
And Tcl8.1 do more initialization. Your result may include the load
time.


> I wrote a substantial script with tcl (1100+ lines
> of code I cannot disclose). This script makes use of all features of tcl
> (array, regexp, string, list, etc...)
>
> When I saw the previous note, I ran a quick comparison myself. I get
> approximately 2X
> the performance when I use tcl8.0 vs. 8.1:
> :

[snip]

>
> Thanks
> Mohamed

--
--------------------------------------------------------------
Chang LI, Neatware
email: cha...@neatware.com
--------------------------------------------------------------

Bruce S. O. Adams

unread,
May 9, 1999, 3:00:00 AM5/9/99
to

Chang LI wrote:

Is it possible that this is just an interim problem? tcl8.1 still has a lot
of
growing to do, several patch levels worth at least. Maybe some of the
modules have accidently been compiled with debugging switched on?
Has anyone had this problem on another platform (i.e. one for which
binary distribution isn't the preferred method)?
Regards,
Bruce A.


Dave LeBlanc

unread,
May 9, 1999, 3:00:00 AM5/9/99
to
A good proportion of 8.1 slow down is likely attributable to Unicode.
I believe what you type/load into tcl in ascii gets converted to
unicode and then back to ascii for output.

Someday we'll have OS's that support Unicode across the board and this
won't happen anymore.

Dave LeBlanc

John McLaughlin

unread,
May 9, 1999, 3:00:00 AM5/9/99
to
Bruce & Chang,

As I mentioned in my first post, I started from source code and compiled it
under Linux. I was using tclsh interactively from the command line so I
don't think there are any issues with 'load' time, also since I ran this
under tclsh there were no Tk issues involed. (I ran the experiment numerous
times within one tclsh session).

I also confirmed that both tclsh's used the same libc, libm, libdld, etc.
The only difference was the libtcl they used (8.0 vs 8.1). I did try
playing with compile options (-O vs -O4) but it didn't change the results.

So for *my* case I am pretty comfortable there was nothing painfully
obviously wrong.. It just appears as if 8.1 is (in some cases) substantially
slower than 8.0. I can't believe this is by design and I hope it's just a
defect (anyone care to 'quantify' 8.0 vs 8.1?)

-Jm

Bruce S. O. Adams <bruce...@rmc-ltd.com> wrote in message
news:3735A9AA...@rmc-ltd.com...


>
>
> Chang LI wrote:
>
> > Mbaccar wrote:
> > >
> >
> > Tcl8.1 is noticable slower to load than Tcl8.0. Especially for Tk 8.1.
> > And Tcl8.1 do more initialization. Your result may include the load
> > time.
> >
> > > I wrote a substantial script with tcl (1100+ lines
> > > of code I cannot disclose). This script makes use of all features of
tcl
> > > (array, regexp, string, list, etc...)
> > >
> > > When I saw the previous note, I ran a quick comparison myself. I get
> > > approximately 2X
> > > the performance when I use tcl8.0 vs. 8.1:
> > > :
> > [snip]
> >
> > >
> > > Thanks
> > > Mohamed
> >

> > --------------------------------------------------------------

Bruce Stephens

unread,
May 9, 1999, 3:00:00 AM5/9/99
to
whi...@accessone.com (Dave LeBlanc) writes:

> A good proportion of 8.1 slow down is likely attributable to
> Unicode. I believe what you type/load into tcl in ascii gets
> converted to unicode and then back to ascii for output.

The internal form is supposed to be UTF-8, and the conversion of ASCII
to UTF-8 ought to be pretty fast!

Slowdown could happen because of string operations becoming more
tricky (since a "character" isn't necessarily a single byte any more).

That's plausible, but I've no idea whether it's true or not. Why
doesn't somebody who cares do some profiling?

Bruce Stephens

unread,
May 9, 1999, 3:00:00 AM5/9/99
to
Bruce Stephens <br...@cenderis.demon.co.uk> writes:

> Slowdown could happen because of string operations becoming more
> tricky (since a "character" isn't necessarily a single byte any
> more).
>
> That's plausible, but I've no idea whether it's true or not. Why
> doesn't somebody who cares do some profiling?

OK, I've done some. First, a bit of arithmetic:

# Begin
proc fact {n} {
if {$n==0} {
return 1;
}
return [expr {$n*[fact [expr {$n-1}]]}]
}

puts [time {fact 10} 10000]
# End

This is slightly slower in tcl8.1 than in tcl8.0 (305 microseconds per
iteration vs 290 microseconds, on a 300MHz AMD K6-2, running Linux
2.2.7-ac2).

Now a string operation:

# Begin
proc reverse {s} {
set t ""
set len [string length $s]
for {incr len -1} {$len>=0} {incr len -1} {
append t [string index $s $len]
}
return $t
}

puts [time {reverse "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\
abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz"} 1000]
# End

This is *way* slower in 8.1: 11642 microseconds per iteration vs 3096
microseconds per iteration.

The profile gives it away:

Flat profile:

Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
34.34 1.71 1.71 16696762 0.00 0.00 Tcl_UtfToUniChar
18.07 2.61 0.90 106000 0.01 0.02 Tcl_NumUtfChars
16.06 3.41 0.80 2022 0.40 2.45 TclExecuteByteCode
8.03 3.81 0.40 105000 0.00 0.01 Tcl_UniCharAtIndex
2.61 3.94 0.13 106002 0.00 0.03 Tcl_StringObjCmd
2.41 4.06 0.12 105067 0.00 0.00 Tcl_SetVar2Ex
2.01 4.16 0.10 215141 0.00 0.00 ResetObjResult

So, Tcl_UtfToUniChar gets called a lot! That suggests an
optimisation: store a flag in string objects which says "only ASCII
here", and arrange for some relevant functions to check for this and
use optimised algorithms. (You'd need to invalidate it appropriately,
too.)

Does this seem worthwhile to anyone else? Anybody got profiles from a
real application?

Chang LI

unread,
May 9, 1999, 3:00:00 AM5/9/99
to
Bruce Stephens wrote:
>

> # Begin
> proc reverse {s} {
> set t ""
> set len [string length $s]
> for {incr len -1} {$len>=0} {incr len -1} {
> append t [string index $s $len]
> }
> return $t
> }
>
> puts [time {reverse "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\
> abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz"} 1000]
> # End
>

This algorithm makes lot of 16bit/8bit conversion.
Why there is no [string set $s $ix $c] command?
It may get fast for reversing string by swap.
And may avoid the 16/8bit switch.

Mbaccar

unread,
May 10, 1999, 3:00:00 AM5/10/99
to
Hi
2 minutes, 1 second versus 1 minute, 16 seconds using tclsh (not wish) from
console.
This cannot be attributed to load time (45 seconds load time?). My script runs
from a batch file as follows:

dos-prompt> tclsh80 ..\cvlib.tcl -g -L b2lib.log -o b2lib.v *.lib > b2lib.dbg

and same for tclsh81, simply replaced the exeecutable name. There are no slave
interpreters or loadable packages in the script
(pure tcl).

Thanks
Mohamed

Dave Warner

unread,
May 10, 1999, 3:00:00 AM5/10/99
to
Thanks to all those who contributed information to this thread.

I'll do my own benchmarking and if these results are confirmed:
1. I'll continue using 8.0.4 (hate to lose the new regexp)
2. Why did Scriptics waste our time?

Dave

Jeffrey Hobbs

unread,
May 10, 1999, 3:00:00 AM5/10/99
to da...@lucent.com
Dave Warner wrote:
> I'll do my own benchmarking and if these results are confirmed:
> 1. I'll continue using 8.0.4 (hate to lose the new regexp)
> 2. Why did Scriptics waste our time?

I'll be interested in your benchmarking results. If possible,
please compile both locally yourself, so that you know each has
the same opt levels.

Personally, I am not seeing any significant performance degradation.
I'm not trying anything that ever took more than a minute to load
(I have never coded a system that took that long... what is it?),
but the differences to me seem slight.

A few points about 8.1 though. One person mentioned the possible
affect of Unicode. This is not to be underestimated. I don't see
how the conversion stuff can be much better without going to
assembly code, but it is there. There isn't a lot you can do with
strings that doesn't touch them. This hit is going to occur in any
system smart enough to deal with multi-byte chars throughout.

You also mentioned that you use regexp. Is it a lot? In earlier
alphas/betas, there was definite chatter about the loss of performance
with the new regexp features. I think that has been cleaned up some,
but it might not be the speed demon it was in its simpler, pre-Unicode
days. It may have some efficiency tricks still waiting to be tweaked,
but new features and support for Unicode/binary is going to take
something from it.

Tk was also victimized. I'd be interested to hear more about this
curiosity with TK_USE_INPUT_METHODS and its effect on speed. Since
you are on Windows, this shouldn't be a factor, but it was mentioned
on the [incr Tcl] mailing list that removing this on X could result
in factor 10-100 speedup on some widget interactions. It has something
to do with excessive X server interaction caused by the def'n of the
above, but I'm not quite sure of the overall effect that has on the
usability of Tk (on what systems is the above required?).

A further point on Tk in 8.1. It isn't yet fully obj-ified, but 8.1
has intro'd a new object version of the option specs for a widget.
While this is not yet finished, in the few widgets it is used, it
actually provides a significant speed increase. For example, I
converted the ::vu::pie widget and succeeded in reducing a cget from
27 usecs to 14 usecs (consistent).

So there are proven bright points there. I would say that we consider
your app as good point to start profiling. Perhaps there are a few
corner cases that really took a performance hit from 8.0->8.1 that you
happen to be hitting. It would be good for us all to know what those
could be, and especially for Scriptics to know where it might have
missed a bottleneck.

Personally, I've switched over wholesale to 8.1 and am satisfied.
However, there is no doubt that in such a massive upgrade (don't let
the minor version change fool you) there is still room for improvement
(heck, there is always room for improvement, but in this case, we
want to look at the speed improvements).

** Jeffrey Hobbs jeff.hobbs @SPAM acm.org **
** I'm really just a Tcl-bot My opinions are MY opinions **

Jeffrey.Hobbs.vcf

Bruce S. O. Adams

unread,
May 10, 1999, 3:00:00 AM5/10/99
to

Bruce Stephens wrote:

> Bruce Stephens <br...@cenderis.demon.co.uk> writes:
>
> > Slowdown could happen because of string operations becoming more
> > tricky (since a "character" isn't necessarily a single byte any
> > more).
> >
> > That's plausible, but I've no idea whether it's true or not. Why
> > doesn't somebody who cares do some profiling?
>
> OK, I've done some. First, a bit of arithmetic:
>
> # Begin
> proc fact {n} {
> if {$n==0} {
> return 1;
> }
> return [expr {$n*[fact [expr {$n-1}]]}]
> }
>
> puts [time {fact 10} 10000]
> # End
>
> This is slightly slower in tcl8.1 than in tcl8.0 (305 microseconds per
> iteration vs 290 microseconds, on a 300MHz AMD K6-2, running Linux
> 2.2.7-ac2).
>
> Now a string operation:
>

> # Begin
> proc reverse {s} {
> set t ""
> set len [string length $s]
> for {incr len -1} {$len>=0} {incr len -1} {
> append t [string index $s $len]
> }
> return $t
> }
>
> puts [time {reverse "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\
> abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz"} 1000]
> # End
>

> This is *way* slower in 8.1: 11642 microseconds per iteration vs 3096
> microseconds per iteration.
>
> The profile gives it away:
>
> Flat profile:
>
> Each sample counts as 0.01 seconds.
> % cumulative self self total
> time seconds seconds calls ms/call ms/call name
> 34.34 1.71 1.71 16696762 0.00 0.00 Tcl_UtfToUniChar
> 18.07 2.61 0.90 106000 0.01 0.02 Tcl_NumUtfChars
> 16.06 3.41 0.80 2022 0.40 2.45 TclExecuteByteCode
> 8.03 3.81 0.40 105000 0.00 0.01 Tcl_UniCharAtIndex
> 2.61 3.94 0.13 106002 0.00 0.03 Tcl_StringObjCmd
> 2.41 4.06 0.12 105067 0.00 0.00 Tcl_SetVar2Ex
> 2.01 4.16 0.10 215141 0.00 0.00 ResetObjResult
>
> So, Tcl_UtfToUniChar gets called a lot! That suggests an
> optimisation: store a flag in string objects which says "only ASCII
> here", and arrange for some relevant functions to check for this and
> use optimised algorithms. (You'd need to invalidate it appropriately,
> too.)
>
> Does this seem worthwhile to anyone else? Anybody got profiles from a
> real application?

Very worthwhile. Sounds like a patch brewing. I would suggest sending
your profiling results on to scriptics via the bugform in case they miss this
thread. Unicode support is important but it shouldn't have such a
performance hit. Keep up the good work. Just for interest, what did
you use to do the profiling? something in the linux kernal or a less well
known option to gcc?
Regards,
Bruce A.


Jeffrey Hobbs

unread,
May 10, 1999, 3:00:00 AM5/10/99
to Bruce S. O. Adams
> > Bruce Stephens <br...@cenderis.demon.co.uk> writes:

I couldn't find the original to this post on profiling,
so I'm responding to a response...

> > > Slowdown could happen because of string operations becoming more
> > > tricky (since a "character" isn't necessarily a single byte any
> > > more).
> > >
> > > That's plausible, but I've no idea whether it's true or not. Why
> > > doesn't somebody who cares do some profiling?
> >
> > OK, I've done some. First, a bit of arithmetic:
> >
> > # Begin
> > proc fact {n} {
> > if {$n==0} {
> > return 1;
> > }
> > return [expr {$n*[fact [expr {$n-1}]]}]
> > }
> >
> > puts [time {fact 10} 10000]
> > # End

OK, I did this, but with slightly different semantics. Instead of
{fact 10}, I did {fact $n} where I knew $n was an int object. In
this case, I am actually getting 8.1 to be consistently faster (by
no more than 5% though). FWIW, this is on an Ultra1, 8.0.4 compiled
-O without shared libs, 8.1.1 (my personal one) compiled -O and with
shared libs.

> > Now a string operation:
> >
> > # Begin
> > proc reverse {s} {
> > set t ""
> > set len [string length $s]
> > for {incr len -1} {$len>=0} {incr len -1} {
> > append t [string index $s $len]
> > }
> > return $t
> > }
> >
> > puts [time {reverse "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\
> > abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz"} 1000]

> > This is *way* slower in 8.1: 11642 microseconds per iteration vs 3096
> > microseconds per iteration.

> > Flat profile:


> > Each sample counts as 0.01 seconds.
> > % cumulative self self total
> > time seconds seconds calls ms/call ms/call name
> > 34.34 1.71 1.71 16696762 0.00 0.00 Tcl_UtfToUniChar
> > 18.07 2.61 0.90 106000 0.01 0.02 Tcl_NumUtfChars
> > 16.06 3.41 0.80 2022 0.40 2.45 TclExecuteByteCode
> > 8.03 3.81 0.40 105000 0.00 0.01 Tcl_UniCharAtIndex
> > 2.61 3.94 0.13 106002 0.00 0.03 Tcl_StringObjCmd
> > 2.41 4.06 0.12 105067 0.00 0.00 Tcl_SetVar2Ex
> > 2.01 4.16 0.10 215141 0.00 0.00 ResetObjResult
> >
> > So, Tcl_UtfToUniChar gets called a lot! That suggests an
> > optimisation: store a flag in string objects which says "only ASCII
> > here", and arrange for some relevant functions to check for this and
> > use optimised algorithms. (You'd need to invalidate it appropriately,

Actually, in making my string patch, I thought that perhaps another val
that stored the Unicode string length would be helpful (right now, the
overall byte length is stored). Of course, if unilength==byte length,
there are no multibyte chars. This would be helpful, I think, in
numerous areas. However, it would require a bit of effort to assure that
you don't break things along the way.

As for the above "reverse"... Are you using 8.1.0 pure? I'm wondering
where the excessive number of NumUtfChars calls come in, as the 8.1.0
string index doesn't use it. Of the above, it should only be in the
[string len]. Anyway, the excessive UtfToUniChar calls has been
removed in the new index method I had in the patch.

In Tcl 8.1.1 (8.1.0 with my string patch), on the above, I get
~6200 usecs for string reverse, whereas 8.1.0 gives me ~3100.
That is factor 2, but much better than Bruce experienced above.
Given his profiling, this is due in great part to the changes I
made to STR_INDEX.

And another tidbit. Since I added NumUtfChars to STR_INDEX to support
the end-1 style indices, I added another little optimization to avoid
that if possible (checking the obj type), and now I get ~4100 usecs for
the reverse command in 8.1.1. That's down to <25% performance loss.

So, one problem mostly solved. More importantly, this points out
that there is room for improvement (which is better than saying, "well,
the new stuff prevents better performance"). Examples like Bruce's
profiling above point out simple cases that make things easy to hone in
on. Bruce, did you use the standard gprof for that? Also, just for
fairness, you didn't compile only 8.1 with all the debugging and symbols
to get your profiling info? That would definitely cause an unfriendly
hit on the numbers.

Jeffrey.Hobbs.vcf

Jeffrey Hobbs

unread,
May 10, 1999, 3:00:00 AM5/10/99
to br...@cenderis.demon.co.uk
Jeffrey Hobbs wrote:
> > > Bruce Stephens <br...@cenderis.demon.co.uk> writes:

> > > This is *way* slower in 8.1: 11642 microseconds per iteration vs 3096
> > > microseconds per iteration.

After further inspection (trying this against 8.1.0), I am suspecting that
Bruce disadvantaged 8.1 in some way (compiled -g, with symbols and
profiling, or something vs. -O), because in an equal environment here I
can't get near this much skew in the numbers... Bruce?

Jeffrey.Hobbs.vcf

ldup...@cmocbx.qc.bell.ca

unread,
May 10, 1999, 3:00:00 AM5/10/99
to
Jeffrey Hobbs <Jeffre...@icn.siemens.de> wrote:
> After further inspection (trying this against 8.1.0), I am suspecting that
> Bruce disadvantaged 8.1 in some way (compiled -g, with symbols and
> profiling, or something vs. -O), because in an equal environment here I
> can't get near this much skew in the numbers... Bruce?

Can't tell you how many jokes this comment brings to mind. And all deal with
living around the norwestern US area. :-)

Laurent

lvi...@cas.org

unread,
May 10, 1999, 3:00:00 AM5/10/99
to

According to Bruce Stephens <br...@cenderis.demon.co.uk>:
:Does this seem worthwhile to anyone else? Anybody got profiles from a
:real application?

If you mean "does the profile information seem worthwhile" - yes.

Thank you Bruce!

--
<URL: mailto:lvi...@cas.org> Quote: Saving the world before bedtime.
<*> O- <URL: http://www.purl.org/NET/lvirden/>
Unless explicitly stated to the contrary, nothing in this posting
should be construed as representing my employer's opinions.

Chang LI

unread,
May 10, 1999, 3:00:00 AM5/10/99
to
I post a real small program in the following paragraphs.
The program parses the cgi FORM string. The result is

ParseForm Test 8.0 (ms) 8.1 (ms)
--------------------------------------------------------
1 2,200 4,400
2 3,300 7,700
3 4,400 11,000
--------------------------------------------------------

The PC is Pentium75/Win95. The cache size may cause performance
down. But the sad story is that the 8.1 is slower than 8.0.
I have not figured out where is the bottleneck. But I think
this program is suitable for a benchmark test.

ParseForm
=================================================================
# *****************************************************************
# ParseForm parse form string
# aFormInfo return key/value array. same key value separate by
# '\0' character.
# Description read method from global env(REQUEST_METHOD) and
# input string from env(QUERY_STRING). get return
# result in the array aFormInfo
# Return true if success, otherwise false
# *****************************************************************
proc ParseForm {aFormInfo} {
upvar $aFormInfo sFormData

set sReqMethod $::env(REQUEST_METHOD)
switch -exact -- $sReqMethod {
POST {set sQuery [read stdin $::env(CONTENT_LENGTH)]}
GET {set sQuery $::env(QUERY_STRING)}
default {return false}
}

set pairs [split $sQuery &]
foreach item $pairs {
set pair [split $item =]
set key [lindex $pair 0]
set value [lindex $pair 1]

regsub -all {\+} $value { } value

while {[regexp {%[0-9A-Fa-f][0-9A-Fa-f]} $value matched]} {
scan $matched "%%%x" hex
set symbol [format %c $hex]
regsub -all $matched $value $symbol value
}

if {[info exists sFormData($key)]} {
append sFormData($key) "\0" $value
} else {
set sFormData($key) $value
}
}
return true
}

proc test {type display} {

set ::env(REQUEST_METHOD) GET
switch $type {
1 {set ::env(QUERY_STRING) {user=David%20Robert&age=35}}
2 {set ::env(QUERY_STRING)
{user=David%20Robert&age=35&user=Frank%20Cook&age=20}}
3 {set::env(QUERY_STRING)
{user=David%20Robert&age=35&user=Frank%20Cook&age=20&user=Steve%20Monk&age=30}}
}

puts [time {ParseForm sForm} 100]
if {$display} {
puts "user=$sForm(user)"
puts "age=$sForm(age)"
}
}

test 1 0
return
=================================================================

Jeffrey Hobbs

unread,
May 10, 1999, 3:00:00 AM5/10/99
to br...@cenderis.demon.co.uk
Jeffrey Hobbs wrote:
> Jeffrey Hobbs wrote:
> > > > Bruce Stephens <br...@cenderis.demon.co.uk> writes:
>
> > > > This is *way* slower in 8.1: 11642 microseconds per iteration vs 3096
> > > > microseconds per iteration.

> ... in an equal environment here I


> can't get near this much skew in the numbers... Bruce?

I just love answering myself. Let me correct the above. I can receive
such a skew when the string passed in becomes large. Whereas the
string index was constant over the size of the string in 8.0, in 8.1
the large an index is requested, the more time it takes (because you
are no longer grabbing a byte out of a string, but searching down that
string until you know which byte(s) you really meant). That is
definitely something that could benefit from a flag saying whether
the string object is known to contain unicode or not (or another val
which gives unicode length).

Jeffrey.Hobbs.vcf

Bruce Stephens

unread,
May 10, 1999, 3:00:00 AM5/10/99
to
Jeffrey Hobbs <Jeffre...@icn.siemens.de> writes:

> I just love answering myself. Let me correct the above. I can
> receive such a skew when the string passed in becomes large.
> Whereas the string index was constant over the size of the string in
> 8.0, in 8.1 the large an index is requested, the more time it takes
> (because you are no longer grabbing a byte out of a string, but
> searching down that string until you know which byte(s) you really
> meant). That is definitely something that could benefit from a flag
> saying whether the string object is known to contain unicode or not
> (or another val which gives unicode length).

Yes, that's it. It's to do with long strings. Here's another example
(which I ran with tcl8.0.5 and tcl8.1 (HEAD) fresh from cvs on Sunday,
both with -O):

# Begin
set a "0123456789"
append a $a; append a $a
append a $a; append a $a
append a $a; append a $a
set s ""

set test {string index $s $len}
for {set i 0} {$i<30} {incr i} {
append s $a
set len [expr {[string length $s]-1}]
puts "[string length $s]\t[time $test 100]"
}
# End

This is really obviously constant (with blips, probably not important)
with 8.0:

640 8 microseconds per iteration
1280 8 microseconds per iteration
1920 8 microseconds per iteration
2560 11 microseconds per iteration
3200 8 microseconds per iteration
3840 8 microseconds per iteration
4480 26 microseconds per iteration
5120 8 microseconds per iteration
5760 8 microseconds per iteration
6400 8 microseconds per iteration
7040 8 microseconds per iteration
7680 8 microseconds per iteration
8320 8 microseconds per iteration

and it looks pretty linear (again, give or take) with 8.1:

640 89 microseconds per iteration
1280 183 microseconds per iteration
1920 320 microseconds per iteration
2560 342 microseconds per iteration
3200 443 microseconds per iteration
3840 535 microseconds per iteration
4480 679 microseconds per iteration
5120 718 microseconds per iteration
5760 771 microseconds per iteration
6400 876 microseconds per iteration
7040 1031 microseconds per iteration
7680 1252 microseconds per iteration
8320 1146 microseconds per iteration

Bruce Stephens

unread,
May 10, 1999, 3:00:00 AM5/10/99
to
"Bruce S. O. Adams" <bruce...@rmc-ltd.com> writes:

> Very worthwhile. Sounds like a patch brewing.

Possibly. This is only an artificial bit of code, though. What's
important is how much this affects real code.

> I would suggest sending your profiling results on to scriptics via
> the bugform in case they miss this thread.

I'm sure they're aware of the issues. It must have been considered at
the design stage.

> Just for interest, what did you use to do the profiling? something
> in the linux kernal or a less well known option to gcc?

It may be "less well known", but it shouldn't be---it's well
documented. Compile and link using -pg, then running your program
will produce a "gmon.out" file. Then use gprof to look at it. (I
suggest building using "make CC='gcc -pg'" or something---CFLAGS
doesn't get used when linking.)

This isn't precise, though---the profile can itself be altered by the
profiling, and there are other issues too. Results should be taken
with a healthy grain of salt.

Bruce Stephens

unread,
May 10, 1999, 3:00:00 AM5/10/99
to
Jeffrey Hobbs <Jeffre...@icn.siemens.de> writes:

> After further inspection (trying this against 8.1.0), I am
> suspecting that Bruce disadvantaged 8.1 in some way (compiled -g,

> with symbols and profiling, or something vs. -O), because in an


> equal environment here I can't get near this much skew in the
> numbers... Bruce?

I didn't, honest! (Well, I don't think so, anyway.) I compiled both
with -pg. It's possible that the profiling interfered with 8.1 more
than 8.0, however---I'm not entirely clear of the implementation, so
it could be that -pg adds a cost to function calls, in which case 8.1
would be unfairly penalised compared with 8.0.

lvi...@cas.org

unread,
May 10, 1999, 3:00:00 AM5/10/99
to

According to Dave LeBlanc <whi...@accessone.com>:
:A good proportion of 8.1 slow down is likely attributable to Unicode.

:I believe what you type/load into tcl in ascii gets converted to
:unicode and then back to ascii for output.
:
:Someday we'll have OS's that support Unicode across the board and this

:won't happen anymore.
:
:Dave LeBlanc


I guess I don't see much of any action in the future that is going to
make handling Unicode as fast as handling 7 bit ASCII. Well, I'll take
that back - specialized silicon instructions , assembly code run time
functions, etc. perhaps. Other than that, multibyte Unicode is likely
to be slower to handle than single 7 bit ASCII.

It's one of the prices of progress... Perhaps Tcl/tk needs a --disable-unicode
config option...

Chin Huang

unread,
May 10, 1999, 3:00:00 AM5/10/99
to
In article <7h77rl$hqj$1...@srv38s4u.cas.org>, <lvi...@cas.org> wrote:
>I guess I don't see much of any action in the future that is going to
>make handling Unicode as fast as handling 7 bit ASCII. Well, I'll take
>that back - specialized silicon instructions , assembly code run time
>functions, etc. perhaps. Other than that, multibyte Unicode is likely
>to be slower to handle than single 7 bit ASCII.

It's slow precisely because Tcl represents strings in UTF-8 format, in
which a character may be represented by 1 to 3 bytes. If you're willing
to trade-off storage for speed, a possible solution is to adopt a string
representation where every character is stored in a wchar_t (wide character
type which is 16-bits in most C compiler implementations).

Scott Redman

unread,
May 10, 1999, 3:00:00 AM5/10/99
to
Not every character set can be represented in wchar_t, though,
hence the decision to go with UTF-8 (and I think Java uses UTF-8
as well, someone correct me if I'm wrong). There is some extra
copying going on for strings, but there are also some new things
related to thread safety that may be causing some of the slowdown
as well (we may just need some better caching in certain places).

If anyone has speed improvements, please submit them to c.l.t.
and/or the Scriptics bug form. If you have any test cases
that show a particularly bad slowdown, please submit those as
well. We are aware of some of the problems already, and will
be looking into them as soon as we get a chance.

-- Scott

Scott Stanton

unread,
May 10, 1999, 3:00:00 AM5/10/99
to Bruce Stephens
Bruce Stephens wrote:

> > I would suggest sending your profiling results on to scriptics via
> > the bugform in case they miss this thread.
>
> I'm sure they're aware of the issues. It must have been considered at
> the design stage.

We are definitely aware of *some* issues, but I doubt if we are aware of all of
them. It never hurts to ensure that the problems are logged in the database.

Also, good benchmarks are quite valuable when it comes time to do performance
tuning on a subsystem. One of the problems we have is that there really isn't
a good performance benchmark suite for Tcl. It's easy to come up with examples
of code that perform badly in any particular version of Tcl. What's harder is
coming up with a representative set of examples that will catch the bottlenecks
that affect a lot of people. We don't want to waste our time tuning bits of
code that aren't executed often enough to be relevant.

Sometimes the issue is one of changing scripts to take advantage of
optimizations that have been made (e.g. in 8.0 lists became very fast). Other
times it's a matter of putting in special optimizations in the implementation
to help with common problems (e.g. regexp caching in 8.1 needs to be reworked a
little bit).

I would encourage people to try to come up with a good set of test cases that
demonstrate real problems with Tcl's performance. We can start a list of known
performance bottlenecks with workarounds and/or patches that alleviate the
problems.

To get the ball rolling, here's one I know of:

1. Regexp

Problem: In 8.0 and earlier versions, compiled regexps were cached in a small
MRU cache within each interpreter. In 8.1 regexps became Tcl objects,
effectively removing the hard limit on the number of compiled regexps that can
be retained. Unfortunately it is common usage to create regexps by
concatenating two or more strings, which effectively eliminates any benefit you
might get from the Tcl_Obj mechanism (you get a new object every time you do a
concatenation).

Workaround: Avoid concatenations in regexp. For example, I often see code
like:

set ws "\[\n\t \]"
regexp "a${ws}*b${ws}*c" $string

Instead, take advantage of the new 8.1 regexp features and use constant
expressions wherever possible:

set pat {a\s*b\s*c}
regexp $pat $string

In general, do whatever you can to avoid losing the object that contains the
compiled expression.

Solution: We need to change the 8.1 implementation to introduce a second level
cache similar to the one in 8.0. This will allow us to avoid recompilation in
cases where the same computed string value is passed in, even when it comes
from a different object. We can probably use a thread-local cache, instead of
a per-interp cache to improve sharing and make the regexp code more modular.


If you know of other performance issues with workarounds and/or suggested
fixes, please post them. If anyone would be willing to help collect these, I
can make sure we get them up on the Scriptics resource center.

___________________________________________________________
Scott Stanton 650-210-0105 tel
Sr. Software Engineer 650-210-0101 fax
Scriptics Corporation scott....@scriptics.com
The Tcl Platform Company http://www.scriptics.com

Chang LI

unread,
May 10, 1999, 3:00:00 AM5/10/99
to
Scott Redman wrote:
>

Did Tcl8.1 mixup 1, 2, and 3 bytes to express string internally?
If it does this will cause great performance down. Because you will
pack and repack the characters. The main problem is for the long
string it may cause cache hit missing and redurce the performance.
You may speedup the processing by access a word (2bytes) or dword
(4bytes) to the 32bit register.

> Not every character set can be represented in wchar_t, though,
> hence the decision to go with UTF-8 (and I think Java uses UTF-8
> as well, someone correct me if I'm wrong). There is some extra
> copying going on for strings, but there are also some new things
> related to thread safety that may be causing some of the slowdown
> as well (we may just need some better caching in certain places).
>
> If anyone has speed improvements, please submit them to c.l.t.
> and/or the Scriptics bug form. If you have any test cases
> that show a particularly bad slowdown, please submit those as
> well. We are aware of some of the problems already, and will
> be looking into them as soon as we get a chance.
>
> -- Scott
>

--

Scott Stanton

unread,
May 10, 1999, 3:00:00 AM5/10/99
to cha...@neatware.com
Chang LI wrote:
>
> I post a real small program in the following paragraphs.
> The program parses the cgi FORM string.

This is a very interesting test case. It points out an 8.1 specific
performance issue, but is more of a matter of usage rather than anything that
really needs to be fixed in 8.1. The test script makes heavy use of the global
"env" array to pass information between procedures. This is very expensive in
8.1 because of changes related to multithreading and fixes to various bugs in
8.0.

Rewriting the example to use a normal array, I get numbers that are much
closer. On a Pentium II/333 with 64mb, Windows NT 4.0, I get the following
numbers:
Tcl 8.1 (usec/iter) Tcl 8.0 (usec/iter)
test 1 0 1211 1051
test 2 1 1232 1042
test 3 2 1222 1051

The conclusion I draw from this is that you should not use the global "env"
array as a general data storage area. There is a lot of mechanism behind "env"
that is needed to keep it in sync with multiple interps and the C environ
array.

John McLaughlin

unread,
May 10, 1999, 3:00:00 AM5/10/99
to
Scott,

This is an interesting (and somewhat sore) point. The application I use Tcl in is
extremely performance sensitive (however I think my application might be a bit of
an extreme example).

Although it's fine to suggest *changes* in scripts to speed them up (and we did
with arrays & lists in the 7.6 -> 8.0 transition) In general it's not 'ok' for us
to tell our customers to rewrite scripts to get the speed back up to where it
was, so for now I think 8.1 is out of the picture (and yes the 'tclStrrev' script
was from real live customer scripts)

Originally we choose Tcl because:

a) It was _so_ easy to integrate
b) You can't 'crash' it (no pointers, automatic garbage collection, etc.)
c) It was 'fast enough'
d) It has a simple syntax for the new user.

Since that time we have pushed Tcl into some pretty speed sensitive areas of our
application, 8.0 was a nice boost but 8.1 seems (so far) to be heading in the
wrong direction *for us*. Don't get me wrong, I think Scriptics is doing a great
job and they are trying to make everyone happy but I wonder what we, the
community, can do to help push the performance issue (or is performance not an
issue for most people?).

What I would *love* to see is a "Tcl Performance Project", something that would
not only assist in how to write fast scripts but also look at modifications to the
core language, improving the on the fly compiler, providing benchmarks to help
compare versions, new extensions to improve performance etc. There is some good
information at http://216.71.55.6/cgi-bin/wikit/348.html and tools like Swig can
make improving performance pretty painless -- It would be nice to see all of this
in one place (and perhaps a co-ordinated effort).

Yes I do realize that the word 'performance improvement' will have as many
meanings as there are people reading this message but we *should* be able to come
up with some goals/ideas that work with a large number of people.

-John

Scott Stanton wrote:

> Chang LI wrote:
> >
> > I post a real small program in the following paragraphs.
> > The program parses the cgi FORM string.
>
> This is a very interesting test case. It points out an 8.1 specific
> performance issue, but is more of a matter of usage rather than anything that
> really needs to be fixed in 8.1. The test script makes heavy use of the global
> "env" array to pass information between procedures. This is very expensive in
> 8.1 because of changes related to multithreading and fixes to various bugs in
> 8.0.

> test 3 2 1222

........


John McLaughlin

unread,
May 10, 1999, 3:00:00 AM5/10/99
to

Duncan Barclay

unread,
May 10, 1999, 3:00:00 AM5/10/99
to
In article <3736A0C7...@icn.siemens.de>,
Jeffrey Hobbs <Jeffre...@icn.siemens.de> writes:

> Tk was also victimized. I'd be interested to hear more about this
> curiosity with TK_USE_INPUT_METHODS and its effect on speed. Since
> you are on Windows, this shouldn't be a factor, but it was mentioned
> on the [incr Tcl] mailing list that removing this on X could result
> in factor 10-100 speedup on some widget interactions. It has something
> to do with excessive X server interaction caused by the def'n of the
> above, but I'm not quite sure of the overall effect that has on the
> usability of Tk (on what systems is the above required?).

Jeff,

I thought that we'ed come to the conclusion that this was an X
server related problem.

All the people using XFree86 (i.e. myself and the original poster) saw
problems and those using other (vendor supplied) servers had no
problems (this might of course be because their servers don't support
the input methods - we didn't check).

Duncan

--
________________________________________________________________________
Duncan Barclay | God smiles upon the little children,
dm...@ragnet.demon.co.uk | the alcoholics, and the permanently stoned.
________________________________________________________________________

Jeffrey Hobbs

unread,
May 11, 1999, 3:00:00 AM5/11/99
to Duncan Barclay
Duncan Barclay wrote:
> In article <3736A0C7...@icn.siemens.de>,
> Jeffrey Hobbs <Jeffre...@icn.siemens.de> writes:
> > Tk was also victimized. I'd be interested to hear more about this
> > curiosity with TK_USE_INPUT_METHODS and its effect on speed. Since
> > you are on Windows, this shouldn't be a factor, but it was mentioned
> > on the [incr Tcl] mailing list that removing this on X could result
> > in factor 10-100 speedup on some widget interactions. It has something
> > to do with excessive X server interaction caused by the def'n of the
> > above, but I'm not quite sure of the overall effect that has on the
> > usability of Tk (on what systems is the above required?).

> I thought that we'ed come to the conclusion that this was an X


> server related problem.
>
> All the people using XFree86 (i.e. myself and the original poster) saw
> problems and those using other (vendor supplied) servers had no
> problems (this might of course be because their servers don't support
> the input methods - we didn't check).

I would check to see whether the XNQueryInputStyle was defined on those
servers. For Sol2.5.1, I have it in:
/usr/openwin/include/X11/Xlib.h

I know that this will not be in all implementations, but when it is, it
will cause extra X server traffic. Then of course it depends on how you
use it (where the X server is, and where the X client is). If you are
doing thin clients w/o a local X server, you must endure those extra hits.
Someone else here took the test script that was passed around and found it
to be beneficial even for Solaris to remove that (maybe I should double
check how beneficial), but he doesn't access the Solaris server locally
(instead, exports it upstairs).

However, my question was really what is lost if that is then dropped out.
Will certain input devices not work with Tk on Unix, or is it in certain
odd keyboard configurations (like some Oriental language input where you
hit several keys as it builds your character incrementally). Since I've
never strayed that far from the input norm, I don't know whether cutting
that out is safe for normal use.

Jeffrey.Hobbs.vcf

Peter.DeRijk

unread,
May 11, 1999, 3:00:00 AM5/11/99
to
Chin Huang (cth...@interlog.com) wrote:
: In article <7h77rl$hqj$1...@srv38s4u.cas.org>, <lvi...@cas.org> wrote:
: >I guess I don't see much of any action in the future that is going to
: >make handling Unicode as fast as handling 7 bit ASCII. Well, I'll take
: >that back - specialized silicon instructions , assembly code run time
: >functions, etc. perhaps. Other than that, multibyte Unicode is likely
: >to be slower to handle than single 7 bit ASCII.

: It's slow precisely because Tcl represents strings in UTF-8 format, in
: which a character may be represented by 1 to 3 bytes. If you're willing
: to trade-off storage for speed, a possible solution is to adopt a string
: representation where every character is stored in a wchar_t (wide character
: type which is 16-bits in most C compiler implementations).

I don't think that is necessary, some code distinguishing UTF-8 and real
ASCII should do. Someone suggested putting the "byte" length and the
UTF-8 length in the object; if they are the same, the object contains
raw ASCII, and we can use the fast methods to manipulate the string,
otherwise use the slow method. This way performance is only sacrificed
on strings actually using non ASCII characters.

--
Peter De Rijk der...@uia.ua.ac.be
<a href="http://rrna.uia.ac.be/~peter/personal/peter.html">Peter</a>
To achieve the impossible, one must think the absurd.
to look where everyone else has looked, but to see what no one else has seen.

Donal K. Fellows

unread,
May 11, 1999, 3:00:00 AM5/11/99
to
In article <37376E9F...@scriptics.com>,

Scott Redman <red...@scriptics.com> wrote:
> Not every character set can be represented in wchar_t, though, hence
> the decision to go with UTF-8 (and I think Java uses UTF-8 as well,
> someone correct me if I'm wrong).

Java uses Unicode internally and all characters are 16 bits in size
(the language is defined this way.) UTF-8 is only an encoding used
when converting between byte streams and character streams. (I think
it is the default encoding, but I'm not too sure about that - the
default encoding might be locale-specific instead.)

Just how big is the chinese character set anyway? (In terms of
defined glyphs, not how many bits the charset is encoded within.)
Does Unicode cover them all?

Donal.
--
Donal K. Fellows http://www.cs.man.ac.uk/~fellowsd/ fell...@cs.man.ac.uk
-- The small advantage of not having California being part of my country would
be overweighed by having California as a heavily-armed rabid weasel on our
borders. -- David Parsons <o r c @ p e l l . p o r t l a n d . o r . u s>

Jean-Claude Wippler

unread,
May 11, 1999, 3:00:00 AM5/11/99
to
Scott,

[B. Stephens suggests adding profile results to the Scriptics bug form]

> Sometimes the issue is one of changing scripts to take advantage of
> optimizations that have been made (e.g. in 8.0 lists became very
> fast). Other times it's a matter of putting in special optimizations
> in the implementation to help with common problems (e.g. regexp
> caching in 8.1 needs to be reworked a little bit).
>
> I would encourage people to try to come up with a good set of test
> cases that demonstrate real problems with Tcl's performance. We can
> start a list of known performance bottlenecks with workarounds and/or
> patches that alleviate the problems.

Donal Fellows did just that on the Tcl'ers Wiki a little while back.
This is a good example of how topics tend to fade from c.l.t - and can
be retained on a more permanent collaborative site:
http://purl.org/thecliff/tcl/wiki/TclPerformance

> To get the ball rolling, here's one I know of:
>
> 1. Regexp

I've taken the liberty to add this to the page mentioned above. If it
gets too long we can move it to a separate 8.1-specific page later.



> If you know of other performance issues with workarounds and/or
> suggested fixes, please post them. If anyone would be willing to
> help collect these, I can make sure we get them up on the Scriptics
> resource center.

I don't want to stand in the way of Scriptics collecting this valuable
information, but in my opinion this is the sort of stuff which can be
collected perfectly on the Tcl'ers wiki. There is no intermediate step,
anyone with a web browser can add and modify information on those pages.

In fact, one could treat the Tcl'ers Wiki as ad-hoc submission form for
all sorts of things, with information culled from it once someone (at or
outside Scriptics) has time. It's there now, it can be filled and
refined by everyone in the Tcl community now, it can be copied to
another place at any time, and it can be replaced with a pointer once
the information has found a new home - avoiding broken links.

Do I care? Nah. I'm just pointing to a spot on the web which can solve
so many problems when it comes to "let's collect/organize X one day":
http://purl.org/thecliff/tcl/wiki/
And if you don't like the way something looks in there, just change it.

-- Jean-Claude

Donal K. Fellows

unread,
May 11, 1999, 3:00:00 AM5/11/99
to
In article <3737C176...@mailexcite.com>,

John McLaughlin <joh...@mailexcite.com> wrote:
> What I would *love* to see is a "Tcl Performance Project", something
> that would not only assist in how to write fast scripts but also
> look at modifications to the core language, improving the on the fly
> compiler, providing benchmarks to help compare versions, new
> extensions to improve performance etc. There is some good
> information at http://216.71.55.6/cgi-bin/wikit/348.html and tools
> like Swig can make improving performance pretty painless -- It would
> be nice to see all of this in one place (and perhaps a co-ordinated
> effort).
>
> Yes I do realize that the word 'performance improvement' will have
> as many meanings as there are people reading this message but we
> *should* be able to come up with some goals/ideas that work with a
> large number of people.

Think of the web page you quoted as the beginnings of the start of
thinking more seriously about doing something about this!

As far as I can see, there are two main classes of things that can be
done:

1) Make the Tcl system faster. This involves increasing the spread
of Tcl_Obj's, the compiler and their ilk throughout the whole Tcl
core. I'm sure there are many possibilities that are not covered
yet here. On-the-fly native code generation is perhaps towards
the apex of this development tree...

2) Document how to make the most of existing facilities and when to
switch to using C. It is not always obvious just how to make
Tcl's existing pips squeak loudest, and even the very highly
experienced are still learning.

The first lends itself to collaborative coding efforts, and the second
lends itself to books. I'm happy to talk more with anyone on the
specifics of either, but I doubt I've got the time and energy to lead
either project. :^(

Christopher Nelson

unread,
May 11, 1999, 3:00:00 AM5/11/99
to
Scott Stanton wrote:
> I would encourage people to try to come up with a good set of test cases that
> demonstrate real problems with Tcl's performance. We can start a list of known
> performance bottlenecks with workarounds and/or patches that alleviate the
> problems.

When HP was developing PA-RISC, they built a fully-instrumented CISC processor
out of discrete logic and invited their customers to bring real world
applications to HP to run on the instrumented processor. They ran millions
(maybe billions) of lines of code through their instrumented processor to get
instruction usage counts so that when they designed the instruction set for the
RISC chip, they *knew* they were implementing the most often used instructions
from real world applications.

Scriptics has an advantage in that the "processor" is in software. If you want
to know what bytecodes or sequences of bytecodes are used most in the real
world, put some profiling tools in a custom interpreter and let it loose on the
world. I volunteer to "set tcl_countExec 1" (or whatever) and run my
applications through it and send you the results.

Chris
--
Rens-se-LEER is a county. RENS-se-ler is a city. R-P-I is a school!

lvi...@cas.org

unread,
May 11, 1999, 3:00:00 AM5/11/99
to

According to Scott Stanton <scott....@Scriptics.com>:
:The conclusion I draw from this is that you should not use the global "env"

:array as a general data storage area. There is a lot of mechanism behind "env"
:that is needed to keep it in sync with multiple interps and the C environ
:array.

It sure would be nice if there were an optimisation that if only one interpreter
were being used, the overhead would not be invoked...

lvi...@cas.org

unread,
May 11, 1999, 3:00:00 AM5/11/99
to

According to John McLaughlin <joh...@mailexcite.com>:
:This is an interesting (and somewhat sore) point. The application I use

:Tcl in is
:extremely performance sensitive (however I think my application might be

Some of the earliest writings from John Ousterhout that I remember discussed
performance and how tcl scripting was never intended to be fast - but fast
enough to get a job done. If "faster" was needed, then improvements to
the algorithm, or specialized Tcl commands, or finally rewriting the code
in some lower level language would be options for the programmers.

:Although it's fine to suggest *changes* in scripts to speed them up (and we did

Actually, what Scott suggested was not 'changes to speed things up' but
'identifying areas in Tcl that cause performance bottlenecks'. If someone
chooses to try to cram more data thru a bottleneck than will go easily, then
they have the choice of living with slower performance, living with an older
version of tcl (there are people who still regularly use Tcl from 4+ years
ago), or choosing another language.


:with arrays & lists in the 7.6 -> 8.0 transition) In general it's not


:'ok' for us
:to tell our customers to rewrite scripts to get the speed back up to where it
:was, so for now I think 8.1 is out of the picture (and yes the

Seems like a management choice that may have unexpected ramifications ("don't
tell the customers how their code is breaking Tcl fundamental principles...").


:Originally we choose Tcl because:


: a) It was _so_ easy to integrate

And it still should be.

: b) You can't 'crash' it (no pointers, automatic garbage
:collection, etc.)

Well, in general <grin>...

: c) It was 'fast enough'
Against whose criteria?

: d) It has a simple syntax for the new user.
It still has.

:Since that time we have pushed Tcl into some pretty speed sensitive areas of our


:application, 8.0 was a nice boost but 8.1 seems (so far) to be heading in the
:wrong direction *for us*. Don't get me wrong, I think Scriptics is

I am uncertain why readers of this newsgroup expect a brand new release of Tcl
- which John has already stated should have been called Tcl 9.0 because of the
all pervasiveness of the changes - will have all the performance tweaks in
it that are possible? It's certainly not the way most products work...


:job and they are trying to make everyone happy but I wonder what we, the


:community, can do to help push the performance issue (or is performance not an
:issue for most people?).

1. Come up with specific scripts showing where Tcl 8.1 runs slower.
2. Be willing to change programs to work well in the new release
3. Be willing to test new performance changes that people hack
(ala J. Hobbes new string patch, etc.) on the various platforms, compilers,
environments.

:What I would *love* to see is a "Tcl Performance Project", something that would

:make improving performance pretty painless -- It would be nice to see


:all of this
:in one place (and perhaps a co-ordinated effort).

Can anyone at scriptics provide us with an comparative analogy of paying
customers requests? Do twice as many customers paying for support request
better performance, or new features? Something along those lines.

In my (non-Scriptics employee) viewpoint, for the non-paying customers,
one of the best things to do is leg work in terms of code changes, new
algorithms, etc. - all the things mentioned in relationship to the tcl
performance project. Once either a) the work (in terms of coding
changes, documentation, test cases, etc.) is available, b) the paying
customers generate enough demand, c) Scriptics is so successful that
they have money to burn, or d) Scriptics finds the performance is
negatively impacting their products, I would guess major performance gains
will garner enough attention to make it inot the work assignments.

:Yes I do realize that the word 'performance improvement' will have as many


:meanings as there are people reading this message but we *should* be
:able to come
:up with some goals/ideas that work with a large number of people.

I think this is very valuable. What are the most important aspects of
performance? It seems like the first step would be to identify all the
aspects of performance that are of interest to users at least in this
forum. Then, benchmarks to measure the performance in each of these
areas could be coded. It would really be nifty if they were coded in such
a way that the benchmark suite could be executed against at least Tcl 7.4,
7.6, 8.0 and 8.1 . That should give people a baseline against which to
work. Then, doing profiling of the latest interpreter to see where time
is currently being spent allows one to narrow down where attention today
should be paid.

I don't know if returning to the 'glory' days of Tcl 6 or early 7 is possible.
But perhaps without corrupting the Tcl internal design some portion of the
performance for scripting can be regained.

Bruce Stephens

unread,
May 11, 1999, 3:00:00 AM5/11/99
to Scott Stanton
Scott Stanton <scott....@Scriptics.com> writes:

> Bruce Stephens wrote:
>
> > > I would suggest sending your profiling results on to scriptics via
> > > the bugform in case they miss this thread.
> >
> > I'm sure they're aware of the issues. It must have been considered at
> > the design stage.
>
> We are definitely aware of *some* issues, but I doubt if we are
> aware of all of them. It never hurts to ensure that the problems
> are logged in the database.

I was referring specifically to the string changes. There *must* have
been discussion of the impact of changing to UTF-8. Quite possibly it
happened in full view on comp.lang.tcl---I don't remember, and I'm
offline at the moment so I can't check.

Presumably alternative approaches were considered and rejected. If
this discussion is archived somewhere at Scriptics, maybe you could
stick it on a web page somewhere, and everyone could see the
engineering considerations? Presumably it happened at Sun, though, so
perhaps it's not available. (I'm sure there are good reasons for
things being as they are---I can think of some.)

> Also, good benchmarks are quite valuable when it comes time to do
> performance tuning on a subsystem. One of the problems we have is
> that there really isn't a good performance benchmark suite for Tcl.
> It's easy to come up with examples of code that perform badly in any
> particular version of Tcl. What's harder is coming up with a
> representative set of examples that will catch the bottlenecks that
> affect a lot of people. We don't want to waste our time tuning bits
> of code that aren't executed often enough to be relevant.

Yep. Good benchmarks would be valuable. Unfortunately, I don't have
any. The example I gave of repeatedly getting the last character of a
string nicely shows how {string index $a $n} is linear in $n in 8.1
but constant in 8.0, but how important is that in real programs?

An obvious pretty big Tcl/Tk application is exmh. Maybe you could do
some kind of tests with that, or with bits of it? (Or tclhttpd?
Maybe that's dominated by I/O, though?)

I was thinking of doing a few tests with tkman, but looking at the
code suggests that its optimisations were mostly pre-8.0: it has lots
of expr with unquoted arguments, which is poor style now, and will
have a performance impact. It may well use other things which worked
better in 7.6 than they do in 8.0. It may be a good benchmark, but
changing Tcl to make tkman faster may not be a useful goal, as you
suggest---sometimes changing the script is better.


Here's an outline for improving the ASCII (7-bit) string handling:
store a flag somewhere for each string which says whether it contains
multibyte characters, and use and update this appropriately in all
operations which create or alter strings. Updating it ought to be
reasonably cheap, I think. (Storing the length of the string in
characters and bytes would provide the same opportunities for
optimisation, but I'm guessing it would be more expensive to keep up
to date; I could be wrong, however.)

That has the disadvantage that there'll be performance
discontinuities: many string operations will be constant in time, but
the first multibyte character will make them linear, and slower.
Depending on the implementation details, it might be possible to add a
multibyte character to a string and then remove it, but the object
might still be considered multibyte (i.e., it might be possible for
scripts to construct 7-bit ASCII strings which Tcl thinks might
contain multibyte characters). That would also be ugly, but maybe
it's possible to arrange that that can't happen.


I'm guessing that there are two distinct uses of strings: one is
predominantly 7-bit ASCII (variable names, command names, widget
names, command options, etc.), and then there's the more obvious uses
where the strings are data (and may well be multibyte).

Unfortunately, the former probably aren't any slower in 8.1 than they
are in 8.0---they're probably quite short, for one thing---and so the
above change probably won't make much difference to them (presuming
that the housekeeping really is negligible). I'm guessing that there
are lots of programs that are going to be dealing with data which is
basically 7-bit---enough so that the change I'm proposing is a win
overall.

A program dealing with Canadian names will have some which contain
accents, but most will probably be 7-bit ASCII. Hmm, actually, these
would be short names anyway---in big chunks of text, there'd mostly be
a single accented character, I'd guess. Darn.

Hmm. OK, my suggestion is probably highly ASCII-centric. Maybe it's
not worth doing---maybe you just need to improve the UTF-8 handling,
or adopt a fixed width character representation---perhaps switched, so
some strings would be 1-byte, some 2, and some more (and lose easy
compatibility with the old C API, which is presumably partly what made
UTF-8 look attractive---you could always map to and from UTF-8,
though, so it's surely not too bad).

Chang LI

unread,
May 11, 1999, 3:00:00 AM5/11/99
to
John McLaughlin wrote:
>
> Since that time we have pushed Tcl into some pretty speed sensitive areas of our
> application, 8.0 was a nice boost but 8.1 seems (so far) to be heading in the
> wrong direction *for us*. Don't get me wrong, I think Scriptics is doing a great

> job and they are trying to make everyone happy but I wonder what we, the
> community, can do to help push the performance issue (or is performance not an
> issue for most people?).
>

I do not think performance is not an issue. For example, when you write
CGI
program with Tcl and run on the server that is a big issue.

Is it possible to have a compiler selection to disable the unicode in
Tcl8.1?
I like the Tcl8.1 with multi-threads and stubs. Most of time we will not
use the unicode. If we want an international version of a software we
may
produce a special edition. Small performace down is acceptable for 8.1.



> What I would *love* to see is a "Tcl Performance Project", something that would

> not only assist in how to write fast scripts but also look at modifications to the
> core language, improving the on the fly compiler, providing benchmarks to help
> compare versions, new extensions to improve performance etc. There is some good
> information at http://216.71.55.6/cgi-bin/wikit/348.html and tools like Swig can

> make improving performance pretty painless -- It would be nice to see all of this
> in one place (and perhaps a co-ordinated effort).
>

I agree. I think kernel can be optimized to minimize the gap.

> Yes I do realize that the word 'performance improvement' will have as many
> meanings as there are people reading this message but we *should* be able to come
> up with some goals/ideas that work with a large number of people.
>

> -John

Chang LI

unread,
May 11, 1999, 3:00:00 AM5/11/99
to
Christopher Nelson wrote:
>

> Scriptics has an advantage in that the "processor" is in software. If you want
> to know what bytecodes or sequences of bytecodes are used most in the real
> world, put some profiling tools in a custom interpreter and let it loose on the
> world. I volunteer to "set tcl_countExec 1" (or whatever) and run my
> applications through it and send you the results.
>

That is a good idea. Where are the profile tools? Is it possible to
write with
pure Tcl?



> Chris
> --
> Rens-se-LEER is a county. RENS-se-ler is a city. R-P-I is a school!

--

Eugene Lee

unread,
May 11, 1999, 3:00:00 AM5/11/99
to
Donal K. Fellows <fell...@cs.man.ac.uk> wrote:
>
>Just how big is the chinese character set anyway? (In terms of
>defined glyphs, not how many bits the charset is encoded within.)
>Does Unicode cover them all?

If I recall from one of my readings, there are over 50,000 unique
Chinese characters (i.e. glyphs). Someone native correct me if I'm
wrong, but I think modern Chinese uses maybe 12,000 characters, and
only 4,000 to 8,0000 must be known by the average person to be
considered fluent. Still, Unicode does not cover all of those
characters, nor does it cover the modern subset. It's not a wonder
why there is great misgivings for it in Asian computer circles and
industries.

--
Eugene Lee
eug...@neosoft.com

Bob Techentin

unread,
May 11, 1999, 3:00:00 AM5/11/99
to
Christopher Nelson wrote:
>
> Scriptics has an advantage in that the "processor" is in software. If you want
> to know what bytecodes or sequences of bytecodes are used most in the real
> world, put some profiling tools in a custom interpreter and let it loose on the
> world. I volunteer to "set tcl_countExec 1" (or whatever) and run my
> applications through it and send you the results.
>

You can get that from a standard tclsh, if you're willing to do a little
text processing on the byte code trace log. From the
http://216.71.55.6/cgi-bin/wikit/TclPerformance

You can find out just what the byte-code compiler is doing by
activating
internal tracing. I didn't see this documented anywhere other than in
the
source code file tclExecute.c, but you can set the Tcl variable
tcl_traceExec
to one of the following values, and internal trace functions will
generate messages on stdout.

0: no execution tracing
1: trace invocations of Tcl procs only
2: trace invocations of all (not compiled away) commands
3: display each instruction executed

--
Bob Techentin techenti...@mayo.edu
Mayo Foundation (507) 284-2702
Rochester MN, 55905 USA http://www.mayo.edu/sppdg/sppdg_home_page.html

Scott Stanton

unread,
May 11, 1999, 3:00:00 AM5/11/99
to Bruce Stephens
Bruce Stephens wrote:

> I was referring specifically to the string changes. There *must* have
> been discussion of the impact of changing to UTF-8. Quite possibly it
> happened in full view on comp.lang.tcl---I don't remember, and I'm
> offline at the moment so I can't check.
>
> Presumably alternative approaches were considered and rejected. If
> this discussion is archived somewhere at Scriptics, maybe you could
> stick it on a web page somewhere, and everyone could see the
> engineering considerations? Presumably it happened at Sun, though, so
> perhaps it's not available. (I'm sure there are good reasons for
> things being as they are---I can think of some.)

All of this happened at Sun, so there is no permanent record of the discussion
that I know of. However, I can summarize the main points.

As I recall it, there were two main concerns with using double-byte
characters: space and incompatibility with stdlib routines. The biggest
advantage of UTF-8 is that ASCII characters are parseable directly from a UTF
string. This makes it very easy to parse Tcl code and path separators without
having to make any changes to the C code. If a byte looks like an ASCII
character, it is guaranteed to be an ASCII character. In practice, you can
scan for characters like "{" or "/" without having to think about multi-byte
character issues. Standard routines like strchr work just fine. This property
significantly simplifies a lot of code.

Another nice property of UTF-8 is that it can cleanly encode binary data as a
normal null-terminated string. This neatly sidesteps many of the binary data
issues that were present in earlier versions of Tcl.

Unfortunately the trade-off is that string indexing becomes O(n) instead of
O(1). The good news is that indexing by character isn't all that common in the
code we've looked at (e.g. exmh). The bad news is that when it hurts, it tends
to hurt a lot. I think there are a number of optimizations we may consider
making to ameliorate the worst cases, but it's going to be an incremental
process as we figure out where the hot spots really are.

> An obvious pretty big Tcl/Tk application is exmh. Maybe you could do
> some kind of tests with that, or with bits of it? (Or tclhttpd?
> Maybe that's dominated by I/O, though?)

We have been doing some tests with exmh, and it is clear there are still some
significant slowdowns with 8.1. I suspect the most serious offender at this
point is the regexp caching behavior. Once we've fixed that, it will be
interesting to see how exmh performs.


> Here's an outline for improving the ASCII (7-bit) string handling:
> store a flag somewhere for each string which says whether it contains
> multibyte characters, and use and update this appropriately in all
> operations which create or alter strings. Updating it ought to be
> reasonably cheap, I think. (Storing the length of the string in
> characters and bytes would provide the same opportunities for
> optimisation, but I'm guessing it would be more expensive to keep up
> to date; I could be wrong, however.)

There is already an expandable string object type that could be extended to
include a character count in addition to the buffer size that it currently
includes. Increasing the base size of a Tcl_Obj would be very costly in terms
of the amount of storage used, so I'd be reluctant to make every object pay the
cost. Keeping the size around will only help with "string length", but not
"string index".

Another possible change I've considered is adding a UnicodeString object type
that keeps the object in double-byte form similar to the way ByteArray keeps
the data in a single-byte form. This would make both indexing and length
computations fast, but would potentially double the storage cost.

> I'm guessing that there are two distinct uses of strings: one is
> predominantly 7-bit ASCII (variable names, command names, widget
> names, command options, etc.), and then there's the more obvious uses
> where the strings are data (and may well be multibyte).

I think you are probably right. However, most of the former type will never
have any indexing or length computations done on them, so the current
representation is probably best. For data strings, the best form will be usage
dependent.

> Hmm. OK, my suggestion is probably highly ASCII-centric. Maybe it's
> not worth doing---maybe you just need to improve the UTF-8 handling,
> or adopt a fixed width character representation---perhaps switched, so
> some strings would be 1-byte, some 2, and some more (and lose easy
> compatibility with the old C API, which is presumably partly what made
> UTF-8 look attractive---you could always map to and from UTF-8,
> though, so it's surely not too bad).

Tcl's current string usage is already pretty ASCII-centric, and the UTF
routines are actually pretty fast. I think our efforts are best spent looking
for ways to change algorithms to avoid recomputing information.

Scott Stanton

unread,
May 11, 1999, 3:00:00 AM5/11/99
to
lvi...@cas.org wrote:

> It sure would be nice if there were an optimisation that if only one interpreter
> were being used, the overhead would not be invoked...

Actually, I think most of the cost comes from the fact that other parts of the
application can change the C environ array behind Tcl's back. In order to
notice those changes, we have to look at the environ array on each read
access. Similarly we need to write through to the C environ array every time,
so other parts of the app can see the changes. Most of the time is going into
those searches/modifications of the environ array. This could probably be
optimized, but I'm not sure it's worth the effort.

I think it's more appropriate to only modify the env array if you really are
trying to do something that should be visible to other libraries in the process
or subprocesses. Otherwise, it will always be faster to use a normal Tcl
array.

Eugene Lee

unread,
May 11, 1999, 3:00:00 AM5/11/99
to
Scott Stanton <scott....@Scriptics.com> wrote:

>Bruce Stephens wrote:
>>
>> Here's an outline for improving the ASCII (7-bit) string handling:
>> store a flag somewhere for each string which says whether it contains
>> multibyte characters, and use and update this appropriately in all
>> operations which create or alter strings. Updating it ought to be
>> reasonably cheap, I think. (Storing the length of the string in
>> characters and bytes would provide the same opportunities for
>> optimisation, but I'm guessing it would be more expensive to keep up
>> to date; I could be wrong, however.)

[...]

>Another possible change I've considered is adding a UnicodeString object type
>that keeps the object in double-byte form similar to the way ByteArray keeps
>the data in a single-byte form. This would make both indexing and length
>computations fast, but would potentially double the storage cost.

If the programmer knew that the strings being checked were definitely
ASCII or ISO-8859, it would make sense to let the programmer choose a
more efficient data type. How about adding a new option to 'string',
e.g. 'string unicode stringarg', that explicits uses this theoretical
UnicodeString object? Then for code like:

set len [string length [string unicode $foo]]

would it be trivial to switch between the normal algorithms and the
double-byte algorithms?

I understand that Tcl is a very weakly-typed language, and much of its
flexibility comes from this characteristic. So would it be heresy to
suggest letting programmers choose data types for storage, or at least
an interface to provide "hints" to the run-time system that some stuff
are boolean, integer, double, list, or string?

--
Eugene Lee
Systems Administrator
NeoSoft R&D Department
eug...@neosoft.com

John McLaughlin

unread,
May 11, 1999, 3:00:00 AM5/11/99
to
lvi...@cas.org wrote:

> According to John McLaughlin <joh...@mailexcite.com>:
> :This is an interesting (and somewhat sore) point. The application I use
> :Tcl in is
> :extremely performance sensitive (however I think my application might be
>
> Some of the earliest writings from John Ousterhout that I remember discussed
> performance and how tcl scripting was never intended to be fast - but fast
> enough to get a job done. If "faster" was needed, then improvements to
> the algorithm, or specialized Tcl commands, or finally rewriting the code
> in some lower level language would be options for the programmers.
>

I agree that Tcl doesn't need to be known necessarily for speed. In our application
it started out 'fast enough' and our needs have changed so now it is often 'not
quite fast enough'. This is no failing of Tcl but more a reflection of our needs
changing.

>
> :Although it's fine to suggest *changes* in scripts to speed them up (and we did

>
>
> Actually, what Scott suggested was not 'changes to speed things up' but

> ....


> :with arrays & lists in the 7.6 -> 8.0 transition) In general it's not
> :'ok' for us
> :to tell our customers to rewrite scripts to get the speed back up to where it
> :was, so for now I think 8.1 is out of the picture (and yes the
>
> Seems like a management choice that may have unexpected ramifications ("don't
> tell the customers how their code is breaking Tcl fundamental principles...").

I don't think it's quite this simple (assuming I understand correctly what you are
saying here).

in the 7.6 -> 8.0 transition Array's basically stayed the same speed (or perhaps a
bit faster because of the byte code compiler), but *lists* got much faster making
them preferable so it's fine to suggest/recommend/encourage customers that by
rewriting scripts to use lists it will go faster but if they choose *not to* their
scripts run at the same speed as the previous version.

With the 8.0 -> 8.1 transition you see a significant slow down with some (Many? I
wonder what percentage of scripts go slower in 8.1 vs. 8.0). True by rewriting them
you can (perhaps in some (most?) cases get the speed close to what you had previous
but that is much more painful (not to mention a difficult 'sell').

In terms of "not telling the customer how code is breaking tcl fundamental
principals" I don't think there is any question that you would want to share
information on performance as widely as possible but to go back and say "with the
new version of tcl you need to re do the scripts to get close the old performance
because the rules have changed " is an extremely difficult message to deliver
(especially if the reason is for a new feature that they don't see value in)


>
>
> :Originally we choose Tcl because:

> : (many reasons why I like tcl


>
> :Since that time we have pushed Tcl into some pretty speed sensitive areas of our
> :application, 8.0 was a nice boost but 8.1 seems (so far) to be heading in the
> :wrong direction *for us*. Don't get me wrong, I think Scriptics is
>
> I am uncertain why readers of this newsgroup expect a brand new release of Tcl
> - which John has already stated should have been called Tcl 9.0 because of the
> all pervasiveness of the changes - will have all the performance tweaks in
> it that are possible? It's certainly not the way most products work...

I think the reason that readers of the newsgroups expect all of the performance
tweaks (and indeed be no slower than the previous version) is because:

a) To my knowledge this has never happened before (a new version gotten
slower). In general the speed is either identical (most of the 7's) or *much*
faster (8.0). (Another good question.. Has there ever been a version of Tcl that
was slower then the previous release?)
b) For *this type of product* the normal expectation is that performance, in
general, improves over time. If you upgraded to gcc 2.8 from gcc 2.7 and your
compiled code got slower it would be a simlar riot (ok, perhaps not a fair analogy).

c) It has not been discussed (to my knowledge) before (I did a Dejanews
^h^h^h^h^h^h^h deja.com search and I couldn't find any discussion on this issue)

I think the key word in all of the comments is 'expect', there are certainly no
guarantees or promises but there were _expectations_ that it would be >= 8.0 speed
and those expectations were not managed (to the best of my knowledge) -- something
along the lines of 'Tcl 8.1 is under some situations slower than Tcl 8.0 so if
performance is critical stick with Tcl 8.0' would have helped manage expectations.
(but of course that would have launched a similar series of discussions..)

>
>
> (lots of discussion about how the community can help deleted)


>
> Can anyone at scriptics provide us with an comparative analogy of paying
> customers requests? Do twice as many customers paying for support request
> better performance, or new features? Something along those lines.

Good question.

>
> I don't know if returning to the 'glory' days of Tcl 6 or early 7 is possible.
> But perhaps without corrupting the Tcl internal design some portion of the
> performance for scripting can be regained.
>
>

I hope so, I think it's in the best interest of Tcl, Scriptics and the community to
have the most recently released version to be the preferred version. Up until 8.1
that was true but now if you want speed you stick with 8.0, if you want the new
regexp, threads or unicode you go with 8.1 This strikes me as not being healthy
long term for Tcl.

Is there a roadmap for Tcl performance? Looking at the scriptics web page
(http://www.scriptics.com/tclpro/roadmap.html) there is no discussion of improved
performance. I realize Scriptics is not an endless fountain of resources but
perhaps they could help co-ordinate the activity (assuming this is a big deal, it
certainly is for me but I may be a 3 sigma type on this issue)

Tcl has long had a bad rap for being slow, 8.0 improved this dramatically (although
that fact is often ignored, do a search on "tcl vs perl" and virtually all
comparisons beat it up for speed reasons (and of course they compare to 7.4 or some
such)). As a long user (and defender) of Tcl it just makes it that much more
difficult (of course we can be smug about Unicode and threads, if only I had a
reason to use them...)


-John
joh...@mailexcite.com

Dave Warner

unread,
May 12, 1999, 3:00:00 AM5/12/99
to
To clarify, I hear you suggesting that read only access to the environment
(as is necessary in CGI scripts, for example) would best be done with
something like
array set localEnv [array get ::env]
and subsequently referencing localEnv. True?

Also, for Tcl to ever be a widely accepted candidate for CGI scripting
applications performance is critical. Please don't ignore the importance
of this.


--
Dave Warner
Lucent Technologies, Inc.
+1-303-538-1748

Donal K. Fellows

unread,
May 12, 1999, 3:00:00 AM5/12/99
to
In article <373899FE...@mayo.edu>,

Bob Techentin <techenti...@mayo.edu> wrote:
> You can find out just what the byte-code compiler is doing by
> activating internal tracing. I didn't see this documented anywhere
> other than in the source code file tclExecute.c, but you can set the
> Tcl variable tcl_traceExec to one of the following values, and
> internal trace functions will generate messages on stdout.

It's documented in tclvars(n) near the bottom of the manpage.

Peter.DeRijk

unread,
May 12, 1999, 3:00:00 AM5/12/99
to
Scott Stanton (scott....@Scriptics.com) wrote:
: Unfortunately the trade-off is that string indexing becomes O(n) instead of

: O(1). The good news is that indexing by character isn't all that common in the
: code we've looked at (e.g. exmh). The bad news is that when it hurts, it tends
: to hurt a lot. I think there are a number of optimizations we may consider
: making to ameliorate the worst cases, but it's going to be an incremental
: process as we figure out where the hot spots really are.

As I work in molecular biology, string indexing is extremely common in
my code, and usually on very long strings. I am sure there are many other
real world uses that do rely on string processing: I have done a small
check, and the slowdown in Tcl8.1 is so bad that it is completely
unacceptable. I am currently staying at 8.0, and if this is not expected
to improve, I will have to look at other options ...

tcl8.0:
% time {for {set i 1} {$i < 10000} {incr i} {append seq A}}
413689 microseconds per iteration
% time {string range $seq 9000 9010} 100
35 microseconds per iteration

tcl8.1:
% time {for {set i 1} {$i < 10000} {incr i} {append seq A}}
616221 microseconds per iteration
% time {string range $seq 9000 9010} 100
10475 microseconds per iteration

: There is already an expandable string object type that could be extended to


: include a character count in addition to the buffer size that it currently
: includes. Increasing the base size of a Tcl_Obj would be very costly in terms
: of the amount of storage used, so I'd be reluctant to make every object pay the
: cost. Keeping the size around will only help with "string length", but not
: "string index".

It would help with index and range as well: if utf size and byte size is
the same, there are no utf characters, and plain and fast indexing
can be used.

: Another possible change I've considered is adding a UnicodeString object type


: that keeps the object in double-byte form similar to the way ByteArray keeps
: the data in a single-byte form. This would make both indexing and length
: computations fast, but would potentially double the storage cost.

Another possibility (that would take a lot of work though) would be to
keep the String object the way it was in Tcl 8.0, and
making a new UnicodeString object type that works the way Strings work in
Tcl 8.1. If unicode characters appear, the String object is converted
to the UnicodeString object, otherwise, the nice and fast old routines
can be used.

: Tcl's current string usage is already pretty ASCII-centric, and the UTF


: routines are actually pretty fast. I think our efforts are best spent looking
: for ways to change algorithms to avoid recomputing information.

I would not call a 300 fold slowdown on a common operation pretty fast.
Especially since I do not need the Unicode ...

--
Peter De Rijk der...@uia.ua.ac.be

<a href="http://rrna.uia.ac.be/~peter/">Peter</a>

Scott Stanton

unread,
May 12, 1999, 3:00:00 AM5/12/99
to
Peter.DeRijk wrote:
> As I work in molecular biology, string indexing is extremely common in
> my code, and usually on very long strings. I am sure there are many other
> real world uses that do rely on string processing: I have done a small
> check, and the slowdown in Tcl8.1 is so bad that it is completely
> unacceptable. I am currently staying at 8.0, and if this is not expected
> to improve, I will have to look at other options ...

For now, you can convert the string to a list using "split $string {}" and then
use "lindex". This is more memory intensive, but should give you linear
indexing behavior. We definitely recognize that this is a problem and we're
tring to figure out the best solution. My comment above was merely that most
applications don't do a lot of string indexing. Clearly the 8.1 behavior is
bad for those applications like yours that use indexing intensively.

In any case, rest assured that I take this problem seriously and am working on
a solution. Hopefully I'll have a fix soon that I can put into a patch
release.

> It would help with index and range as well: if utf size and byte size is
> the same, there are no utf characters, and plain and fast indexing
> can be used.

That's a good point, and I'll definitely look for ways to take advantage of
that fact in the code.

> Another possibility (that would take a lot of work though) would be to
> keep the String object the way it was in Tcl 8.0, and
> making a new UnicodeString object type that works the way Strings work in
> Tcl 8.1. If unicode characters appear, the String object is converted
> to the UnicodeString object, otherwise, the nice and fast old routines
> can be used.

We can't really keep the String object the way it was in 8.0 because the data
is fundamentally stored in UTF-8 representation now. To do anything else will
involve new object types, either to store double-byte characters, or to keep
more information about the multi-byte characters.

> : Tcl's current string usage is already pretty ASCII-centric, and the UTF


> : routines are actually pretty fast. I think our efforts are best spent looking
> : for ways to change algorithms to avoid recomputing information.
>

> I would not call a 300 fold slowdown on a common operation pretty fast.
> Especially since I do not need the Unicode ...

While the slowdown is caused by switching to UTF representation, it doesn't
have anything to do with the speed of the Utf routines. The problem is an O(n)
instead of an O(1) algorithm.

Scott Stanton

unread,
May 12, 1999, 3:00:00 AM5/12/99
to
John McLaughlin wrote:

> I think the key word in all of the comments is 'expect', there are certainly no
> guarantees or promises but there were _expectations_ that it would be >= 8.0 speed
> and those expectations were not managed (to the best of my knowledge) -- something
> along the lines of 'Tcl 8.1 is under some situations slower than Tcl 8.0 so if
> performance is critical stick with Tcl 8.0' would have helped manage expectations.
> (but of course that would have launched a similar series of discussions..)

To be fair, some things in 8.1 are faster than 8.0 and many things are the
same. For example, execution of global code is faster in 8.1 because we skip
compilation in cases where the code will only be invoked once. 8.1 is also
more agressive about sharing literals to help reduce memory overhead and reduce
the number of times that the same conversion has to be performed. It's not a
strict rule that 8.1 is slower than 8.0. The new regexp code is quite a bit
faster than 8.0 for many types of pattern, not to mention being *much* more
powerful. Unfortunately the caching problems are making these improvements
hard to see in many cases.

Given the huge amount of change involved in internationalization, replacing the
regexp engine and making Tcl thread-safe, I think it's fair to expect there to
be a shakedown period where we identify the new bottlenecks and find
solutions. We aren't gifted with perfect knowledge of how people use Tcl and
where all the problems will be, so it takes time to identify and fix things.
The first step is to get things working, then tune performance. Now that most
things are working, it's time to start tuning.

> I hope so, I think it's in the best interest of Tcl, Scriptics and the community to
> have the most recently released version to be the preferred version. Up until 8.1
> that was true but now if you want speed you stick with 8.0, if you want the new
> regexp, threads or unicode you go with 8.1 This strikes me as not being healthy
> long term for Tcl.

Agreed. And since Scriptics uses Tcl extensively, we'll be just as motivated
as you to smooth out the rough edges. We will definitely be working on fixes
to the problems as they are identified. Currently on our list to fix are
string indexing and regexp caching. What everyone here can do to help is to
continue to identify and report bottlenecks that seem like good candidates for
optimization. This will both help to identify problems we may not be aware of
and it will help to prioritize the problems that we do know about.

Scott Stanton

unread,
May 12, 1999, 3:00:00 AM5/12/99
to
Dave Warner wrote:
>
> To clarify, I hear you suggesting that read only access to the environment
> (as is necessary in CGI scripts, for example) would best be done with
> something like
> array set localEnv [array get ::env]
> and subsequently referencing localEnv. True?

That probably depends on how often you expect to access the values. If you are
checking values at startup, it's reasonable to just access the array directly.
But what you suggest will improve performance if you repeatedly access a value
in a loop. An alternative implementation is to copy the array value into a
local variable before you enter the loop. This will improve performance even
for normal arrays, since you'll avoid a hash table lookup.

Bruce Stephens

unread,
May 12, 1999, 3:00:00 AM5/12/99
to
Scott Stanton <scott....@Scriptics.com> writes:

> Peter.DeRijk wrote:
> > As I work in molecular biology, string indexing is extremely common in
> > my code, and usually on very long strings. I am sure there are many other
> > real world uses that do rely on string processing: I have done a small
> > check, and the slowdown in Tcl8.1 is so bad that it is completely
> > unacceptable. I am currently staying at 8.0, and if this is not expected
> > to improve, I will have to look at other options ...
>
> For now, you can convert the string to a list using "split $string
> {}" and then use "lindex". This is more memory intensive, but should
> give you linear indexing behavior. We definitely recognize that
> this is a problem and we're tring to figure out the best solution.
> My comment above was merely that most applications don't do a lot of
> string indexing. Clearly the 8.1 behavior is bad for those
> applications like yours that use indexing intensively.

Not linear, constant!

In any case, overall I think your comments clear things up. Strings
are mostly still OK in 8.1. For some things, turning them into lists
will be worthwhile, and sometime in the future, there'll be a
Unicode-string object type, which will be bit bigger but operations
will be constant again.

There's probably still some optimisation to be done on the existing
implementation, but the plan sounds OK to me.

Chang LI

unread,
May 13, 1999, 3:00:00 AM5/13/99
to
Regexp is a kernel command I used in the previous test program.
The following program test the regexp:

set u {%[0-9A-Fa-f][0-9A-Fa-f]}
set v abcdefghijklmnopqrstuvwxyz
puts [time {regexp $u $v m} 1000]

The result is

8.1 (ms) 8.0 (ms) 8.1/8.0
---------------------------------------
110 50 2.2

It shows regexp in 8.1 is one times slower than 8.0 for
this special expression. The optimization of the regexp
may greatly improve the performance.

Andreas Kupries

unread,
May 14, 1999, 3:00:00 AM5/14/99
to

Scott Stanton <scott....@Scriptics.com> writes:


> John McLaughlin wrote:

>> I think the key word in all of the comments is 'expect', there are
>> certainly no guarantees or promises but there were _expectations_
>> that it would be >= 8.0 speed and those expectations were not
>> managed (to the best of my knowledge) -- something along the lines
>> of 'Tcl 8.1 is under some situations slower than Tcl 8.0 so if
>> performance is critical stick with Tcl 8.0' would have helped
>> manage expectations. (but of course that would have launched a
>> similar series of discussions..)

> To be fair, some things in 8.1 are faster than 8.0 and many things
> are the same. For example, execution of global code is faster in
> 8.1 because we skip compilation in cases where the code will only be
> invoked once.

I believe that this is only true if the global code does not contain
(nested) loops. For the top-level parser the loop body is a single
word which wi