Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

speed

7 views
Skip to first unread message

John Kelly

unread,
Sep 10, 2007, 1:45:21 PM9/10/07
to
I'm using a /usr/local, compiled from source, Tcl 8.4.15 on linux.

Comparing execution speed, I found Perl to be 10 times faster than
Tcl. I don't know much about Tcl yet, I'm just getting started with
it.

Is there a way to make Tcl go faster?


Tcl:

>package require Tcl 8.4
>
>set tx 1
>set limit 10000000
>
>while {$tx < $limit} {
> set tx [expr {$tx + 1}]
>}
>
>puts "tx = $tx"


>tx = 10000000
>
>real 0m56.871s
>user 0m56.858s
>sys 0m0.009s

Perl:

>use strict;
>use warnings;
>use IO::Handle;
>use integer;
>
>STDOUT -> autoflush (1);
>STDERR -> autoflush (1);
>
>my ($tx, $limit);
>
>$tx = 1;
>$limit = 10000000;
>
>while ($tx < $limit) {
> $tx = $tx + 1;
>}
>
>print "tx = $tx\n";
>
>1;


>tx = 10000000
>
>real 0m5.408s
>user 0m5.401s
>sys 0m0.006s

--
Internet service
http://www.isp2dial.com/

Michael Schlenker

unread,
Sep 10, 2007, 1:59:52 PM9/10/07
to
John Kelly schrieb:

> I'm using a /usr/local, compiled from source, Tcl 8.4.15 on linux.
>
> Comparing execution speed, I found Perl to be 10 times faster than
> Tcl. I don't know much about Tcl yet, I'm just getting started with
> it.
>
> Is there a way to make Tcl go faster?
Use the tricks of your platform.

>
>
> Tcl:
>
>> package require Tcl 8.4
>>
>> set tx 1
>> set limit 10000000
>>
>> while {$tx < $limit} {
>> set tx [expr {$tx + 1}]
>> }
>>
>> puts "tx = $tx"
>

This is more idiomatically (and probably faster as):

package require Tcl 8.4

proc mainloop {tx limit} {
while {$tx < $limit} {
incr start
}
puts $tx
}
mainloop 1 1000000

or maybe even:

proc mainloop {tx limit} {
for {} {$tx < $limit} {incr tx} {}
puts $tx
}
mainloop 1 1000000

There is a lot of advice in the tcl'ers wiki about performance,
(http://wiki.tcl.tk/348 for example)
and with such microbenchmarks you basically measure nothing useful.

Your code uses two inefficient things:
1. no procs, means no bytecompiling (a little bit simplified)
2. set tx [expr {$tx +1}] instead of {incr tx}

For some slightly better examples look at:
http://shootout.alioth.debian.org/gp4/benchmark.php?test=all&lang=tcl&lang2=perl

Yes, Perl might be faster in many things, but speed isn't usually the
issue when you use a scripting language, the interesting stuff with Tcl
isn't usually performance, its more things like cross platform
portability, basically sane unicode/encoding support and some other nice
things like high flexibility for writing DSLs.

Michael

Bryan Oakley

unread,
Sep 10, 2007, 1:59:29 PM9/10/07
to
John Kelly wrote:
> I'm using a /usr/local, compiled from source, Tcl 8.4.15 on linux.
>
> Comparing execution speed, I found Perl to be 10 times faster than
> Tcl. I don't know much about Tcl yet, I'm just getting started with
> it.
>
> Is there a way to make Tcl go faster?
>

When you put code inside of a proc it gets byte-compiled and is faster.
When you put code outside of a proc it doesn't get byte-compiled and it
is slower.

In the case of your code, I get maybe 14 seconds of real clock time as
you coded it. Placing the code inside a proc drops that down to less
than 3 seconds.

--
Bryan Oakley
http://www.tclscripting.com

John Kelly

unread,
Sep 10, 2007, 2:27:29 PM9/10/07
to
On Mon, 10 Sep 2007 19:59:52 +0200, Michael Schlenker
<sch...@uni-oldenburg.de> wrote:

>Yes, Perl might be faster in many things, but speed isn't usually the
>issue when you use a scripting language

Right. I just want to have a rough idea of what to expect before
investing time in learning YAL.


>Your code uses two inefficient things:
>1. no procs, means no bytecompiling (a little bit simplified)

Thanks for the "proc" speed tip.

John Kelly

unread,
Sep 10, 2007, 2:27:30 PM9/10/07
to
On Mon, 10 Sep 2007 17:59:29 GMT, Bryan Oakley
<oak...@bardo.clearlight.com> wrote:

>When you put code inside of a proc it gets byte-compiled and is faster.
>When you put code outside of a proc it doesn't get byte-compiled and it
>is slower.

OK. Thanks for that info, Bryan.

ZB

unread,
Sep 10, 2007, 3:37:03 PM9/10/07
to
Dnia 10.09.2007 Bryan Oakley <oak...@bardo.clearlight.com> napisał/a:

> When you put code inside of a proc it gets byte-compiled and is faster.
> When you put code outside of a proc it doesn't get byte-compiled and it
> is slower.
>
> In the case of your code, I get maybe 14 seconds of real clock time as
> you coded it. Placing the code inside a proc drops that down to less
> than 3 seconds.

It's amazing - I'm getting *much, much* better acceleration:


#proc countIt {} {


set tx 1
set limit 10000000

while {$tx < $limit} {
incr tx
#}

#countIt

- with the proc commented out, like above, it takes 94 seconds
- ...but when the comments are removed - it drops to 6 (yes, six) seconds(!)
--
ZB

ZB

unread,
Sep 10, 2007, 3:43:59 PM9/10/07
to
Dnia 10.09.2007 ZB <zbREMOVE_THIS@AND_THISispid.com.pl> napisał/a:

> }
> #}

Missed one parenthesis, when typing. ;]
--
ZB

sleb...@gmail.com

unread,
Sep 10, 2007, 9:48:46 PM9/10/07
to

Bitmover's implementation of a tcl interpreter, L, can do byte-
compilation of code outside procs. Any reason why this haven't been
done/considered/backported to tcl yet?

Note: Yes, L is a language, but it is also a full-fledged tcl
interpreter.

miguel

unread,
Sep 10, 2007, 10:04:54 PM9/10/07
to

Independently of L: yes, that could be done. Easily. But code that runs
just once is faster to interpret than compile.

Note that when you have a loop outside a proc body, the loop body
actually *does* get bytecompiled - BUT all variable accesses are by
name. Within proc bodies local variables are accessed by indexing into a
table (much faster). The indexed access to variables is not
(economically) feasible outside of proc bodies.

So: even though I have not looked at L, I am not sure that compiling
everything is an overall win in terms of performance. It definitely
would make the core simpler though, and this may have been the motivation.

Jeff Hobbs

unread,
Sep 11, 2007, 5:19:50 PM9/11/07
to miguel
miguel wrote:
>>>> Is there a way to make Tcl go faster?
>>> When you put code inside of a proc it gets byte-compiled and is faster.
>>> When you put code outside of a proc it doesn't get byte-compiled and it
>>> is slower.
>>>
>>> In the case of your code, I get maybe 14 seconds of real clock time as
>>> you coded it. Placing the code inside a proc drops that down to less
>>> than 3 seconds.
>>
>> Bitmover's implementation of a tcl interpreter, L, can do byte-
>> compilation of code outside procs. Any reason why this haven't been
>> done/considered/backported to tcl yet?
>
> Independently of L: yes, that could be done. Easily. But code that runs
> just once is faster to interpret than compile.

Not so fast with that statement without a ton of caveats ...

> Note that when you have a loop outside a proc body, the loop body
> actually *does* get bytecompiled - BUT all variable accesses are by
> name. Within proc bodies local variables are accessed by indexing into a
> table (much faster). The indexed access to variables is not
> (economically) feasible outside of proc bodies.
>
> So: even though I have not looked at L, I am not sure that compiling
> everything is an overall win in terms of performance. It definitely
> would make the core simpler though, and this may have been the motivation.

I actually did the modifications that allowed global compile. I am not
convinced of the oft-repeated but never really well analyzed statement
"code that runs just once is faster to interpret than compile". In
fact, I no longer believe it for most "standard" code.

Furthermore, Larry and I have discussed a bit what it could mean to only
have a compile side (fully parsed and bytecoded) Tcl. You could
possibly excise a large chunk of the classic Tcl codebase by moving to
that. I would like to discuss this further at the Tcl conference.

Jeff

Gerald W. Lester

unread,
Sep 11, 2007, 5:50:14 PM9/11/07
to
Jeff Hobbs wrote:
>...

> Furthermore, Larry and I have discussed a bit what it could mean to only
> have a compile side (fully parsed and bytecoded) Tcl. You could
> possibly excise a large chunk of the classic Tcl codebase by moving to
> that. I would like to discuss this further at the Tcl conference.

Is that a BOF request or do you want a session time slot?


--
+--------------------------------+---------------------------------------+
| Gerald W. Lester |
|"The man who fights for his ideals is the man who is alive." - Cervantes|
+------------------------------------------------------------------------+

Donal K. Fellows

unread,
Sep 11, 2007, 6:36:25 PM9/11/07
to
Gerald W. Lester wrote:
> Is that a BOF request or do you want a session time slot?

Sounds more like a Beer Request to me. :-)

Donal.

sleb...@gmail.com

unread,
Sep 11, 2007, 6:36:44 PM9/11/07
to
On Sep 12, 5:19 am, Jeff Hobbs <je...@activestate.com> wrote:
> miguel wrote:
> >>>> Is there a way to make Tcl go faster?
> >>> When you put code inside of a proc it gets byte-compiled and is faster.
> >>> When you put code outside of a proc it doesn't get byte-compiled and it
> >>> is slower.
>
> >>> In the case of your code, I get maybe 14 seconds of real clock time as
> >>> you coded it. Placing the code inside a proc drops that down to less
> >>> than 3 seconds.
>
> >> Bitmover's implementation of a tcl interpreter, L, can do byte-
> >> compilation of code outside procs. Any reason why this haven't been
> >> done/considered/backported to tcl yet?
>
> > Independently of L: yes, that could be done. Easily. But code that runs
> > just once is faster to interpret than compile.
>
> Not so fast with that statement without a ton of caveats ...
>
> > Note that when you have a loop outside a proc body, the loop body
> > actually *does* get bytecompiled - BUT all variable accesses are by
> > name. Within proc bodies local variables are accessed by indexing into a
> > table (much faster). The indexed access to variables is not
> > (economically) feasible outside of proc bodies.

Why is this so? Couldn't we simply treat the whole script as one big
"main" proc? Is this because we *dont* bytecompile outside of proc
bodies? Is this to simplify support for interactive mode?

> > So: even though I have not looked at L, I am not sure that compiling
> > everything is an overall win in terms of performance. It definitely
> > would make the core simpler though, and this may have been the motivation.
>
> I actually did the modifications that allowed global compile. I am not
> convinced of the oft-repeated but never really well analyzed statement
> "code that runs just once is faster to interpret than compile". In
> fact, I no longer believe it for most "standard" code.
>
> Furthermore, Larry and I have discussed a bit what it could mean to only
> have a compile side (fully parsed and bytecoded) Tcl. You could
> possibly excise a large chunk of the classic Tcl codebase by moving to
> that. I would like to discuss this further at the Tcl conference.
>

Now for the next question. It it possible to bytecompile upleveled
code? It's a shame that one of Tcl's most interesting features, to be
able to create custom control structures, is also one that
considerably slows down your code.

Message has been deleted

miguel

unread,
Sep 11, 2007, 8:18:03 PM9/11/07
to
sleb...@yahoo.com wrote:
> On Sep 12, 5:19 am, Jeff Hobbs <je...@activestate.com> wrote:
>> miguel wrote:
>>> Independently of L: yes, that could be done. Easily. But code that runs
>>> just once is faster to interpret than compile.
>> Not so fast with that statement without a ton of caveats ...
>>
>>> Note that when you have a loop outside a proc body, the loop body
>>> actually *does* get bytecompiled - BUT all variable accesses are by
>>> name. Within proc bodies local variables are accessed by indexing into a
>>> table (much faster). The indexed access to variables is not
>>> (economically) feasible outside of proc bodies.
>
> Why is this so? Couldn't we simply treat the whole script as one big
> "main" proc? Is this because we *dont* bytecompile outside of proc
> bodies? Is this to simplify support for interactive mode?

Not really. The reasons are:

1. COMPILATION
In order to interpret you have to
(a) parse each command
(b) dispatch the command
The compiled approach requires:
(a) parse each command
(b) compile the parsed command
(c) dispatch each command (some but not all will be much faster to
dispatch from compiled code)
The cost of compiling, including creating and then cleaning up the
relevant structs, overwhelms the extra cost of the slow dispatch *for
code that runs just once*. Or so I thought (Jeff may correct me here).

But please do note that loop bodies ARE bytecompiled (they are assumed
not to run just once). They pay the price of ...

2. VARIABLE ACCESS
Proc bodies run in their own environment; during compilation all local
variables are identified (close enough), and a table of variables is
created. Variables are then accessed at runtime by indexing into this
table.
The compiler needs to access each variable occurrence by name - just
like a straight interpretation has to do. Building up the variable table
already pays the slow access cost, it can't be recovered by a faster
access later on if the code is used just once.


> Now for the next question. It it possible to bytecompile upleveled
> code? It's a shame that one of Tcl's most interesting features, to be
> able to create custom control structures, is also one that
> considerably slows down your code.

Currently: not directly, there are ways to get that using 'if 1' tricks.

Going forward: I think so, yes. It will still suffer from slower
variable access than builtin code, as it needs to access a foreign
variable table too (the one in the uplevel), and/or to link the truly
local variables to the uplevel ones. You need a macro system to get the
full speed - ie, the 'uplevel' needs to occur at the time the uplevel
proc is compiled. Some preliminary thoughts on both macros and better
compilation of uplevel code exist somewhere in my mind, some of them
even found their way to experimental code somewhere in my HD.

None yet that I deemed worthy enough. The difficult thing is to insure
that the compilation survives in cached form long enough to pay for the
cost of compilation, but not so long as to become incorrect.

Ron Fox

unread,
Sep 12, 2007, 5:34:04 AM9/12/07
to
Code sprint code sprint!!!

Joe English

unread,
Sep 12, 2007, 3:07:19 PM9/12/07
to
Miguel Sofer wrote:
> [...]

>The cost of compiling, including creating and then cleaning up the
>relevant structs, overwhelms the extra cost of the slow dispatch *for
>code that runs just once*. Or so I thought (Jeff may correct me here).

And in any case, for code that runs just once it
doesn't really matter how fast it runs.

(Within reason, of course -- if parse+compile+execute was, say,
an order of magnitude slower than parse+interpret, then it
might make a difference; but Tcl's "compile" phase is more
than Fast Enough.)


--Joe English

John Kelly

unread,
Sep 12, 2007, 7:34:10 PM9/12/07
to
BTW,

I repeated my speed test, tcl vs. perl, but this time with the loop in
a proc as suggested to improve tcl speed.

10,000,000 iterations:

perl 5.4 seconds
tcl 7.3 seconds (with my original expr)
tcl 3.6 seconds (with incr instead of expr)

Putting the tcl loop in a proc made a big difference. In the first
test, perl was 10 times faster than tcl. But now, they're in the same
ballpark.

Now I feel better about tcl. I like using the event loop with non
blocking I/O. I wonder if perl has any equivalent to the tcl event
loop.

Cameron Laird

unread,
Sep 13, 2007, 2:52:38 PM9/13/07
to
In article <o1tge39tv7br866vn...@4ax.com>,
John Kelly <j...@isp2dial.com> wrote:
.
.
.

>blocking I/O. I wonder if perl has any equivalent to the tcl event
>loop.
.
.
.
<URL: http://poe.perl.org >. Also <URL:
http://download.fedora.redhat.com/pub/fedora/linux/extras/6/i386/repoview/perl-Event.html >,
apparently, but I don't understand that one.
0 new messages