Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Speeding up list operations?

8 views
Skip to first unread message

jose-v

unread,
Oct 22, 2008, 5:50:05 AM10/22/08
to
Hi all,

I faced a striking result today: I've found an amazing way to speed up
list appending.

I have a procedure that gathers a lot of data in a list to analyze it
later in other procedures, as an intermediate storage stage:

proc storeData { data v s } {
variable priv
lappend priv(allData,$s,$v) $data
}

Each $data is a long string (about 1000 characters), and I can have
thousands of them (let's say 2000). So at the end the list
priv(allData,$s,$v) is quite a long one.

'priv' is an array that contains all kind of information about this
module, including this list to store the objects data. (In this case
you can consider 'v' and 's' to be both = 0).

Because I have a lot of data, just storing it takes some time, so I
added today one debugging line to count the number of objects added to
the list and possibly generate some feedback elsewhere:

proc storeData { data v s } {
variable priv
lappend priv(allData,$s,$v) $data
incr priv(storedCnt,$s,$v)
}

where priv(storedCnt,$s,$v) was initialized to zero before the first
call to this procedure.

Well: just by doing this the whole process became faster in a factor
of about 3 !!!

I cannot explain or understand it, so maybe you can illuminate me. Why
touching one element of the same 'priv' array will accelerate the
[lappend] in the long element?

I have another way to accelerate the thing even more (another factor
of ~2, making the thing about 6 times faster than originally):

proc storeData { data v s } {
variable priv
puts [ time [ list lappend priv(allData,$s,$v) $data ] ]
}

Yes! Even printing stuff to stdout is not slower: the time command
accelerates the whole thing so much that I can afford it. (It is
[time] who does it, not [list], I've checked it).

So my initial assumption (make this storage procedure short and
minimalistic, do not generate feedback, to make it faster) is
completely wrong: things are faster if I manage to find out the
correct feedback methods!

Do you understand this? Learning what slows down my initial procedure
(so simple!) can help me in programming better code!

This is tcl 8.5.4, compiled without multithreading, running under
MacOS X 10.5:
os: Darwin, platform: unix, machine: i386, osVersion: 9.5.0, wordSize:
8, byteOrder: littleEndian

Donal K. Fellows

unread,
Oct 22, 2008, 6:21:23 AM10/22/08
to
jose-v wrote:
> I faced a striking result today: I've found an amazing way to speed up
> list appending.
[...]

> Do you understand this? Learning what slows down my initial procedure
> (so simple!) can help me in programming better code!

I'm suspicious. If you don't need [storeData] to return anything, try
putting a [return] (with no arguments) at the end of it.

Donal.

di L

unread,
Oct 22, 2008, 6:24:59 AM10/22/08
to
> I'm suspicious. If you don't need [storeData] to return anything, try
> putting a [return] (with no arguments) at the end of it.

I see! The procedure was originally returning the whole list...

It makes a lot of sense, thanks a lot. I will test your suggestion,
but it looks terribly obvious now.

(Still, why does [time] accelerate the procedure even more?)

di L

unread,
Oct 22, 2008, 6:30:34 AM10/22/08
to
> (Still, why does [time] accelerate the procedure even more?)

Well, it is probably not [time], but the fact that I don't touch
'priv' anymore in this alternative.

Thanks, eyeopener!

j.

Alexandre Ferrieux

unread,
Oct 22, 2008, 8:09:32 AM10/22/08
to
On Oct 22, 12:21 pm, "Donal K. Fellows" <donal.k.fell...@man.ac.uk>
wrote:

Yes, this will decr the refcount earlier. But why does it matter so
much for speed ? The dismantling of the list is still outside the proc
(and the timing) since it is stored in a namespace var... What am I
missing ?

-Alex

Ralf Fassel

unread,
Oct 22, 2008, 8:12:14 AM10/22/08
to
* di L <jos...@gmail.com>

| Well, it is probably not [time], but the fact that I don't touch
| 'priv' anymore in this alternative.

Most probably the fact that you dont return the list when usingh
'time'. If you have more than one data element to append at a time,
then instead of repeated calls to storeData it could be useful to
provide a list-form of the proc, something like:

proc storeDataList { datalist v s } {
variable priv
eval lappend [list priv(allData,$s,$v)] $datalist
# tcl 8.5: lappend priv(allData,$s,$v) {*}$datalist
# probably also: incr somecount [llength $datalist]
return
}

HTH
R'

Donal K. Fellows

unread,
Oct 22, 2008, 8:28:50 AM10/22/08
to
On 22 Oct, 11:24, di L <jose...@gmail.com> wrote:
> > I'm suspicious. If you don't need [storeData] to return anything, try
> > putting a [return] (with no arguments) at the end of it.
>
> I see! The procedure was originally returning the whole list...

Yes, and the list was (probably) then being manipulated into printable
form, which is where the real cost came from.

> (Still, why does [time] accelerate the procedure even more?)

At a guess, compilation effects.

Donal.

0 new messages