memory usage in tcl

39 views
Skip to first unread message

Vincent Wartelle

unread,
May 21, 2001, 11:56:30 AM5/21/01
to

Bob Techentin suggested to post a summary about memory usage in TCL.

Can you (the tcl community) help me to check this summary, to see
whether it's
- true ?
- worth "wikifying" ?

Thanks,

Vincent.


MEMORY COSTS WITH TCL


My temporary conclusions and where they come from.


A. OBJECTIVE CONCLUSIONS

on tcl 8.3, 32 bit machines only

each tcl object =
any new string, number, date, value --> 24 bytes + content size
any pre-existing string, data, value --> 4 bytes (pointer only)

content size =
depends on encoding and data type :
one or two bytes per char for string values
may be 0 for a number (integer/double), if it is never used as a
string
(therefore included in the core tcl object)

each variable =
48 bytes + "content size" of the name + "tcl object size" of the
content

each hash key entry =
48 bytes + "content size" of the key + "tcl object size" of the
value

each list entry =
4 bytes + "tcl object size" of the content

B. SUBJECTIVE CONCLUSIONS

- When using TCL, don't emulate pointer mechanisms. Copy the complete
data
when needed. TCL will replace redundant data by pointers.

- Each different "thing" in a tcl program will cost 24 bytes

- Variables and hash-tables are costly:
52 bytes overhead for each variable,
52 bytes overhead for each hash table key

- Lists are not costly:
4 bytes overhead for each element.


C. INFORMATION FROM NEWSGROUPS

1. excerpt from

http://groups.google.com/groups?hl=fr&lr=&safe=off&ic=1&th=38a358d1b5875c48,17&seekm=39509312.55CF52D1%40hursley.ibm.com#p

> On a 32 bit machine where alignment is 4 byte boundary and the types have the
> following sizes,
> long 4 bytes
> int 4 bytes
> char * 4 bytes
> double 8 bytes
> void * 4 bytes
> sizeof (Tcl_Obj) = 4 + 4 + 4 + 4 + MAX (4, 8, 4, 4 + 4)
> = 24 bytes


2. excerpt from

http://groups.google.com/groups?hl=fr&lr=&safe=off&ic=1&th=8ae402debdf9a73f,5&seekm=3B014995.9F833320%40mayo.edu#p

>> [experiment shows that...] approximately 54 bytes for each key. [...]


> Well, it takes a certain amount of space to store the hash entry (four
> words plus the size of the key; median about 20 bytes in your case on
> a 32-bit machine) and more to store the variable (each entry in an array
> is an independent variable that can support its own traces, etc.) which
> adds another 8 words or 32 bytes. This gives about 52 bytes per array
> member; pretty close to what you report...


D. MY EXPERIMENTS

tclsh 8.3.2 with TCL_MEMORY_DEBUG on windows Millenium - 32 bit machine

1. hashtable with empty values

memory info
current bytes allocated 152681
...

% for {set i 0} {$i < 10000 } { incr i } {
set t($i) ""
}
% memory info
current bytes allocated 698453
...

698453 - 152681 = 545772

approx 54 bytes per key.


2. hashtable with constant value

memory info
current bytes allocated 152550
...

% for {set i 0} {$i < 10000 } { incr i } {
set t($i) "abcd"
}
% memory info
current bytes allocated 698363
...

698363 - 152550 = 545813

approx 54 bytes per key.


3. hashtable with variable value

memory info
current bytes allocated 152550
...

% for {set i 0} {$i < 10000 } { incr i } {
set t($i) "abcd_$i"
}

% memory info
current bytes allocated 1037220
...

1037220 - 152550 = 884670

approx 89 bytes per key.

4. empty global variables


% memory info
current bytes allocated 152550
...

% for { set i 1 } { $i <= 10000 } { incr i } {
set ::a[set i] ""
}
% memory info
current bytes allocated 729761
...

729761 - 152550 = 577211

approx 57 bytes per variable

5. global variables with the same value


% memory info
...
current bytes allocated 152550

% for { set i 1 } { $i <= 10000 } { incr i } {
set ::a[set i] "abcd"
}
% memory info
...
current bytes allocated 708202

708202 - 152550 = 555652

approx. 55 bytes per variable.

6. global variables with different values

% memory info
...
current bytes allocated 152550

% for { set i 1 } { $i <= 10000 } { incr i } {
set ::a[set i] "abcd_$i"
}
% memory info
...
current bytes allocated 1047070

1047070 -152550 = 894520

approx 89 bytes per variable.


7. empty list entries

% memory info
...
maximum bytes allocated 152550

% for {set i 1 } { $i <= 10000 } { incr i } {
lappend l ""
}
% memory info
...
current bytes allocated 202179

202179 - 152550 = 49629

approx 5 bytes per list entry.

6. identic list entries

% memory info
...
current bytes allocated 152550

% for {set i 1 } { $i <= 10000 } { incr i } {
lappend ::l "abcd"
}
% memory info
...
current bytes allocated 202215

202215 - 152550 = 49665

approx 5 bytes per list entry.


7. different list entries


% memory info
...
current bytes allocated 152550

% for {set i 1 } { $i <= 10000 } { incr i } {
lappend ::l "abcd_$i"
}
% memory info
...
current bytes allocated 541083

541083 - 152550 = 428533

approx 43 bytes per list entry.

Chang LI

unread,
May 21, 2001, 5:07:40 PM5/21/01
to

Vincent Wartelle wrote in message <3B093AAE...@hotmail.com>...

>
>B. SUBJECTIVE CONCLUSIONS
>
>- When using TCL, don't emulate pointer mechanisms. Copy the complete
>data when needed. TCL will replace redundant data by pointers.
>
>- Each different "thing" in a tcl program will cost 24 bytes
>
>- Variables and hash-tables are costly:
> 52 bytes overhead for each variable,
> 52 bytes overhead for each hash table key
>

I have found the slow down when use the array. If set an element of an array
means to 52 bytes copy, it is really slow. Is it possible to add a nType
item
in the Tcl_Obj to speed up the processing?

Chang LI
Neatware


Jeff Hobbs

unread,
May 21, 2001, 7:51:10 PM5/21/01
to Vincent Wartelle
Vincent Wartelle wrote:
> Can you (the tcl community) help me to check this summary, to see
> whether it's
> - true ?
> - worth "wikifying" ?

This is definitely worth wikifying. Some comments...

> each tcl object =
> any new string, number, date, value --> 24 bytes + content size
> any pre-existing string, data, value --> 4 bytes (pointer only)
>
> content size =
> depends on encoding and data type :
> one or two bytes per char for string values
> may be 0 for a number (integer/double), if it is never used as a
> string
> (therefore included in the core tcl object)

UTF-8 can go up to 3 bytes per char for the 2-byte unicode that
Tcl uses internally. Also, content size can be greater for
UnicodeString objects, List objects, ... that all malloc some
extra space for their internal reps.

Very nice overall review, BTW. Do note that things will change
between versions (the current CVS may already provide somewhat
different numbers to what your empirical data shows).

--
Jeff Hobbs The Tcl Guy
Senior Developer http://www.ActiveState.com/
Tcl Support and Productivity Solutions

Richard.Suchenwirth

unread,
May 22, 2001, 5:15:32 AM5/22/01
to
> * From: Vincent Wartelle <vwar...@hotmail.com>

> Bob Techentin suggested to post a summary about memory usage in TCL.
> Can you (the tcl community) help me to check this summary, to see
> whether it's
> - true ?
> - worth "wikifying" ?
[...]

I have just put it there, see http://mini.net/cgi-bin/wikit/1617.html
Now I have to fill some lines so Mailgate does not reject this post
(it requires the message to be longer than the quotes)...
Feel free to edit and polish that page ;-)
--
Schoene Gruesse/best regards, Richard Suchenwirth - +49-7531-86 2703
Siemens Dematic AG, PA RC D2, Buecklestr.1-5, 78467 Konstanz,Germany
Personal opinions expressed only unless explicitly stated otherwise.

--
Posted from sneak.kst.siemens.de [193.98.144.2]
via Mailgate.ORG Server - http://www.Mailgate.ORG

Andreas Kupries

unread,
May 22, 2001, 4:04:54 PM5/22/01
to

Richard.S...@kst.siemens.de ("Richard.Suchenwirth") writes:

> > * From: Vincent Wartelle <vwar...@hotmail.com>
> > Bob Techentin suggested to post a summary about memory usage in TCL.
> > Can you (the tcl community) help me to check this summary, to see
> > whether it's
> > - true ?
> > - worth "wikifying" ?
> [...]
>
> I have just put it there, see http://mini.net/cgi-bin/wikit/1617.html
> Now I have to fill some lines so Mailgate does not reject this post
> (it requires the message to be longer than the quotes)...

My ISP has the same restriction. If I am unsure about the ratio I
simply indent (part of) the quoted material by a space to get around
this. That helps. See above.

--
Sincerely,
Andreas Kupries <a.ku...@westend.com>
Developer @ <http://www.activestate.com/>
Private <http://www.purl.org/NET/akupries/>
-------------------------------------------------------------------------------

Reply all
Reply to author
Forward
0 new messages