Will TCL release the variables automatically when the procedure is finished?

bingo

unread,

Sep 12, 2008, 1:33:46 AM9/12/08

to

Hi, All:
I'm new to TCL, and can somebody tell me whether TCL will release
the variables when the procedure is finished?
I got "OUT OF MEMORY" error while I was running my code, so I
thought unreleased variables might cause it.
Thanks in advance.
Bin

miguel

unread,

Sep 12, 2008, 1:36:37 AM9/12/08

to

Variables which are local to a proc body are freed when the proc returns, yes.
Namespace variables live forever (or until you unset them explicitly)

bingo

unread,

Sep 12, 2008, 1:45:44 AM9/12/08

to

Thanks a lot for your quick response.
I guess there might be other things causing the error.

bingo

unread,

Sep 12, 2008, 1:51:32 AM9/12/08

to

Hi, miguel:
Can u also tell me about how and when shall I use functions :
"TCL_Finalize and TCL_Exit" ?
When I was trying to use them, I got "wrong # args"
Thanks a lot.
Bin

Alexandre Ferrieux

unread,

Sep 12, 2008, 2:22:20 AM9/12/08

to

Can you describe the constraints that led you to embedding Tcl ?
Since you're just starting with the Tcl language, it might be wise to
postpone your encounter with the C interface for some time. In some
cases the need for C vanishes completely in between.

-Alex

bingo

unread,

Sep 12, 2008, 2:35:45 AM9/12/08

to

On Sep 11, 11:22 pm, Alexandre Ferrieux <alexandre.ferri...@gmail.com>
wrote:

Hi,
I'm using some computational package called: NAMD, which allow
users to add simple utilities without modifying the source code.
Unfortunately, only TCL is possible to do the job.

Bin

Alexandre Ferrieux

unread,

Sep 12, 2008, 2:45:54 AM9/12/08

to

OK, thanks for the clarification. If NAMD is the one I guess
(molecular dynamics, optimized to make the silicon scream), then
indeed removing the C part is out of the question ;-)

However it might still be wise to take a bit of perspective on the
coupling between the two.
What is your Tcl part bringing there ? GUI (Tk) ? Automation ? Output
formatting ?
Depending on the answer, alternatives to embedding exist that would
largely reduce the complexity (and bug risk): (1) Tcl in an external
child process, reached through sockets or pipes; (2) Tcl as an outer
shell, spawning the C one as a child; (3) Tcl as an outer shell, using
the facilities of the C part exposed in an extension; (4) Tcl as an
outer shell, using the facilities of the C part in an unmodified DLL
wrapped by ffidl.

-Alex

Donal K. Fellows

unread,

Sep 12, 2008, 5:13:39 AM9/12/08

to

Alexandre Ferrieux wrote:
> OK, thanks for the clarification. If NAMD is the one I guess
> (molecular dynamics, optimized to make the silicon scream), then
> indeed removing the C part is out of the question ;-)

Knowing how these sorts of apps are, it could well be a Fortran part
that is impossible to remove. :-) But that's fine; Tcl works just fine
with Fortran too (and a good Tcler always wants to use the right tool
for the job).

Donal.

bingo

unread,

Sep 12, 2008, 12:26:29 PM9/12/08

to

On Sep 11, 11:45 pm, Alexandre Ferrieux <alexandre.ferri...@gmail.com>

You are right, NAMD is an optimized C++ code, and I think the relation
of TCL and C code is (3) as you mentioned. The thing is I don't want
to screw up with their C source code, which is quite complicate.

Alexandre Ferrieux

unread,

Sep 12, 2008, 1:11:22 PM9/12/08

to

On Sep 12, 6:26 pm, bingo <zhn...@gmail.com> wrote:
>
> > Depending on the answer, alternatives to embedding exist that would
> > largely reduce the complexity (and bug risk): (1) Tcl in an external
> > child process, reached through sockets or pipes; (2) Tcl as an outer
> > shell, spawning the C one as a child; (3) Tcl as an outer shell, using
> > the facilities of the C part exposed in an extension; (4) Tcl as an
> > outer shell, using the facilities of the C part in an unmodified DLL
> > wrapped by ffidl.
>

> You are right, NAMD is an optimized C++ code, and I think the relation
> of TCL and C code is (3) as you mentioned. The thing is I don't want
> to screw up with their C source code, which is quite complicate.

Do you mean "the relationship ... is" or "should be" ?
What exactly is given, what has yet to be decided by you ?
If (3) is already implemented, what's left for you to write ?
And why are you talking about Tcl_Finalize and Tcl_Exit ? These are
certainly *not* meant to be called from an extension. They are rather
the signature of "embedded Tcl", meaning the C app is the outer shell,
"hosting" at least one Tcl interpreter, and deciding when to call
*into* Tcl. I would call that solution (0), which is a deadly trap to
many a beginner, unless it is forced upon him by the situation. I am
still struggling to decipher whether it applies to you...

-Alex

bingo

unread,

Sep 12, 2008, 1:29:26 PM9/12/08

to

On Sep 12, 10:11 am, Alexandre Ferrieux <alexandre.ferri...@gmail.com>
wrote:

Hi, Alex:
Thanks a lot for your response.
Actually, I'm really not that clear about the relationship between C
code and TCL script. I'm guess (3) because the way how it works is:
(1) I can write some input files using TCL.
(2) NAMD will parse the input file and do the simulation.
(3) Another interesting thing is you can define a "proc" using TCL,
and ask NAMD to call that proc each step and get information.
Now the problem is that I got "OUT OF MEMORY" error while running
the proc I defined in TCL(and I can see the memory increasing each
step). At first I thought it could be that the variables in the proc I
defined are not freed. But I tried to unset them explicitly, and still
the same problem. Then I start to google and being a little desperate,
that's where I find the function "TCL_Finalize", which I don't
understand at all.
Any suggestion about the error will be greatly appreciated.
Thanks again.
Bin

Christian Gollwitzer

unread,

Sep 12, 2008, 1:52:41 PM9/12/08

to

bingo schrieb:

> (1) I can write some input files using TCL.
> (2) NAMD will parse the input file and do the simulation.

This means that indeed NAMD embeds TCL, not the other way round

> (3) Another interesting thing is you can define a "proc" using TCL,
> and ask NAMD to call that proc each step and get information.
> Now the problem is that I got "OUT OF MEMORY" error while running
> the proc I defined in TCL(and I can see the memory increasing each
> step).

Does it happen when you run a trivial proc like

proc trivial {} {
puts Hi
}

? If so, blame the developer of NAMD. If not, post a minimal crashing
Tcl-proc here. The thing you read on Tcl_Finalize only confused the
people who know Tcl well; it's a function that NAMD has to use and
probably does, buried in the C code, which you should not touch so far
unless you know what to do.

Christian

bingo

unread,

Sep 12, 2008, 2:54:27 PM9/12/08

to

> Does it happen when you run a trivial proc like
>
> proc trivial {} {
> puts Hi
>
> }
>

No, it doesn't crash this way.

I think the problem might be that I'm doing too much float point math
using TCL.
This is the code I'm using to calculate derivatives using chain rule.
(kind of tedious but simple).

Also, I read this "there is a bug in the tcl interpreter (not
namd-specific) that shows up occasionally (and randomly) when doing
parallel runs. It only occurs on floating point math, and is
relatively
rare. " online. Is it true?

Thanks a lot.

# ===================================== #
# procedure to calculate the tedious #
# derivative #
# Wed Sep 10 21:44:23 PDT 2008 #
# ===================================== #
#chain_rule
proc derv {a b c d} {

global S Grad_S
#
#
# a, b, c, d : 4 vectors
# a, b : 2 points on line 1
# c, d : 2 points on line 2

# setup common vectors
set r1 $a
set r2 $c
set r12 [vecsub $r2 $r1] ;# r2-r1

set e1 [vecsub $b $a] ;# b - a
set e2 [vecsub $d $c] ;# d - c
set amc [vecscale $r12 -1] ;# a - c
set abc [vecadd [vecscale $e1 -1] $amc] ;# 2a -b -c
set acd [vecadd $amc $e2] ;# a -2c +d

# setup a few common dot products
set r12r12 [vecdot $r12 $r12] ;# r12 . r12
set r12e1 [vecdot $r12 $e1] ;# r12 . e1
set r12e2 [vecdot $r12 $e2] ;# r12 . e2
set e1e2 [vecdot $e1 $e2] ;# e1 . e2
set e1e1 [vecdot $e1 $e1] ;# e1 . e1
set e2e2 [vecdot $e2 $e2] ;# e2 . e2

# numerator of lamda
#a
set tmpa1 [vecscale $e2 $e1e2]
set tmpa2 [vecscale $e2 $r12e2]
set tmpa3 [vecscale $abc $e2e2]
set DL1Da [vecadd $tmpa1 $tmpa2 $tmpa3]
#b
set tmpb1 [vecscale $tmpa2 -1]
set tmpb2 [vecscale $amc [expr - $e2e2]]
set DL1Db [vecadd $tmpb1 $tmpb2]
#c
set tmpc1 [vecscale $e2 [expr -2 * $r12e1]]
set tmpc2 [vecscale $acd [expr - $e1e2]]
set tmpc3 [vecscale $e1 $r12e2]
set tmpc4 [vecscale $e1 $e2e2]
set DL1Dc [vecadd $tmpc1 $tmpc2 $tmpc3 $tmpc4]
#d
set tmpd1 [vecscale $tmpc1 -1]
set tmpd2 [vecscale $amc $e1e2]
set tmpd3 [vecscale $tmpc3 -1]
set DL1Dd [vecadd $tmpd1 $tmpd2 $tmpd3]

# numerator of mu
#a
set tmpMa1 [vecscale $e2 [expr - $e1e1]]
set tmpMa2 [vecscale $e2 $r12e1]
set tmpMa3 [vecscale $abc [expr - $e1e2]]
set tmpMa4 [vecscale $tmpc3 -2]
set DM1Da [vecadd $tmpMa1 $tmpMa2 $tmpMa3 $tmpMa4]
#b
set tmpMb1 [vecscale $tmpMa4 -1]
set tmpMb2 [vecscale $tmpMa2 -1]
set tmpMb3 $tmpd2
set DM1Db [vecadd $tmpMb1 $tmpMb2 $tmpMb3]
#c
set tmpMc1 [vecscale $acd $e1e1]
set tmpMc2 [vecscale $e1 $r12e1]
set tmpMc3 [vecscale $e1 [expr - $e1e2]]
set DM1Dc [vecadd $tmpMc1 $tmpMc2 $tmpMc3]
#d
set tmpMd1 [vecscale $amc [expr - $e1e1]]
set tmpMd2 [vecscale $tmpMc2 -1]
set DM1Dd [vecadd $tmpMd1 $tmpMd2]

# denominator of both lamda and mu
#a
set tmp1 [vecscale $tmpa1 2]
set tmp2 [vecscale $tmpc4 -2]
set DL2Da [vecadd $tmp1 $tmp2]
#b
set DL2Db [vecscale $DL2Da -1]
#c
set tmp3 [vecscale $tmpMa1 2]
set tmp4 [vecscale $tmpMc3 -2]
set DL2Dc [vecadd $tmp3 $tmp4]
#d
set DL2Dd [vecscale $DL2Dc -1]

# r12^2
set Dr12r12Da [vecscale $amc 2]
set Dr12r12Dc [vecscale $Dr12r12Da -1]

# r12e2
set Dr12e2Da [vecscale $e2 -1]
set Dr12e2Dc $acd
set Dr12e2Dd [vecscale $amc -1]

# r12e1
set Dr12e1Da $abc
set Dr12e1Db $Dr12e2Dd
set Dr12e1Dc $e1

# e1e2
set De1e2Da $Dr12e2Da
set De1e2Db $e2
set De1e2Dc [vecscale $e1 -1]
set De1e2Dd $e1
# e1e1
set De1e1Da [vecscale $e1 -2]
set De1e1Db [vecscale $e1 2]
# e2e2
set De2e2Dc [vecscale $e2 -2]
set De2e2Dd [vecscale $e2 2]

# %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% #
# start to calculate the derivatives #
# %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% #

set L1 [expr $r12e1*$e2e2 - $r12e2*$e1e2]
set L2 [expr $e1e1*$e2e2 - pow($e1e2,2)]
set lamda [expr $L1 / $L2]
set M1 [expr $r12e2*$e1e1 - $r12e1*$e1e2]
set mu [expr - $M1 / $L2]

set Grad_L1 [list $DL1Da $DL1Db $DL1Dc $DL1Dd]
set Grad_L2 [list $DL2Da $DL2Db $DL2Dc $DL2Dd]
set Grad_M1 [list $DM1Da $DM1Db $DM1Dc $DM1Dd]

set invL2 [expr 1. / $L2]
set L22 [expr pow($L2, 2)]
set L1L22 [expr $L1 / $L22]
set M1L22 [expr $M1 / $L22]

set Grad_lamda {}
set Grad_mu {}
for {set i 0} {$i <=3} {incr i} {

set DL1 [lindex $Grad_L1 $i]
set DM1 [lindex $Grad_M1 $i]
set DL2 [lindex $Grad_L2 $i]
lappend Grad_lamda [vecsub [vecscale $DL1 $invL2] [vecscale $DL2
$L1L22]]
lappend Grad_mu [vecsub [vecscale $DL2 $M1L22] [vecscale $DM1
$invL2]]

}

set zero {0 0 0}
set Grad_r12r12 [list $Dr12r12Da $zero $Dr12r12Dc $zero]
set Grad_e1e1 [list $De1e1Da $De1e1Db $zero $zero]
set Grad_e2e2 [list $zero $zero $De2e2Dc $De2e2Dd]
set Grad_e1e2 [list $De1e2Da $De1e2Db $De1e2Dc $De1e2Dd]
set Grad_r12e1 [list $Dr12e1Da $Dr12e1Db $Dr12e1Dc $zero]
set Grad_r12e2 [list $Dr12e2Da $zero $Dr12e2Dc $Dr12e2Dd]

set mu2 [expr pow($mu,2)]
set lamda2 [expr pow($lamda, 2)]
set S2 [expr $r12r12 + $mu2*$e2e2 + $lamda2*$e1e1 + 2*$mu*$r12e2 \
-2*$lamda*$r12e1 - 2*$lamda*$mu*$e1e2]
set S [expr sqrt($S2)]
# calculate the gradient of S
set Grad_S {}
set inv2s [expr 1./2./$S]

for {set i 0} {$i <= 3} {incr i} {

set Dmu [lindex $Grad_mu $i]
set Dlamda [lindex $Grad_lamda $i]
set De1e1 [lindex $Grad_e1e1 $i]
set De2e2 [lindex $Grad_e2e2 $i]
set De1e2 [lindex $Grad_e1e2 $i]
set Dr12e1 [lindex $Grad_r12e1 $i]
set Dr12e2 [lindex $Grad_r12e2 $i]

set part1 [lindex $Grad_r12r12 $i]
set part2 [vecadd [vecscale $Dmu [expr 2*$mu*$e2e2]] [vecscale $De2e2
$mu2] ]
set part3 [vecadd [vecscale $Dlamda [expr 2*$lamda*$e1e1]] \
[vecscale $De1e1 $lamda2] ]
set part4 [vecadd [vecscale $Dmu [expr 2*$r12e2]] \
[vecscale $Dr12e2 [expr 2*$mu]] ]
set part5 [vecadd [vecscale $Dlamda [expr -2*$r12e1]] \
[vecscale $Dr12e1 [expr -2*$lamda]] ]
set part6 [vecadd [vecscale $Dlamda [expr -2*$mu*$e1e2]] \
[vecscale $Dmu [expr -2*$lamda*$e1e2]] \
[vecscale $De1e2 [expr -2*$lamda*$mu]] ]
set Grad_S2 [vecadd $part1 $part2 $part3 $part4 $part5 $part6]
lappend Grad_S [vecscale $Grad_S2 $inv2s]
}

puts $S
puts $Grad_S

foreach var [info locals] {
unset $var
}

} ;# end of procedure

Uwe Klein

unread,

Sep 12, 2008, 3:12:56 PM9/12/08

to

>
> } ;# end of procedure

bingo wrote:
>>Does it happen when you run a trivial proc like
>>
>>proc trivial {} {
>> puts Hi
>>
>>}
>>
>
>
> No, it doesn't crash this way.
>
> I think the problem might be that I'm doing too much float point math
> using TCL.
> This is the code I'm using to calculate derivatives using chain rule.
> (kind of tedious but simple).
>
> Also, I read this "there is a bug in the tcl interpreter (not
> namd-specific) that shows up occasionally (and randomly) when doing
> parallel runs. It only occurs on floating point math, and is
> relatively
> rare. " online. Is it true?
>
> Thanks a lot.
>
> # ===================================== #
> # procedure to calculate the tedious #
> # derivative #
> # Wed Sep 10 21:44:23 PDT 2008 #
> # ===================================== #
> #chain_rule
> proc derv {a b c d} {
>

no other globals or vars in global scope?
> global S Grad_S

you should not need this:

> foreach var [info locals] {
> unset $var
> }

How many invocations can you do before you get ooM?

You could permute the little test proc [trivial]
to use _one_ of the namd provided procs
in each permutation.

might give an idea on who gobbles up memory.

uwe

Alexandre Ferrieux

unread,

Sep 12, 2008, 3:46:29 PM9/12/08

to

On Sep 12, 8:54 pm, bingo <zhn...@gmail.com> wrote:
>
> I think the problem might be that I'm doing too much float point math
> using TCL.

Floating-point per se won't harm you much on the memory scale.
However, looking at your code, a likely culprit would be the NAMD
implementation of primitives like [vecscale] and [vecadd], which you
call very frequently, and which allocate a vector or list for their
return value. If I were to guess, I'd say one of these fails to
decrement properly the refcount of these returned vectors, which leads
to systematic leaks.

Please tell me which version of NAMD you're using, I'll take a peek
inside.

-Alex

Robert Heller

unread,

Sep 12, 2008, 3:59:59 PM9/12/08

to

At Fri, 12 Sep 2008 11:54:27 -0700 (PDT) bingo <zhn...@gmail.com> wrote:

>
>
> > Does it happen when you run a trivial proc like
> >
> > proc trivial {} {

> > =A0 =A0puts Hi

> >
> > }
> >
>
> No, it doesn't crash this way.
>
> I think the problem might be that I'm doing too much float point math
> using TCL.
> This is the code I'm using to calculate derivatives using chain rule.
> (kind of tedious but simple).
>
> Also, I read this "there is a bug in the tcl interpreter (not
> namd-specific) that shows up occasionally (and randomly) when doing
> parallel runs. It only occurs on floating point math, and is
> relatively
> rare. " online. Is it true?
>
> Thanks a lot.
>

> # =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D #

> # procedure to calculate the tedious #
> # derivative #
> # Wed Sep 10 21:44:23 PDT 2008 #

> # =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D #

> #chain_rule
> proc derv {a b c d} {
>
> global S Grad_S

Is there some reason these are global?

> #
> #
> # a, b, c, d : 4 vectors
> # a, b : 2 points on line 1
> # c, d : 2 points on line 2
>
> # setup common vectors
> set r1 $a
> set r2 $c
> set r12 [vecsub $r2 $r1] ;# r2-r1

What does vecsub do? Does it allocate some special data object? Is the
objected ever 'garbage collected'? (Ditto for all of the other vec*
functions.) Do you need to do something like a 'vecfree' call?

You *should* be using braces with expr:

set L1 [expr {$r12e1*$e2e2 - $r12e2*$e1e2}]

although this is somewhat of a nitpick.

> set Grad_L1 [list $DL1Da $DL1Db $DL1Dc $DL1Dd]
> set Grad_L2 [list $DL2Da $DL2Db $DL2Dc $DL2Dd]
> set Grad_M1 [list $DM1Da $DM1Db $DM1Dc $DM1Dd]
>
> set invL2 [expr 1. / $L2]
> set L22 [expr pow($L2, 2)]
> set L1L22 [expr $L1 / $L22]
> set M1L22 [expr $M1 / $L22]
>
> set Grad_lamda {}
> set Grad_mu {}

> for {set i 0} {$i <=3D3} {incr i} {

>
> set DL1 [lindex $Grad_L1 $i]
> set DM1 [lindex $Grad_M1 $i]
> set DL2 [lindex $Grad_L2 $i]
> lappend Grad_lamda [vecsub [vecscale $DL1 $invL2] [vecscale $DL2
> $L1L22]]
> lappend Grad_mu [vecsub [vecscale $DL2 $M1L22] [vecscale $DM1
> $invL2]]
>
> }
>
> set zero {0 0 0}
> set Grad_r12r12 [list $Dr12r12Da $zero $Dr12r12Dc $zero]
> set Grad_e1e1 [list $De1e1Da $De1e1Db $zero $zero]
> set Grad_e2e2 [list $zero $zero $De2e2Dc $De2e2Dd]
> set Grad_e1e2 [list $De1e2Da $De1e2Db $De1e2Dc $De1e2Dd]
> set Grad_r12e1 [list $Dr12e1Da $Dr12e1Db $Dr12e1Dc $zero]
> set Grad_r12e2 [list $Dr12e2Da $zero $Dr12e2Dc $Dr12e2Dd]
>
> set mu2 [expr pow($mu,2)]
> set lamda2 [expr pow($lamda, 2)]
> set S2 [expr $r12r12 + $mu2*$e2e2 + $lamda2*$e1e1 + 2*$mu*$r12e2 \
> -2*$lamda*$r12e1 - 2*$lamda*$mu*$e1e2]
> set S [expr sqrt($S2)]
> # calculate the gradient of S
> set Grad_S {}
> set inv2s [expr 1./2./$S]
>

> for {set i 0} {$i <=3D 3} {incr i} {

--
Robert Heller -- Get the Deepwoods Software FireFox Toolbar!
Deepwoods Software -- Linux Installation and Administration
http://www.deepsoft.com/ -- Web Hosting, with CGI and Database
hel...@deepsoft.com -- Contract Programming: C/C++, Tcl/Tk

Cameron Laird

unread,

Sep 12, 2008, 3:14:09 PM9/12/08

to

In article <da2dec01-9d15-42d9...@c22g2000prc.googlegroups.com>,

bingo <zhn...@gmail.com> wrote:
>
>> Does it happen when you run a trivial proc like
>>
>> proc trivial {} {
>> puts Hi
>>
>> }
>>
>
>No, it doesn't crash this way.
>
>I think the problem might be that I'm doing too much float point math
>using TCL.
>This is the code I'm using to calculate derivatives using chain rule.
>(kind of tedious but simple).
>
>Also, I read this "there is a bug in the tcl interpreter (not
>namd-specific) that shows up occasionally (and randomly) when doing
>parallel runs. It only occurs on floating point math, and is
>relatively
>rare. " online. Is it true?

.
.
.
It certainly would surprise me; I assume a few of the Core
specialists will look at <URL:
http://ftp.ks.uiuc.edu/Research/namd/mailing_list/namd-l/6196.html >
and comment.

In general, it is NOT possible to "do too much float point math
using Tcl": there are very large, very serious Tcl applications
have have been doing abundant floating-point calculations
continuously for *years*.

Cameron Laird

unread,

Sep 12, 2008, 3:08:40 PM9/12/08

to

In article <f33b8421-3dbd-4c05...@s20g2000prd.googlegroups.com>,
bingo <zhn...@gmail.com> wrote:
.
.
.

> Thanks a lot for your response.
> Actually, I'm really not that clear about the relationship between C
>code and TCL script. I'm guess (3) because the way how it works is:
> (1) I can write some input files using TCL.
> (2) NAMD will parse the input file and do the simulation.
> (3) Another interesting thing is you can define a "proc" using TCL,
>and ask NAMD to call that proc each step and get information.
> Now the problem is that I got "OUT OF MEMORY" error while running
>the proc I defined in TCL(and I can see the memory increasing each
>step). At first I thought it could be that the variables in the proc I
>defined are not freed. But I tried to unset them explicitly, and still
>the same problem. Then I start to google and being a little desperate,
>that's where I find the function "TCL_Finalize", which I don't
>understand at all.
> Any suggestion about the error will be greatly appreciated.

.
.
.
"Working too hard" is my label for this situation.

I understand the sense of what you've done. It's quite far
from the simplest possible, though.

Put away the source code at the level of TCL_Finalize().
You have done well to note that "memory increas[es] each
step". Is the [proc] you defined small enough that you
can reasonably write it here? If not, make it small!--
that is, simplify the system down to something minimal
that exhibits the bad memory behavior, but is small
enough to be understandable.

It's hard for a simple, small proc to run out of memory
without something obvious to correct.

We can get through this.

Cameron Laird

unread,

Sep 12, 2008, 3:19:20 PM9/12/08

to

In article <da2dec01-9d15-42d9...@c22g2000prc.googlegroups.com>,
bingo <zhn...@gmail.com> wrote:
>

>> Does it happen when you run a trivial proc like
>>
>> proc trivial {} {
>> puts Hi
>>
>> }
>>
>
>No, it doesn't crash this way.
>
>I think the problem might be that I'm doing too much float point math
>using TCL.
>This is the code I'm using to calculate derivatives using chain rule.
>(kind of tedious but simple).

.
.
.

> set Grad_S2 [vecadd $part1 $part2 $part3 $part4 $part5 $part6]
> lappend Grad_S [vecscale $Grad_S2 $inv2s]
> }
>
> puts $S
> puts $Grad_S
>
> foreach var [info locals] {
> unset $var
> }
>
>} ;# end of procedure

Does the NAMD manual counsel you to [unset] local variables?
Does the NAMD manual have anything to say on the subject of
"memory", and specifically, does its index include that rubric?

I suspect NAMD has documented something--something about usage
or semantics--that you haven't noticed.

How does it happen that you aren't talking with the NAMD
specialists about this?

bingo

unread,

Sep 12, 2008, 4:16:40 PM9/12/08

to

Hi, All,
Thanks so much for all your replies!! You guys are really awesome!!

(1):
The only reason I'm using the "global S Grad_S" is I don't want to
unset them while executing:

foreach var [info locals] {
unset $var
}

(2): I'm now using NAMD 2.6.
vecadd : vector addition
vecsub : vector subtraction
vecscale : scale a vector with a scalar
vecdot : vector dot product
These functions are all implemented in NAMD using C++, I think. And I
haven't took a look at how it works.(Sorry, I should do this)

(3)

>If I were to guess, I'd say one of these fails to
>decrement properly the refcount of these returned vectors, which leads
>to systematic leaks.

This is really a good point.

>Please tell me which version of NAMD you're using, I'll take a peek
>inside.

Thank you so much for doing this.

Bin

miguel

unread,

Sep 12, 2008, 5:29:02 PM9/12/08

to

Cameron Laird wrote:
>> Also, I read this "there is a bug in the tcl interpreter (not
>> namd-specific) that shows up occasionally (and randomly) when doing
>> parallel runs. It only occurs on floating point math, and is
>> relatively
>> rare. " online. Is it true?

I do not recall ever seeing a ticket describing this bug, nor code that would
expose it. I wonder where the "not namd-specific" comes from too - are there
other refs in the net?

> It certainly would surprise me; I assume a few of the Core
> specialists will look at <URL:
> http://ftp.ks.uiuc.edu/Research/namd/mailing_list/namd-l/6196.html >
> and comment.
>
> In general, it is NOT possible to "do too much float point math
> using Tcl": there are very large, very serious Tcl applications
> have have been doing abundant floating-point calculations
> continuously for *years*.

If I were a betting man, I'd wager that namd is violating Tcl's threading
principle: "an interp may only be called from the thread that created it".

What makes me suspect something along those lines is: "(it) shows up
occasionally (and randomly) when doing parallel runs". Also "As long as tcl
doesn't do any floating point math in your script (ie, it's all handled using
calls to the compiled extension), this crash can't happen". Sure: you are then
not allocating Tcl_Objs or memory through Tcl ...

Violation of the principle will do nasty things in the mem allocator: leaks,
crashes, volcanos ... anything can happen.

Or else it really is a threading bug in Tcl. No way to know without more info.

As to the rest of that comment:

Corrupted computations are possible. Dismissing this and "accept it as an
occasional run-crasher, and just restart from the latest checkpoint" is a bit
naïve, even if the reporter says "I've verified (through pain of lots of print
statements) that even when the crash does occur in tclforces, all of the
calculations were correct". Not being able to observe the effects of a race
condition when one is looking is known not to be a proof of their non-existence.
What do you know, the print statements themselves may be enough to diminish
their likelihood massively.

So, namd people: can we have a proper bug report? "Until the underlying tcl
implementation is fixed" will take forever without that.

Miguel

bingo

unread,

Sep 12, 2008, 6:19:46 PM9/12/08

to

Hi, All
I just checked the function "vecscale" in NAMD source file:
"TclCommands.C" and found something suspicious.
It seems to me that the variables (scalar and vector) defined on line
171 are not freed.
Will this be problem?
But when I try to modify it as in the code (Bin Edit), then the code
just failed running.
Any suggestions?
Thanks a lot.
Bin

/* ====================================== */

134 // Function: vecscale
135 // Returns: scalar * vector or vector * scalar
136 // speedup is 1228/225 = 5.5 fold
137 int proc_vecscale(ClientData, Tcl_Interp *interp, int argc,
138 char *argv[])
139 {
140 if (argc == 1) {
141 Tcl_SetResult(interp,"no value given for parameter \"c\"
to \"vecscale\"",TCL_VOLATILE);
142 return TCL_ERROR;
143 }
144
145 if (argc == 2) {
146 Tcl_SetResult(interp,"no value given for parameter \"v\"
to \"vecscale\"",TCL_VOLATILE);
147 return TCL_ERROR;
148 }
149 if (argc != 3) {
150 Tcl_SetResult(interp,"called \"vecscale\" with too many
arguments",TCL_VOLATILE);
151 return TCL_ERROR;
152 }
153
154 int num1, num2;
155 char **data1, **data2;
156 if (Tcl_SplitList(interp, argv[1], &num1, &data1) != TCL_OK)
{
157 return TCL_ERROR;
158 }
159 if (Tcl_SplitList(interp, argv[2], &num2, &data2) != TCL_OK)
{
160 Tcl_Free((char*) data1);
161 return TCL_ERROR;
162 }
163 int result = TCL_OK;
164 if (num1 == 0 || num2 == 0) {
165 result = TCL_ERROR;
166 Tcl_SetResult(interp,"vecscale: parameters must have
data",TCL_VOLATILE);
167 } else if (num1 != 1 && num2 != 1) {
168 result = TCL_ERROR;
169 Tcl_SetResult(interp,"vecscale: one parameter must be a
scalar value",TCL_VOLATILE);
170 } else {

/* ========= suspecious =========== */
171 char *scalar, **vector;

172 int num;
173 if (num1 == 1) {
174 scalar = data1[0];
175 vector = data2;
176 num = num2;
177 } else {
178 scalar = data2[0];
179 vector = data1;
180 num = num1;
181 }
182 char s[TCL_DOUBLE_SPACE];
183 double val1, val2;
184 if (Tcl_GetDouble(interp, scalar, &val1) != TCL_OK) {
185 result = TCL_ERROR;
186 } else {
187 for (int i=0; i<num; i++) {
188 if (Tcl_GetDouble(interp, vector[i], &val2) != TCL_OK)
{
189 Tcl_SetResult(interp,"vecscale: vector contains a
non-number",TCL_VOLATILE);
190 result = TCL_ERROR;
191 break;
192 }
193 Tcl_PrintDouble(interp, val1 * val2, s);
194 Tcl_AppendElement(interp, s);
195 }
196 }
/* ============================ */
197 // Bin Edit
198 //Tcl_Free((char*) scalar);
199 Tcl_Free((char*) vector);
201 // Bin Edit
/* ============================ */
202 }
203 Tcl_Free((char*) data1);
204 Tcl_Free((char*) data2);
205 return result;
206 }

Robert Heller

unread,

Sep 12, 2008, 9:04:53 PM9/12/08

to

No memory is allocated above! data1 and data2 were allocated in the
Tcl_SplitList()s above (lines 156 and 159) and are freed below in lines
203 and 204. The above code just moves pointers around and your
Tcl_Free()s at lines 198 & 199 are unneeded (and are in fact harmful).

It appears (from the above code sample), that NAND's 'vec*' functions
allocate Tcl lists -- this is safe. Tcl will properly reclaim the
memory used by these lists when they become dereferenced to a refcount
of 0.

Cameron Laird

unread,

Sep 13, 2008, 6:50:44 PM9/13/08

to

In article <ab685931-861c-423a...@z6g2000pre.googlegroups.com>,

bingo <zhn...@gmail.com> wrote:
>Hi, All,
> Thanks so much for all your replies!! You guys are really awesome!!
>
>(1):
> The only reason I'm using the "global S Grad_S" is I don't want to
>unset them while executing:
> foreach var [info locals] {
> unset $var
> }

.
.
.
While my personal belief is that you'll eventually dispose of
the "... unset $var" entirely, in the meantime, you're quite
free to modify it to

foreach var [info locals] {
switch -- $var {
S -
Grad_S {}
default {
unset $var
}
}
}

Alexandre Ferrieux

unread,

Sep 14, 2008, 5:38:46 PM9/14/08

to

On Sep 12, 10:16 pm, bingo <zhn...@gmail.com> wrote:
>
> >If I were to guess, I'd say one of these fails to
> >decrement properly the refcount of these returned vectors, which leads
> >to systematic leaks.
>
> This is really a good point.

OK, I think I got it.

Short answer: [vecadd] with more than 2 args leaks memory.

Long answer: see function proc_vecadd() in TclCommands.C:
(error cases elided as {...})

for (int term=2; term < argc; term++) {
if (Tcl_SplitList(interp, argv[term], &num2, &data) != TCL_OK)
{...}
if (num != num2) {...}
for (i=0; i<num; i++) {
double df;
if (Tcl_GetDouble(interp, data[i], &df) != TCL_OK) {...}
sum[i] += df;
}
}

As you can see, each turn of the outer loop uses Tcl_SplitList to
allocate "data" (an array of char*), and fails to free it. A single
Tcl_Free(data) at the end of the func frees just the last of them. As
a consequence, the 2-arg variant doesn't leak. That may explain why it
was not spotted earlier.

Here is the fix: just move the "Tcl_Free((char *)data)" from the end
of the function to just before the closing brace of the "term" loop.

As a side note, I'm a bit surprised to see all these vector-
manipulation routines implemented against the old string-based API to
Tcl. This means they are very slow, spending most of their time
converting between strings and doubles. Worse, the inner product is
written in Tcl ! At the same time, considering what NAMD does, I can
imagine that those individual-vector primitives exposed to Tcl are
*not* the ones really used for the bulk of the parallelized
computation... But still it may surprise some users who may ignore
this "don't use for big stuff" spirit !

Could you please propagate this to NAMD developers ?

-Alex

beaker...@gmail.com

unread,

Sep 14, 2008, 6:16:21 PM9/14/08

to

On Sep 14, 4:38 pm, Alexandre Ferrieux <alexandre.ferri...@gmail.com>
wrote:
>

> Here is the fix: just move the "Tcl_Free((char *)data)" from the end
> of the function to just before the closing brace of the "term" loop.

Thanks for pointing this out! I'll forward this to the head namd
developer (I'm a grad student in the group that maintains NAMD).

>
> As a side note, I'm a bit surprised to see all these vector-
> manipulation routines implemented against the old string-based API to
> Tcl. This means they are very slow, spending most of their time
> converting between strings and doubles. Worse, the inner product is
> written in Tcl ! At the same time, considering what NAMD does, I can
> imagine that those individual-vector primitives exposed to Tcl are
> *not* the ones really used for the bulk of the parallelized
> computation... But still it may surprise some users who may ignore
> this "don't use for big stuff" spirit !

This is an interesting point, and I'll also point it out to the
developers. Just for your reference, your guess that these vector
primitives are not used for the bulk of calculations is correct. The
tcl interface (tclforces) that this is a part of is designed to allow
users to add very simple additional forces to their simulations, which
are primarily done in the NAMD core (which is all C++ and quite
heavily tuned); the example that started this thread is not a typical
use case for the tclforces interface. Still, there's no reason to have
a suboptimal implementation of these primitives; thanks for noting it.

-Peter

bingo

unread,

Sep 21, 2008, 12:21:28 PM9/21/08

to

On Sep 14, 2:38 pm, Alexandre Ferrieux <alexandre.ferri...@gmail.com>
wrote:

Sorry for the late reply, I was out of town in the last week.
Anyway, thank you so much for your kind help. I will definitely send
this to the NAMD developers.

-Bin

cde...@speakeasy.net

unread,

Sep 23, 2008, 12:39:38 AM9/23/08

to

bingo wrote:
> Hi, All
> I just checked the function "vecscale" in NAMD source file:
> "TclCommands.C" and found something suspicious.
> It seems to me that the variables (scalar and vector) defined on line
> 171 are not freed.
> Will this be problem?

No. scalar and vector are automatic C variables that go out of scope at
the end of the block. The Tcl variables that are created (and pointed to
by scalar and vector) are data1 and data2. Those appear to be freed
correctly at the end of the function.

> But when I try to modify it as in the code (Bin Edit), then the code
> just failed running.
> Any suggestions?
> Thanks a lot.
> Bin

[... deleted ...]

this is the end of scope for scalar and vector

> 196 }
> /* ============================ */
> 197 // Bin Edit
> 198 //Tcl_Free((char*) scalar);
> 199 Tcl_Free((char*) vector);
> 201 // Bin Edit

these should cause compiler errors.

Alexandre Ferrieux

unread,

Sep 23, 2008, 5:08:18 AM9/23/08

to

On Sep 23, 6:39 am, "cden...@speakeasy.net" <cden...@speakeasy.net>
wrote:

The bug was found, see

http://groups.google.com/group/comp.lang.tcl/tree/browse_frm/thread/8859715ec6df90c2/f65fb960aa947e54?rnum=11&_done=%2Fgroup%2Fcomp.lang.tcl%2Fbrowse_frm%2Fthread%2F8859715ec6df90c2%2Fc57681d3d6ed87c6%3F#doc_10ad4c96d554bca7

-Alex