Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Tcl Double Free Memory Corruption with custom Tcl_Obj* and "scripting error"

80 views
Skip to first unread message

sam appleton

unread,
May 13, 2011, 1:31:01 AM5/13/11
to
Hi

I am flummoxed by this bug and could use some help.
Our application requires returning lots of small objects back and
forth between a C++ application and
a TCL shell. We use SWIG, but we also use a custom "Tcl_Obj*" based
interface for many of the
very-commonly used object types for efficiency reasons.

We defined a new type in TCL using

Tcli_ObjectList_t = (Tcl_ObjType*) Tcl_Alloc(sizeof(Tcl_ObjType));

I then create the four internal
freeIntRepProc,dupIntRepProc,updateStringProc and setFromAnyProc.
For setFromAnyProc for this custom datatype, I just have it return
TCL_ERROR, since we cannot
set this datatype from an arbitrary string representation.
Internally, we use the (obj)->internalRep.otherValuePtr
to represent the pointer to the C++ "ObjectList" object.

The problem comes when the user enters something that is an error. We
get a glibc double free segfault, and
I traced this to TCL attempting to free the same object twice. The
user error is something like


set x [get_objects yyyyy]xxx

The get_objects is a registered command that returns one of the custom
TCL types Tcli_ObjectList_t.
It returns a valid value, but the user has entered something incorrect
and the resulting "object" is
turned into a string and assigned to X. The original "object" is
immediately garbage collected, since it
goes out of scope immediately. But subsequently, the variable "x" goes
out of scope, and "some object"
gets freed - which glibc's the software.

This problem has cost me so much time I'd pay someone who knows how to
resolve it.
If anyone out there knows something, I would much appreciate a reply.

Best

Sam Appleton
Ausdia Inc.
650 242 2908

Alexandre Ferrieux

unread,
May 13, 2011, 3:15:30 AM5/13/11
to

A first level of attack of the problem would be to check that your
refcounting is okay. Since you can use the script level, I'd suggest
to try:

puts [::tcl::unsupported::representation [get_objects yyyyy]]

You should get something like

value is a ObjectList with a refcount of 1, object pointer at
0x8ce61e0, internal representation 0x8f090cee:0x3feaed54, no string
representation.

Can you check this ?

-Alex

Donal K. Fellows

unread,
May 13, 2011, 3:56:56 AM5/13/11
to
On May 13, 6:31 am, sam appleton <sam.s.apple...@gmail.com> wrote:
> I am flummoxed by this bug and could use some help.
> Our application requires returning lots of small objects back and
> forth between a C++ application and a TCL shell. We use SWIG, but
> we also use a custom "Tcl_Obj*" based interface for many of the
> very-commonly used object types for efficiency reasons.

Fair enough.

> We defined a new type in TCL using
>
> Tcli_ObjectList_t = (Tcl_ObjType*) Tcl_Alloc(sizeof(Tcl_ObjType));

As a side note, the design of the Tcl_Obj system pretty much assumes
that a Tcl_ObjType is allocated at compile time (i.e., that you've
declared the structure value as 'static const') and never ever modify
it.

> The problem comes when the user enters something that is an error.
> We get a glibc double free segfault, and I traced this to TCL
> attempting to free the same object twice. The user error is something
> like
>
> set x [get_objects yyyyy]xxx

At a guess, it's an issue in your freeIntRepProc. That should be
called during the processing of the above, at which point you should
ensure that the internal representation of the object is completely
cleared. That includes setting the typePtr field on your custom
Tcl_Obj value back to NULL.

Donal.

Alexandre Ferrieux

unread,
May 13, 2011, 4:30:57 AM5/13/11
to
On 13 mai, 09:15, Alexandre Ferrieux <alexandre.ferri...@gmail.com>
wrote:

Also, since we are 9 timezones apart, a few more things that could
help you (apologies if you find them obvious).
A more accurate description of what happens in

set x [get_objects yyyyy]xxx

would be:

- [get_objects] returns a refcount-1 Tcl-Obj with type ObjectList
(and possibly null string rep, depends on your code)

- to prepare appending "xxx" to it as a string, Tcl computes its
string rep if needed (with updateStringProc)

- the appending itself happens on an unshared value (refcount==1)
so is done in-place, hence the internal rep is freed (freeIntRepProc),
the type becomes null (pure string), and the appending is done on the
string rep computed above.

- the variable x still points to the very same Tcl_Obj, which is
now a pure string, so has no remaining relationship with your custom
type

- later, when x is unset, the Tcl_Obj reaches refcount 0 and is
freed, ie returned to the free pool.

Could you single-step through your code and find at which point things
diverge ?

-Alex

sam appleton

unread,
May 13, 2011, 12:49:19 PM5/13/11
to
On May 13, 12:56 am, "Donal K. Fellows"


Donal - that was it! When I set the Freed object typePtr to NULL, the
whole
thing goes away, or at least, fails gracefully & let's the user know

I owe you a big one - thanks so much!

Best

Sam Appleton

0 new messages