Fix for memory corruption using deepcopy with BigInt/BigFloat

169 views
Skip to first unread message

Scott Jones

unread,
Jun 8, 2016, 9:34:48 AM6/8/16
to julia-dev
I have the issue #16667 fixed for BigInt and BigFloat types, which is where this problem has already been seen in the wild.
Although it is not a general fix for deepcopy of types with finalizers, it should be performant, and it would be trivial to backport to v0.4.x (for v0.4.6).

If somebody can merge this in, and put it on the backport list for v0.4.x, that would be great. (it modifies one line in base/deepcopy.jl and adds tests in test/copy.jl)

Thanks in advance.

Scott Jones

unread,
Jun 8, 2016, 10:05:21 AM6/8/16
to julia-dev
Note also, I think these examples point out that it the idea raised in #16667 of simply making a copy and registering the finalizer would not work.
There are very different situations:

1) You have a type like BigFloat/BigInt, where they are considered immutable, like the string types (by convention, at least, unfortunately, there is no way of ensuring that in the language).
     For those, you simply want to return the same object (as I did in this fix).
     These could possibly be handled by having a trait set for these types (just have a method that normally returns false, but can be specialized for types like BigFloat, BigInt, etc. to return true)
     When true, deepcopy would simply return the same object.

2) You have mutable types, that point to some memory not managed by Julia.
    Since there is no way of knowing just how to correctly copy that memory, each type will need to override deepcopy in some fashion on it's own, and register a new finalizer for the copy).
    (For BigFloats, if you really needed to make a separate copy, you need to make sure you fill in the precision of the original (and not use the current precision), fill in the sign and exponent, initialize it with the correct number of limbs,
     and then copy the limbs from the source to the copy, and you have to calculate the number of limbs from the precision)

Stefan Karpinski

unread,
Jun 8, 2016, 10:12:22 AM6/8/16
to juli...@googlegroups.com

Scott Jones

unread,
Jun 8, 2016, 10:17:32 AM6/8/16
to julia-dev
Here is the analysis of why this ends up causing a problem (but not always, and not always immediately, because Julia frequently goes a very long time before doing GC and calling finalizers)

Which this change, the finalizer is only called once, when there are no more pointers to the object, which is exactly what should happen.

The corruption that occurred before was because `deepcopy` made a new Julia object (without a finalizer), but retaining the pointer to the memory controlled by gmp.

If the original is then GCed, the finalizer gets called, and the copy is left with a dangling pointer, and once that memory gets reallocated and overwritten, then you will start to get

failures in the mpfr code.

If the copy is GCed first, then nothing will happen, the bullet is dodged.

Scott Jones

unread,
Jun 8, 2016, 10:18:09 AM6/8/16
to julia-dev
Thanks very much, Stefan!

On Wednesday, June 8, 2016 at 10:12:22 AM UTC-4, Stefan Karpinski wrote:

Scott Jones

unread,
Jun 8, 2016, 1:16:58 PM6/8/16
to julia-dev
At https://github.com/JuliaLang/julia/pull/16826, @yuyichao had these comments (which I am not able to respond to there):

This means that if the old one is finalized the new one is too, which is not what it is supposed to do.

and

And just to be clear, I mean calling finalize explicitly. 

I don't think that matters at all in this case.  It is essentially the same as modifying the .data vector of a string.
Julia does not give you any way of preventing it, however, by convention, it would never be done (as strings, like Numbers, are treated as being immutable). 

Similarly, you can write into the memory pointed to by the "d" field in a BigFloat or BigInt, and cause Julia to fail, but nobody worries about that, because it would violate the convention
that they are immutable to do so.

-Scott

Jeffrey Sarnoff

unread,
Jun 9, 2016, 12:12:12 AM6/9/16
to julia-dev
This sounds like something that, elsewhere, was resolved with the introduction of weak references:
deepcopying an entity that carries finalization replicates the entirety of its contentful state and status (e.g. finalization registration);
weakcopying an entity that carries finalization generates a shadow pointer-to-it that did not increment the gc reference count,
when it became garbage, the shadow pointer could be gc()ed as an immediate object without causing the pointed-to to be collected.
Shadow pointers were byte encoded to be processed as 'retrieve me over there' rather than 
   the unshadowed 'this here' or its pointed-to form, 'that there'.
Like "to recross is to cross", to reshadow is to shadow. -- [q.v. G. Spencer Brown, The Laws of Form]

Scott Jones

unread,
Jun 9, 2016, 7:58:46 AM6/9/16
to julia-dev
In these cases, I don't think weak references would help, because you don't want the reference in the copy to become invalid if the original is GCed.
For types that act immutable (same as strings are treated in Julia), I think that returning the object itself is valid.
Having to make valid copies (i.e. make sure the limbs are allocated in gmp, and register a finalizer) would also be a big performance hit when dealing with BigInts/BigFloats.

For types that are seen to be mutable, then I think you really need to make sure that you extend the deepcopy_internal method.
I also think it would be nice if deepcopy_internal would give an error for any type containing Ptr{} that didn't have deepcopy_internal extended to copy it correctly.

Jeffrey Sarnoff

unread,
Jun 9, 2016, 8:22:35 AM6/9/16
to julia-dev
My weak references, until they become disassociated or otherwise inaccessible garbage, preclude the gc()ing of their referents -- until that referent is in an unshadowed state.

Not disagreeing.. elaborating about an experimental gc.

Scott Jones

unread,
Jun 9, 2016, 8:25:19 AM6/9/16
to julia-dev
Ah, OK.  I thought from what you said "weakcopying an entity that carries finalization generates a shadow pointer-to-it that did not increment the gc reference count" would mean that
the original might get GCed.   I've also been trying to find any documentation on Julia's unexported WeakRef type, with no luck so far.
Reply all
Reply to author
Forward
0 new messages