I'm still fighting against my bug while trying to interface OCaml with
CMU bdd library (in C).
Following Xavier comments, I've protected my values (with CAMLparamXX
and CAMLlocalXX macros).
Actual code is available at:
1. I still obtain a segfault, however, the gdb backtrace has changed:
#0 0x80655b5 in mark_slice ()
#1 0x80659eb in major_collection_slice ()
#2 0x8065fd3 in minor_collection ()
#3 0x8063859 in interprete ()
#4 0x8065043 in caml_main ()
#5 0x805ab45 in main ()
#6 0x2ab457e2 in __libc_start_main () from /lib/libc.so.6
It is still a GC bug, however the problem no longer arises while our
C interface allocates CAML memory. So I think the bug has moved (or
this is another bug).
2. More interestingly, by compiling this code with OCaml 2.99 under
sun4 architecture, this bug no longer occurs. I obtain in place a:
Fatal error: uncaught exception Invalid_argument("Array.get")
So, it would be possible that some Invalid_argument exceptions are
not properly caught in OCaml 2.04.
For CAML team, I was not able to produce a small/simple example that
can trigger this bug. Only my gas-plant program seems to
3. I've found an interesting message on Damien Doliguez site:
This message tells that you can have strange CAML GC related
problems while deallocating C structures. However, I think my code
is in case 2 of Manuel proposed fixes (i.e. box every C pointer
inside an abstract CAML block). So, in my opinion, my bug is not
related to this problem. But, as am I paranoid now :), I've set
pointers to 1 after deallocating them. This not fixed the bug.
Anyway, if somebody has an advice on how to track down this bug, I'll
glady accept it.
David -- once happy in the Caml-only world
David....@irisa.fr -- http://www.irisa.fr/prive/dmentre/
Opinions expressed here are only mine.
Hm, I have just taken a look at the code. I do not want to be too fast with
my suggestion (I have not tried it), but I am pretty sure that the
following might be the bug:
You use "Store_field" throughout the code to assign pointers to fields in
structures which were allocated using "alloc_final".
I once had a similar bug in my PCRE-library, but Gerd Stolpmann was so kind
to send me the patch and explain the problem. Here his translated
explanation (seems reasonable):
- after "alloc_small" the fields have to be initialized with
"Field(var, n) = ...", not with "Store_field". The last version writes
(with some bad luck) the address of the field into a list of addresses
which have to be moved in case of a minor GC.
- The fields of "alloc_final" are not considered by the GC. Therefore,
they, too, have to be written to using "Field(var, n)" (or you may
cast them to a normal C-struct). "Store_field" has, again, unexpected
Since you are lucky and have used access macros throughout the code, you
can quickly test my suggestion by changing them.
I hope that helps!
In case this is really the bug (probably), I'd suggest a revision of the
C-interface-documentation. At least to me it was not obvious that
"Store_field" leads to such additional, unexpected behaviour.
Good luck squeezing the bug,
Markus Mottl, mo...@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl
You were right Markus. Using directly the Field macro fixed my bug.
Markus Mottl <mo...@miss.wu-wien.ac.at> writes:
> You use "Store_field" throughout the code to assign pointers to fields in
> structures which were allocated using "alloc_final".
> I once had a similar bug in my PCRE-library, but Gerd Stolpmann was so kind
> to send me the patch and explain the problem. Here his translated
> explanation (seems reasonable):
> - after "alloc_small" the fields have to be initialized with
> "Field(var, n) = ...", not with "Store_field". The last version writes
> (with some bad luck) the address of the field into a list of addresses
> which have to be moved in case of a minor GC.
> - The fields of "alloc_final" are not considered by the GC. Therefore,
> they, too, have to be written to using "Field(var, n)" (or you may
> cast them to a normal C-struct). "Store_field" has, again, unexpected
> side effects.
The explanation (or a guess ;) :
1. a memory block is allocated with alloc_final, therefore this block
internals should not be considered by the GC.
2. I use the Store_field macro to update block content.
3. However, this macro is calling modify (function defined in
byterun/memory.c) which in turn calls the Modify macro (defined in
byterun/memory.h). As Markus said, this macro adds the address
given in argument to a list of memory addresses (ref_table_ptr)
that should be examined by the GC at collection time.
4. So, we have a GC-opaque memory block whose content adresses have
been added to a GC to-examine-later list. Therefore, at GC time:
> In case this is really the bug (probably), I'd suggest a revision of the
> C-interface-documentation. At least to me it was not obvious that
> "Store_field" leads to such additional, unexpected behaviour.
I also subscribe to this documentation revision. I also volunteer, if
needed, to review/rewrite the doc part related to Interfacing C with
> Good luck squeezing the bug,
I've squeezed it, with your help. :)
One again, many many thanks,
I guess that the confusion about the real things happening as explained in
"3" above comes from "rule 6" in the documentation of the C-interface,
Direct assignment to a field of a block, as in
Field(v, n) = w;
is safe only if v is a block newly allocated by alloc_small; that
is, if no allocation took place between the allocation of v and the
assignment to the field. In all other cases, never assign directly.
This "safe only" and "in all other cases, never assign directly" leaves the
impression that the "Field" macro is a bit "evil" and could be avoided,
possibly by using this nice "Store_field"-macro. However, it is not only
"safe" to use "Field" in this case, it seems (?) that this is the only way
to do it correctly. Furthermore, I did not find any documentation on
correctly placing values into blocks created with "alloc_final", which
seems to be pretty similar to "alloc_small" in this respect.
The only information I found concerning "alloc_small" which appears to
indicate correct usage is:
alloc_small(n, t) returns a fresh small block of size n <=
Max_young_wosize words, with tag t. If this block is a structured block
(i.e. if t < No_scan_tag), then the fields of the block (initially
containing garbage) must be initialized with legal values (using direct
assignment to the fields of the block) before the next allocation.
The intention of "using direct assignment to the fields" is obviously meant
as hint to use the "Field"-macro. Because most people don't know that
"Store_field" not only assigns directly, but does unexpected other things,
this information does probably not help...
>I also subscribe to this documentation revision. I also volunteer, if
>needed, to review/rewrite the doc part related to Interfacing C with
We are always very grateful for any contribution to the system and
especially the docs. I've added a few notes in the doc based on your
story (as found in my mailbox when I came back from vacation), but I
guess what we really need is a better example to show all the features
of the interface.