[Caml-list] Marshal, closures, bytecode and native compilers

Skip to first unread message

Damien Pous

Feb 1, 2007, 1:49:52 PM2/1/07
to caml...@inria.fr

I found some strange difference between the native and bytecode
compilers, when Marshaling functional values:

[damien@mostha]$ cat lift.ml
let r = ref 0
let f =
fun () -> incr r; print_int !r; print_newline()
let () = match Sys.argv.(1) with
| "w" -> Marshal.to_channel stdout f [Marshal.Closures]
| "r" ->
let g = (Marshal.from_channel stdin: unit -> unit) in
g (); f ()
| _ -> assert false

[damien@mostha]$ ocamlc lift.ml; ( ./a.out w | ./a.out r )
[damien@mostha]$ ocamlopt lift.ml; ( ./a.out w | ./a.out r )
[damien@mostha]$ ocamlc -version

In the bytecode version, the reference [r] gets marshaled along with
[f] so that the calls [f()] and [g()] respectively affect the initial
reference of the reader, and the (fresh) marshaled reference.

On the contrary in the native version, it seems that [f] is not
`closed': its code address is directly sent, and the call [g()]
affects the initial reference of the reader.

For my needs, I definitely prefer the second answer (only the address
is sent). However, if I move the declaration of the reference inside
the definition of [f], both compilers agree on the first answer: the
reference is marshaled.

[damien@mostha]$ cat refs.ml
let f =
let r = ref 0 in
fun () -> incr r; print_int !r; print_newline()
let () = match Sys.argv.(1) with
| "w" -> Marshal.to_channel stdout f [Marshal.Closures]
| "r" ->
let g = (Marshal.from_channel stdin: unit -> unit) in
g (); f ()
| _ -> assert false

[damien@mostha]$ ocamlc refs.ml; ( ./a.out w | ./a.out r )
[damien@mostha]$ ocamlopt refs.ml; ( ./a.out w | ./a.out r )

More than the different behaviour of ocamlc and ocamlopt on "lift.ml",
I am quite surprised that ocamlopt does not give the same results on
"refs.ml" and "lift.ml" : the second is just a `lambda-lifting' of the
first one!

Here come my questions:
- How to guess how deep a functional value will be marshaled?
- Is there a way to enforce the second behaviour, where the reference is
not marshalled (ocamlopt lift.ml)?

Cimer beaucoup,

Caml-list mailing list. Subscription management:
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Jacques Garrigue

Feb 1, 2007, 9:19:05 PM2/1/07
to Damie...@ens-lyon.fr
From: Damien Pous <Damie...@ens-lyon.fr>

Interesting phenomenon. According to the usual definition of closure,
the correct solution is probably the bytecode one. But this definition
seems hardly applicable in practice, since it would also mean bringing
all dependencies with you. This is not the case even with the bytecode
version. For instance if you move "let r = ref 0" to r.ml, and replace
the first line of your program by "open R", you get the same behaviour
as for native code.

So as a first approximation, the real specification is: local
variables are transmitted with the closure, but global ones are not.
The trouble being that the definition of global is different for
bytecode and native code. With bytecode, definitions from the same
module are local, while they are global for native code.

Moreover, I believe that, through optimizations, variables that look
local may turn up to be global.

I'm not sure what would be the right fix.
A more complete specification would be a good idea.
A flag to disable optimizations would be rather costly.

For now, a rule of the thumb would be:

* if you want your variable to be handled as global, even in bytecode,
either receive it as parameter (after marshalling) or put it in
another compilation unit.

* if you want your variable to be handle as local, even in native
code, then define or redefine it locally inside your function.

let r = ref 0
let f =

let r = r in
fun () -> incr !r; print_int !r; print_newline()
For the time being this seems to work.

Maybe it is better just to assume that you should not mix closure
marshalling with mutable variables. In either case, the semantics
seems fishy. It seems more reasonable to make such functions receive
their mutable state explicitly, and choose either to send it
(obtaining a "fork" behaviour) or not.

Jacques Garrigue

Reply all
Reply to author
0 new messages