[Caml-list] Function application implementation

Tom

unread,

Apr 25, 2007, 6:39:34 PM4/25/07

to Caml List

I have a question about how are function applications compiled. In
particular, how does the program "know" the difference between f and g in:

let f w x y z = w + x + y - z

let g w x y =
let m = ref (w + x + y) in
function z -> !m - z

print_int (f 1 2 3 4)
print_int (g 1 2 3 4)

According to my logic, the first call should compile something like
push 1
push 2
push 3
push 4
call function_f
and the second like
push 1
push 2
push 3
call function_g // the return value, new function closure, is (for
example) in register A
push 4
call A

But obviously, this cannot be the case, as the functions aren't determined
in advance at all call sites (in a functional language). How does the
compiler deal with that?

(I am simplifying here, disregarding the matters of closure construction and
deconstruction, along with environments. However, the refuced example still
causes me problems.)

- Tom

skaller

unread,

Apr 25, 2007, 11:14:36 PM4/25/07

to Tom

On Thu, 2007-04-26 at 00:38 +0200, Tom wrote:
> I have a question about how are function applications compiled. In
> particular, how does the program "know" the difference between f and g
> in:

> But obviously, this cannot be the case, as the functions aren't

> determined in advance at all call sites (in a functional language).
> How does the compiler deal with that?

It knows the type of the function expression, and that is all
that is required. Incidentally Ocaml evaluates right to left. So

f x y z

will be roughly:

push (eval z)
push (eval y)
push (eval x)
push (eval f)
apply
apply
apply

I guess this simple stack protocol explains why it evaluates right
to left not left to right.

The real compiler will of course do optimisations not shown above.

--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Tom

unread,

Apr 26, 2007, 4:55:18 AM4/26/07

to skaller

On 26/04/07, skaller <ska...@users.sourceforge.net> wrote:
>
> It knows the type of the function expression, and that is all
> that is required. Incidentally Ocaml evaluates right to left. So
>
> f x y z
>
> will be roughly:
>
> push (eval z)
> push (eval y)
> push (eval x)
> push (eval f)
> apply
> apply
> apply

But that doesn't explain how does each apply know what to do, either to
build a new closure (in the case above, the first two applies) or to
actually call the code (the third apply).

- Tom

skaller

unread,

Apr 26, 2007, 5:20:35 AM4/26/07

to Tom

push (eval f) calculates the expression f,
which results in a closure. Apply, with the stack:

closure f <-- top
value 1
...

calculates

apply(closure f, value 1)

That is how functions are called. In practice, a compiler may do
optimisations.

In the Felix compiler for example, in the expression:

apply(f,e)

if the subexpresion f is a simple function constant, then the compiler
can inline the function. Otherwise, a closure has to be formed. In Felix
this means instantiating a C++ class (the function f) to make a closure
(an object of the class). In Felix the actual C++ used is:

(new f(environment)) -> apply (e)

In other words, all compilers will look for optimisations such
as are made possible when a direct call is detected, inlining
in such cases being one possible optimisation which could be applied.

the actual sequence I have above may not be how the Ocaml compiler
organises it: the point is that the model is built to not need
to make the distinction you're asking about: that's just an
optimisation.

Xavier Leroy

unread,

Apr 26, 2007, 5:26:40 AM4/26/07

to Tom

> It knows the type of the function expression, and that is all
> that is required. Incidentally Ocaml evaluates right to left. So
>
> f x y z
>
> will be roughly:
>
> push (eval z)
> push (eval y)
> push (eval x)
> push (eval f)
> apply
> apply
> apply
>
>
> But that doesn't explain how does each apply know what to do, either to
> build a new closure (in the case above, the first two applies) or to
> actually call the code (the third apply).

The generated abstract machine code is more like:

push (eval z)
push (eval y)
push (eval x)
push (eval f)

apply 3 (* number of arguments provided *)

"apply" doesn't do anything clever, it just enters the code of the
called function f. It's the code of f that determines what to do
with the arguments provided on the stack.

More details can be found in one of my talks:
http://gallium.inria.fr/~xleroy/talks/zam-kazam05.pdf

- Xavier Leroy