Why no tail call optimization

1,970 views
Skip to first unread message

Dale

unread,
Aug 2, 2010, 5:09:37 PM8/2/10
to Clojure
The JVM has an unconditional goto opcode and the ability to re-bind
function parameters, so why no tail-call optimization? Thanks.

Dale

Frederick Polgardy

unread,
Aug 2, 2010, 9:58:46 PM8/2/10
to clo...@googlegroups.com
It means that the JVM doesn't look at method calls and figure out that they're in tail call position and optimize them. You can hand-write code that performs a goto in a tight loop (like recur does), but means you can't assume that method calls in general will be tail call optimized.

-Fred

--
Science answers questions; philosophy questions answers.

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

Kevin Downey

unread,
Aug 2, 2010, 10:37:44 PM8/2/10
to clo...@googlegroups.com
the jvm goto's only can jump around inside method bodies, so it is a
very restricted goto

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

--
And what is good, Phaedrus,
And what is not good—
Need we ask anyone to tell us these things?

Dan Kersten

unread,
Aug 2, 2010, 10:16:10 PM8/2/10
to Clojure
Why can't the clojure bytecode compiler hand-perform this like
functional languages do when compiling to native code? Is it to keep
the clojure compiler fast (for dynamic runtime compilation), since
performing tail call optimisation presumably requires a bunch of extra
checks and more complex code generation? Perhaps this could be done on
AOT compilation?

Wilson MacGyver

unread,
Aug 3, 2010, 12:54:39 AM8/3/10
to clo...@googlegroups.com
as Rich Hickey stated

question: Is it fundamentally impossible to do TCO on JVM due to
current JVM lack of primitives to do so? Would TCO ever be possible on
the JVM without a new JVM design?
rhickey: TCO is easy if you are an interpreter - see SISC Scheme.
Using Java's call stack, the JVM would have to provide it. There are
no fundamental technical difficulties, but potential issues for the
security model, which uses the call stack to ensure privileges.

--
Omnem crede diem tibi diluxisse supremum.

Michał Marczyk

unread,
Aug 3, 2010, 7:31:03 AM8/3/10
to clo...@googlegroups.com
On 3 August 2010 04:16, Dan Kersten <dker...@gmail.com> wrote:
> Why can't the clojure bytecode compiler hand-perform this like
> functional languages do when compiling to native code?

Because bytecode attempting to manipulate the stack and jump around
(unrestricted goto-like) in other ways than through the usual JVM
method call mechanisms would not pass verification (the first step of
the bytecode loading process on the JVM).

> Is it to keep
> the clojure compiler fast (for dynamic runtime compilation), since
> performing tail call optimisation presumably requires a bunch of extra
> checks and more complex code generation? Perhaps this could be done on
> AOT compilation?

TCO adds no complexity at all when the generated object code handles
its own stack, subroutine calling conventions etc. A compiler
targeting native code has the option of simply not storing a new
return address on the stack when compiling a tail call (if the
arguments to the current subrouting where passed in registers; if they
were passed in a stack frame, it can simply be popped, with the return
address saved and reused for the tail call). On the JVM it is
impossible, because the generated object code (JVM bytecode) is not
permitted to do this sort of thing.

Interestingly, [Erjang][1] (a port of Erlang to the JVM) apparently
performs TCO while claiming to stay "reasonably fast". The gimmick
involved is apparently a particularly smart implementation of
trampolining. Read more about it [here][2]. I have a hunch that this
is a no-go for Clojure (partly because Clojure tends to insist on
staying close to the platform -- which has significant benefits -- and
partly because I'm not sure if this kind of thing wouldn't disagree
with Clojure's extremely dynamic nature... I haven't thought this
through that well, though, so maybe this is nonsense). At any rate, it
is interesting.

Sincerely,
Michał


[1] git://github.com/krestenkrab/erjang
[2] http://wiki.github.com/krestenkrab/erjang/how-erjang-compiles-tail-recursion

Daniel Kersten

unread,
Aug 3, 2010, 2:19:42 AM8/3/10
to clo...@googlegroups.com
Can one not detect that a recursive call is a tail call and then transform the AST so that its iterative instead - ie, not use the stack besides for initial setup of local variables (which then get reused in each recursive tail-call). Isn't this how its done in native compiled languages with TCO? How is this different from generating bytecode for iterative loops in imperative languages, or from what recur does? Alternatively, why can't the tail call be detected and converted into recur? I'm guessing that the problem is detecting tal calls - but why; speed of dynamic compilation? Something else?

Obviously I'm missing something fundamental here - can somebody explain to me what it is?

Thanks!
Daniel Kersten.
Leveraging dynamic paradigms since the synergies of 1985.

Rich Hickey

unread,
Aug 3, 2010, 8:39:07 AM8/3/10
to Clojure


On Aug 3, 2:19 am, Daniel Kersten <dkers...@gmail.com> wrote:
> Can one not detect that a recursive call is a tail call and then transform
> the AST so that its iterative instead - ie, not use the stack besides for
> initial setup of local variables (which then get reused in each recursive
> tail-call). Isn't this how its done in native compiled languages with TCO?
> How is this different from generating bytecode for iterative loops in
> imperative languages, or from what recur does? Alternatively, why can't the
> tail call be detected and converted into recur? I'm guessing that the
> problem is detecting tal calls - but why; speed of dynamic compilation?
> Something else?
>
> Obviously I'm missing something fundamental here - can somebody explain to
> me what it is?
>

When speaking about general TCO, we are not just talking about
recursive self-calls, but also tail calls to other functions. Full TCO
in the latter case is not possible on the JVM at present whilst
preserving Java calling conventions (i.e without interpreting or
inserting a trampoline etc).

While making self tail-calls into jumps would be easy (after all,
that's what recur does), doing so implicitly would create the wrong
expectations for those coming from, e.g. Scheme, which has full TCO.
So, instead we have an explicit recur construct.

Essentially it boils down to the difference between a mere
optimization and a semantic promise. Until I can make it a promise,
I'd rather not have partial TCO.

Some people even prefer 'recur' to the redundant restatement of the
function name. In addition, recur can enforce tail-call position.

Rich

Peter Schuller

unread,
Aug 3, 2010, 8:57:14 AM8/3/10
to clo...@googlegroups.com
> Interestingly, [Erjang][1] (a port of Erlang to the JVM) apparently
> performs TCO while claiming to stay "reasonably fast". The gimmick

I have never done extensive benchmarking of clojure, but given the
frequent mentions of use of '-server' in order to achieve specific
performance goals, I get the impression clojure (i.e. Rich) definitely
wants to take advantage of all the optimizations the JIT can offer now
and in the future. Trampoline-based TCO would, as far as I can tell,
always defeat the JIT's notion of a call site - unless the JIT is made
to understand the trampoline (but then are we getting close to full
TCO support anyway? I dunno, I'm definitely not an expert on this
topic...). So, I would expect that those cases which are aggressively
optimized by JIT:ing, such as eliminating method lookups by inline
caching in light loops, would suffer potentially very extreme
performance impacts which aren't fixed by just avoiding allocation as
is mentioned in the erjang wiki page. Or is this over-stating the
problem?

--
/ Peter Schuller

Dale

unread,
Aug 3, 2010, 3:56:45 PM8/3/10
to Clojure
> When speaking about general TCO, we are not just talking about
> recursive self-calls, but also tail calls to other functions. Full TCO
> in the latter case is not possible on the JVM at present whilst
> preserving Java calling conventions (i.e without interpreting or
> inserting a trampoline etc).

Understood. Thanks!

Daniel Kersten

unread,
Aug 3, 2010, 6:28:01 PM8/3/10
to clo...@googlegroups.com
Thanks for the replies, that certainly clarified things!

On 3 August 2010 13:39, Rich Hickey <richh...@gmail.com> wrote:


On Aug 3, 2:19 am, Daniel Kersten <dkers...@gmail.com> wrote:
> Can one not detect that a recursive call is a tail call and then transform
> the AST so that its iterative instead - ie, not use the stack besides for
> initial setup of local variables (which then get reused in each recursive
> tail-call). Isn't this how its done in native compiled languages with TCO?
> How is this different from generating bytecode for iterative loops in
> imperative languages, or from what recur does? Alternatively, why can't the
> tail call be detected and converted into recur? I'm guessing that the
> problem is detecting tal calls - but why; speed of dynamic compilation?
> Something else?
>
> Obviously I'm missing something fundamental here - can somebody explain to
> me what it is?
>

When speaking about general TCO, we are not just talking about
recursive self-calls, but also tail calls to other functions. Full TCO
in the latter case is not possible on the JVM at present whilst
preserving Java calling conventions (i.e without interpreting or
inserting a trampoline etc).

Ah, this where my confusion was then.
Self-calls aren't the problem at all, since they can be compiled how recur is, but tail-calls to other functions cannot be due to the JVM's calling conventions. I understand now, thanks for the explanation.
 

While making self tail-calls into jumps would be easy (after all,
that's what recur does), doing so implicitly would create the wrong
expectations for those coming from, e.g. Scheme, which has full TCO.
So, instead we have an explicit recur construct. 

Essentially it boils down to the difference between a mere
optimization and a semantic promise. Until I can make it a promise, I'd rather not have partial TCO. 

To me, it is really only an optimization and I'm very much in the group who likes the explicit recur statement, since it conveys intent. Therefore I'd be happy with the partial optimization, though, honestly, not having it doesn't bother me at all.
 
Some people even prefer 'recur' to the redundant restatement of the
function name. In addition, recur can enforce tail-call position.

Rich

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en



--
Daniel Kersten.

Mark Engelberg

unread,
Aug 4, 2010, 12:36:37 AM8/4/10
to clo...@googlegroups.com
Some people even prefer 'recur' to the redundant restatement of the
function name. In addition, recur can enforce tail-call position.

Rich



Because recur only takes you back to the innermost loop, sometimes I miss the ability to jump back to some outer loop (or the overall function call).  Hypothetically, could Clojure support a way to name a specific loop, and use a version of recur that lets you jump back to a given named outer loop (like Scheme's named let), or is this outside the scope of what Java's goto permits?
Reply all
Reply to author
Forward
0 new messages