An observation about tail calls

Showing 1-1 of 1 messages
An observation about tail calls Florian Weimer 10/9/11 3:33 AM
Go lacks proper tail calls: calls in tail positions grow the stack, at
least in the standard implementation.  This prevents the direct
implementation of state machine as a collection of functions
recursively calling each other.  Surprisingly, the usual workaround, a
mini-interpreter, makes it straightforward to execute the state
machine incrementally, and not just to completion, without relying on
additional language features such as coroutines.

The lexer in the template package encodes part of its state in the
instruction pointer.  To work around the lack of tail calls, it has to
use a mini-interpreter to avoid unbounded stack growth, something like

        type stateFn func(*lexer) stateFn
        // ...
        var state stateFn
        // ...
        for state != nil {
                state = state(l)

According to Rob Pike, "Lexical Scanning in Go,"
<>, this more or
less how the lexer started.  The loop above is a standard way to
emulate proper tail calls on top of systems which lack them.  It's
been used by GHC with the C backend, and various functional language
implementations on the JVM.

Curiously, the stateFn type cannot be expressed in Standard ML, even
when the option type in the function result type is made explicit.  It
appears to be a counterexample to the widespread notion that something
which cannot be typed in Standard ML is not worth typing at all,

To use this mini-interpreter, the actions have to be changed from tail

        func lexLeftDelim(l *lexer) {
                if strings.HasPrefix(l.input[l.pos:], l.leftDelim+leftComment) {
                        return lexComment(l)
                l.pos += len(l.leftDelim)
                return lexInsideAction(l)

to the formulation used in the current Go sources:

        func lexLeftDelim(l *lexer) stateFn {
                if strings.HasPrefix(l.input[l.pos:], l.leftDelim+leftComment) {
                        return lexComment
                l.pos += len(l.leftDelim)
                return lexInsideAction

Such manual tail call elimination has the drawback that it is somewhat
inefficient: the indirect call state(l) is likely to be mispredicted.
The original calls were direct calls.  So it is tempting to ask for
direct language support for proper tail calls.

But there's a small bit of magic in this code fragment: the call to
lexer.emit(itemType).  This is effectively a coroutine yield,
transferring a result out of the state machine loop to the caller of
the lexer.  It could just append the item to an array, but then the
lexer would be incremental anymore.  The coroutine yield is emulated
in Go with a goroutine and a channel.  The goroutine executes the
mini-interpreter, and the channel carries the lexer items to the main
program.  See the slides (20--22) for details.

However, this doesn't work very well in Go because goroutines do more
than mere coroutines (they provide parallelism) and have to be
disallowed during package initialization.  This is the time when you
want to parse templates to get early errors, not just when the
template is actually used.

But it turns out that the mini-interpreter can be extended easily to
do away with the need for the goroutine:

        for {
                select {
                case item := <-l.items:
                        return item
                        l.state = l.state(l)

The lexer actions do not need any changes. They can continue to use
the existing lexer.emit() method.  To a certain degree, this appears
to be an accident.  In the earlier versions, it would not have been a
problem to loop around calls to emit method, but with the new
mini-interpreter, such loops would need unbounded buffering.  A fixed
number of calls to the emit method from a single action is acceptable,

Using a channel and a conditional receive is not essential here, it is
merely a convenience.  See Nigel Tao, "replace the lexer's item
channel with an inline ringbuffer", <>.

Apart from the curiously infinite type stateFn (which can be encoded
in some way in most languages which lack direct support to express
it), the entire construction does not need any special language
features.  Not even closures are required, bare function values are

On the other hand, this approach is impossible if direct tail calls
are used instead of a mini-interpreter.  There is just no place at
which the loop can be broken.  The direct tail call could be replaced
with a tail call to a function which checks if there is an emitted
token, suspending the loop in that case.  But this would just
reimplement the mini-interpreter loop given above using recursion, so
it is not a real improvement.

To me, this little anecdote suggests that tail calls are not as
essential as they might seem in a programming language.  State
machines which can be executed step-by-step have much wider use than
those that run until they reach some end state.  Lack of proper tail
calls can be an unexpected virtue.

Permanent URL: <>