Comparison of CPS, ANF, and SSA?

485 views
Skip to first unread message

Bradd W. Szonye

unread,
Jan 14, 2005, 5:28:28 AM1/14/05
to
I learned something new this week: While reading up on continuation-
passing-style (CPS) interpreters and intermediate languages, I
discovered some literature on A-normal form (ANF) and static single
assignment (SSA) representations. I've been trying to gather more
information on the latter two forms, since they seem more promising than
CPS overall. However, my searches have been frustrating so far,
especially for ANF (which has very Google-unfriendly search terms).

I'd appreciate it if somebody could compare and contrast these
representations, answer some questions, or at least point me to good
articles and bibliographies on the topic.

I'm particularly interested in two questions:

Are there any compelling reasons to use a CPS intermediate form to
represent Scheme code, except for tutorial purposes?

How do major Scheme features like lexical closures and first-class
continuations interact with SSA representations? It seems to me that the
"non-local" aspects of those features might make it more difficult to
compute and reason about definitions and uses, but I'm not sure.
--
Bradd W. Szonye
http://www.szonye.com/bradd

Bradd W. Szonye

unread,
Jan 14, 2005, 5:51:07 AM1/14/05
to
Bradd W. Szonye wrote:
> I'd appreciate it if somebody could compare and contrast these
> representations, answer some questions, or at least point me to good
> articles and bibliographies on the topic.

I forgot to mention that I've already read Amr Sabry's thesis, which is
actually how I learned about A-normal form. I'm just having a hard time
finding information beyond that. Ideally, I'd like something readable to
an engineer; while I can follow the gist of Sabry's thesis, most of the
proofs and lambda-calculus stuff is over my head.

I've also read a bit of the SSA literature, including several general
overviews and the Rice paper on semi-pruned form. I found all that very
comprehensible -- that's about the level of mathematical sophistication
I'm comfortable with. The problem with the SSA stuff is that it all
seems oriented toward Algol-family languages, and I don't have a strong
enough intuition or understanding to see how it works with Lisp-family
languages.

Thanks!

Jim Bender

unread,
Jan 14, 2005, 9:16:17 AM1/14/05
to
In message <slrncuf7mc.u...@szonye.com>
"Bradd W. Szonye" <bradd...@szonye.com> wrote:
>I learned something new this week: While reading up on continuation-
>passing-style (CPS) interpreters and intermediate languages, I
>discovered some literature on A-normal form (ANF) and static single
>assignment (SSA) representations. I've been trying to gather more
>information on the latter two forms, since they seem more promising than
>CPS overall. However, my searches have been frustrating so far,
>especially for ANF (which has very Google-unfriendly search terms).

I would make the following the following literature suggestions:

Dave Tarditi's dissertation (for its use of an ANF-style representation
and the optimizations performed; this is perhaps the best "engineers
point of view" on ANF that I know of):
David Tarditi. Design and Implementation of Code Optimiziations for a
Type-Directed Compiler for Standard ML. PhD thesis, School of
Computer Science, Carnegie Mellon University, December 1996
http://reports-archive.adm.cs.cmu.edu/anon/1997/CMU-CS-97-108.ps

A Correspondence between Continuation-Passing Style and Static
Single Assignment. (IR'95, published as ACM SIGPLAN Notices,
3(30), March 1995)
http://mumble.net/~kelsey/papers/cps-ssa.ps.gz

SSA is Functional Programming. Andrew W. Appel, ACM SIGPLAN
Notices v. 33, no. 4, pp. 17-20, April 1998.
http://www.cs.princeton.edu/~appel/papers/ssafun.pdf

Optimizing nested loops using local CPS conversion
John Reppy, In Higher-order and Symbolic Computation, 2002
http://moby.cs.uchicago.edu/papers/2002/hosc-final.pdf

Jim Bender


Joe Marshall

unread,
Jan 14, 2005, 9:37:10 AM1/14/05
to
"Bradd W. Szonye" <bradd...@szonye.com> writes:

> Bradd W. Szonye wrote:
>> I'd appreciate it if somebody could compare and contrast these
>> representations, answer some questions, or at least point me to good
>> articles and bibliographies on the topic.
>
> I forgot to mention that I've already read Amr Sabry's thesis, which is
> actually how I learned about A-normal form. I'm just having a hard time
> finding information beyond that. Ideally, I'd like something readable to
> an engineer; while I can follow the gist of Sabry's thesis, most of the
> proofs and lambda-calculus stuff is over my head.

Try this one:

<http://citeseer.ist.psu.edu/flanagan-essence.html>
Cormac Flanagan, Amr Sabry, Bruce F. Duba, Matthias Felleisen
Proceedings ACM SIGPLAN 1993 Conf. on Programming Language Design and
Implementation, PLDI'93, Albuquerque, NM, USA, 23--25 June 1993

and browse CiteSeer.


I was recently researching the same thing and discovered that nearly
anything technical has "a normal form" and Google doesn't key on the
all-important hyphen between `a' and `normal'. However it is also
referred to as `administrative normal form' and `let normal form'.

> Are there any compelling reasons to use a CPS intermediate form to
> represent Scheme code, except for tutorial purposes?

Maybe. Some Java-based Scheme systems use CPS conversion in order
to get continuations and tail-recursion. Since the final form will be
CPS, the intermediate form might as well be, too.

But for `normal' compilation, it appears that ANF is superior.

Matthias Blume

unread,
Jan 14, 2005, 10:41:37 AM1/14/05
to
Jim Bender <j...@benderweb.net> writes:

<plug>

One might also have a look at a paper co-authored by Lal George
and myself regarding the language Nova and its compiler:

Taming the IXP Network Processor. Lal George and Matthias Blume. In
Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language
Design and Implementation (PLDI '03), San Diego, California. June
9-11, 2003.

Our compiler used CPS as its intermediate form because in the setting
of this particular language/processor combination we found that CPS
had none of the drawbacks usually attributed to it.
(The paper does not go into much detail as far as CPS is concerned, though.)

</plug>

Matthias

Jens Axel Søgaard

unread,
Jan 14, 2005, 11:02:35 AM1/14/05
to
Joe Marshall wrote:

> "Bradd W. Szonye" <bradd...@szonye.com> writes:
>>Bradd W. Szonye wrote:
>>
>>>I'd appreciate it if somebody could compare and contrast these
>>>representations, answer some questions, or at least point me to good
>>>articles and bibliographies on the topic.
>>
>>I forgot to mention that I've already read Amr Sabry's thesis, which is
>>actually how I learned about A-normal form. I'm just having a hard time
>>finding information beyond that. Ideally, I'd like something readable to
>>an engineer; while I can follow the gist of Sabry's thesis, most of the
>>proofs and lambda-calculus stuff is over my head.
>
>
> Try this one:
>
> <http://citeseer.ist.psu.edu/flanagan-essence.html>
> Cormac Flanagan, Amr Sabry, Bruce F. Duba, Matthias Felleisen
> Proceedings ACM SIGPLAN 1993 Conf. on Programming Language Design and
> Implementation, PLDI'93, Albuquerque, NM, USA, 23--25 June 1993
>
> and browse CiteSeer.

When "The Essence of Compiling with Continuations" was reprinted in
the "20 years of SIGPLAN", the authors wrote a retrospective:

<http://www.ccs.neu.edu/scheme/pubs/pldi-fsdf.ps>,

which is packed with references.

For actual source code look at:

<http://schemecookbook.org/view/Cookbook/PearlANormalForm>
<http://download.plt-scheme.org/scheme/plt/collects/compiler/private/anorm.ss>

--
Jens Axel Søgaard

/Jens Axel Søgaard

Bradd W. Szonye

unread,
Jan 15, 2005, 4:40:27 AM1/15/05
to
Joe Marshall wrote:
> I was recently researching the same thing and discovered that nearly
> anything technical has "a normal form" and Google doesn't key on the
> all-important hyphen between `a' and `normal'. However it is also
> referred to as `administrative normal form' and `let normal form'.

Yeah, I ran into the same problem.

>> Are there any compelling reasons to use a CPS intermediate form to
>> represent Scheme code, except for tutorial purposes?

> Maybe. Some Java-based Scheme systems use CPS conversion in order to
> get continuations and tail-recursion. Since the final form will be
> CPS, the intermediate form might as well be, too.

That makes sense. However, I'm more interested in ground-up interpreters
and compilers, so I don't think that's important to me.

> But for `normal' compilation, it appears that ANF is superior.

That's the impression I got.

Thanks to everyone for the references. I may have more questions after
I've had time to digest it all.

Bradd W. Szonye

unread,
Jan 15, 2005, 6:05:23 AM1/15/05
to
Jim Bender wrote:
> A Correspondence between Continuation-Passing Style and Static
> Single Assignment. (IR'95, published as ACM SIGPLAN Notices,
> 3(30), March 1995)
> http://mumble.net/~kelsey/papers/cps-ssa.ps.gz

This is the only paper I've found that mentions both call/cc and SSA,
and it simply notes that call/cc is one of the CPS forms that can't
generally be translated to SSA. Does this imply that SSA is only useful
for certain subsets of Scheme code (i.e., procedures that can be proven
to have no non-local jumps)?

I'm a bit puzzled by this, because even C and C++ have non-local jumps
(e.g., longjmp and exception handling). Therefore, SSA compilers must
have some way of coping with them. Assuming that techniques do exist for
representing non-local jumps in SSA representations, is there hope for
using them in a Scheme compiler? Or is SSA useful only for certain
procedures?

Jens Axel Søgaard

unread,
Jan 15, 2005, 7:04:10 AM1/15/05
to
Bradd W. Szonye wrote:

> Thanks to everyone for the references. I may have more questions after
> I've had time to digest it all.

This is a short, very readable article of Appel, which relates SSA to
lexical scope in functional programs.

<http://www.cs.princeton.edu/~appel/papers/ssafun.pdf>

--
Jens Axel Søgaard

Jens Axel Søgaard

unread,
Jan 15, 2005, 7:06:24 AM1/15/05
to
Bradd W. Szonye wrote:

> However, my searches have been frustrating so far,
> especially for ANF (which has very Google-unfriendly search terms).

Here is the trick, search for:

administrative normal form static single assignment

--
Jens Axel Søgaard

Jens Axel Søgaard

unread,
Jan 15, 2005, 7:15:22 AM1/15/05
to
Bradd W. Szonye wrote:

> Are there any compelling reasons to use a CPS intermediate form to
> represent Scheme code, except for tutorial purposes?

John Reppy has some thoughts about this in the introduction to:

<http://moby.cs.uchicago.edu/papers/2002/hosc-final.pdf>

The paper describes how to use a local CPS conversion
on a direct style intermediary representation in order to
optimize nested loops. This suggests that one can benefit
from using both ANF- and CPS-techniques in the same compiler.

--
Jens Axel Søgaard

George Neuner

unread,
Jan 19, 2005, 5:21:34 PM1/19/05
to

SSA doesn't address non-local jumps differently. Any control transfer
can be accomodated by inserting the appropriate phi function to join
data definitions on separate entry paths.

The class of upward-only continuations ( including longjmps and
exceptions ) is relatively easy to handle. Because the code path is
essentially being abandoned, locals within it don't need to be
considered on exit. The join function(s) at the continuation point
need only consider data external to the abandoned path which may have
been side effected (including any persisting data which may have been
newly created).

I've never tried to use SSA with general continuations but I suspect
it is not a very good fit. I expect that each continuation path would
have to be treated as if it were nested recursions or loops ...
working on the assumption that coroutines could have modified data
defined below the entry point unless an examination of the code could
prove no lower entry point. Does anyone who has actually worked with
this care to comment?

George
--
for email reply remove "/" from address

William D Clinger

unread,
Jan 21, 2005, 11:05:27 PM1/21/05
to
George Neuner wrote:
> I've never tried to use SSA with general continuations
> but I suspect it is not a very good fit. I expect that each
> continuation path would have to be treated as if it were
> nested recursions or loops ...

Sounds about right. Basically, every return point has to
be treated almost as though it were a lambda expression,
because the continuation (i.e. the closure of the return
point) can be captured and called multiple times. This
means you can't move side effects or some allocation
effects across a return point. You can still move pure
terminating computations across a return point, however.

SSA forms must deal with similar restrictions. You can't
move arbitrary side effects across a call site, because
the effect might be visible to the callee.

I think of A-normal form as a normal form for DAGs, SSA
as a normal form for loops, and CPS as a normal form for
interprocedural control flow. My Twobit compiler uses
A-normal form together with several interprocedural
optimizations that don't require CPS, and doesn't do any
loop optimization per se at all.

Will

Reply all
Reply to author
Forward
0 new messages