I learned something new this week: While reading up on continuation- passing-style (CPS) interpreters and intermediate languages, I discovered some literature on A-normal form (ANF) and static single assignment (SSA) representations. I've been trying to gather more information on the latter two forms, since they seem more promising than CPS overall. However, my searches have been frustrating so far, especially for ANF (which has very Google-unfriendly search terms).
I'd appreciate it if somebody could compare and contrast these representations, answer some questions, or at least point me to good articles and bibliographies on the topic.
I'm particularly interested in two questions:
Are there any compelling reasons to use a CPS intermediate form to represent Scheme code, except for tutorial purposes?
How do major Scheme features like lexical closures and first-class continuations interact with SSA representations? It seems to me that the "non-local" aspects of those features might make it more difficult to compute and reason about definitions and uses, but I'm not sure. -- Bradd W. Szonye http://www.szonye.com/bradd
Bradd W. Szonye wrote: > I'd appreciate it if somebody could compare and contrast these > representations, answer some questions, or at least point me to good > articles and bibliographies on the topic.
I forgot to mention that I've already read Amr Sabry's thesis, which is actually how I learned about A-normal form. I'm just having a hard time finding information beyond that. Ideally, I'd like something readable to an engineer; while I can follow the gist of Sabry's thesis, most of the proofs and lambda-calculus stuff is over my head.
I've also read a bit of the SSA literature, including several general overviews and the Rice paper on semi-pruned form. I found all that very comprehensible -- that's about the level of mathematical sophistication I'm comfortable with. The problem with the SSA stuff is that it all seems oriented toward Algol-family languages, and I don't have a strong enough intuition or understanding to see how it works with Lisp-family languages.
In message <slrncuf7mc.unt.bradd+n...@szonye.com> "Bradd W. Szonye" <bradd+n...@szonye.com> wrote:
>I learned something new this week: While reading up on continuation- >passing-style (CPS) interpreters and intermediate languages, I >discovered some literature on A-normal form (ANF) and static single >assignment (SSA) representations. I've been trying to gather more >information on the latter two forms, since they seem more promising than >CPS overall. However, my searches have been frustrating so far, >especially for ANF (which has very Google-unfriendly search terms).
I would make the following the following literature suggestions:
Dave Tarditi's dissertation (for its use of an ANF-style representation and the optimizations performed; this is perhaps the best "engineers point of view" on ANF that I know of): David Tarditi. Design and Implementation of Code Optimiziations for a Type-Directed Compiler for Standard ML. PhD thesis, School of Computer Science, Carnegie Mellon University, December 1996 http://reports-archive.adm.cs.cmu.edu/anon/1997/CMU-CS-97-108.ps
A Correspondence between Continuation-Passing Style and Static Single Assignment. (IR'95, published as ACM SIGPLAN Notices, 3(30), March 1995) http://mumble.net/~kelsey/papers/cps-ssa.ps.gz
> Bradd W. Szonye wrote: >> I'd appreciate it if somebody could compare and contrast these >> representations, answer some questions, or at least point me to good >> articles and bibliographies on the topic.
> I forgot to mention that I've already read Amr Sabry's thesis, which is > actually how I learned about A-normal form. I'm just having a hard time > finding information beyond that. Ideally, I'd like something readable to > an engineer; while I can follow the gist of Sabry's thesis, most of the > proofs and lambda-calculus stuff is over my head.
Try this one:
<http://citeseer.ist.psu.edu/flanagan-essence.html> Cormac Flanagan, Amr Sabry, Bruce F. Duba, Matthias Felleisen Proceedings ACM SIGPLAN 1993 Conf. on Programming Language Design and Implementation, PLDI'93, Albuquerque, NM, USA, 23--25 June 1993
and browse CiteSeer.
I was recently researching the same thing and discovered that nearly anything technical has "a normal form" and Google doesn't key on the all-important hyphen between `a' and `normal'. However it is also referred to as `administrative normal form' and `let normal form'.
> Are there any compelling reasons to use a CPS intermediate form to > represent Scheme code, except for tutorial purposes?
Maybe. Some Java-based Scheme systems use CPS conversion in order to get continuations and tail-recursion. Since the final form will be CPS, the intermediate form might as well be, too.
But for `normal' compilation, it appears that ANF is superior.
Jim Bender <j...@benderweb.net> writes: > In message <slrncuf7mc.unt.bradd+n...@szonye.com> > "Bradd W. Szonye" <bradd+n...@szonye.com> wrote: >>I learned something new this week: While reading up on continuation- >>passing-style (CPS) interpreters and intermediate languages, I >>discovered some literature on A-normal form (ANF) and static single >>assignment (SSA) representations. I've been trying to gather more >>information on the latter two forms, since they seem more promising than >>CPS overall. However, my searches have been frustrating so far, >>especially for ANF (which has very Google-unfriendly search terms).
> I would make the following the following literature suggestions:
> Dave Tarditi's dissertation (for its use of an ANF-style representation > and the optimizations performed; this is perhaps the best "engineers > point of view" on ANF that I know of): > David Tarditi. Design and Implementation of Code Optimiziations for a > Type-Directed Compiler for Standard ML. PhD thesis, School of > Computer Science, Carnegie Mellon University, December 1996 > http://reports-archive.adm.cs.cmu.edu/anon/1997/CMU-CS-97-108.ps
> A Correspondence between Continuation-Passing Style and Static > Single Assignment. (IR'95, published as ACM SIGPLAN Notices, > 3(30), March 1995) > http://mumble.net/~kelsey/papers/cps-ssa.ps.gz
One might also have a look at a paper co-authored by Lal George and myself regarding the language Nova and its compiler:
Taming the IXP Network Processor. Lal George and Matthias Blume. In Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation (PLDI '03), San Diego, California. June 9-11, 2003.
Our compiler used CPS as its intermediate form because in the setting of this particular language/processor combination we found that CPS had none of the drawbacks usually attributed to it. (The paper does not go into much detail as far as CPS is concerned, though.)
Joe Marshall wrote: > "Bradd W. Szonye" <bradd+n...@szonye.com> writes: >>Bradd W. Szonye wrote:
>>>I'd appreciate it if somebody could compare and contrast these >>>representations, answer some questions, or at least point me to good >>>articles and bibliographies on the topic.
>>I forgot to mention that I've already read Amr Sabry's thesis, which is >>actually how I learned about A-normal form. I'm just having a hard time >>finding information beyond that. Ideally, I'd like something readable to >>an engineer; while I can follow the gist of Sabry's thesis, most of the >>proofs and lambda-calculus stuff is over my head.
> Try this one:
> <http://citeseer.ist.psu.edu/flanagan-essence.html> > Cormac Flanagan, Amr Sabry, Bruce F. Duba, Matthias Felleisen > Proceedings ACM SIGPLAN 1993 Conf. on Programming Language Design and > Implementation, PLDI'93, Albuquerque, NM, USA, 23--25 June 1993
> and browse CiteSeer.
When "The Essence of Compiling with Continuations" was reprinted in the "20 years of SIGPLAN", the authors wrote a retrospective:
Joe Marshall wrote: > I was recently researching the same thing and discovered that nearly > anything technical has "a normal form" and Google doesn't key on the > all-important hyphen between `a' and `normal'. However it is also > referred to as `administrative normal form' and `let normal form'.
Yeah, I ran into the same problem.
>> Are there any compelling reasons to use a CPS intermediate form to >> represent Scheme code, except for tutorial purposes? > Maybe. Some Java-based Scheme systems use CPS conversion in order to > get continuations and tail-recursion. Since the final form will be > CPS, the intermediate form might as well be, too.
That makes sense. However, I'm more interested in ground-up interpreters and compilers, so I don't think that's important to me.
> But for `normal' compilation, it appears that ANF is superior.
That's the impression I got.
Thanks to everyone for the references. I may have more questions after I've had time to digest it all. -- Bradd W. Szonye http://www.szonye.com/bradd
Jim Bender wrote: > A Correspondence between Continuation-Passing Style and Static > Single Assignment. (IR'95, published as ACM SIGPLAN Notices, > 3(30), March 1995) > http://mumble.net/~kelsey/papers/cps-ssa.ps.gz
This is the only paper I've found that mentions both call/cc and SSA, and it simply notes that call/cc is one of the CPS forms that can't generally be translated to SSA. Does this imply that SSA is only useful for certain subsets of Scheme code (i.e., procedures that can be proven to have no non-local jumps)?
I'm a bit puzzled by this, because even C and C++ have non-local jumps (e.g., longjmp and exception handling). Therefore, SSA compilers must have some way of coping with them. Assuming that techniques do exist for representing non-local jumps in SSA representations, is there hope for using them in a Scheme compiler? Or is SSA useful only for certain procedures? -- Bradd W. Szonye http://www.szonye.com/bradd
The paper describes how to use a local CPS conversion on a direct style intermediary representation in order to optimize nested loops. This suggests that one can benefit from using both ANF- and CPS-techniques in the same compiler.
<bradd+n...@szonye.com> wrote: >Jim Bender wrote: >> A Correspondence between Continuation-Passing Style and Static >> Single Assignment. (IR'95, published as ACM SIGPLAN Notices, >> 3(30), March 1995) >> http://mumble.net/~kelsey/papers/cps-ssa.ps.gz
>This is the only paper I've found that mentions both call/cc and SSA, >and it simply notes that call/cc is one of the CPS forms that can't >generally be translated to SSA. Does this imply that SSA is only useful >for certain subsets of Scheme code (i.e., procedures that can be proven >to have no non-local jumps)?
>I'm a bit puzzled by this, because even C and C++ have non-local jumps >(e.g., longjmp and exception handling). Therefore, SSA compilers must >have some way of coping with them. Assuming that techniques do exist for >representing non-local jumps in SSA representations, is there hope for >using them in a Scheme compiler? Or is SSA useful only for certain >procedures?
SSA doesn't address non-local jumps differently. Any control transfer can be accomodated by inserting the appropriate phi function to join data definitions on separate entry paths.
The class of upward-only continuations ( including longjmps and exceptions ) is relatively easy to handle. Because the code path is essentially being abandoned, locals within it don't need to be considered on exit. The join function(s) at the continuation point need only consider data external to the abandoned path which may have been side effected (including any persisting data which may have been newly created).
I've never tried to use SSA with general continuations but I suspect it is not a very good fit. I expect that each continuation path would have to be treated as if it were nested recursions or loops ... working on the assumption that coroutines could have modified data defined below the entry point unless an examination of the code could prove no lower entry point. Does anyone who has actually worked with this care to comment?
George Neuner wrote: > I've never tried to use SSA with general continuations > but I suspect it is not a very good fit. I expect that each > continuation path would have to be treated as if it were > nested recursions or loops ...
Sounds about right. Basically, every return point has to be treated almost as though it were a lambda expression, because the continuation (i.e. the closure of the return point) can be captured and called multiple times. This means you can't move side effects or some allocation effects across a return point. You can still move pure terminating computations across a return point, however.
SSA forms must deal with similar restrictions. You can't move arbitrary side effects across a call site, because the effect might be visible to the callee.
I think of A-normal form as a normal form for DAGs, SSA as a normal form for loops, and CPS as a normal form for interprocedural control flow. My Twobit compiler uses A-normal form together with several interprocedural optimizations that don't require CPS, and doesn't do any loop optimization per se at all.