ro...@corman.net (Roger Corman) writes: > >Yes. But the whole point a CL is that it must aid the programmer. I > >am sorry for you. You are in the compiler business :)
> >The tail recursive definition is far more readable and concise.
> The two functions state-1 and state-2 are identical, practically (other than > state-2 in my example is correct and the original was not). Certainly you cannot > say they are less concise.
> Then the only difference is foo(), which contains a simple loop and one which > will be identical for every finite state machine you build--you could easily > encapsulate the loop in a function or a macro.
> I won't debate "readable" because that is so subjective, but I think "far more > readable and concise" is an overstatement in any case. The original poster said > there was "no simple way", and in my opinion this is pretty simple.
> Back to the recursive definition, this only requires tail call merging if you > support valid terminals which have the number of state transitions greater than > the stack can support. That does not seem that common to me (I have never had > cause to implement such a FSM).
> I realize this was just an example, and that mutually recursive functions are > useful and solve real problems, and can exceed the stack depth. In these cases > having to resort to iterative approaches can conceivably be a burden. Just not, > in my opinion, in this case.
In this specific case no. However, as soon as you scale up a little bit (but not too far: you are probably better off with more complex data structures representing FSMs, should they become big.) the tail-recursive definitions do win.
Anyway, I have been following this whole thread and I convinced myself that tail-call merging is not easy, and that is your problem :) since there is some demand for it.
Having said so, I will happily continue programming following the simple rule: first get it right, then get it fast. If in the process I have to eliminate tail-calls in favor of looping constructs, too bad.
Cheers
-- Marco Antoniotti ======================================================== NYU Courant Bioinformatics Group tel. +1 - 212 - 998 3488 719 Broadway 12th Floor fax +1 - 212 - 995 4122 New York, NY 10003, USA http://bioinformatics.cat.nyu.edu "Hello New York! We'll do what we can!" Bill Murray in `Ghostbusters'.
Marco Antoniotti <marc...@cs.nyu.edu> writes: > Having said so, I will happily continue programming following the > simple rule: first get it right, then get it fast. If in the process > I have to eliminate tail-calls in favor of looping constructs, too > bad.
While I'd still like to see some sort of standardization of tail-call merging (and no, I haven't had a chance to start working on implementing the appropriate reflection in SBCL, yet), I agree with this for practical purposes. Using my lisp systems, I can get tail-call merging, so it's not a big problem. From my experience with e-lisp, I'm pretty sure it's not too difficult to write your own optimizer, particularly since you know what sort of tail calls you're after. So if hand eliminating the calls is too much of a PITA, it's not too bad to automate it yourself. It's still a PITA, but not too bad.
-- /|_ .-----------------------. ,' .\ / | No to Imperialist war | ,--' _,' | Wage class war! | / / `-----------------------' ( -. | | ) | (`-. '--.) `. )----'
ana...@earthlink.net (Andy Freeman) writes: > Bruce Hoult <br...@hoult.org> wrote in message <news:bruce-E95C11.10344117102001@news.paradise.net.nz>... > > Since C compilers don't themselves currently do tail call optimization, > > a compiler targetting C (as d2c does) can't use C functions to implement > > source language functions if you want tail call optimization. d2c > > normally *does* use C functions to implement Dylan functions. QED.
> I'm somewhat surprised that a serious implementation of a scheme-like > language targetting C would use C functions that way. I'd have > expected something with explicit continuation passing, as I did.
> In my application, the CPS transformation made certain analysis > unnecessary and let me handle the complete source language. If > I'd used C functions in the "natural way", I'd have had to do a > lot of analysis and still wouldn't have handled everything.
> I found that C compilers are really good at straight-forward > CPS-style C, so the suggested (by colleagues) inefficiency > wasn't an issue. In my case, the CPS-style C was occasionally > faster, and never slower, than the "natural" C.
Could you elaborate on this a bit? I'm curious, because I'm currently working on a lisp-to-c++ system (called CL-80, for "the easy 80% of CL"). I'm feeling very much torn in implementation decisions, because the major purpose of the system is to integrate extremely tightly with the C++ code (including its kind-of-weird multiprocessing system, semi-automatic memory management system, and weird custom object system), so every time I run into a performance-vs.-tight-integration decision, I go with integration (otherwise I'd just use an existing system). But it is piquing (or re-piquing) my interest in lisp(like)-to-c(like) language compilation techniques.
> I'd guess that there is some chance that a better C implementation > would help here, but it would be hard for me to know for sure without > doing the work.
I have the luxury of knowing what compiler my code will be compiled with (g++ 2.95), but unfortunately its documentation is pretty lousy. So when deciding if it would be more painful to do something myself, or to ensure that the c++ compiler can do it, I also have to factor in the experimentation needed to figure out *what* g++ can do, and precisely when. Which means that, at the moment, I'm producing pretty lousy code, both at the c++ and assembly levels :-( -- I can fix it later, if it's not too much work, but it would be nice if my c++ compiler made it easier for me.
-- /|_ .-----------------------. ,' .\ / | No to Imperialist war | ,--' _,' | Wage class war! | / / `-----------------------' ( -. | | ) | (`-. '--.) `. )----'
In article <3bcdb42c.1034288...@news.callatg.com>, ro...@corman.net
(Roger Corman) wrote: > >One Scheme-to-C compiler, "Chicken", uses a different technique, due to > >Cheney, which also translates Scheme function calls into ordinary C
Aaaarrrgghhh!! Brain fart. Due to Baker, of course!! oops.
> >function calls, but checks for stack overflow at the start of each > >function. When the stack overflows the code does a longjump() to a top > >level driver function, thus clearing the stack. In Chicken the C stack > >also serves as the nursery generation for the garbage collector.
> Chicken sounds very interesting. I am interested to find out more > about that. It sounds like a very creative implementation.
In article <8bbd9ac3.0110170824.6647e...@posting.google.com>,
ana...@earthlink.net (Andy Freeman) wrote: > Bruce Hoult <br...@hoult.org> wrote in message > <news:bruce-E95C11.10344117102001@news.paradise.net.nz>... > > Since C compilers don't themselves currently do tail call optimization, > > a compiler targetting C (as d2c does) can't use C functions to > > implement > > source language functions if you want tail call optimization. d2c > > normally *does* use C functions to implement Dylan functions. QED.
> I'm somewhat surprised that a serious implementation of a scheme-like > language targetting C would use C functions that way. I'd have > expected something with explicit continuation passing, as I did.
You'd have to take that up with Scott Fahlman and his colleagues since the decision was made long before the project left CMU.
> I found that C compilers are really good at straight-forward > CPS-style C, so the suggested (by colleagues) inefficiency > wasn't an issue. In my case, the CPS-style C was occasionally > faster, and never slower, than the "natural" C.
How and when do you unwind the stack?
> > Doesn't imply that the general case is hard though.
> The claim was that it was trivial to add tail-call merging to > an arbitrary implementation. We've now seen that the analysis to > determine whether it's appropriate is non-trivial. d2c demonstrates > that there are reasonable implementation choices that one might make > which make tail-call merging "difficult". (Corman made the same > claim, but we don't seem to believe him for some reason, perhaps > because he didn't know enough to do the right thing.)
It depends on what control you have over what stages of the compilation process. Generating C takes away a lot of the options that you have if you're generating your own assembly-language.
There are compilers which go through C and achieve tail-call optimization by introducing a filter program after the C compiler and before the assembler, doing it essentially as a kind of peephole optimization. Which appears to be entirely adequate.
> I'd guess that there is some chance that a better C implementation > would help here, but it would be hard for me to know for sure without > doing the work.
Targetting C-- instead of C is a likely future direction for the d2c compiler.
<t...@rtp.ericsson.se> wrote: > >>>>> "Bruce" == Bruce Hoult <br...@hoult.org> writes:
> Bruce> In article <4nr8s3i74k....@rtp.ericsson.se>, Raymond Toy > Bruce> <t...@rtp.ericsson.se> wrote:
> >> >>>>> "Bruce" == Bruce Hoult <br...@hoult.org> writes:
> Bruce> Since C compilers don't themselves currently do tail call > >> optimization,
> >> I thought gcc could do tail-call optimization in certain > >> situations?
> Bruce> d2c generates portable C, not whatever gcc happens to accept.
> If gcc doesn't accept portable C, what does it accept?
Lots of things that aren't portable C and which we therefore can't use even though they might be quite useful.
> I'm not claiming gcc will tail-call optimize your generated C code. > But you said "C compilers don't...do tail call optimization". I know > gcc (and Solaris cc as well) can do some tail-call optimization. I > checked.
I didn't mean that no C compiler does tail call optimization, but that you can't rely on it. In fact, if gcc does it in some casess (and I know that it makes some such claim), they do not appear to be any of the cases which we generate.
If there was an abundance of people working on d2c then it might be worth chasing it up as a gcc special case, but we don't have such an abundance and we need things to work on compilers such as CodeWarrior and VC++ which are not gcc.
Bruce Hoult <br...@hoult.org> wrote in message <news:bruce-B5DA12.17254418102001@news.paradise.net.nz>... > In article <8bbd9ac3.0110170824.6647e...@posting.google.com>, > ana...@earthlink.net (Andy Freeman) wrote: > > I found that C compilers are really good at straight-forward > > CPS-style C, so the suggested (by colleagues) inefficiency > > wasn't an issue. In my case, the CPS-style C was occasionally > > faster, and never slower, than the "natural" C.
> How and when do you unwind the stack?
Which stack?
The generated C was not recursive - I could have used Fortran IV (or whatever fortran has statically allocated activation records). (Yes - I'm assuming that C libraries for i/o, math, and memory allocation aren't recursive - I think that it's safe to assume that they're nicely bounded.)
The control "stack" for the source language was linked continuations, managed (created, returned, passed around, destroyed) by said generated C. I did optimize that management for self-tail call situations, but it isn't clear that that optimization was important. (It probably made a measurable difference in performance, but performance was so good that I probably should have spent that time on other projects. Unfortunately, that optimization was one of the first things that I did and it was designed in so deep that it would have been work to take it out so that I could do the obvious experiment. I don't know if it would have been less work to graft it on afterwards.)
Yes, there was a "top" level routine, not unlike what Corman suggested for the FSM problem. This routine accepted returned continuations from the generated C code and used them to determine what to call next. ("Top" is in quotes because it was actually involved for every continuation generated. It was at the top of the implementation's control stack, but it was used at almost every intermediate level in the application's control scheme. Remember, the generated C was not recursive.) The generated C did NOT call continuations; it was called with such continuations (by "top"), and it generated them (and returned them to "top") as necessary.
If you want to think of it in Steele's "goto with arguments terms, all interesting gotos were implemented by creating and returning a continuation to the "top", which then called the appropriate C routine. "Determining what to do next" was implemented as a call through a function pointer, much as Corman did in his example.
> It depends on what control you have over what stages of the compilation > process. Generating C takes away a lot of the options that you have if > you're generating your own assembly-language.
Sorry, but I don't see that. (I do see how portable C makes it unlikely that you'll generate specialized instructions, because they're often supported by the manufacturer's C compiler as special subroutines, but those subr calls can be generated so that the "right" compiler will do the right thing.)
When a C construct doesn't do exactly what you want, you can either change what you want or you can use another construct. (I didn't use C's loop constructs at all for my language's loops.) Managing continuations "by hand" seems extreme, but freed me from the limitations of C's calling model. It took a lot less time to write that management code (including the optimization) than it would have to write the analysis code that it made unnecessary.
One might argue that I used C as a glorified generic assembler, one that does register allocation, instruction generation, and some interesting name management. I'm not sure what the point of such an argument would be.
> > It depends on what control you have over what stages of the compilation > > process. Generating C takes away a lot of the options that you have if > > you're generating your own assembly-language.
ana...@earthlink.net (Andy Freeman) writes: > Sorry, but I don't see that. (I do see how portable C makes it > unlikely that you'll generate specialized instructions, because they're > often supported by the manufacturer's C compiler as special subroutines, > but those subr calls can be generated so that the "right" compiler > will do the right thing.)
What about things like stack frame management and calling conventions? You have no control over these in C. You can't get at the carry bit or the `direction' bit in the x86.
(vis-a-vis the stack frame management: Since you are using Baker's trick, it really doesn't matter what C does with the stack frame because you never really use it. However, the C code still manages this useless record. If you had control at the assembly code level, you could elide all the useless saving of return addresses, previous frame and stack pointers, etc.)
In article <8bbd9ac3.0110180715.32cea...@posting.google.com>,
ana...@earthlink.net (Andy Freeman) wrote: > Bruce Hoult <br...@hoult.org> wrote in message > <news:bruce-B5DA12.17254418102001@news.paradise.net.nz>... > > In article <8bbd9ac3.0110170824.6647e...@posting.google.com>, > > ana...@earthlink.net (Andy Freeman) wrote: > > > I found that C compilers are really good at straight-forward > > > CPS-style C, so the suggested (by colleagues) inefficiency > > > wasn't an issue. In my case, the CPS-style C was occasionally > > > faster, and never slower, than the "natural" C.
> > How and when do you unwind the stack?
> Which stack?
The C stack.
> Yes, there was a "top" level routine, not unlike what Corman suggested > for the FSM problem. This routine accepted returned continuations > from the generated C code and used them to determine what to call next.
Right. This is the standard Scheme-to-C technique, used by quite a number of implementations e.g. Gambit-C.
> ("Top" is in quotes because it was actually involved for every > continuation generated.
Well I assume you actually merge the continuations in basic blocks into a single C function, rather than returning to the driver for *every* expression...
> > It depends on what control you have over what stages of the compilation > > process. Generating C takes away a lot of the options that you have if > > you're generating your own assembly-language.
> Sorry, but I don't see that. (I do see how portable C makes it > unlikely that you'll generate specialized instructions, because they're > often supported by the manufacturer's C compiler as special subroutines, > but those subr calls can be generated so that the "right" compiler > will do the right thing.)
Well, you're losing lots of things with this scheme. What could be a simple GOTO in a native compiler becomes a multi-valued function return -- you have to somehow pass back something identifying the next continuation, plus its arguments, and C isn't as good at returning multiple values as it is at passing multiple arguments -- and then the driver needs to extract the continuation and call it (which will be a couple more instructions than a simple GOTO). Basically the C compiler is generating code to give you facilities that you then don't use, and you're having to implement your own version as WELL as the C ones.
A compiler such as Chicken that actually uses C function calls normally gets all sorts of speed advantages through being able to use direct function calls instead of indirect, use the C argument passing mechanism to pass multiple return values, and so forth. The C compiler generates a little extra code for function return that is never actually executed but that is a small price in space and no price in speed, except in secondary effects through cache being wasted (and it's reducable through non-standard "noreturn" compiler directives). The cost is having to check for stack overflow at the start of each function and extra code to recover from that -- basically the technique that your implementation uses, but in Chicken it is seldom used and can be shared or interpreted or table-driven with negligable speed penalty.
Mapping source functions directly to C functions as d2c does is not a viable technique for Scheme because it doesn't allow call/cc but Dylan doesn't have that. It also certainly makes tail-call optimization harder and d2c will probably *never* support tail-call optimization between top-level functions.
Interestingly, the commercial "Functional Developer" doesn't either, even though it generates machine code directly. So, in both Gwydion "d2c" and Functional Developer this silly code...
define method countto(i, t) if (i = 0) t else countto(i - 1, t + 1) end end;
countto(1000000, 0);
... produces a stack overflow, while this code ...
define method countto(i) local method foo(i, t) if (i = 0) t else foo(i - 1, t + 1) end end; foo(i, 0); end;
countto(1000000, 0);
... works fine in both implementations. The concensus in the Dylan community seems to be that tail-call optimization is required, but not for top-level functions.
Another example:
define method classify(n :: <integer) local method even?(i) if (i = 0) "even" else odd?(i - 1) end end, method odd?(i) if (i = 0) "odd" else even?(i - 1) end end; even?(n); end;
classify(123456789);
This works fine in Functional Developer. At the moment it doesn't work in d2c (stack overflow). It should. Dylan programmers have a right to *expect* this code to be optimized, and I plan to ensure that d2c fulfils that expectation.
Dylan of course isn't CL, and expectations differ in the two communities. As these examples show, Dylan isn't Scheme either. In many respects -- including the question of tail-call optimization -- it lies somewhere between the two.
j...@itasoftware.com wrote in message <news:r8s155vp.fsf@itasoftware.com>... > > Bruce Hoult <br...@hoult.org> wrote in message <news:bruce-B5DA12.17254418102001@news.paradise.net.nz>... > > > It depends on what control you have over what stages of the compilation > > > process. Generating C takes away a lot of the options that you have if > > > you're generating your own assembly-language.
> ana...@earthlink.net (Andy Freeman) writes: > > Sorry, but I don't see that. (I do see how portable C makes it > > unlikely that you'll generate specialized instructions, because they're > > often supported by the manufacturer's C compiler as special subroutines, > > but those subr calls can be generated so that the "right" compiler > > will do the right thing.)
> What about things like stack frame management and calling conventions?
mu. One's implementation strategy determines whether those questions are relevant. (I can't get excited about the x86 overflow bit - I don't know why I'd ever care about it outside of library routines, and I'm perfectly willing to write them in assembler when they're an issue.)
> (vis-a-vis the stack frame management: Since you are using Baker's > trick, it really doesn't matter what C does with the stack frame > because you never really use it. However, the C code still manages > this useless record. If you had control at the assembly code level, > you could elide all the useless saving of return addresses, previous > frame and stack pointers, etc.)
I don't know about other archs, but fairly old gcc on sparcs was reasonably good about killing unused instructions. The generated code tended to be leaf procedures, so return addresses weren't an issue and I didn't see anything else in the assembler output worth worrying about.
Now, one can argue that I paid extra because I didn't have decent register allocation. However, I don't buy it.
With modern (wide and deep) machines, instructions are almost free. Stalls are the thing to minimize. My "top" routine's indirect calls are worth worrying about - C's stack frames aren't. (The next architecture advances will reduce the cost of indirect calls; returns stopped being an issue 5 years ago.)
For lisps and schemes, library functions take a signficant amount of time so one would expect that overhead to be negligible. In my case, there weren't any library functions to hide behind, but the observed overhead was negligible. (It was almost unmeasurable.)
BTW - My project was a skunk works project. The "real project", had lots of resources, time, and smart people who worried about such "overhead", but was never finished, and was slower for the things it did do.
Of course, one can use this approach to generate assembler directly if you really need the last 0.2% performance. Few of us are in a position where doing so makes economic sense. Almost all of us are in a position where algorithms and efficient development matter more.
Bruce Hoult <br...@hoult.org> wrote in message <news:bruce-2A6ACB.13111919102001@news.paradise.net.nz>... > Well I assume you actually merge the continuations in basic blocks into > a single C function, rather than returning to the driver for *every* > expression...
Right. I didn't generate continuations for open coded basic blocks or sequences of library calls.
However, my tests for "negligible" overhead used programs that used continuations every few simple statements. Since these statements were usually implemented with C that became a few sparc instructions, the overhead got its chance to shine. On more realistic code, continuations were far less common.
> Well, you're losing lots of things with this scheme. What could be a > simple GOTO in a native compiler becomes a multi-valued function return > -- you have to somehow pass back something identifying the next > continuation, plus its arguments, and C isn't as good at returning > multiple values as it is at passing multiple arguments -- and then the > driver needs to extract the continuation and call it (which will be a > couple more instructions than a simple GOTO).
Hmm. Since almost all of my source language's gotos became C gotos, "have to" seems a bit strong.
My source language didn't have multiple return values, but if it had, I'd have used the same scheme for returning values that I used for passing them.
I found that the dumb thing (source vars stored in structs) was almost as fast as the "smart thing" (trying to use C args or caching struct fields in C vars and relying on register allocation). Why? Modern machines are more sensitive to stalls than instruction counts. Indexed loads/stores are "good enough" if you're doing much with the values.
> Basically the C compiler > is generating code to give you facilities that you then don't use, and > you're having to implement your own version as WELL as the C ones.
Yup. I found that the "pain" of rolling my own was far less than the pain of trying to use the "similar, but not close enough" facilities provided by C.
I think of C's calling mechanism as being like the VAX "call" instruction. That instruction does a lot of things, and when you need those things, it's the fast way to get them. However, when there's a mismatch, ....
> non-standard "noreturn" compiler directives). The cost is having to > check for stack overflow at the start of each function and extra code to > recover from that -- basically the technique that your implementation > uses, but in Chicken it is seldom used and can be shared or interpreted > or table-driven with negligable speed penalty.
I don't understand Chicken, but I don't have any code to check for "stack overflow". I might run out of space for continuations, but that's a different issue.
In article <8bbd9ac3.0110190744.79cb0...@posting.google.com>,
ana...@earthlink.net (Andy Freeman) wrote: > > non-standard "noreturn" compiler directives). The cost is having to > > check for stack overflow at the start of each function and extra code > > to > > recover from that -- basically the technique that your implementation > > uses, but in Chicken it is seldom used and can be shared or interpreted > > or table-driven with negligable speed penalty.
> I don't understand Chicken, but I don't have any code to check for > "stack overflow". I might run out of space for continuations, but > that's a different issue.
Right.
When you return to the top level for every continuation you don't need a check for stack overflow. Chicken OTOH doesn't return to the top level every time. It uses normal C function calls to call the next continuation and then when it runs out of stack space (generally set to about the size of the machine's L1 cache seems to be fastest) it uses a longjump() to dispose of the entire stack at once and return to a top level driver loop like yours.
Thus on a typical machine which passes C function args in registers (PowerPC, SPARC, MIPS, Alpha...) the overhead to call a continuation is:
Chicken:
- get the args into the right registers (usually free) - normal, direct, C function call - check stack pointer against a global stack limit variable
Trampoline (e.g. yours):
- save the args and continuation address into a struct - C function return returning the struct (?by reference, or is it a global?) - indirect function call - pull the args out of the struct
On something like the x86 which has few registers and normally passes arguments on the stack yours is probably pretty good. The various x86 vendors now throw huge resources at load/store units and accessing L1 cache nearly as fast as registers.
On a RISC, though, your technique will I think really hurt -- it's going to result in most code being serialized on the load/store unit.
The indirect call in yours will hurt bad on all machines, and probably worse and worse as time goes on.
Chicken also, incidentally, uses the C stack as the nursery generation of the heap. Heap allocations are done with either calloc() or else as C local variables. When stack overflow occurs and it is about to be unwound (using longjump()) a garbage collection is done, transferring any live data from the stack to the long-term heap.
Roger Corman wrote in message <3bcdb42c.1034288...@news.callatg.com>...
>>One Scheme-to-C compiler, "Chicken", uses a different technique, due to >>Cheney, which also translates Scheme function calls into ordinary C >>function calls, but checks for stack overflow at the start of each >>function. When the stack overflows the code does a longjump() to a top >>level driver function, thus clearing the stack. In Chicken the C stack >>also serves as the nursery generation for the garbage collector.
>Chicken sounds very interesting. I am interested to find out more about that. >It sounds like a very creative implementation.
Well, thanks! But the idea is entirely Henry Baker's. Chicken is just the first implementation that uses this approach, AFAIK.
Converting to CPS and translating this straight to C solves a number of problems quite elegantly, because you get things like (efficient) first class continuations, multiple values and tail-recursion (in the sense that a tail-recursive function doesn't run out of memory, even if it allocates continuation frames). But I should also note that there are some caveats (as usual):
1. This system allocates storage at a frightning pace. The generational garbage collection helps, but this should be kept in mind. 2. Straightforward CPS-converted code is horribly inefficient. A large part of the highlevel-optimizations Chicken performs are used to simplify the code back to a more efficient form. 3. As Bruce already points out there are many stack-checks to be done. Continuation-procedures (those introduced by the CPS conversion) normally don't need those, but you still get a lot of checking overhead.
Direct-style translation into C (as systems like Bigloo, Stalin or KCL and it's children) will normally generate more efficient code, but of course have awful problems with call/cc and/or tail-rec. The trampoline (driver-loop) approach that is used for example in Gambit is somehow in between those two extremes: I made the experience that it is sometimes able to outperform direct style compilers in fixnum-/unsafe-mode, but is often slower than Chicken when the declarations are removed.
To demonstrate the translation concept, I will give an example. Consider the following code:
Now if we compile this with "-benchmark-mode", then we get the following (comments inserted by me):
static void f14(int c,C_word t0,C_word t1,C_word t2){ C_word tmp; C_word t3; C_word t4; C_word t5; C_word ab[3],*a=ab; /* Check stack */ if(!C_stack_probe(&a)){ /* Save current continuation and do a minor GC */ C_adjust_stack(3); C_rescue(t0,2); C_rescue(t1,1); C_rescue(t2,0); C_reclaim(tr3,f14);} /* We land here if stack is OK, or GC is done */ if(C_truep((C_word)C_i_nullp(t2))){ t3=t1; ((C_proc2)(void*)(*((C_word*)t3+1)))(2,t3,C_fix(0));} else{ /* Allocate continuation closure */ t3=(*a=C_CLOSURE_TYPE|2,a[1]=(C_word)f28,a[2]=t1,tmp=(C_word)a,a+=3,tmp); t4=(C_word)C_slot(t2,C_fix(1)); /* Fetch global variable "len" */ t5=*((C_word*)lf[0]+1); /* Invoke "len"s procedure (in tail position, since this is CPS) */ ((C_proc3)(void*)(*((C_word*)t5+1)))(3,t5,t3,t4);}}
In article <3bd0cbd8$0$30612$9b622...@news.freenet.de>, "felix"
<felixundd...@freenet.de> wrote: > Direct-style translation into C (as systems like Bigloo, Stalin or KCL > and it's > children) will normally generate more efficient code, but of course have > awful problems > with call/cc and/or tail-rec. The trampoline (driver-loop) approach > that is used for example in Gambit is somehow in between those two > extremes: I made the experience that it is sometimes able to outperform > direct style compilers in fixnum-/unsafe-mode, but is often slower than > Chicken when the declarations are removed.
Hi Felix,
Something I meant to mention before but didn't is the ability to use standard C tools such as gdb and gprof (debugging and code profiling, for the non-Un*x types).
Direct-style translation to C makes it pretty straightforward to use standard tools. Trampoline-style such as in Gambit will surely totally distroy the usefulness of gprof -- the time in each function might not be too bad, but the call tree stuff will be useless.
How do you get on with Chicken? I imagine it's somewhere in between.
On Sat, 20 Oct 2001 02:44:29 +0200, "felix" <felixundd...@freenet.de> wrote:
>Roger Corman wrote in message <3bcdb42c.1034288...@news.callatg.com>...
>>>One Scheme-to-C compiler, "Chicken", uses a different technique, due to >>>Cheney, which also translates Scheme function calls into ordinary C >>>function calls, but checks for stack overflow at the start of each >>>function. When the stack overflows the code does a longjump() to a top >>>level driver function, thus clearing the stack. In Chicken the C stack >>>also serves as the nursery generation for the garbage collector.
>>Chicken sounds very interesting. I am interested to find out more about that. >>It sounds like a very creative implementation.
>Well, thanks! But the idea is entirely Henry Baker's. Chicken is just the >first implementation that uses this approach, AFAIK.
I sure owe a lot to Henry Baker--all of his great papers and ideas.
I would think of Chicken as a stack-less architecture, which takes advantage of C stack management code generation to implement the first generation of the memory manager. I understand ML is also stack-less (is that right?). I think that is very interesting, and understand from Baker's papers that you can get very good performance with such a system. I especially like how it so blows the minds of most programmers.
>3. As Bruce already points out there are many stack-checks to be done. > Continuation-procedures (those introduced by the CPS conversion) normally > don't need those, but you still get a lot of checking overhead.
Although to be entirely platform-independent the stack checks are required. However, I would think on a particular platform you could get rid of them by instead using the virtual memory system. You set the page above the stack (or nursery, in this case) to read-only, and then catch and handle the exception. This should help things, shouldn't it? The Win32 platform does this with the stack automatically, so code never needs to check for stack overflow.
I haven't followed this thread, so please excuse me for intruding with something that is perhaps completely irrelevant or already covered.
I was thinking a about possible syntax for explicitly requesting tail-call merging. Isn't return-from almost that operator? Every return-from used to exit a function is in effect a tail call for that function (?). Maybe return-from could be extended to communicate the programmer's intention when the result-form is a function call?
Something along the lines of
return-from name result &key tail-call-merge if-cannot-merge
Where tail-call-merge is a boolean, and if-cannot-merge one of :error, :warn, or :ignore.
* Frode Vatvedt Fjeld <fro...@acm.org> | I was thinking a about possible syntax for explicitly requesting | tail-call merging. Isn't return-from almost that operator? Every | return-from used to exit a function is in effect a tail call for that | function (?). Maybe return-from could be extended to communicate the | programmer's intention when the result-form is a function call?
Since you could wrap the entire function body in a return-from, I think this would not be helpful. Figuring out whether a function call is in a tail-call-mergable "position" is not helped by such use of this form.
/// -- Norway is now run by a priest from the fundamentalist Christian People's Party, the fifth largest party representing one eighth of the electorate. -- The purpose of computing is insight, not numbers. -- Richard Hamming
Erik Naggum <e...@naggum.net> writes: > Since you could wrap the entire function body in a return-from, I > think this would not be helpful. Figuring out whether a function > call is in a tail-call-mergable "position" is not helped by such use > of this form.
I think you misunderstood me. I'll try to spell out what I meant more clearly.
(defun foo (x y) (when x (return-from foo (bar))) (if y (baz) (return-from foo (bax))))
Here, all of the forms (bar), (baz) and (bax) are in a tail-call-mergable position, but only (bar) and (bax) explicitly so.
Now, my suggestion was that one can rewrite your typical (tail-) recursive function:
(defun member (item list) (cond ((null list) nil) ((eq item (car list)) list) (t (return-from member (member item (cdr list)) :tail-call-merge t :if-cannot-merge :error))))
..and be sure your member function either runs in constant space or compilation will fail. Or the compiler will emit a warning if that is what you specify. Of course, if the result-form is anything besides a function call, tail-call-merging will fail.
> (defun foo (x y) > (when x > (return-from foo (bar))) > (if y (baz) > (return-from foo (bax))))
> Here, all of the forms (bar), (baz) and (bax) are in a > tail-call-mergable position, but only (bar) and (bax) explicitly so.
As we discussed, if there is a closure over the block FOO, then it will veto the ability of these other things to merge. I therefore don't like the term "tail-call[-mergable] position" because it suggests that position is what determines a tail call. It does not. What is or is not a tail call is determined BOTH by position AND by other semantics.
> ..and be sure your member function either runs in constant space or > compilation will fail. Or the compiler will emit a warning if that is > what you specify. Of course, if the result-form is anything besides a > function call, tail-call-merging will fail.
I won't even bother to ask about (defun member (item list &optional (tail-merge t)) ... (return-from member (member item (cdr list)) :tail-call-merge tail-merge ...))
My personal feeling is that the language should either define a reasonable semantics or not, but that keyword arguments controlling the very lowest level of language sementics is both unsightly and a waste of time. There are myriad design decisions in the language that might want such things, and the result of actually using such a style would be too gross to look at.
> [..] What is or is not a tail call is determined BOTH by position > AND by other semantics.
Of course.
> I won't even bother to ask about > (defun member (item list &optional (tail-merge t)) > ... (return-from member (member item (cdr list)) > :tail-call-merge tail-merge ...))
I didn't really consider this, but my intention was that the keyword arguments would have to be constants. Not to be evaluated, I suppose.
> My personal feeling is that the language should either define a > reasonable semantics or not, but that keyword arguments controlling > the very lowest level of language sementics is both unsightly and a > waste of time.
I think it's ugly too. Maybe &optionals would be a slight improvement.
> There are myriad design decisions in the language that might want > such things, and the result of actually using such a style would be > too gross to look at.
* Erik Naggum | Since you could wrap the entire function body in a return-from, I think | this would not be helpful. Figuring out whether a function call is in a | tail-call-mergable "position" is not helped by such use of this form.
* Frode Vatvedt Fjeld | I think you misunderstood me. I'll try to spell out what I meant more | clearly. | | (defun foo (x y) | (when x | (return-from foo (bar))) | (if y (baz) | (return-from foo (bax))))
Well, _I_ think I understand your proposal better than you do yourself. :) Your proposal changes and improves exactly nothing, but just adds a false impression of something that still has be determined the same way it was before it was introduced.
(defun foo (x y) (return-from foo (if x (bar) (if y (baz) (bax))))))
| Here, all of the forms (bar), (baz) and (bax) are in a tail-call-mergable | position, but only (bar) and (bax) explicitly so.
I do not agree with this. Figuring out whether a function call is tail-call-mergeable is generally not possible through a trivial visual inspection. In the above form, all of the function calls are pretty obviously the last call the function will make, and if you cannot see whether a function call will yield the return value or not, it certainly will not help to have a piece of wrapper code around it that might _lie_.
| Now, my suggestion was that one can rewrite your typical (tail-) | recursive function: [...] to this: [...] ..and be sure your member | function either runs in constant space or compilation will fail.
My argument against your proposal is that you could wrap the entire function in this newfangled return-from form and not reduce the problem one iota.
| Or the compiler will emit a warning if that is what you specify. Of | course, if the result-form is anything besides a function call, | tail-call-merging will fail.
Now, this is getting so specific it looks like a _real_ kluge. How about if someone provided a compiler-macro for that function? How about a real macro? Would you really want so much of your code to fail compilation even though nothing had materially changed? In other words, if you had a form like you propose, the only place I can see it being really useful is at the _outermost_ level, like I have demonstrated above, _not_ at the innermost, terminal function call that it _might_ apply to.
/// -- Norway is now run by a priest from the fundamentalist Christian People's Party, the fifth largest party representing one eighth of the electorate. -- The purpose of computing is insight, not numbers. -- Richard Hamming
Erik Naggum <e...@naggum.net> writes: > Well, _I_ think I understand your proposal better than you do > yourself. :)
Maybe so.. :) but please consider it an "idea" rather than "proposal".
> Your proposal changes and improves exactly nothing, but just adds a > false impression of something that still has be determined the same > way it was before it was introduced.
> (defun foo (x y) > (return-from foo > (if x > (bar) > (if y > (baz) > (bax))))))
I actually don't see how this example relates to my .. uhm.. idea in any way. What is it meant to show? I wasn't aiming to change how compilers determine what are tail-call-mergable function calls, but to provide a mechanisms for programmers to express "I want this function call to be 'merged'".
> [..] it certainly will not help to have a piece of wrapper code > around it that might _lie_.
I don't understand. How can return-from lie? It's guaranteed to terminate the block/function, so long as no sub-form of the result-form performs a non-local exit (in which case control never reaches the function application anyway, so it's not a problem).
> My argument against your proposal is that you could wrap the entire > function in this newfangled return-from form and not reduce the > problem one iota.
You must be thinking of some other problem than I am.
> Now, this is getting so specific it looks like a _real_ kluge.
It's certainly kludgy.
> How about if someone provided a compiler-macro for that function? > How about a real macro?
What if? The compiler must deal with macros and compiler-macros regardless, and I don't see how they come into play here.
* Frode Vatvedt Fjeld <fro...@acm.org> | What if? The compiler must deal with macros and compiler-macros | regardless, and I don't see how they come into play here.
The questions I have tried in vain to raise are: How much knowledge do you require to be able to use your idea productively? What does it mean to employ your suggested form around another form?
Common Lisp does not, in fact, provide you with any information how a particular form will be compiled. A function call may be inlined, which you may defeat with your desire for a tail-call-merging request, it may be turned into several function calls, most anything can happen between source and machine code. Therefore, if your idea is not able to encompass an entire function's body, it is completely useless, because it may _have_ to encompass an entire function's body even if you think you are giving it a function call. If you really want to tell your compiler that an _actual_ function call should be tail-call merged, you first have to know if your compiler will actually generate a non-merged function call that _could_ be a merged tail call.
In other words, it is an idea that fits languages that do not do anything "interesting" with the code they compile. Instead of providing something useful, such as important information for your compiler, you are instead limiting your compiler's ability to do interesting things with your code.
If you want to do something useful by changing the language, I suggest finding/inventing ways to tell your compiler that you promise never to change the definition of some functions, classes, methods, whatever. That kind of information would be tremendously useful to the compiler, because it can make assumptions that it cannot make today and which necessarily means less opportunity for some kinds of optimization.
/// -- Norway is now run by a priest from the fundamentalist Christian People's Party, the fifth largest party representing one eighth of the electorate. -- The purpose of computing is insight, not numbers. -- Richard Hamming
In article <3212616172232...@naggum.net>, Erik Naggum <e...@naggum.net> wrote:
> Common Lisp does not, in fact, provide you with any information > how a particular form will be compiled. A function call may be > inlined, which you may defeat with your desire for a tail-call- > merging request
But inlining is one method of implementing the goals of tail-call merging.
> If you want to do something useful by changing the language, I > suggest finding/inventing ways to tell your compiler that you > promise never to change the definition of some functions, > classes, methods, whatever. That kind of information would be > tremendously useful to the compiler, because it can make > assumptions that it cannot make today and which necessarily > means less opportunity for some kinds of optimization.
I agree. That is exactly what Dylan does with its "sealing" declarations.
> > Well, _I_ think I understand your proposal better than you do > > yourself. :)
> Maybe so.. :) but please consider it an "idea" rather than "proposal".
> > Your proposal changes and improves exactly nothing, but just adds a > > false impression of something that still has be determined the same > > way it was before it was introduced.
> I actually don't see how this example relates to my .. uhm.. idea in > any way. What is it meant to show? I wasn't aiming to change how > compilers determine what are tail-call-mergable function calls, but to > provide a mechanisms for programmers to express "I want this function > call to be 'merged'".
But it doesn't say that.
(defun bar (x y) (funcall x (1+ y))) (defun foo (y) (let ((x #'(lambda (x) (return-from foo x)))) ;<-- first RETURN-FROM (return-from foo ;<-- second RETURN-FROM (bar x y))))
You might think the second RETURN-FROM was requesting tail-calling, but it simply isn't. First, it's in no more of a tail call position than it was if you omitted the RETURN-FROM. Second, it cannot be tail called because there is pending state between the RETURN-FROM and the BLOCK that cannot be disposed of until BAR has finished executing.
If all you mean by "request" is that it should be done "if possible", then I would say that to an advocate of aggressive tail calling, EVERY form is a request for tail calling "if possible". Some things are not able to be tail called, and if saying that's enough to undo the request, then you're asking the same of every form.
> > [..] it certainly will not help to have a piece of wrapper code > > around it that might _lie_.
> I don't understand. How can return-from lie? It's guaranteed to > terminate the block/function,
It's not a guarantee that there is no intervening state inhibiting correct and immediate return. See example above.