Here's my thoughts for those that are interested. (Haven't written the
Skulpt code yet, just trying to figure out what to write...)
-----
In the back of my mind I was hoping to use 'switch' to deal with
generators, so that something like:
def f(n):
num = 0
while num < n:
yield num
num += 1
would translate into something like this in C (of course with
iteration protocol, parameter capture, closure over variables, etc.
added):
typedef struct State
{
int location;
int num;
} State;
int f(State* state, int n)
{
switch (state->location)
{
case 0:
state->num = 0;
while (state->num < n)
{
state->location = 1;
return state->num;
case 1:
state->num += 1;
}
state->location = -1;
}
return -1;
}
The sneaky part here being that "case 1" is inside the body of the
while loop which lets us resume control at the correct location while
emitting pretty 'normal' code otherwise.
Sadly, I guess due to jihad against goto, it seems this form of
control isn't valid in JS. (There's also no goto with which to emulate
this, of course).
This makes things quite a lot more cumbersome, or exposes me as a
closet C programmer at the very least. Instead, we need to transform
the generated code much more heavily. Without 'sneaky' switchs and
just using straight gotos we want something like:
int f2(State* state, int n)
{
if (state->location == 0) goto state0;
if (state->location == 1) goto state1;
return -1;
state0:
state->num = 0;
while (state->num < n)
{
state->location = 1;
return state->num;
state1:
state->num += 1;
}
state->location = -1;
return -1;
}
Now, we can do some nasty transformation on this to remove the gotos:
First, get rid of 'state0'. Eliminate the first forward goto by
skipping what it skips if it's true using an if (this is eliminating a
forward-jumping-goto at the same nesting level):
int f3(State* state, int n)
{
if (!(state->location == 0))
{
if (state->location == 1) goto state1;
return -1;
}
state->num = 0;
while (state->num < n)
{
state->location = 1;
return state->num;
state1:
state->num += 1;
}
state->location = -1;
return -1;
}
Finally, get rid of 'state1'. First, we have to move the goto out of
its containing 'if', and then we have to move it back 'into' the while
loop. This is accomplished by modifying the loop condition, and the
statements above the state1 label to get execution to the correct
location. Then, we make sure to reset the condition so the next around
the loop, the while condition will be executed normally.
int f4(State* state, int n)
{
int _t0 = 0;
if (!(state->location == 0))
{
_t0 = state->location == 1;
if (!_t0)
{
return -1;
}
}
if (!_t0)
{
state->num = 0;
}
while (_t0 || state->num < n)
{
if (!_t0)
{
state->location = 1;
return state->num;
}
_t0 = 0;
state->num += 1;
}
state->location = -1;
return -1;
}
Ugh!
This works in general, I think, but there's a lot of cases to handle
(moving inwards into loops and ifs, outwards, etc.). It's certainly
more complicated than I'd like, and makes me yearn for goto or
generators to be widely deployed in JS.
It's also going to be hairy to test which worries me.
The transformation from goto's to if/while/switch is well known from
the olden days when researchers seem to have been hell-bent on
removing every trace of anything that looked like goto (see f.e.
http://www.sable.mcgill.ca/~hendren/ftp/erosa/thesis.ps.gz) but it's
kind of messy. This is a somewhat reduced version because we know the
gotos are really just a 'switch' at the top of the function to get to
the correct location, but the inwards transformations are still fairly
involved.
There's still the issue of rewriting locals to access state data
variables rather than being local accesses, but I don't think
that will be too hard.
Anyway, if anyone made it this far, I'd appreciate hearing any better
ideas before I start mashing up the compiler.
scott
> What I've been doing so far in js with mootools to replicate the kind
> of stuff I can achieve in python with yields is to use events in
> classes.
>
> Can't we use a registry of function triggers that would send data to
> events on yield, and on the other part (the caller, which iterates on
> the yield result) something like "wait for event reply" (without being
> blocking, maybe transparent callbacks ?) ? (and of course we should
> yield some kind of "end of data" event that is equivalent to the
> python raise to tell that the iteration is over)
I'm not sure I understand what you're getting at. Generally callbacks
seem like they could work, but I'm not quite sure *how* they'd work.
The two main requirements are that:
1) the iterator object that's returned from a generator function has
to be "drive"able from a .next() method (that's the part I don't see
how to do with callbacks), and
2) there has to be a mechanical translation from python to javascript
of course because it's a compiler that's doing it, not a person :)
Could you give an example of the method you're proposing?
scott
I think what you describe is what I thought I could do initially (and
works well in C), but I don't think it's possible in javascript. The
problem is that there's no way to jump back "inside" the loops or if
statements on the next execution.
For example, in "Simple loop generator" in your pdf, you can't jump to
"0,1" in the JS code.
In order to get "in" to those loops, the control flow has to be
modified so that it gets there naturally on the next time the function
is called. That's what all my crazy goto elimination code was about in
the long previous email.
It's basically adding extra booleans as required at each if/loop so
that the flow can be controlled when the function is re-entered based
on the previous saved location. Once it's fallen through based on
those booleans, the booleans are all turned "off" and execution
continues normally until a return.
Clear as mud?
scott
I think we're converging.
I was convinced at first, but I don't think it's quite workable like
that. For example, consider compiling:
def f(n):
i = 0
while i < n:
yield i
i = 100
yield i
i += 1
(Ignoring that it's stupid code...)
If that was compiled with a switch, it wouldn't get back into the
while loop if n was < 100, after the first yield, on reentry, it would
switch to just before the while, but wouldn't enter the while, so
wouldn't be able to get to the second yield. That's why all those
booleans are required to be added to the flow.
Does that make sense, or am I missing something obvious now? :)
scott
Yup, I think that works now. I wonder about some perverse cases with
nested loops now though once we start adding conditions.
while i < n:
while i < n:
yield i
i = 10
yield i
The outer loop would also need to allow re-entry based on the
placeholder of loops inside of it. Not too bad for something like
above, just another "|| s2 != 0". Is there a case where that gets more
complicated than just or'ing in all children? I can't think of one off
the top of my head, but I'm not sure.
Supporting 'break' becomes tricky too. Currently break is compiled to
a JS break, but that'll break out of the switch rather than the while.
Interestingly 'continue' seems like it'll be fine though.
Anyway, I think I might start trying to implement this now and see how
things go. Doing the actual transformation in the compiler is going to
be a whole separate problem!
Thanks for your feedback in thinking through this.
scott
Oh, you're right, the outer loop would need to allow re-entry based on
inner loop placeholders.
The alternative is to use a separate variable (ie w1), rather than the
placeholder, for each while loop, set to 1 as soon as the while loop
is entered and back to 0 just before it is left. I think this is back
to what you were writing about originally...we have convergence!
s0 = 0;
s1 = 0;
w1 = 0;
function f(n)
{
switch(s0)
{
case(0):
i = 0;
s0 = 1;
case(1):
while (i < n || w1 == 1)
{
w1 = 1; // signify we're now in the loop
switch(s1)
{
case(0):
s1 = 1;
return i;
case(1):
i = 100;
s1 = 2;
return i;
case(2):
i += 1;
s1 = 0; // reset.
}
w0 = 0; // about to leave loop so reset loop holder
}
}
// mark generator as finished
}
This does prevent ever-lengthening while loop conditions caused by
tracking any inner loop placeholders, no matter how much nesting
occurs.
> Supporting 'break' becomes tricky too. Currently break is compiled to
> a JS break, but that'll break out of the switch rather than the while.
> Interestingly 'continue' seems like it'll be fine though.
>
> Anyway, I think I might start trying to implement this now and see how
> things go. Doing the actual transformation in the compiler is going to
> be a whole separate problem!
Good luck!
Justyn.
I'm not clear on the problem you're pointing to here. The shadowing of
'vars'? Or the depth of the callstack? The depth will (I think) be the
same python's (or possibly * a constant factor if there needs to be
some helper trampolining going on)
The calling discipline is the same as real python's (as I understand
it at least): the function becomes a constructor for
iterator-generator, which has a next() method that returns the next
value in the iteration.
The sneaky switch business is just to save the location of where the
yield happened during the previous call. During compilation,
function-local references are modified to reference into a state
object that's part of the iterator-generator. Function arguments are
closed over during the initial call.
That said, I've only just started the implementation so far, and not
tested it much at all, so there could easily be some Big Problem
that's not come up yet.
scott