I'm not sure what the point of passing in parameters to the
compilation is. (Not that I don't see the point of having changeable
settings for compilers, but that's something separate) The interface
is simple on purpose -- in most cases either there *are* no
parameters possible (Perl's eval and its equivalent in other
languages) or there's no reasonable way to know what the parameters
are (Perl's eval evaluating code of a different language). The syntax
just isn't there to have them, and is really unlikely to ever
materialize, so there's little point in putting in parameters to the
compilation. In those cases where the programmer may know what to
change, they can tweak any external knobs the compiler module might
have programmatically.
The whole "name for the function I'm compiling" thing isn't an issue
either, or at least it shouldn't be. The code being compiled is
implicitly a subroutine -- you don't have to have code that reads:
.sub foo_1234423_some_random_text
.
.
.
.end
and go look for 'foo12 34423_some_random_text' in a namespace
somewhere. Just leave out the .sub/.end (they should be implied) and
the returned PMC is a sub PMC for your nicely anonymous sub. Which is
fine, and as it should be.
--
Dan
--------------------------------------it's like this-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk
...now. That can be changed.
>The PIR
>compiler needs compilation units. If the compiler is PASM, it'll compile
>whatever is fed to it.
We can have an implied compilation unit if things are properly set
up. I think that's not unreasonable if we can work out the situations
under which it's appropriate.
>And C<compile> is a synonym for C<invoke> now, but the latter implies a
>branching operation, while the former doesn't.
Right, but that's just a quirk of implementation. It could change
>> The PIR
>> compiler needs compilation units. If the compiler is PASM, it'll compile
>> whatever is fed to it.
> We can have an implied compilation unit if things are properly set up. I
> think that's not unreasonable if we can work out the situations under
> which it's appropriate.
I don't see a good reason to change the current behavior. The PIR
compiler takes compilation units. A lispish compiler could take a form.
It's like feeding the latter stuff without the enclosing parenthesis.
leo
What, besides me saying "Change the current behaviour?" The inability
to compile and return truly anonymous subs in PIR is, by itself,
enough to warrant the change.
Still runs into the issue of not returning a sub PMC to use.
I can see not wanting to burden the PIR parser with figuring out
whether a chunk of PIR it gets is 'normal' source with subs and all,
or an anonymous sub. Having an alternate PIR parser that just wraps a
".sub/.end" around the code and returns the PMC for the resulting sub
wouldn't be out of line.
The AST processor is going to have to deal with this -- there are a
lot of languages, including perl, that are going to want to throw
unmarked compilation units, and nested compilation units, at it.
>What else is needed for anoymous subs? How do we get at the
>subroutine object, if the anon sub is compiled statically?
If it's compiled statically you can't get to it, so they're kinda
useless there. (Unless we want to get into PMC constants, in which
case I could see them being useful)
[ passing arguments to compilers ]
> ... So why does the compile op exist?
Your concerns are all valid. The compiler interface needs extension as
well as some cleanup. This is true for compilers written in C (loadable
as shared libs) and for compilers written in PASM/PIR.
The explicit call of Parrot_runops... in C<compile> was a short-term
hack to get a compiler implemented as a Parrot_Sub running. I think that
removing that stuff and always calling ->invoke already should do it.
You can subclass the internal Compiler PMC, overload __invoke, and pass
in whatever is needed.
The C<compile> opcode itself isn't more then wrapping a function or
method call, a visuable distinction that something different is
happening here. OTOH it's not really needed. It's just C<invoke> - or
could/should be.
leo
I understand that perspective, but I guess I'm thinking about embedded
compilers somewhat differently. For example, consider a regex compiler.
It needs to be able to compile embedded code in whatever the host
language is. In fact, it needs to be able to switch back and forth
freely between the regex compile and the host language compile, and the
compilation of the inner language might need to be tailored to fit into
whatever the regex compiler needs. Maybe that's a simple as saying
"don't provide a main()", in which case it can be done by having two
C<compreg> registration strings for the same language. But you have to
get the name of that language into the regex compiler in the first
place. (Ok, you might be able to avoid that in this particular case by
making the regex compiler into a coroutine, but I don't want to get too
caught up in one particular example.)
And the compiler needs to be reentrant, for the cases where the language
within the regex rule invokes another regex match. I mention that only
to say that you can't just set properties on the PMC returned from
C<compreg>, because that PMC will be shared during reentrant calls. You
could always clone it and then configure it, I suppose.
Anyway, I'm just trying to come up with situations where compilers need
to know more than just the language they're compiling, and especially
cases where you want different configuration for every compile. Another
example of this would be if your regex syntax involved binding
hypothetical variables (or something similar), and the inner language
needed to know at compile time which variables had been defined at that
point.
I'm sure I could come up with workarounds for all of these issues, but I
was expecting that much of the usefulness of Parrot would be in mixing
together (and nesting) several languages in one program, and it seems
like in many cases nested compiles are going to need to communicate
nontrivial amounts of information.
I'm okay with things if the answer is "don't do that" -- meaning if you
need complex cases like this, then forget about C<compile> and do
everything with straight subroutines or whatever else -- but I would
like to understand the intent of the C<compile> op better so I can
forget about trying to make my stuff fit into its mold if what I'm doing
is just different.
> The whole "name for the function I'm compiling" thing isn't an issue
> either, or at least it shouldn't be. The code being compiled is
> implicitly a subroutine -- you don't have to have code that reads:
>
> .sub foo_1234423_some_random_text
> .
> .
> .
> .end
>
> and go look for 'foo12 34423_some_random_text' in a namespace
> somewhere. Just leave out the .sub/.end (they should be implied) and
> the returned PMC is a sub PMC for your nicely anonymous sub. Which is
> fine, and as it should be.
That would work fine for me. The current state also works ok since
overriding is allowed, but it feels wrong to construct a sub with a
specific name and then disavow all knowledge of that name even though
it's been registered in some global table. Leo's @ANON implementation of
your scheme works great for me (I have no problem wrapping that around
my code.) All this does raise the question of garbage collection for
packfile objects; is there any? Both my current day job project and (I'm
guessing) mod_perl both hope the answer is "yes". :-)
> ... Leo's @ANON implementation of
> your scheme works great for me (I have no problem wrapping that around
> my code.) All this does raise the question of garbage collection for
> packfile objects; is there any?
Not yet. We basically have two kinds of dynamically compiled code:
1) loaded modules - persistent code used until end of program
2) evaled "statements" - volatile code, maybe used once only
But the current implementation doesn't know about that difference. The
compiled code is always appended to the list of code segments. There is
no interface yet to manipulate packfile segments.
We finally need a packfile PMC that is the owner of packfile segments.
If that PMC goes out of scope the compiled code structures can be freed.
This packfile PMC would also vastly eliminate the difference between 1)
and 2), the more when there is some interface to be able to append the
newly compiled code to existing code segments, so that you can e.g. dump
the combined code to disc.
But it would still be useful to differentiate between 1) and 2). For 1)
we could do global constant folding (if a constant already exists in the
main contant table just use it, or, if not, append to the main constant
table).
For 2) a distinct constant table is needed.
leo