> 1: Right now, would it be possible for parrot only to install its signal
> handlers when it starts the runloop?
> (given that ponie isn't using the runloop)
Currently Parrot installs just one handler (SIGHUP) for testing only.
See src/events.c. So old issues WRT ponie regression tests should be
solved.
> 2: Long term can parrot interwork nicely with existing signal handlers
> by storing the function pointer returned by signal() when it installs
> its handler, and calling that function when its signal handler is called?
Shouldn't be too hard, yes.
> IIRC ponie isn't playing nicely with how parrot does embedding. I'd like
> parrot's embedding to be as "clean" as perl's.
A lot of interface functions are still missing.
> IIRC parrot is currently assuming that we give it the address of a local
> variable (for the stack top) before doing anything else, which is making
> a big assumption about the calling locations of various parrot functions
> within the code of the embedding program.
The stack top has to be beyond any variables Parrot might see. When
during DOD the stack (and the CPU registers) are traced all possible
pointers to PMCs and Buffers must be covered by the range of the stack
top and the current stack pointer.
> ... Would it be possible for parrot to
> provide an embedder's interface to all the (exported) functions that
> checks whether the stack top pointer is set, and if not (ie NULL) it
> pulls the address of a local variable in it
This doesn't work:
{
PMC *x = pmc_new(..);
{
some_parrot_func();
}
}
C<x> would be outside of the visible range of stack items. The braces do
of course indicate stack frames.
> IIRC there only interface to call into parrot bytecode subs currently
No. There is a lot:
void Parrot_runops_fromc(Parrot_Interp, PMC *sub);
void Parrot_runops_fromc_save(Parrot_Interp, PMC *sub);
void* Parrot_runops_fromc_args(Parrot_Interp, PMC *sub, const char *sig, ...);
void* Parrot_runops_fromc_args_save(Parrot_Interp, PMC *, const char *, ...);
INTVAL Parrot_runops_fromc_args_save_reti(Parrot_Interp, PMC *,
const char *, ...);
FLOATVAL Parrot_runops_fromc_args_save_retf(Parrot_Interp, PMC *,
const char *, ...);
plus a bunch more to run method functions.
see perldoc -F src/interpreter.c
> ... Can the NCI code work for calling into parrot,
> as well as calling out?
NCI calls C functions from byte code. Calling PASM/PIR from C is done
with above functions. And there is PASM->C->PASM via callback functions.
> Also IIRC parrot is still creating globally visible symbols that aren't
> well prefixed, so may cause linker fun. But that's not really a ponie
> issue.
Yep. A lot of. What's perl5 doing against those?
> Sorry if I'm behind the times on parrot, and some of what I remember is
> wrong.
No problem.
> Nicholas Clark
leo
> And instead the perl regression tests that use SIGHUP fail.
> (There are also tests on SIGUSR1, but not SIGUSR2, it seems)
Argh. I grepped through all perl 5.8.0 tests and didn't find SIGHUP. So
I thought, it's usable. AFAIK is SIGUSR2 not usable due to linuxthreads.
But anyway, can we just grab different signals for Parrot and Ponie
testing. AFAIK if a test runs with SIGINT or SIGUSR1, SIGHUP will work
too.
>> C<x> would be outside of the visible range of stack items. The braces do
>> of course indicate stack frames.
> This is true. And yes you need to set a stack top if you're expecting the
> stack walking to find things you own. But I'm thinking about the case where
> the embedding app registers all its PMCs immediately, and so doesn't expect
> (or want) to be stack walked.
Each PMC allocation can trigger DOD and or GC. If you never free any PMC
until program termination, you can turn off both DOD and GC. I doubt
that this is a reasonable scenario, though. And if you "unuse" PMCs and
DOD is off, you'll quickly run out of resources because recycling can't
be done then.
If you really don't want to rely on stack walking, you have to register
each PMC with C<Parrot_register_pmc()>. This function uses a hash for
remembering the PMC's address. So it's not really fast.
> I doubt that ponie will be the only app wanting to take control of the
> lifetime of all its PMC acting as references into the parrot VM
Well, Parrot does DOD/GC. It'll not start refcounting ... But I really
don't think, that this is an issue. You just need to start your app like
this:
main() {}
void *stack_top;
return real_main(stack_top); // pass &stack_top to parrot
}
then every PMC is for sure covered by the stack walking code. And there
is of course the scheme described in t/src/basic_3 with
Parrot_run_native(), which already does The Right Thing(tm).
[ global syms ]
>> Yep. A lot of. What's perl5 doing against those?
> Prefixing all the linkable functions as Perl_ or PerlIO_ etc
Ugly. Can't the linker hide it?
> Nicholas Clark
leo
> Well, Parrot does DOD/GC. It'll not start refcounting ... But I really
> don't think, that this is an issue. You just need to start your app like
> this:
>
> main() {}
> void *stack_top;
> return real_main(stack_top); // pass &stack_top to parrot
> }
If I understand this correctly, then this is far more restrictive than the
perl5 embedding interface. For example you can't "hide" an embedded parrot
interpreter as an implementation detail inside a library that you link
against existing (unchanged) C code.
It's possibly not an issue long term for ponie (as a stand alone interpreter),
except that it would mean that ponie couldn't be embedded in the way that
perl5 currently can be embedded.
Nicholas Clark
> Nicholas Clark <ni...@ccl4.org> wrote:
> > This is true. And yes you need to set a stack top if you're expecting the
> > stack walking to find things you own. But I'm thinking about the case where
> > the embedding app registers all its PMCs immediately, and so doesn't expect
> > (or want) to be stack walked.
>
> Each PMC allocation can trigger DOD and or GC. If you never free any PMC
> until program termination, you can turn off both DOD and GC. I doubt
> that this is a reasonable scenario, though. And if you "unuse" PMCs and
> DOD is off, you'll quickly run out of resources because recycling can't
> be done then.
>
> If you really don't want to rely on stack walking, you have to register
> each PMC with C<Parrot_register_pmc()>. This function uses a hash for
> remembering the PMC's address. So it's not really fast.
But I suspect that there will be embedders of parrot who want to do this -
register only 1 or 2 PMCs that they use to hold configuration or state, and
hold onto these outside the runloop. They'd then make calls into the runloop
to get stuff done. Effectively this allows the embedding app to provide
subroutines/functions that appear to the caller to be written in C, but
actually have a parrot implementation. But the implementation language is
an implementation detail.
> > Prefixing all the linkable functions as Perl_ or PerlIO_ etc
>
> Ugly. Can't the linker hide it?
Probably yes on most platforms, but the usual problem for perl is that it
has to remain horribly portable, and it has to be portable to platforms
that aren't currently available for any of the porters to test on.
So fun things like Crays (Where sizeof(short) is 8)
Nicholas Clark
>> register each PMC with C<Parrot_register_pmc()>.
^^^^^^^^^^^^^^^^^^^^^^^^
> But I suspect that there will be embedders of parrot who want to do this -
> register only 1 or 2 PMCs
The answer is above. If the run loop is entered the first time, the
stack top is automatically set to the run loops level. That's it. All
PMCs that are created before have to be registered manually. Or - if you
are really creating just a few PMCs turn of DOD/GC. Parrot does it
currently in the compiler.
>> Ugly. Can't the linker hide it?
> Probably yes on most platforms, but the usual problem for perl is that it
> has to remain horribly portable,
Ok, ok. We'll have to prefix all globally visible symbols.
> Nicholas Clark
leo
> If I understand this correctly, then this is far more restrictive than the
> perl5 embedding interface.
This is *one* way to start things up. You can start the app with
Parrot_run_native. You can set the stack top. You can register each PMC
with dod_register. You can stuff all PMCs that wouldn't be visible into a
PMC array ...But one of these policies or a suitable mixture has
to be choosen.
Like it or not DOD/GC has different impacts on the embedder. Above rules
are simple. There is no "when the PMC isn't used any more decrement a
refcount" and "when you do that and that then icnrement a refcount" or
some such like in XS. This is really simple. Simplest is to just set the
top of stack.
> Nicholas Clark
leo
>
>> ... Would it be possible for parrot to
>> provide an embedder's interface to all the (exported) functions that
>> checks whether the stack top pointer is set, and if not (ie NULL) it
>> pulls the address of a local variable in it
>
> This doesn't work:
>
> {
> PMC *x = pmc_new(..);
> {
> some_parrot_func();
> }
> }
>
> C<x> would be outside of the visible range of stack items. The braces
> do
> of course indicate stack frames.
Since in this case I am outside or parrot and have chosen to use the
interface, i better use register_pmc and if I did, then this sceme
would work?
Arthur
> Like it or not DOD/GC has different impacts on the embedder. Above
> rules
> are simple. There is no "when the PMC isn't used any more decrement a
> refcount" and "when you do that and that then icnrement a refcount" or
> some such like in XS. This is really simple. Simplest is to just set
> the
> top of stack.
>
I am now going to be impolite.
THERE ARE CASES WHERE YOU CAN NOT SET A TOP OF STACK, FOR EXAMPLE IF
YOU ARE WRITING A PLUGIN TO A BINARY ONLY APPLICATION LIKE INTERNET
EXPLORER OR WRITING AN APACHE2 SHARED LIBRARY THAT IS SUPPOSED TO WORK
WITH PRE COMPILED BINARIES, NOT TO MENTION A LOT OF APPLICATIONS THAT
MIGHT WANT TO EMBED PARROT AS AN OPTION MIGHT FEEL IT IS A TAD FUCKING
UNCLEAN TO RUN THEIR ENTIRE APPLICATION THROUGH PARROT (THINK
OPENOFFICE)
I am amazed by the fact that parrot seems determined to redo the same
misstakes perl5 did.
Arthur
Meh...
Leo: There are some embedding applications where it's simply not
possible to get the top of the stack. For example, let's say I want to
write a Parrot::Interp module for Perl 5 (on a non-Ponie core):
my $i=new Parrot::Interp;
my $argv=$i->new_pmc('PerlArray');
$argv->push($i->new_pmc('PerlString')->set_string('foo'));
$i->load_bytecode("foo.pbc");
$i->run_bytecode($argv);
Now, theoretically Parrot::Interp::new should capture the top of the C
stack, but there's no way it could do so. If it captured an auto
variable in its own body, that variable might not even be part of the
stack by the time run_bytecode is invoked.
Having said that, the PMC registration technique ought to be good enough
for this particular application.
Arthur: Embedding Parrot will never be quite as simple conceptually as
embedding Perl. The garbage collection system ensures that. Even so,
there does need to be a way to embed Parrot without having it take over
your program--and it appears that PMC registration and other alternative
methods of dealing with the GC will do that. There's no need to disable
the GC outside of a runloop, and in fact I could easily imagine someone
using Parrot buffers and the GC system without using the runloop itself
as a convenient memory management system for an application otherwise
written in straight C. (Not to mention that Parrot I/O and strings
should be a lot nicer than the straight C equivalents...)
Parrot must be embeddable in virtually any environment Perl can be.
That doesn't mean it has to be as easy, but it has to be possible. If
it isn't, we might as well give up on the embedding interface altogether.
--
Brent "Dax" Royal-Gordon <br...@brentdax.com>
Perl and Parrot hacker
Oceania has always been at war with Eastasia.
> THERE ARE CASES
Arthur, please let's quietly talk about possible issues.
Many libraries that you want to use, demand that you call
"The_lib_init(bla)". This isn't inappropriate, it's a rule. (dot).
Parrot is GC based. (dot).
This imposes different semantics for embedders. I've listed four
different very simple ways to not get your PMC collected to early.
GC and refcounting are different schemes to achieve the same thing. You
know that. But nethertheless you have to follow these GC specific rules.
leo
> Meh...
> Leo: There are some embedding applications where it's simply not
> possible to get the top of the stack.
Not possible, or some of ... just don't like that ;)
> write a Parrot::Interp module for Perl 5 (on a non-Ponie core):
> my $i=new Parrot::Interp;
> my $argv=$i->new_pmc('PerlArray');
If there is such an interface, it's responsible for anchoring the PMC.
This is *one* simple rule.
Shit: really. I dont't get it. Please read (again):
$ perldoc perlguts
/increment
... and does not increment
... do not increment the reference count
... As a side effect, it increments
... has been incremented to two.
... If it is not the same as the sv
argument, the reference count of the obj object is
incremented. If it is the same, or if the how argument is
PERL_MAGIC_arylen, or if it is a NULL pointer, then obj is
merely stored, without the reference count being
incremented.
*That could make me cry*
Think different,
have fun,
leo
> Arthur Bergman <s...@nanisky.com> wrote:
>
>> THERE ARE CASES
>
> Arthur, please let's quietly talk about possible issues.
>
> Many libraries that you want to use, demand that you call
> "The_lib_init(bla)". This isn't inappropriate, it's a rule. (dot).
> Parrot is GC based. (dot).
>
Yes, but they don't demand that at the top level, by demanding that at
a top level you cut out all non opensource applications with a plugin
based API, if this is your goal then I am going to stop playing right
now.
> This imposes different semantics for embedders. I've listed four
> different very simple ways to not get your PMC collected to early.
>
> GC and refcounting are different schemes to achieve the same thing. You
> know that. But nethertheless you have to follow these GC specific
> rules.
>
Leo, I am not an idiot, please do not treat me like one. I fail to see
how the register/unregister PMC issue is semantically different from a
reference count.
All I want to do is.
1) create a parrot interpreter
2) create some pmcs
3) call some code inside parrot with those pmcs
Now I am fine registering those PMCs that I create and unregister them
afterwards, but inside the call to parrot everything should behave as
normal! Currently there is no easy way to do this. The obvious answer
seems to be to have the embedding interface set the top of stack in
each embedding function if it is not set. This would do the right thing
and make it easy to embed parrot.
Arthur
> ... The obvious answer
> seems to be to have the embedding interface set the top of stack in
> each embedding function if it is not set. This would do the right thing
> and make it easy to embed parrot.
No. I've posted already this example:
{
PMC *some = pmc_new(...);
{
PMC *another = pmc_new(...);
}
// some may be dead here
}
The braces denote stack frames.
> Arthur
leo
No you have not, you have pasted something that has no relevance what
so ever to what I am saying unless you give it some more verbosity.
Yes, I know this is the case which is why I have to do
{
a = pmc_new
register_pmc(a)
{
b = pmc_new
register_pmc(b)
}
a should still be alive here
}
By leaving out the needed function you do not convey any response to my
proposal that parrot embedding functions that detect that no stacktop
to be set to set them before calling onwards into parrot. Yes, if no
stacktop is set in my code I need to carefully use regster_pmc, but I
know that already and it is something I agree to do.
Arthur
>
>> All I want to do is.
>
>> 1) create a parrot interpreter
>> 2) create some pmcs
>> 3) call some code inside parrot with those pmcs
>
> I've now added a missing init function that sets the stack top:
>
> Parrot_init_stacktop(Interp*, void*);
>
> This function can be used as a replacement for Parrot_init(). I hope
> that simplifies step 1)
No, it entirely misses the point, every time I call into parrot the
place I called parrot_init_stacktop might not, or most likely will not
be in my current stack.
Is the stacktop per interpreter?
What then would be needed is a set_stacktop(Interp*, void*) function.
Arthur
> On 2 May 2004, at 11:47, Leopold Toetsch wrote:
>> Parrot_init_stacktop(Interp*, void*);
>>
>> This function can be used as a replacement for Parrot_init(). I hope
>> that simplifies step 1)
> No, it entirely misses the point, every time I call into parrot the
> place I called parrot_init_stacktop might not, or most likely will not
> be in my current stack.
Can't you call that somewhere in an outer frame? E.g. where you create
the interpreter.
> Is the stacktop per interpreter?
Yes. But DOD/GC with mutiple interpreters/threads isn't really
investigated especially when there are shared PMCs.
> What then would be needed is a set_stacktop(Interp*, void*) function.
You can use above function exactly for this. If the interpreter is
initialized it skips that part and just sets the stack top. But calling
Parrot_init_stack() in the same stack frame, where automatic PMC
variables are located might or might not work, because you don't know
the order of automatic variables in the stack frame.
> Arthur
leo
>>> that simplifies step 1)
>
>> No, it entirely misses the point, every time I call into parrot the
>> place I called parrot_init_stacktop might not, or most likely will not
>> be in my current stack.
>
> Can't you call that somewhere in an outer frame? E.g. where you create
> the interpreter.
No, because I might be creating the interpreter in a callback from the
application, and then access that interpreter in ANOTHER callback from
the application.
Arthur
Your are speaking of a usage like "perldoc perlembed" here, I presume.
For old code that is totally unaware that its running on Parrot, you'll
have to anchor (dod_register) returned PMCs. dodregister() works like
REFCNT_inc(), i.e. you can register a PMC multiple times. unregister is
like REFCNT_dec(), except there is no immediate destruction if the
register count reaches zero.
Such an embeded usage might also need a flag for entering the run loop,
so that the stack top is always set at the run loop stack frame.
> Arthur
leo
IGNORE PERL5 IGNORE PERLEMBED, IGNORE ANY FUCKING REFERENCE TO PERL5,
ALSO USE THE EMBEDDING API NAMES NOT THE INTERNAL ONES
Let us say I want to embed parrot in application x, application x is
binary only and has an plugin API, this has three callbacks
app_x_init_plugin
app_x_run_plugin
app_x_destroy_plugin
When the app loads the plugin, it calls app_x_init_plugin, in this
routine I load parrot using Parrot_init(Parrot_Interp); I return from
my init function and the program continues to run, ANY STACK AUTO
VARIABLE CAPTURED IN THIS FRAME IS FROM NOW ON USELESS
Then when it runs and wants to use me it calls app_x_run_plugin, in
here I set up some PMCs,<underlined><bold>REGISTER
THEM</bold></underlined> using Parrot_register_pmc, then call into
parrot using for example void Parrot_runcode(Parrot_Interp, int argc,
char *argv[]); Now my argument is that currently this does not work
because the stacktop is not set! so what Parrot_runcode should do is in
{
if(!parrot->stacktop) {
set_stacktop
}
Perl_runcode_real(...)
}
So for the embedder it is transparent if the stacktop is set or not.
Arthur
> ... , then call into
> parrot using for example void Parrot_runcode(Parrot_Interp, int argc,
> char *argv[]); Now my argument is that currently this does not work
> because the stacktop is not set! so what Parrot_runcode should do is in
> {
> if(!parrot->stacktop) {
> set_stacktop
When you call C<app_x_run_plugin> from different locations, this
wouldn't work. So the correct sequence is:
app_x_run_plugin(...) {
void stk;
Parrot_init_stacktop(interp, &stk);
Parrot_call(interp, the_sub, ...);
}
> Arthur
leo
>> because the stacktop is not set! so what Parrot_runcode should do is
>> in
>
>> {
>> if(!parrot->stacktop) {
>> set_stacktop
>
> When you call C<app_x_run_plugin> from different locations, this
> wouldn't work. So the correct sequence is:
>
> app_x_run_plugin(...) {
> void stk;
> Parrot_init_stacktop(interp, &stk);
> Parrot_call(interp, the_sub, ...);
> }
>
>> Arthur
>>
Not if you clear the stacktop if you set it in Parrot_call, I thought
this was obvious so I left it out, but here we go.
parrot_call {
if(stacktop) {
parrot_call_real
} else {
set_stacktop
parrot_call_real
unset_stacktop
}
}
ICU somehow manages (optionally?) to suffix all it its symbols with the
version number, automagically (meaning, they don't look that way in the
source or the headers). We should figure out how they are doing that,
and see if we can use the same approach. I don't know if it's linker
magic, or compiler magic, or magic magic.
JEff
> ICU somehow manages (optionally?) to suffix all it its symbols with the
> version number, automagically (meaning, they don't look that way in the
> source or the headers).
urename.h is responsible for this. All the symbols get a version suffix.
> JEff
leo
The Linux kernel does this as well with compier/header/preprocessor magic.
Yes, it's optional. The --disable-renaming configure option can be used
to turn this feature off. This feature is used so that multiple
versions of ICU can be used within the same application. This can be
important when you want to have multiple collation versions, or in a
huge application where part of it uses the current ICU while the rest
(maybe from third party vendors) can continue to use older versions of ICU.
This magic is done with a perl script called
icu/source/tools/genren/genren.pl, which uses nm on Linux. It's run
before every release of ICU, and the resulting urename.h is checked in.
George