Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[perl #20315] [PATCH] eval

7 views
Skip to first unread message

Leopold Toetsch

unread,
Jan 15, 2003, 6:19:34 AM1/15/03
to perl6-i...@perl.org, bugs-bi...@netlabs.develooper.com
Leopold Toetsch (via RT) wrote:

> # New Ticket Created by Leopold Toetsch
> # Please include the string: [perl #20315]
> # in the subject line of all future correspondence about this issue.
> # <URL: http://rt.perl.org/rt2/Ticket/Display.html?id=20315 >
>
>
> Attached is a first try towards eval.


I have now eval/compile also running inside imcc. There are two
registered compregs: "PASM" and "PIR" aka .imc, e.g.

set S0, 'set S1, "42\n"'
concat S0, "\n"
concat S0, "print S1\nend\n"
compile P1, S0, "PASM"
print "\n"
end

I still would like to have some design advice.

1)
The call function to the compiler/assembler is kept as a NCI. Better
would be a subclass of NCI (Compiler.pmc or so), which provides

invoke_keyed(key, next)

This would look up the compreg "key" and prepare the registers for
calling the compiler function.

I think such a vtable method would also be handy for the OO stuff:

callmethod P1, "foo"

which would translate too a invoke_keyed() on the object. For methods
known at compile time, the HL could spit out

callmethod P1, n

which would then be invoke_keyed_int().

2) The return value of the compile ops should be a pointer to a bytecode
segment, already in the interpreter and ready for calling.
So how should a "code_segment_PMC" look like and how should the
structure in the packfile be defined?
Just an array of code pointers containing byte_code and byte_code_size?
The code_segment_PMC would probably be a [subclass of]? Sub.pmc, which
can then be invoked for actually evalling the code.

Comments welcome,
leo

Jerome Quelin

unread,
Jan 15, 2003, 12:29:49 PM1/15/03
to Leopold Toetsch, perl6-i...@perl.org
Leopold Toetsch wrote:
> 1)
> The call function to the compiler/assembler is kept as a NCI. Better
> would be a subclass of NCI (Compiler.pmc or so), which provides
> invoke_keyed(key, next)

Hmm, I don't know what a NCI is. Where (which files) can I find
information about them?


Jerome
--
jqu...@mongueurs.net

Dan Sugalski

unread,
Jan 15, 2003, 11:31:14 AM1/15/03
to perl6-i...@perl.org, bugs-bi...@netlabs.develooper.com
At 8:27 PM +0000 1/14/03, Leopold Toetsch (via RT) wrote:
># New Ticket Created by Leopold Toetsch
># Please include the string: [perl #20315]
># in the subject line of all future correspondence about this issue.
># <URL: http://rt.perl.org/rt2/Ticket/Display.html?id=20315 >
>
>
>Attached is a first try towards eval.
>
>- interpreter has a new data member Parrot_compreg_hash
>- parrot registers the PASM1 type i.e. what PDB_eval can parse
>- the new B<compile> opcode (ab)uses nci to build a function for calling
>PDB_eval
>- nci is extended (jit/i386 only), to understand an 'I' param as interpreter
>- the string is evaled immediately, we don't have multiple byte code
>segments yet
>
>No registers, which nci uses, are preserved, no error checking and so
>on, but works ;-)

Yow, Cool! We *have* to get IMCC built into parrot now.

>Some questions arise here:
>- Should the B<compreg> opcode also have a form with a label to build
>PASM compilers, ook?

I think you've confused me here. (No, I'm wrong--you've definitely
confused me here) More explanation?

>- is using the NCI interface ok for evals purpose?

Sure. We can rejig it later if we need to, but I expect most
compilers will involve a trip into C code, so that's fine.

>- how should a byte code segment (PMC) look like?

Ah, the big question. I'm not quite sure yet--let's try and work that
out while I'm churning over objects.
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Leopold Toetsch

unread,
Jan 15, 2003, 2:12:36 PM1/15/03
to Jerome Quelin, perl6-i...@perl.org

Native Call Interface. s. nci.c, build_nativecall.pl, classes/nci.pmc,
the docs and Dan's announcement on the list.

> Jerome

HTH
leo

Leopold Toetsch

unread,
Jan 15, 2003, 4:16:13 PM1/15/03
to Dan Sugalski, perl6-i...@perl.org, bugs-bi...@netlabs.develooper.com
Dan Sugalski wrote:

> At 8:27 PM +0000 1/14/03, Leopold Toetsch (via RT) wrote:

> Yow, Cool! We *have* to get IMCC built into parrot now.


You do get this wrong - always ;-)

imcc = parrot + assemble.pl - pre-processor + PIR-assembler +
optimizer/10 #yet & now & already

With the help of assemble.pl -E (preprocess only, spitting out PASM)
imcc runs *all* parrot tests.

>> - Should the B<compreg> opcode also have a form with a label to build
>> PASM compilers, ook?

> I think you've confused me here. (No, I'm wrong--you've definitely
> confused me here) More explanation?


I was thinking of assembler/compilers implemented in PASM, as
languages/ook/ook.pasm is a "ook" compiler. It could register a compreg
with type "Ook", which uses the registered "PASM" compiler to run ook
code ;-)


>> - is using the NCI interface ok for evals purpose?

> Sure. We can rejig it later if we need to, but I expect most compilers
> will involve a trip into C code, so that's fine.


Ok, then this needs some twigging. I did use signature 'I' for pushing a
Parrot_Interp. The return value needs still work.


>> - how should a byte code segment (PMC) look like?

> Ah, the big question. I'm not quite sure yet--let's try and work that
> out while I'm churning over objects.


I did have a closer look at struct PackFile. I think we have some
possiblities to actually eval()/invoke() the code, depending on the HL:
- new interpreter, nothing shared (unlikely)
- new interpreter, context shared - meaning also constants
- same interpreter, everything shared

So it seems, that for multiple code segments, we'll have to take the
PackFile_ConstTable out of the structure and include
file/line/debug/whatever information. This would look like:

packfile aka interpreter->code:
- constants
- code_segment[]
- byte_code
- byte_code_size
- [ more of current packfile ]
- filename
- lines[]
- [ more aditional stuff ]
- prederefed_code
- jit_info (jitted code)
- fixups

The return value of B<compile> would then be a pointer to such a
code_segment.

BTW a PackFile_Constant should be a union IMHO, currently each type has
its own storage.

leo

Leopold Toetsch

unread,
Jan 17, 2003, 4:30:13 AM1/17/03
to Dan Sugalski, perl6-i...@perl.org, boem...@physik.uni-kl.de
Leopold Toetsch wrote:

> So it seems, that for multiple code segments, we'll have to take the
> PackFile_ConstTable out of the structure and include
> file/line/debug/whatever information. This would look like:
>
> packfile aka interpreter->code:
> - constants
> - code_segment[]
> - byte_code
> - byte_code_size
> - [ more of current packfile ]
> - filename
> - lines[]
> - [ more aditional stuff ]
> - prederefed_code
> - jit_info (jitted code)
> - fixups


The debug stuff should have its own segment, linked to the corresponding
code segment.


> The return value of B<compile> would then be a pointer to such a
> code_segment.
>
> BTW a PackFile_Constant should be a union IMHO, currently each type has
> its own storage.


As a first step towards multiple code segments I did apply locally
Juergen Boemmels Patch #18056. With little rediffing and twigging, this
runs all tests, imcc's packout() for writing pbc files is working too.
Only packdump is broken, but this didn't do very much anyway.

So people please have a look at this patch and the description.
With this patch in place, it shouldn't be too hard, to have multiple
code segments, currently only in memory and with extending the PBC file
also external.


leo

Leopold Toetsch

unread,
Jan 18, 2003, 7:52:42 AM1/18/03
to Dan Sugalski, perl6-i...@perl.org, Juergen Boemmels
Dan Sugalski wrote:

> At 8:27 PM +0000 1/14/03, Leopold Toetsch (via RT) wrote:

>> - how should a byte code segment (PMC) look like?

> Ah, the big question. I'm not quite sure yet--let's try and work that
> out while I'm churning over objects.

I have it ready.
It's based on the packfile patch #18056 by Juergen Boemmels. On top of
this patch, it was quite easy to implement multiple code segments.

A code segment is this:

struct PackFile_ByteCode {
struct PackFile_Segment base; /* name, byte_count ... */
opcode_t * code;
void **prederef_code; /* The predereferenced code */
void *jit_info; /* JITs data */
struct PackFile_ByteCode * prev; /* was executed previous */
};

And there are some additional functions to deal with code segments:

struct PackFile_ByteCode * Parrot_new_eval_cs(struct Parrot_Interp *);
struct PackFile_ByteCode * Parrot_switch_to_cs(struct Parrot_Interp *,
struct PackFile_ByteCode *);
/* pop and destroy cs */
void Parrot_pop_cs(struct Parrot_Interp *);

The eval code segs are named EVAL_{$n}.

Further changes:
- a PackFile_Constant is now a union, which saves a lot of space in mem.
- eval (invoke) can run with *all* run cores, including trace
- reading packfiles is ready for multiple code segments in bytecode.
- code is ready for debug information, I'll first do it in imcc, which
could generate file/line info on the fly. Next would then be to extend
the PBC format.


diffstat of patch:

classes/compiler.pmc | 24 -
classes/eval.pmc | 16
debug.c | 35 +-
include/parrot/debug.h | 2
include/parrot/exceptions.h | 1
include/parrot/interpreter.h | 2
include/parrot/packfile.h | 104 ++++++
interpreter.c | 6
jit.c | 2
jit/i386/jit_emit.h | 6
languages/imcc/parser_util.c | 24 -
languages/imcc/pbc.c | 10
lib/Parrot/OpTrans/C.pm | 6
lib/Parrot/OpTrans/CGoto.pm | 6
lib/Parrot/OpTrans/Compiled.pm | 6
ops2c.pl | 3
packdump.c | 12
packfile.c | 688 ++<snip..>+++++++++++++++---
packout.c | 106 ++----
trace.c | 9
20 files changed, 891 insertions(+), 177 deletions(-)

A lot of files are touched because of the PackFile_Constant change.

Does anyone want to have a look at the patch, or should I put it in?

leo

Leopold Toetsch

unread,
Jan 20, 2003, 6:08:29 PM1/20/03
to Dan Sugalski, perl6-i...@perl.org
Leopold Toetsch wrote:

> I have it ready.
> It's based on the packfile patch #18056 by Juergen Boemmels. On top of
> this patch, it was quite easy to implement multiple code segments.

And yet another f'up me.

Here is a proposal for inter code segment jumps:

The assembler (imcc) can recognize when a branch ins goes to a different
code segment.

For such a branch, imcc generates this opcode seqence:

inter_cs
if i, ic # or whatever

The branch location "ic" is the index[1] into the fixuptable, which
contains codesegment/offset pairs.

The inter_cs instruction looks like:

opcode_t *cur = cur_opcode; /* remember current */
// = CUR_OPCODE
opcode_t *pc;
DO_OP(pc, interpreter); /* pc is new pc now */
if (pc_is_outof_bounds) { /* branch taken */
index = (pc-code_start)/(sizeof opcode_t) & ~0x80000000;
interp->resume_code_seg = fixup_table[index].code_seg;
interp->resume_offset = fixup_table[index].offs;
//= restart OFFSET(x)
return 0; /* do a resume in new cs */
}
cur_opcode = pc; /* branch not taken */
// = goto ADDRESS(pc);

[1] the branch offsets are with the hight bit set, to allow recognition
of out_of_bounds.

Some help to translate this to OpsFile macros as well as of course
always comments are welcome,

leo

Leopold Toetsch

unread,
Jan 20, 2003, 5:58:44 PM1/20/03
to Dan Sugalski, perl6-i...@perl.org
Leopold Toetsch wrote:

> I have it ready.


> - code is ready for debug information, I'll first do it in imcc, which
> could generate file/line info on the fly. Next would then be to extend
> the PBC format.


And this is working too in imcc, including gdb-stepping into evaled code
segments.


> Does anyone want to have a look at the patch, or should I put it in?


Tomorrow is dead time, I really don't like to have 30 files different to
CVS :)


> leo

me2

Jason Gloudon

unread,
Jan 21, 2003, 12:27:12 PM1/21/03
to Leopold Toetsch, Dan Sugalski, perl6-i...@perl.org
On Tue, Jan 21, 2003 at 12:08:29AM +0100, Leopold Toetsch wrote:

> Here is a proposal for inter code segment jumps:
>
> The assembler (imcc) can recognize when a branch ins goes to a different
> code segment.
>
> For such a branch, imcc generates this opcode seqence:
>
> inter_cs
> if i, ic # or whatever

Why do we need branches to go to different code segments ? I think the
expectation has been that control transfers between segments would have their
own op, because separate code segments would generally coincide with subs,
closures or at least blocks, that have specific entry points.

Maybe Dan could give us a hint about the closure/block/byte code segment
relationship.

--
Jason

Leopold Toetsch

unread,
Jan 21, 2003, 2:21:42 PM1/21/03
to Jason Gloudon, Dan Sugalski, perl6-i...@perl.org
Jason Gloudon wrote:

> On Tue, Jan 21, 2003 at 12:08:29AM +0100, Leopold Toetsch wrote:
>
>
>>Here is a proposal for inter code segment jumps:
>>
>>The assembler (imcc) can recognize when a branch ins goes to a different
>>code segment.
>>
>>For such a branch, imcc generates this opcode seqence:
>>
>> inter_cs
>> if i, ic # or whatever

> Why do we need branches to go to different code segments ?


Because of this nasty piece of little code:
t/syn/eval_3.imc:

# #!/usr/bin/perl -w
# my $i= 5;
# LAB:
# $i++;
# eval("goto LAB if ($i==6)");
# print "$i\n";
#
# 7
#####

.sub _test
I1 = 5
$S0 = ".sub _e\nif I1 == 6 goto LAB\nend\n.end\n"
compreg P2, "PIR"
compile P0, P2, $S0
LAB:
inc I1
invoke
print I1
print "\n"
end
.end

> ... I think the


> expectation has been that control transfers between segments would have their
> own op, because separate code segments would generally coincide with subs,
> closures or at least blocks, that have specific entry points.


The problem is, that the "invoke" calls a different code segment, which
eventually branches back.
Parsing the eval code into the same code segment in not possible, we
can't expand code segments, or let's say not easily, except (if even
possible) with invalidating prederef and JIT code, and restarting. This
is IMHO too expensive to go that road.

The proposed "inter_cs" op is such a special op, but suitable for all
branches. E.g. "inter_cs ; bsr _somewhere"


> Maybe Dan could give us a hint about the closure/block/byte code segment
> relationship.

I just can tell you from imcc's POV. Everything compiled in one sequence
(e.g. all subs of all files) could be one code segment. As soon as you
start running the code, and you want to compile again, produced bytecode
has to go into a different code segment. Of course, it could be more
fine grained, but not less.

leo

Jason Gloudon

unread,
Jan 21, 2003, 9:56:57 PM1/21/03
to Leopold Toetsch, perl6-i...@perl.org
On Tue, Jan 21, 2003 at 08:21:42PM +0100, Leopold Toetsch wrote:

> >>For such a branch, imcc generates this opcode seqence:
> >>
> >> inter_cs
> >> if i, ic # or whatever
>
> >Why do we need branches to go to different code segments ?
>
>
> Because of this nasty piece of little code:
> t/syn/eval_3.imc:
>
> # #!/usr/bin/perl -w
> # my $i= 5;
> # LAB:
> # $i++;
> # eval("goto LAB if ($i==6)");

Ok. Having inter_cs call DO_OP just seems more involved than it has to be.
How about a single self-contained inter-segment jump instruction.

Since the compiler knows when a branch is non-local it can always break a
non-local conditional branch into a conditional local branch to a non-local
branch instruction.

For example

if i, nonlocal

... not taken

can be expressed as

if i, TAKEN

... not taken
...

TAKEN: inter_jump nonlocal

--
Jason

Leopold Toetsch

unread,
Jan 22, 2003, 3:43:06 AM1/22/03
to Jason Gloudon, perl6-i...@perl.org
Jason Gloudon wrote:

> On Tue, Jan 21, 2003 at 08:21:42PM +0100, Leopold Toetsch wrote:
>
>
>># #!/usr/bin/perl -w
>># my $i= 5;
>># LAB:
>># $i++;
>># eval("goto LAB if ($i==6)");
>>
>
> Ok. Having inter_cs call DO_OP just seems more involved than it has to be.


Yep.


> How about a single self-contained inter-segment jump instruction.
>
> Since the compiler knows when a branch is non-local it can always break a
> non-local conditional branch into a conditional local branch to a non-local
> branch instruction.


This would mean to rewrite the branch target to point to a location
after the end of the current sub (or end of program).

if i, non_local1

would become

if i, taken1
...

end/ret # whatever
taken1: inter_cs_jump non_local1
...

Yep. Seems really much simpler. I'll try this approach.

Thanks for your input,
leo

0 new messages