Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Message from discussion CGP - CGoto Prederefed runloop
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Leopold Toetsch  
View profile  
 More options Feb 11 2003, 5:50 am
Newsgroups: perl.perl6.internals
From: l...@toetsch.at (Leopold Toetsch)
Date: Tue, 11 Feb 2003 10:49:14 +0100
Local: Tues, Feb 11 2003 4:49 am
Subject: Re: [CVS ci] CGP - CGoto Prederefed runloop

Leopold Toetsch wrote:
> Nicholas Clark wrote:
>> Inside a cgoto core have 1 extra op - enter JITted section.
> Or go the other way round: Run from JIT. If there is a sequence of non
> JITable ops, convert these to a CGP section, which returns to JIT when
> finished. This would save a lot of function calls to jit_normal_op.

I have thought about this, with a little help from ddd:

1)
opcode_t *
cgp_core(opcode_t *cur_op, struct Parrot_Interp *interpreter)
{
#ifdef __GNUC__
     register opcode_t *cur_opcode asm ("esi") = cur_op;
#else
     opcode_t *cur_opcode = cur_op;
#endif

This produces unoptimized almost the same code quality and speed as -O3.

The cur_opcode is in %esi, all operand access is done like in the posted
-O3 example:
$ parrot -P mops.pbc
82.019517 M op/s

2) There is one new opcode:

    B<jmp_to_eip inconst INT>

The argument is the native_ptr in JIT code, where to return from a
section of non JITed code.

    goto **(cur_opcode + 1);

The address is filled in by the JIT emit functions.

3) The Parrot_jit_begin() emits code to call cgp_core (alas setting up
the same stack frame as cgp_core) and B<jmp_to_eip> back to the address
after the function call

4) When there is a seqence (more then 1) non JITed ops,
Parrot_jit_normal_op emits code to calculate %esi (the *cur_opcode) in
the prederefed jump table and *jumps* there. The end of the section is
above jmp_to_eip instruction. This implies, that after a non JITed
section, the JITed section is at least two opcodes sized, to have room
to fill in this jump.

5) non JITed branches do not fit very nicely in this scheme, but there
are several possible ways to handle these:
- make all branches JITted
- generate another core (cgp_jit_core), which does the right thing
- always emit code by Parrot_jit_cpcf_op() for the last opcode in the
section (cgp_core would only be used if the nonJITed section is >= 3
instructions)
- if both ends of the branch are non JITed sections do nothing, just
stay in the cgp_core and patch the ends of these branches to return to
JIT (brrr)

6) and finally the prederefed B<end> opcode gets jumped to by code
emitted from Parrot_end_jit, to clean up the cgp_core stack frame and
return from JIT.

So we would not have any function call overhead and getting the best
performance by combining the 2 fastest run cores.

This approach would of course need some architecture/compiler specific
hacks, but JIT is such a hack anyway. OTOH it is almost totally
encapsulated in the architecture jit file, so it *can* be implemented
but there is no need to do so.

Comments welcome,
leo


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.