Gmail Calendar Documents Reader Web more »
Recently Visited Groups | Help | Sign in
Google Groups Home
Prederefed run cores
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  10 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Leopold Toetsch  
View profile  
 More options Oct 28 2004, 5:13 am
Newsgroups: perl.perl6.internals
From: l...@toetsch.at (Leopold Toetsch)
Date: Thu, 28 Oct 2004 11:13:51 +0200
Local: Thurs, Oct 28 2004 5:13 am
Subject: Prederefed run cores
With the indirect register addressing all prederefed run cores
(Prederefed, CGP, Switch) are currently not functional, as these run
cores have absolute addresses in the prederefed code.

I see two ways to fix it:

1) use frame pointer relative addressing:
    + prederefed code is usable by different threads too
    - ~4 times increase in code size of core_ops_*.{c,o} [1]

2) Re-prederef on function calls, if frame pointer differs
    + no impact on code size
    - needs precise code length of functions
    - threads need distinct prederefed code
    - possibly slower then 1)

Comments welcome,
leo

[1] due to absolute addressing a constant argument and a register
argument have the same code, set_i_ic and set_i_i are the same.


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dan Sugalski  
View profile  
 More options Oct 28 2004, 8:54 am
Newsgroups: perl.perl6.internals
From: d...@sidhe.org (Dan Sugalski)
Date: Thu, 28 Oct 2004 08:54:56 -0400
Local: Thurs, Oct 28 2004 8:54 am
Subject: Re: Prederefed run cores
At 11:13 AM +0200 10/28/04, Leopold Toetsch wrote:

>With the indirect register addressing all prederefed run cores
>(Prederefed, CGP, Switch) are currently not functional, as these run
>cores have absolute addresses in the prederefed code.

>I see two ways to fix it:

>1) use frame pointer relative addressing:
>    + prederefed code is usable by different threads too
>    - ~4 times increase in code size of core_ops_*.{c,o} [1]

>2) Re-prederef on function calls, if frame pointer differs
>    + no impact on code size
>    - needs precise code length of functions
>    - threads need distinct prederefed code
>    - possibly slower then 1)

Or 3) Toss the prederef stuff entirely.
--
                                Dan

--------------------------------------it's like this-------------------
Dan Sugalski                          even samurai
d...@sidhe.org                         have teddy bears and even
                                       teddy bears get drunk


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Duraid Madina  
View profile  
 More options Oct 28 2004, 9:27 am
Newsgroups: perl.perl6.internals
From: dur...@octopus.com.au (Duraid Madina)
Date: Thu, 28 Oct 2004 22:27:32 +0900
Local: Thurs, Oct 28 2004 9:27 am
Subject: Re: Prederefed run cores

Dan Sugalski wrote:
> Or 3) Toss the prederef stuff entirely.

Which might not be quite as bad as it sounds: on at least one "strange
platform" (IA64 HP-UX) the native C compiler gets the switch core
running faster than the prederef core! (!)

        Duraid


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Leopold Toetsch  
View profile  
 More options Oct 28 2004, 11:36 am
Newsgroups: perl.perl6.internals
From: l...@toetsch.at (Leopold Toetsch)
Date: Thu, 28 Oct 2004 17:36:43 +0200
Local: Thurs, Oct 28 2004 11:36 am
Subject: Re: Prederefed run cores

Dan Sugalski wrote:
> At 11:13 AM +0200 10/28/04, Leopold Toetsch wrote:

>> 1) use frame pointer relative addressing:
>>    + prederefed code is usable by different threads too
>>    - ~4 times increase in code size of core_ops_*.{c,o} [1]

>> 2) Re-prederef on function calls, if frame pointer differs
>>    + no impact on code size
>>    - needs precise code length of functions
>>    - threads need distinct prederefed code
>>    - possibly slower then 1)

> Or 3) Toss the prederef stuff entirely.

Well, the prederefed function core (parrot -P) is for sure not
necessary. Are still remaining CGP and switched core, which is
prederefed too. CGP is by far the fasted run-core for JIT-less
architectures, if CGoto is available. The switched core can of course
run w/o prederef too.

But one thing is nice with prederef: it's by far the simplest way to
create a safe run core that verifies opcode arguments. This could of
course be done w/o predereferencing afterwords, but while you are
checking function args, predereferencing these is of almost zero cost.

Using option 1) above isn't really complicated. The problem we have is
code size and opcode count, which is a problem with the CGoto core too.

I've proposed not too long ago to toss all opcode variants with
constants and just leave:

   set I, Ic
   set N, Nc
   set S, Sc

Immediate constants aren't really that useful with RISC cpus. You might
have a look at e.g. jit/arm/jit_emit.h:459 ff.

leo


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Leopold Toetsch  
View profile  
 More options Oct 28 2004, 11:01 am
Newsgroups: perl.perl6.internals
From: l...@toetsch.at (Leopold Toetsch)
Date: Thu, 28 Oct 2004 17:01:11 +0200
Local: Thurs, Oct 28 2004 11:01 am
Subject: Re: Prederefed run cores

Duraid Madina wrote:
> Dan Sugalski wrote:

>> Or 3) Toss the prederef stuff entirely.

> Which might not be quite as bad as it sounds: on at least one "strange
> platform" (IA64 HP-UX) the native C compiler gets the switch core
> running faster than the prederef core! (!)

Err, the switched core *is* a prederefed core.

>     Duraid

leo

    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Leopold Toetsch  
View profile  
 More options Nov 1 2004, 5:12 am
Newsgroups: perl.perl6.internals
From: l...@toetsch.at (Leopold Toetsch)
Date: Mon, 01 Nov 2004 11:12:50 +0100
Local: Mon, Nov 1 2004 5:12 am
Subject: Re: Prederefed run cores

Leopold Toetsch wrote:
> Dan Sugalski wrote:

>> At 11:13 AM +0200 10/28/04, Leopold Toetsch wrote:

>>> 1) use frame pointer relative addressing:
>>>    + prederefed code is usable by different threads too
>>>    - ~4 times increase in code size of core_ops_*.{c,o} [1]

I've now committed this case 1) as a fix for prederefed run cores. It's
unoptimized currently. make fulltest is passing again here.

>> Or 3) Toss the prederef stuff entirely.

> Well, the prederefed function core (parrot -P) is for sure not
> necessary.

Patches welcome to remove the plain prederefed function core
F<ops/core_ops_prederef.*>. F<lib/Parrot/OpTrans/CPrederef.pm> is still
needed as an abstract base class of CGP.pm and CSwitch.pm but can be
cleanued up too.

I still like to keep CGP and CSwitch run cores. The latter as the safe
run core with argument checking and as a fallback, if CGOTO isn't
available on that platform. The former as an extension for JIT to run
non-JITted opcodes. Similar to the current JIT_CGP stuff on i386, but in
a more general way:

For a sequence of non-JITted opcodes: create a copy of the byte-code of
these non-JITted opcodes and append one opcode that returns to JIT. Then
fill it with the CORE_ops_prederef__ opcode. Generate code to call this
piece of code via cgp_core().

leo


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dan Sugalski  
View profile  
 More options Nov 1 2004, 8:57 am
Newsgroups: perl.perl6.internals
From: d...@sidhe.org (Dan Sugalski)
Date: Mon, 1 Nov 2004 08:57:46 -0500
Local: Mon, Nov 1 2004 8:57 am
Subject: Re: Prederefed run cores
At 11:12 AM +0100 11/1/04, Leopold Toetsch wrote:

While I want to keep the switch core, I'm still not seeing the need
for prederef with it. I'm presuming this crept in at some point and
just needs un-creeping?
--
                                Dan

--------------------------------------it's like this-------------------
Dan Sugalski                          even samurai
d...@sidhe.org                         have teddy bears and even
                                       teddy bears get drunk


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Leopold Toetsch  
View profile  
 More options Nov 1 2004, 9:41 am
Newsgroups: perl.perl6.internals
From: l...@toetsch.at (Leopold Toetsch)
Date: Mon, 1 Nov 2004 15:41:15 +0100
Local: Mon, Nov 1 2004 9:41 am
Subject: Re: Prederefed run cores

Dan Sugalski <d...@sidhe.org> wrote:
> While I want to keep the switch core, I'm still not seeing the need
> for prederef with it. I'm presuming this crept in at some point and
> just needs un-creeping?

Using prederef for switch has one advantage: it's a bit faster. Before
the indirect register addressing it had another one: it took only 1/4th
of code size because of the collapsing of constant and register variants
into one switch case.

There is of course no need to prederef the switched core.

Maybe benchmarking the two variants yields a final answer.

leo


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Leopold Toetsch  
View profile  
 More options Nov 1 2004, 10:13 am
Newsgroups: perl.perl6.internals
From: l...@toetsch.at (Leopold Toetsch)
Date: Mon, 1 Nov 2004 16:13:00 +0100
Subject: Re: Prederefed run cores

Dan Sugalski <d...@sidhe.org> wrote:
> Or 3) Toss the prederef stuff entirely.

And here is, why I want to keep the CGP core:

  sub_i_i_i

    0x81bbef0 <cgp_core+33488>:   mov    0x4(%esi),%ecx
    0x81bbef3 <cgp_core+33491>:   mov    0x8(%esi),%edx
    0x81bbef6 <cgp_core+33494>:   mov    0xc(%esi),%eax
    0x81bbef9 <cgp_core+33497>:   add    $0x10,%esi
    0x81bbefc <cgp_core+33500>:   mov    (%eax,%edi,1),%eax
    0x81bbeff <cgp_core+33503>:   mov    (%edx,%edi,1),%edx
    0x81bbf02 <cgp_core+33506>:   sub    %eax,%edx
    0x81bbf04 <cgp_core+33508>:   mov    %edx,(%ecx,%edi,1)
    0x81bbf07 <cgp_core+33511>:   jmp    *(%esi)

  if_i_ic

    0x81b4152 <cgp_core+1330>:    mov    0x4(%esi),%eax
    0x81b4155 <cgp_core+1333>:    cmpl   $0x0,(%eax,%edi,1)
    0x81b4159 <cgp_core+1337>:    je     0x81b4167 <cgp_core+1351>
    0x81b415b <cgp_core+1339>:    mov    0x8(%esi),%eax
    0x81b415e <cgp_core+1342>:    mov    (%eax),%eax
    0x81b4160 <cgp_core+1344>:    shl    $0x2,%eax
    0x81b4163 <cgp_core+1347>:    add    %eax,%esi
    0x81b4165 <cgp_core+1349>:    jmp    *(%esi)
    0x81b4167 <cgp_core+1351>:    add    $0xc,%esi
    0x81b416a <cgp_core+1354>:    jmp    *(%esi)

%esi ... cur_opcode
%edi ... register frame pointer

A register access is 2 CPU instructions only:

mov 8(%esi), %edx    # cur_opcode[2], i.e. offset of REG_INT(x)
mov (%edx, %edi, 1), %edx  # get *(base + offset)

That's all.

$ ./parrot -C mops.pasm
Iterations:    100000000
Estimated ops: 200000000
Elapsed time:  2.156002
M op/s:        92.764291

That's an Athlon 800 - 8.5 CPU instructions per Parrot instruction.

leo


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Leopold Toetsch  
View profile  
 More options Nov 1 2004, 11:26 am
Newsgroups: perl.perl6.internals
From: l...@toetsch.at (Leopold Toetsch)
Date: Mon, 1 Nov 2004 17:26:51 +0100
Local: Mon, Nov 1 2004 11:26 am
Subject: Re: Prederefed run cores
FWIW the CGP sub_i_i_i opcode on the PowerBook

0x001048d4 <cgp_core+35652>:    lwz     r0,8(r30)
0x001048d8 <cgp_core+35656>:    lwz     r2,12(r30)
0x001048dc <cgp_core+35660>:    lwzx    r0,r27,r0
0x001048e0 <cgp_core+35664>:    lwzx    r2,r27,r2
0x001048e4 <cgp_core+35668>:    lwz     r9,4(r30)
0x001048e8 <cgp_core+35672>:    subf    r0,r2,r0
0x001048ec <cgp_core+35676>:    stwx    r0,r27,r9
0x001048f0 <cgp_core+35680>:    lwzu    r2,16(r30)
0x001048f4 <cgp_core+35684>:    mtctr   r2
0x001048f8 <cgp_core+35688>:    bctr

Only slightly longer caused by the branch sequence but also quite
compact.

leo


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2010 Google