[RfC] A New PMC Layout

Leopold Toetsch

unread,

May 22, 2003, 3:37:57 PM5/22/03

to P6I, Dan Sugalski

First 2 cites from history:

*) I'd like to get access to pmc elements done via macros as well--I
think it's time to split PMCs into pieces, since we're definitely
trashing the cache in the common case with things we don't commonly
need. (Like the metadata, thread, and GC stuff) PMCs are 40 bytes on
my PPC system. Too much cache fluff.
-- Dan in "Change to the calling convention, and other fallout ..."

Small PMC (SPMC), a half sized PMC has double performance in stress.pasm
-- /me (ages ago)

So here is a short proposal for a new PMC layout:

1) SPMC - Small PMC
- holds integers, doubles and string-pointers[1]
- 16 byte size on i386 ( = sizeof(PObj) + sizeof(*vtable) )

2) PMC - our current PMC
- holds aggregates, Sub's and so on, everything that doesn't fit into a SPMC

3) EPMC (extended PMC) = SPMC->PMC
i.e a small PMC with appended properties or SYNC for multi threading,
which are kept in a normal PMC
- we allocate a new PMC located in an unmanaged smallobject pool
- SPMC (cache) data are moved to this PMC piece
- pmc->u.pmc_val points to this PMC piece
- the vtable entries of the SPMC take care of accessing data[2]
- vtable of the PMC could be used e.g. for PIO or who knows

[1] currently ("for hysterical raisins" (Dan)) a string is pointed to by
PMC->data. The string would move first to PMC->u.string_val as a
prerequisit.

[2] some thoughts (Dan) WRT vtable layout would help here. Also the
objects patch (Dan) could be useful :-)
The Extended PMC thus would need a different vtable then the SPMC.

Comments welcome,
leo

PS: the SPMC (16 bytes) could have a still smaller optimized version
(MPMC - Mini PMC, 12 bytes, for integers and string/other ptrs only),
but this would make morphing an Int to a Num more expensive. Albeit it
would be faster for the "normal case" as - and the more, when known -
such morphing can mostly be avoided.

(all sizes 32bit i386 of course)

Dan Sugalski

unread,

May 22, 2003, 11:30:16 PM5/22/03

to Leopold Toetsch, P6I

At 9:37 PM +0200 5/22/03, Leopold Toetsch wrote:
>First 2 cites from history:
>
>*) I'd like to get access to pmc elements done via macros as well--I
>think it's time to split PMCs into pieces, since we're definitely
>trashing the cache in the common case with things we don't commonly
>need. (Like the metadata, thread, and GC stuff) PMCs are 40 bytes on
>my PPC system. Too much cache fluff.
> -- Dan in "Change to the calling convention, and other fallout ..."
>
>Small PMC (SPMC), a half sized PMC has double performance in stress.pasm
> -- /me (ages ago)

Ah, this again. :)

I think we've gone over this enough in the past to draw a few conclusions:

1) We can't have PMCs that change size
2) We can't count on any particular PMC staying any particular type
3) We like compact arenas
4) We like cache-friendly structures

Part of the reason for the PMC/buffer split originally was to have a
smaller, less functional part that did just enough to perform its
function without having the baggage of a full variable, including
synchronization, GC pointers, and suchlike things.

Now, it'd be nice to have things denser. I'm pretty sure that, after
thrashing this over a few times, the only viable solution we came up
with was to shuffle off the 'extra' bits into an auxiliary structure
and embed a pointer to that structure in the main PMC body. That way
we get the cache density

>So here is a short proposal for a new PMC layout:
>
>1) SPMC - Small PMC
>- holds integers, doubles and string-pointers[1]
>- 16 byte size on i386 ( = sizeof(PObj) + sizeof(*vtable) )
>
>2) PMC - our current PMC
>- holds aggregates, Sub's and so on, everything that doesn't fit into a SPMC

I don't think this is worth it, given the possibility for PMCs to
morph. Either we don't bother with #1, or thump things so it's
essentially what the Buffer was originally. (Which isn't far off of
what it is now) But that's not really a PMC,so it doesn't really
count. A roundabout way of saying No to #1, I guess. :)

>3) EPMC (extended PMC) = SPMC->PMC
>i.e a small PMC with appended properties or SYNC for multi
>threading, which are kept in a normal PMC
>- we allocate a new PMC located in an unmanaged smallobject pool
>- SPMC (cache) data are moved to this PMC piece
>- pmc->u.pmc_val points to this PMC piece
>- the vtable entries of the SPMC take care of accessing data[2]
>- vtable of the PMC could be used e.g. for PIO or who knows

I don't think it's worth keeping the extended information in another
PMC, since I don't think it's worth having PMCs that have this
information and PMCs that don't. If we get it out of the main PMC
structure that'll get us the cache density, while guaranteeing that
it exists lets us skip a null pontinter check every time we want to
use it.

In the extended info, I expect we should put at the very least the
property hash pointer and the sync info. The main structure should
have the pointer to the extended info (I know--Duh! :), the vtable
pointer, the cache info, the flags, and the data pointer. We could, I
suppose, toss the cache data or move it to the extended struct, in
which case the base PMC would be 16 bytes. If we ordered it vtable,
flags, data pointer, extended pointer we'd probably pair things up
well for cache access, since the vtable and flags will probably be
accessed together. (Though pairing the vtable and data pointer might
be better)

I'm not sure if the GC info should go in the main or extended struct.
We could leave it in the extended struct, which could let us do some
interesting tricks if the extended info lined up with the primary PMC
info. (Basically we'd have two parallel arrays, and PMC #1 always had
a pointer to extended info #1) We'd still want the pointer for speed,
but that guarantee could let a few other things make some assumptions.

>[2] some thoughts (Dan) WRT vtable layout would help here. Also the
>objects patch (Dan) could be useful :-)
>The Extended PMC thus would need a different vtable then the SPMC.

This, I don't want. I'd much rather have a constant vtable, so we
could make some validity assumptions in code.
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Leopold Toetsch

unread,

May 23, 2003, 11:38:57 AM5/23/03

to Dan Sugalski, P6I

Dan Sugalski wrote:

> In the extended info, I expect we should put at the very least the
> property hash pointer and the sync info. The main structure should have
> the pointer to the extended info (I know--Duh! :), the vtable pointer,
> the cache info, the flags, and the data pointer. We could, I suppose,
> toss the cache data or move it to the extended struct, in which case the
> base PMC would be 16 bytes.

But that would put the cache i.e. int_val, num_val, string_val into the
extend info, which defeats the cache density. vtable and the accessed
data are in different areas then. The data member is not used that much
compared to the cache (all scalars).

I can only see cache, flags, vtable and ptr to the extended structure
(20 bytes) as an alternative.

> I'm not sure if the GC info should go in the main or extended struct.

Extended, we need it mainly for aggregates.

> ... We

> could leave it in the extended struct, which could let us do some
> interesting tricks if the extended info lined up with the primary PMC
> info.

I don't see a reason, why to allocate an extended strucutre, when it is
not used.

>> The Extended PMC thus would need a different vtable then the SPMC.
>
> This, I don't want. I'd much rather have a constant vtable, so we could
> make some validity assumptions in code.

Ok, then I'll try this:

- SPMC = current SPMC[1] + ptr to extended info
- extended info hold GC_ptr, properties, SYNC, data
- it gets allocated on demand for scalars and always for other classes

[1] s. [PATCH] spmc from today

leo

Mitchell N Charity

unread,

May 27, 2003, 10:30:59 AM5/27/03

to perl6-i...@perl.org

>Small PMC (SPMC), a half sized PMC has double performance in stress.pasm

[...]
Ah, this again. :)

Perhaps it is time to get "multiple gc regimes can coexist" working?

Though having this capability is IMHO a Right Thing(tm) long-term,
I had been thinking of it as a task for a later time. But having it
now would allow simple pmcs, small pmcs, a draft generational collector,
and whatever other wizzy schemes (bit-masked array of unboxed floats?)
someone wants to play with. Have an idea? - Just implement the
standard PMC and GC access api. So proposed collectors can join the
main code branch, and real experience can guide their evolution and adoption.

In addition to complexity cost, downsides include an extra test/indirection
so multiple PMC layouts can coexist.

Mitchell

Leopold Toetsch

unread,

May 28, 2003, 6:15:48 AM5/28/03

to mcha...@vendian.org, perl6-i...@perl.org

Mitchell N Charity <mcha...@vendian.org> wrote:

> Perhaps it is time to get "multiple gc regimes can coexist" working?

Sounds good, but AFAIK doesn't work - or isn't practical. I can only
imagine to have some #defines in place, to switch/test different
schemes, as currently LEA allocator.

> In addition to complexity cost, downsides include an extra test/indirection
> so multiple PMC layouts can coexist.

That's the problem. When you touch each PMC during a DOD run, you defeat
e.g. my current approach with separated DOD flags, where PMC memory is
not touched.

> Mitchell

leo

Alin Iacob

unread,

May 29, 2003, 10:28:19 AM5/29/03

to perl6-i...@perl.org

Leopold Toetsch wrote:
> Mitchell N Charity <mcha...@vendian.org> wrote:
>
> > Perhaps it is time to get "multiple gc regimes can coexist" working?
>
> Sounds good, but AFAIK doesn't work - or isn't practical. I can only
> imagine to have some #defines in place, to switch/test different
> schemes, as currently LEA allocator.

Intel developed the Open Runtime Platform wich has multiple coexisting JITs
and GCs:
http://www.intel.com/technology/itj/2003/volume07issue01/art01_orp/p01_abstr
act.htm
http://www.intel.com/technology/itj/2003/volume07issue01/art01_orp/vol7iss1_
art01.pdf

> > In addition to complexity cost, downsides include an extra
test/indirection
> > so multiple PMC layouts can coexist.
>
> That's the problem. When you touch each PMC during a DOD run, you defeat
> e.g. my current approach with separated DOD flags, where PMC memory is
> not touched.
>
> > Mitchell
>
> leo
>

Alin