Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: Proposal for a new PMC layout and more

10 views
Skip to first unread message

Dan Sugalski

unread,
Sep 1, 2004, 12:26:34 PM9/1/04
to Leopold Toetsch, Perl 6 Internals
At 5:17 PM +0200 9/1/04, Leopold Toetsch wrote:
>Below is a pod document describing some IMHO worthwhile changes. I
>hope I didn't miss some issues that could inhibit the implementation.

Interesting. But... no. Things are the way they are on purpose -- a
lot of thought, a not-incosiderable amount of pain, and a lot of
harsh experience went into precursor designs, the current design, and
the current implementation.

We're going to leave it as-is.
--
Dan

--------------------------------------it's like this-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Aaron Sherman

unread,
Sep 1, 2004, 12:32:49 PM9/1/04
to Leopold Toetsch, Perl 6 Internals
On Wed, 2004-09-01 at 11:17, Leopold Toetsch wrote:

> Comments welcome,

Honestly, much of this goes beyond my meager understanding of Parrot
internals, but I've read it, and most of it seems reasonable. Just on
point where you may not have considered a logical alternative:

> =head2 2.6. Morphing Undefs
>
> Currently all binary (and other) opcodes need an existing destination
> PMC. The normal sequence a compiler emits is something like this:
>
> $P0 = new Undef
> $P0 = a + b

Since you've lopped a lot of space off of PMCs, Undefs could be made
large enough to fit a basic buffer PMC (3 words). In that case, they
could always be upgraded in-place to integer PMCs, float PMCs, very
simple objects, references and buffers. Everything else would need to go
through a copy-upgrade.

The trade-off is that all PMCs would be 3 words unless special code was
emitted that avoided this for smaller (integer, float, reference) PMCs.

I'm not saying that this is a BETTER plan, just an idea to think about
and a different set of trade-offs.

--
781-324-3772
a...@ajs.com
http://www.ajs.com/~ajs

Nicholas Clark

unread,
Sep 1, 2004, 12:44:43 PM9/1/04
to Leopold Toetsch, Perl 6 Internals
On Wed, Sep 01, 2004 at 05:17:55PM +0200, Leopold Toetsch wrote:

> PMCs are using too much memory (5 words for an Integer PMC, 11 + n
> words plus two indirection for an object with n attributes). The
> reduction of IIRC 9 words to the current 5 words almost doubled
> execution speed for not too small amounts of allocated PMCs.

I would be much happier if we got to a functionally complete implementation
of parrot with stable, useful APIs first.

And then put effort into optimising the implementation behind the scenes.
Based on more complete knowledge of how things performed on code generated
from real language compilers.

It may well turn out that your proposals make sense then, as well as now.
But I feel what's holding things up is not lack of speed, but lack of
completeness.

Nicholas Clark

Steve Fink

unread,
Sep 1, 2004, 1:28:42 PM9/1/04
to Leopold Toetsch, Perl 6 Internals
On Sep-01, Leopold Toetsch wrote:
> Below is a pod document describing some IMHO worthwhile changes. I hope
> I didn't miss some issues that could inhibit the implementation.

Overall, I like it, although I'm sure I haven't thought of all of the
repercussions.

The one part that concerns me is the loss of the flags -- flags just
seem generally useful for a number of things. In the limit, each flag
could be replaced by an equivalent vtable entry that just returned true
or false, but that will only work for rarely-used flags due to the extra
levels of indirection. I suppose we could also have a large class of
PMCs that contained a flag word, and only the primitive PMCs would lack
it, but then the flags cannot be used without knowing the type of PMC.

It all comes down to the specific current and future uses of flags.
You've dealt with the GC flags; what about the rest?

The proposal would also expand the size of the vtable by a bit due to
the string vtable stuff. I don't know how much that is, percentage-wise.
And I suppose that increase is dwarfed by the decrease due to
eliminating the S variants. (Although that's another part of the
proposal that makes me nervous -- will MMD really take care of all of
the places where we care that we're going to a string, specifically,
rather than any other random PMC type? Strings are a pretty widespread
concept throughout the code base, and this is the only highly
user-visible part of the change.)

I also view the proposal as being comprised of several fairly
independent pieces. Something like:

* Merging PMCs and Buffers
* Merging STRINGs and PMCs
* Removing GC-related flags and moving them to GC implementations
* Removing the rest of the flags
* Using Null instead of Undef
* Moving "extra" stuff to before the PMC pointer
* Using Refs to expand PMCs
* Using DOD to remove the Ref indirection
* Shrinking the base PMC size

..and whatever else I forgot. Not all of these are dependent on each
other, and could be implemented separately. And some are only dependent
in the sense that you'll make space or time performance worse until you
make the rest of the related changes. You could call those
design-dependent, rather than implementation-dependent.

Dan Sugalski

unread,
Sep 1, 2004, 1:30:56 PM9/1/04
to Leopold Toetsch, Perl 6 Internals
At 5:17 PM +0200 9/1/04, Leopold Toetsch wrote:
>Below is a pod document describing some IMHO worthwhile changes. I
>hope I didn't miss some issues that could inhibit the implementation.

Okay, the "No" warrants more explanation.

First off, the current structure of PMCs, Buffers, and Strings is
definitely a mess, what with the multiple nested structs, semi-shared
data, and weird smallobject overlap. A lot of stuff that is, in
retrospect, crap has been layered on, so if this gets beaten up and
cleaned out I won't mind in the least.

The PMC scheme -- where PMCs are an immovable header with a vtable
slot, cache slot, and flag slot -- stays. It's this way on purpose,
and matches normal usage patterns (nicely efficiently) for perl 5 as
well as (oddly) most python and ruby usage. (Where there's a
preponderance of low-level types)

Buffers and strings are special-purpose constructs, or at least they
*should* be. They're segregated off for GC purposes. While they could
be unified with PMCs, I don't want them to be. They've specific,
special purposes, and as such they're staying the way they are.

Strings, FWIW, are *not* a perl 6 specific thing. The current string
design is sufficient, and *will* be used, for perl 5, python, and
ruby, as well as any other language that wants to live on parrot and
handle string data. While there's stuff to be added still, there's no
reason that I can see to mess with them.

Finally, Nicholas is right -- this is messing around with stuff that
already works. We're better off working on things that don't exist
yet, and leave this to later.

If you want, we can hash out the changes to sub calling (with the
swapping interpreter structs we've been arguing over), moving the
return continuation/calling object/called sub into the interp
structure, and fixing up the JIT and exception handling stuff to deal
with it. That, at least, will be visible to bytecode programs and
worth getting done.

Leopold Toetsch

unread,
Sep 1, 2004, 2:10:38 PM9/1/04
to Steve Fink, perl6-i...@perl.org
Steve Fink <st...@fink.com> wrote:
> On Sep-01, Leopold Toetsch wrote:
>> Below is a pod document describing some IMHO worthwhile changes. I hope
>> I didn't miss some issues that could inhibit the implementation.

> Overall, I like it, although I'm sure I haven't thought of all of the
> repercussions.

> The one part that concerns me is the loss of the flags -- flags just
> seem generally useful for a number of things. In the limit, each flag
> could be replaced by an equivalent vtable entry that just returned true
> or false,

I'm not thinking about vtable entries returning a flag bit. E.g. the
presence of PObj_custom_mark_FLAG could as well be tested as:

if (pmc->vtable->mark) // != NULL

Generally speaking the vtable mostly holds the information, that is
needed for one kind of PMC. More specialized PMCs can have their private
flags (for example a Key PMC). But they are normally not needed. An
Integer or Float PMC doesn't need any flags to perform its operation.
The proposed scheme doesn't of course forbid private flags in the PMCs
data section. But a lot of PMCs just don't need any flags.

> The proposal would also expand the size of the vtable by a bit due to
> the string vtable stuff.

No. The vtable would very likely shrink. 46 vtable (or MMD) entries are
currently used by STRING* operations. These would be just PMC
operations, which we have anyway.

> ... will MMD really take care of all of


> the places where we care that we're going to a string, specifically,
> rather than any other random PMC type?

MMDs have to deal with that anyway. We have String PMCs. The vtables or
MMD functions that currently take STRING* ought to be optimized
shortcuts for STRING* arguments. But if a String PMC is passes, still
The Rigth Thing has to happen.

> I also view the proposal as being comprised of several fairly
> independent pieces. Something like:

* allow/allocate variable sized PMCs

Then --yes.

> * Merging PMCs and Buffers
> * Merging STRINGs and PMCs

That's the same thing, mostly.

> * Removing GC-related flags and moving them to GC implementations

We already have that. But it's not hidden or encapsulated.

> * Removing the rest of the flags

Yep.

> * Using Null instead of Undef

No. Undef is a totally different thing. There is no change here. The
Null PMC catches program errors (like using a C NULL pointer). The Undef
is just a placeholder that morphs to any other type.

> * Moving "extra" stuff to before the PMC pointer
> * Using Refs to expand PMCs
> * Using DOD to remove the Ref indirection
> * Shrinking the base PMC size

Yep. That is related. though.

> ... And some are only dependent


> in the sense that you'll make space or time performance worse until you
> make the rest of the related changes. You could call those
> design-dependent, rather than implementation-dependent.

Yes.

leo

Leopold Toetsch

unread,
Sep 1, 2004, 2:50:50 PM9/1/04
to Dan Sugalski, perl6-i...@perl.org
Dan Sugalski <d...@sidhe.org> wrote:
> At 5:17 PM +0200 9/1/04, Leopold Toetsch wrote:
>>Below is a pod document describing some IMHO worthwhile changes. I
>>hope I didn't miss some issues that could inhibit the implementation.

> Okay, the "No" warrants more explanation.

Thanks.

> First off, the current structure of PMCs, Buffers, and Strings is
> definitely a mess, what with the multiple nested structs, semi-shared
> data, and weird smallobject overlap. A lot of stuff that is, in
> retrospect, crap has been layered on, so if this gets beaten up and
> cleaned out I won't mind in the least.

Well, that's what the proposal is for. Cleaning up and unifying existing
mess.

> The PMC scheme -- where PMCs are an immovable header with a vtable
> slot, cache slot, and flag slot -- stays. It's this way on purpose,
> and matches normal usage patterns (nicely efficiently) for perl 5 as
> well as (oddly) most python and ruby usage. (Where there's a
> preponderance of low-level types)

I don't see that "matches normal usage patterns". Just the opposite of
it. The current PMC structure doesn't easily allow to create e.g.
Pythons "all is an object" POV. An Integer just needs 2 words of
information and not more. The rest (3 words) is just wasted. No
interpreter I've looked at has fixed sized objects. OTOH aggregates have
artifical helper structures to store needed information. A variable
sized PMC covers that all and eliminates all indirections totally to
access these data. I don't see efficiency either, neither in execution
time nor in memory usage in the current scheme.

> Buffers and strings are special-purpose constructs, or at least they
> *should* be. They're segregated off for GC purposes.

Buffers and strings are different because the current PMC structure
doesn't allow or support an arbitrary object layout. This lack of
functionality creates the need for Buffers. Which leads to more
indirection in accessing an PMC's data and more overhead during GC.

The unificiation into one coherent object model just simplifies all that
stuff.

> ... While they could


> be unified with PMCs, I don't want them to be. They've specific,
> special purposes, and as such they're staying the way they are.

A PMC is specific enough. The vtable makes it special. The vtable
defines the functionality of that very object. There isn't any real
difference between an Buffer structure or an array-ish PMC. Both hold
some amount of data. But we currently treat these two totally
differently for no good reason.

> Strings, FWIW, are *not* a perl 6 specific thing. The current string
> design is sufficient, and *will* be used, for perl 5, python, and
> ruby, as well as any other language that wants to live on parrot and
> handle string data. While there's stuff to be added still, there's no
> reason that I can see to mess with them.

Well, the current need for a distinct STRING type arises just because of
a lack in PMCs to deal with strings. E.g.

$P0 = a_func_returning_a_string()
$S0 = $P0
$I0 = length $S0

That's the way to get at the length of a string. Python doesn't have a
notion of a STRING, I'm not aware of anything like that in Perl5 either.
So functions are returning objects aka PMCS. Mostly all operations are
dealing with PMCs only. This is my experience coming from the Pie-thon
quest, but not alone.

The need for STRING opcodes and vtable/MMD functions just comes from a
lack of functionality in PMCs. Unifying or just having String PMCs
eliminates this lack and almost one third of opcodes.

STRING operations aren't the fastest anyway. I don't see any reason to
just provide all these in PMCs (which we need anyway) and eliminate the
duplication with a distinct type.

Native integers and numbers do warrant the specialization. Processors and
JIT can supprt these types natively. Nothing can be done with STRINGs*.
These are just overhead and code duplication currently.

> Finally, Nicholas is right -- this is messing around with stuff that
> already works. We're better off working on things that don't exist
> yet, and leave this to later.

That's of course true. There is a lot of stuff that needs to be done and
should be done before reworking internals deeply. OTOH a lot of
currently todo stuff could immediately be done much more easily.

We need some more PMCs e.g. to manage packfiles and code segments. The
rigid structure of the fixed-sized PMCs is always a PITA when
implementing new objects.

Unicode string vtables is another issue, albeit I don't know, if
some/all vtable slots are usable for string operations. But we got
already e.g. concatenate or bitwise string vtables.

> If you want, we can hash out the changes to sub calling (with the
> swapping interpreter structs we've been arguing over), moving the
> return continuation/calling object/called sub into the interp
> structure,

Of course, yes. That thread is BTW lacking another answer: what's the
difference between your derived proposal and mine.

> ... and fixing up the JIT and exception handling stuff to deal


> with it. That, at least, will be visible to bytecode programs and
> worth getting done.

Yep.

leo

0 new messages