pmc_type

Nicholas Clark

unread,

Oct 26, 2004, 12:51:29 PM10/26/04

to perl6-i...@perl.org

pmc_type is documented and implemented as follows:

/*

=item C<INTVAL
pmc_type(Parrot_Interp interp, STRING *name)>

Returns the PMC type for C<name>.

=cut

*/

INTVAL
pmc_type(Parrot_Interp interp, STRING *name)
{
HashBucket *bucket;
PMC *classname_hash = interp->class_hash;

bucket = hash_get_bucket(interp, PMC_struct_val(classname_hash), name);
if (bucket)
return PMC_int_val((PMC*) bucket->value);
return Parrot_get_datatype_enum(interp, name);

}

On IRC I asked:

17:44 <@Nicholas> can the type returned by Parrot_PMC_typenum for a PMC ever be
zero?
17:45 <@Dan> I don't think so, no
17:46 <@Dan> Take that back -- a PMC of type 'default' can have a type of 0
17:46 <@Dan> But you shouldn't ever have one, as they're not really
instantiatable

In which case:

1: What does pmc_type return if it fails to find a PMC?
2: If that answer is 0, is it safe to document that 0 is a failure return,
which happens to map to the PMC type for "default", but as they can't be
instantiated looking up "default" is "not supported" (or words to that
effect)

Nicholas Clark

Stéphane Payrard

unread,

Oct 26, 2004, 1:17:54 PM10/26/04

to perl6-i...@perl.org

[snipped]

> 1: What does pmc_type return if it fails to find a PMC?
> 2: If that answer is 0, is it safe to document that 0 is a failure return,
> which happens to map to the PMC type for "default", but as they can't be
> instantiated looking up "default" is "not supported" (or words to that
> effect)
>
>
> Nicholas Clark
>

A related but different issue is that abstract pmcs like (Scalar
and PerlScalar) have no pmc_type. I understand that pmc_type are
an offset in the table of pmc vtables and that we probably cannot
have holes in that table. Nevertheless it would be nice for these
abstract pmcs to have a pmc_type; say for base type pmc checking in
imcc or some related tools.

I think one of the constraint is that the pmc type numbers must
not overlap the values in PARROT_DATA_TYPES.

--
stef

Leopold Toetsch

unread,

Oct 27, 2004, 6:12:32 AM10/27/04

to Nicholas Clark, perl6-i...@perl.org

Nicholas Clark <ni...@ccl4.org> wrote:
> pmc_type is documented and implemented as follows:

>=item C<INTVAL
> pmc_type(Parrot_Interp interp, STRING *name)>

> On IRC I asked:

> 17:44 <@Nicholas> can the type returned by Parrot_PMC_typenum for a PMC ever be
> zero?
> 17:45 <@Dan> I don't think so, no
> 17:46 <@Dan> Take that back -- a PMC of type 'default' can have a type of 0

C<enum_type_undef> aka 0 is returned for unknown types.

> 1: What does pmc_type return if it fails to find a PMC?

$I0 = find_type "no_such" # $I0 := 0

> 2: If that answer is 0, is it safe to document that 0 is a failure return,
> which happens to map to the PMC type for "default", but as they can't be
> instantiated looking up "default" is "not supported" (or words to that
> effect)

Yep, that's missing.

> Nicholas Clark

leo

Leopold Toetsch

unread,

Oct 27, 2004, 6:19:22 AM10/27/04

to st...@payrard.net, perl6-i...@perl.org

Stéphane Payrard <st...@payrard.net> wrote:

> A related but different issue is that abstract pmcs like (Scalar
> and PerlScalar) have no pmc_type. I understand that pmc_type are
> an offset in the table of pmc vtables and that we probably cannot
> have holes in that table. Nevertheless it would be nice for these
> abstract pmcs to have a pmc_type; say for base type pmc checking in
> imcc or some related tools.

Isn't really needed:

$ find t -name '*.t' | xargs grep -w isa
...
t/pmc/objects.t: isa I0, P1, "scalar"
...

Having a type enum for these abstract types would imply to install a
vtable, filled with methods that catch errors.

> I think one of the constraint is that the pmc type numbers must
> not overlap the values in PARROT_DATA_TYPES.

If you mean the struct _data_types data_types[] list of "native" types,
yes - they don't overlap, these are all negative numbers.

> --
> stef

leo

Stéphane Payrard

unread,

Oct 27, 2004, 8:09:43 AM10/27/04

to perl6-i...@perl.org

On Wed, Oct 27, 2004 at 12:19:22PM +0200, Leopold Toetsch wrote:
> Stéphane Payrard <st...@payrard.net> wrote:
>
> > A related but different issue is that abstract pmcs like (Scalar
> > and PerlScalar) have no pmc_type. I understand that pmc_type are
> > an offset in the table of pmc vtables and that we probably cannot
> > have holes in that table. Nevertheless it would be nice for these
> > abstract pmcs to have a pmc_type; say for base type pmc checking in
> > imcc or some related tools.
>
> Isn't really needed:
>
> $ find t -name '*.t' | xargs grep -w isa
> ...
> t/pmc/objects.t: isa I0, P1, "scalar"
> ...
>
> Having a type enum for these abstract types would imply to install a
> vtable, filled with methods that catch errors.

I never proposed the installation of vtables for these types.
But to avoid creating holes in the vtables table,
there should be an integer range reserved for these abstract types.
Being outside the range for regular pmc type they would not
need any vtable.

>
> > I think one of the constraint is that the pmc type numbers must
> > not overlap the values in PARROT_DATA_TYPES.
>
> If you mean the struct _data_types data_types[] list of "native" types,
> yes - they don't overlap, these are all negative numbers.
>
> > --
> > stef
>
> leo
>

--
stef

Leopold Toetsch

unread,

Oct 27, 2004, 12:24:59 PM10/27/04

to st...@payrard.net, perl6-i...@perl.org

Stéphane Payrard <st...@payrard.net> wrote:
> On Wed, Oct 27, 2004 at 12:19:22PM +0200, Leopold Toetsch wrote:

>> Having a type enum for these abstract types would imply to install a
>> vtable, filled with methods that catch errors.

> I never proposed the installation of vtables for these types.

Well, then a type enum for abstract types is probably not useful anyway.

> there should be an integer range reserved for these abstract types.
> Being outside the range for regular pmc type they would not
> need any vtable.

And then?

leo

Stéphane Payrard

unread,

Oct 27, 2004, 2:26:56 PM10/27/04

to perl6-i...@perl.org

That would allow to implement typechecking in imcc.

.sym Scalar a
a = new .PerlInt # ok. Perlint is derived from Scalar

--
stef

>
> leo
>

Stéphane Payrard

unread,

Oct 27, 2004, 7:06:42 PM10/27/04

to perl6-i...@perl.org

On Wed, Oct 27, 2004 at 01:19:29PM -0600, Luke Palmer wrote:
> Stéphane Payrard writes:
> > That would allow to implement typechecking in imcc.
> >
> > .sym Scalar a
> > a = new .PerlInt # ok. Perlint is derived from Scalar
>

> Ugh, yeah, but what does that buy you? In dynamic languages pure
> derivational typechecking is very close to useless. The reason C++[1]
> has pure derivational semantics is because of implementation. The
> vtable functions have the same relative address, so you can use a
> derived object interchangably. In a language where methods are looked
> up by name, such strictures are more often over-restrictive than
> helpful.

So, still thinking (too?) statically, I should probably worry
about what "does" a pmc instead on what it is derived from. In
that case, I would not have to worry about abstract pmc classes.
So abstract classes missing from the enum of
core_pmcs.h, which prompted that subthread, becomes a moot point.

>
> Anyway, that's just my rant. If such a thing is to be in imcc, it
> _must_ be optional without loss of feature. I have quibble with the
> automatic typechecking of .param variables for the same reason.

Declaring a symreg just as a pmc would do that anyway.

>
> Luke
>
> [1] And the reason Java has it is because C++ did. Great design work,
> guys.
>

--
stef

Paolo Molaro

unread,

Oct 28, 2004, 2:24:33 PM10/28/04

to perl6-i...@perl.org

On 10/27/04 Luke Palmer wrote:
> Stéphane Payrard writes:

> > That would allow to implement typechecking in imcc.
> >
> > .sym Scalar a
> > a = new .PerlInt # ok. Perlint is derived from Scalar
>

> Ugh, yeah, but what does that buy you? In dynamic languages pure
> derivational typechecking is very close to useless. The reason C++[1]
> has pure derivational semantics is because of implementation. The
> vtable functions have the same relative address, so you can use a
> derived object interchangably. In a language where methods are looked
> up by name, such strictures are more often over-restrictive than
> helpful.

Actually, if I were to write a perl runtime for parrot, mono or
even the JVM I'd experiment with the same pattern. I guess it could
be applied to a python implementation, too.
You would assign small interger IDs to the names of the methods
and build a vtable indexed by the id. In most cases the method name
is known at compile time, so you know the id and you can get
the method with a simple load from the vtable. This is much faster
than a hash table lookup (I hinted at this in my old RFC for perl6).
Of course the table would be sparse, especially in pathological
programs, so you could have a limit, like 100 entries or less
with IDs bigger than that using a different lookup (binary search
on an array, for example). There are a number of optimizations that
can be done to reduce the vtable size, but I'm not sure this would
matter in parrot as long as bytecode values are as big as C ints:-)
Maybe someone has time to write a script and run it on a bunch of
perl programs and report how many different method names are usually
created. Of course it also depends how much the hash lookup will cost
wrt the total cost of a subroutine call...

lupus

--
-----------------------------------------------------------------
lu...@debian.org debian/rules
lu...@ximian.com Monkeys do it better

Leopold Toetsch

unread,

Oct 29, 2004, 4:12:43 AM10/29/04

to Paolo Molaro, perl6-i...@perl.org

Paolo Molaro <lu...@debian.org> wrote:
> On 10/27/04 Luke Palmer wrote:
>>
>> Ugh, yeah, but what does that buy you? In dynamic languages pure
>> derivational typechecking is very close to useless.

> Actually, if I were to write a perl runtime for parrot, mono or

> even the JVM I'd experiment with the same pattern.

For the latter two yes, but as Luke has outlined that doesn't really
help for languages where methods are changing under the hood.

> You would assign small interger IDs to the names of the methods
> and build a vtable indexed by the id.

Well, we already got a nice method cache, which makes lookup a
vtable-like operation, i.e. an array lookup. But that's runtime only
(and it needs invalidation still). So actually just the first method
lookup is a hash operation.

> ... There are a number of optimizations that

> can be done to reduce the vtable size, but I'm not sure this would
> matter in parrot as long as bytecode values are as big as C ints:-)

That ought to come ;) Cachegrind shows no problem with opcode fetch and
you know, when it's compiled to JIT bytecode size doesn't matter anyway.
We just avoid the opcode and operand decoding.

> lupus

leo

Paolo Molaro

unread,

Oct 29, 2004, 10:26:18 AM10/29/04

to perl6-i...@perl.org

On 10/29/04 Leopold Toetsch wrote:
> >> Ugh, yeah, but what does that buy you? In dynamic languages pure
> >> derivational typechecking is very close to useless.
>
> > Actually, if I were to write a perl runtime for parrot, mono or
> > even the JVM I'd experiment with the same pattern.
>
> For the latter two yes, but as Luke has outlined that doesn't really
> help for languages where methods are changing under the hood.

If a method changes you just replace the pointer in the vtable
to point to the new method implementation. Invalidation is the
same, you just replace it with a method that gives the
"method not found" error/exception.

> > You would assign small interger IDs to the names of the methods
> > and build a vtable indexed by the id.
>
> Well, we already got a nice method cache, which makes lookup a
> vtable-like operation, i.e. an array lookup. But that's runtime only
> (and it needs invalidation still). So actually just the first method
> lookup is a hash operation.

And where is it cached and how? Take (sorry, still perl5 syntax:-):

foreach $i (@list) {
$i->method ();
}

With the vtable idea, the low-level operations are (in pseudo-C):
vtable = $i->vtable; // just a memory dereference
code = vtable [method-constant-id]; // another mem deref
run_code (code);

From your description it seems it would look like:
vtable = $i->vtable;
code = vtable->method_lookup ("method"); // C function call
run_code (code);

Note that $i may be of different type for each loop iteration.
Even a cached lookup is going to be slower than a simple memory
dereference. Of course this only matters if the lookup is
actually a bottleneck of your function call speed.

> > matter in parrot as long as bytecode values are as big as C ints:-)
>
> That ought to come ;) Cachegrind shows no problem with opcode fetch and
> you know, when it's compiled to JIT bytecode size doesn't matter anyway.
> We just avoid the opcode and operand decoding.

If you use a JIT, decode overhead is already very small:-) AFAIK, alpha
is the only interesting architecture that doesn't do byte access (at least
on older processors) and so it may be a little inefficient there. But I think
you should optimize for the common case. On my machine going through a byte
opcode array is faster than an int one by about 15% (more if level 2 cache
or mem is needed to hold it). The only issue is when you need to load int
values that don't fit in a byte, but those are not so common as register
numbers in your bytecode which currently take a whole int could just use
a byte.
Anyway, the two approaches may also balance out if the opcodes are
in ro memory. The issue is that in perl, for example, so much is
supposed to happen at runtime, because the 'use' operator changes the
compiling environment, so you actually need to compile at runtime in many
cases, not only eval. That means emitting parrot bytecode in memory and
this bytecode is per-process, so it increases memory usage and eventually
swapping activity. As you say, since you jit, this memory is wasted, since
it goes unused soon after it is written.
Another issue is disk-load time: when you have small test apps it doesn't
matter, but when you start having bigger apps it might (even mmapping
has its cost, if you need a larger working set to load bytecodes).

BTW, in the computed goto code, make the array of address labels const:
it helps reducing the rw working set at least when parrot is built as an
executable.

Leopold Toetsch

unread,

Oct 29, 2004, 11:19:35 AM10/29/04

to Paolo Molaro, perl6-i...@perl.org

Paolo Molaro <lu...@debian.org> wrote:
> On 10/29/04 Leopold Toetsch wrote:

>> Well, we already got a nice method cache, which makes lookup a
>> vtable-like operation, i.e. an array lookup. But that's runtime only
>> (and it needs invalidation still). So actually just the first method
>> lookup is a hash operation.

> And where is it cached and how?

in src/objects.c:1077 ff. It's indexed by some bits of the method's
name address.

> From your description it seems it would look like:
> vtable = $i->vtable;
> code = vtable->method_lookup ("method"); // C function call
> run_code (code);

The method_lookup doesn't have a vtable indirection. And having a direct
array lookup doesn't really scale. So the actual code is a bit more
complicated (and in no way optimized).

> Note that $i may be of different type for each loop iteration.

Sure.

> Even a cached lookup is going to be slower than a simple memory
> dereference. Of course this only matters if the lookup is
> actually a bottleneck of your function call speed.

Function calling is currently being changed. So I don't know yet.

>> > matter in parrot as long as bytecode values are as big as C ints:-)

[ more about op size ]

Well, that isn't an issue currently, AFAIK. More time consuming things
need optimizations first.

> BTW, in the computed goto code, make the array of address labels const:
> it helps reducing the rw working set at least when parrot is built as an
> executable.

Ah, yep. Thanks for the tip.

> lupus

leo

Nicholas Clark

unread,

Oct 30, 2004, 4:40:28 PM10/30/04

to Leopold Toetsch, Paolo Molaro, perl6-i...@perl.org

On Fri, Oct 29, 2004 at 05:19:35PM +0200, Leopold Toetsch wrote:

> The method_lookup doesn't have a vtable indirection. And having a direct
> array lookup doesn't really scale. So the actual code is a bit more
> complicated (and in no way optimized).

Something that just struck me reading this whole thread - can we take
advantage of the lazy approach the prederef cores use? Prime all the slots
in the vtable array with pointers to the same function that actually does
the real work of the method lookup. So we take the lookup hit only when
we call the method. This would also mean that cache invalidation is easy -
if the method is changed (or deleted) just unconditionally reset all the
slots that refer to it back to pointers to the initial lookup function.

However this way we'd suffer on copy on write shared memory pages becoming
unshared if something forked, so (massive premature optimisation alarm bells
ring here) maybe it would also be useful to have an API hook to call to
say "stop being lazy; do whatever work is needed to optimise the situation
for being a parent process that will spawn many children".

Nicholas Clark

Leopold Toetsch

unread,

Oct 31, 2004, 5:16:59 AM10/31/04

to Nicholas Clark, perl6-i...@perl.org

Nicholas Clark wrote:
> On Fri, Oct 29, 2004 at 05:19:35PM +0200, Leopold Toetsch wrote:
>
>
>>The method_lookup doesn't have a vtable indirection. And having a direct
>>array lookup doesn't really scale. So the actual code is a bit more
>>complicated (and in no way optimized).
>
>
> Something that just struck me reading this whole thread - can we take
> advantage of the lazy approach the prederef cores use? Prime all the slots
> in the vtable array with pointers to the same function that actually does
> the real work of the method lookup. So we take the lookup hit only when
> we call the method.

First, while Paolo named the thing vtable, that term collides with
pmc->table. So let's call the thing method_table. Second, I've to revise
my above sentence, method lookup *is* a vtable call. We have:

method_pmc = object->vtable->find_method(... "the_method_str")

The find_method calls the method lookup, which is a hash lookup in the
objects class, and likely some more hash lookups in the classes parents.

So back to your idea. We'd need a list of the top N methods of all
objects and generate a sparse array of index <-> method_str mappings per
object primed with the actual lookup function. For all top N methods,
we'd replace the keyed find_method with an indexed find_method.

Then after a successful lookup, we could replace the lookup-function
with a function that returns the real method. Sounds doable, yes.

leo