17:44 <@Nicholas> can the type returned by Parrot_PMC_typenum for a PMC ever be zero? 17:45 <@Dan> I don't think so, no 17:46 <@Dan> Take that back -- a PMC of type 'default' can have a type of 0 17:46 <@Dan> But you shouldn't ever have one, as they're not really instantiatable
In which case:
1: What does pmc_type return if it fails to find a PMC? 2: If that answer is 0, is it safe to document that 0 is a failure return, which happens to map to the PMC type for "default", but as they can't be instantiated looking up "default" is "not supported" (or words to that effect)
> 1: What does pmc_type return if it fails to find a PMC? > 2: If that answer is 0, is it safe to document that 0 is a failure return, > which happens to map to the PMC type for "default", but as they can't be > instantiated looking up "default" is "not supported" (or words to that > effect)
> Nicholas Clark
A related but different issue is that abstract pmcs like (Scalar and PerlScalar) have no pmc_type. I understand that pmc_type are an offset in the table of pmc vtables and that we probably cannot have holes in that table. Nevertheless it would be nice for these abstract pmcs to have a pmc_type; say for base type pmc checking in imcc or some related tools.
I think one of the constraint is that the pmc type numbers must not overlap the values in PARROT_DATA_TYPES.
Nicholas Clark <n...@ccl4.org> wrote: > pmc_type is documented and implemented as follows: >=item C<INTVAL > pmc_type(Parrot_Interp interp, STRING *name)> > On IRC I asked: > 17:44 <@Nicholas> can the type returned by Parrot_PMC_typenum for a PMC ever be > zero? > 17:45 <@Dan> I don't think so, no > 17:46 <@Dan> Take that back -- a PMC of type 'default' can have a type of 0
C<enum_type_undef> aka 0 is returned for unknown types.
> 1: What does pmc_type return if it fails to find a PMC?
$I0 = find_type "no_such" # $I0 := 0
> 2: If that answer is 0, is it safe to document that 0 is a failure return, > which happens to map to the PMC type for "default", but as they can't be > instantiated looking up "default" is "not supported" (or words to that > effect)
Stéphane Payrard <s...@payrard.net> wrote: > A related but different issue is that abstract pmcs like (Scalar > and PerlScalar) have no pmc_type. I understand that pmc_type are > an offset in the table of pmc vtables and that we probably cannot > have holes in that table. Nevertheless it would be nice for these > abstract pmcs to have a pmc_type; say for base type pmc checking in > imcc or some related tools.
Isn't really needed:
$ find t -name '*.t' | xargs grep -w isa ... t/pmc/objects.t: isa I0, P1, "scalar" ...
Having a type enum for these abstract types would imply to install a vtable, filled with methods that catch errors.
> I think one of the constraint is that the pmc type numbers must > not overlap the values in PARROT_DATA_TYPES.
If you mean the struct _data_types data_types[] list of "native" types, yes - they don't overlap, these are all negative numbers.
On Wed, Oct 27, 2004 at 12:19:22PM +0200, Leopold Toetsch wrote: > Stéphane Payrard <s...@payrard.net> wrote:
> > A related but different issue is that abstract pmcs like (Scalar > > and PerlScalar) have no pmc_type. I understand that pmc_type are > > an offset in the table of pmc vtables and that we probably cannot > > have holes in that table. Nevertheless it would be nice for these > > abstract pmcs to have a pmc_type; say for base type pmc checking in > > imcc or some related tools.
> Isn't really needed:
> $ find t -name '*.t' | xargs grep -w isa > ... > t/pmc/objects.t: isa I0, P1, "scalar" > ...
> Having a type enum for these abstract types would imply to install a > vtable, filled with methods that catch errors.
I never proposed the installation of vtables for these types. But to avoid creating holes in the vtables table, there should be an integer range reserved for these abstract types. Being outside the range for regular pmc type they would not need any vtable.
> > I think one of the constraint is that the pmc type numbers must > > not overlap the values in PARROT_DATA_TYPES.
> If you mean the struct _data_types data_types[] list of "native" types, > yes - they don't overlap, these are all negative numbers.
Stéphane Payrard <s...@payrard.net> wrote: > On Wed, Oct 27, 2004 at 12:19:22PM +0200, Leopold Toetsch wrote: >> Having a type enum for these abstract types would imply to install a >> vtable, filled with methods that catch errors. > I never proposed the installation of vtables for these types.
Well, then a type enum for abstract types is probably not useful anyway.
> there should be an integer range reserved for these abstract types. > Being outside the range for regular pmc type they would not > need any vtable.
On Wed, Oct 27, 2004 at 06:24:59PM +0200, Leopold Toetsch wrote: > Stéphane Payrard <s...@payrard.net> wrote: > > On Wed, Oct 27, 2004 at 12:19:22PM +0200, Leopold Toetsch wrote:
> >> Having a type enum for these abstract types would imply to install a > >> vtable, filled with methods that catch errors.
> > I never proposed the installation of vtables for these types.
> Well, then a type enum for abstract types is probably not useful anyway.
> > there should be an integer range reserved for these abstract types. > > Being outside the range for regular pmc type they would not > > need any vtable.
> And then?
That would allow to implement typechecking in imcc.
.sym Scalar a a = new .PerlInt # ok. Perlint is derived from Scalar
On Wed, Oct 27, 2004 at 01:19:29PM -0600, Luke Palmer wrote: > Stéphane Payrard writes: > > That would allow to implement typechecking in imcc.
> > .sym Scalar a > > a = new .PerlInt # ok. Perlint is derived from Scalar
> Ugh, yeah, but what does that buy you? In dynamic languages pure > derivational typechecking is very close to useless. The reason C++[1] > has pure derivational semantics is because of implementation. The > vtable functions have the same relative address, so you can use a > derived object interchangably. In a language where methods are looked > up by name, such strictures are more often over-restrictive than > helpful.
So, still thinking (too?) statically, I should probably worry about what "does" a pmc instead on what it is derived from. In that case, I would not have to worry about abstract pmc classes. So abstract classes missing from the enum of core_pmcs.h, which prompted that subthread, becomes a moot point.
> Anyway, that's just my rant. If such a thing is to be in imcc, it > _must_ be optional without loss of feature. I have quibble with the > automatic typechecking of .param variables for the same reason.
Declaring a symreg just as a pmc would do that anyway.
> Luke
> [1] And the reason Java has it is because C++ did. Great design work, > guys.
> Stéphane Payrard writes: > > That would allow to implement typechecking in imcc.
> > .sym Scalar a > > a = new .PerlInt # ok. Perlint is derived from Scalar
> Ugh, yeah, but what does that buy you? In dynamic languages pure > derivational typechecking is very close to useless. The reason C++[1] > has pure derivational semantics is because of implementation. The > vtable functions have the same relative address, so you can use a > derived object interchangably. In a language where methods are looked > up by name, such strictures are more often over-restrictive than > helpful.
Actually, if I were to write a perl runtime for parrot, mono or even the JVM I'd experiment with the same pattern. I guess it could be applied to a python implementation, too. You would assign small interger IDs to the names of the methods and build a vtable indexed by the id. In most cases the method name is known at compile time, so you know the id and you can get the method with a simple load from the vtable. This is much faster than a hash table lookup (I hinted at this in my old RFC for perl6). Of course the table would be sparse, especially in pathological programs, so you could have a limit, like 100 entries or less with IDs bigger than that using a different lookup (binary search on an array, for example). There are a number of optimizations that can be done to reduce the vtable size, but I'm not sure this would matter in parrot as long as bytecode values are as big as C ints:-) Maybe someone has time to write a script and run it on a bunch of perl programs and report how many different method names are usually created. Of course it also depends how much the hash lookup will cost wrt the total cost of a subroutine call...
lupus
-- ----------------------------------------------------------------- lu...@debian.org debian/rules lu...@ximian.com Monkeys do it better
Paolo Molaro <lu...@debian.org> wrote: > On 10/27/04 Luke Palmer wrote:
>> Ugh, yeah, but what does that buy you? In dynamic languages pure >> derivational typechecking is very close to useless. > Actually, if I were to write a perl runtime for parrot, mono or > even the JVM I'd experiment with the same pattern.
For the latter two yes, but as Luke has outlined that doesn't really help for languages where methods are changing under the hood.
> You would assign small interger IDs to the names of the methods > and build a vtable indexed by the id.
Well, we already got a nice method cache, which makes lookup a vtable-like operation, i.e. an array lookup. But that's runtime only (and it needs invalidation still). So actually just the first method lookup is a hash operation.
> ... There are a number of optimizations that > can be done to reduce the vtable size, but I'm not sure this would > matter in parrot as long as bytecode values are as big as C ints:-)
That ought to come ;) Cachegrind shows no problem with opcode fetch and you know, when it's compiled to JIT bytecode size doesn't matter anyway. We just avoid the opcode and operand decoding.
> >> Ugh, yeah, but what does that buy you? In dynamic languages pure > >> derivational typechecking is very close to useless.
> > Actually, if I were to write a perl runtime for parrot, mono or > > even the JVM I'd experiment with the same pattern.
> For the latter two yes, but as Luke has outlined that doesn't really > help for languages where methods are changing under the hood.
If a method changes you just replace the pointer in the vtable to point to the new method implementation. Invalidation is the same, you just replace it with a method that gives the "method not found" error/exception.
> > You would assign small interger IDs to the names of the methods > > and build a vtable indexed by the id.
> Well, we already got a nice method cache, which makes lookup a > vtable-like operation, i.e. an array lookup. But that's runtime only > (and it needs invalidation still). So actually just the first method > lookup is a hash operation.
And where is it cached and how? Take (sorry, still perl5 syntax:-):
foreach $i (@list) { $i->method (); }
With the vtable idea, the low-level operations are (in pseudo-C): vtable = $i->vtable; // just a memory dereference code = vtable [method-constant-id]; // another mem deref run_code (code);
From your description it seems it would look like: vtable = $i->vtable; code = vtable->method_lookup ("method"); // C function call run_code (code);
Note that $i may be of different type for each loop iteration. Even a cached lookup is going to be slower than a simple memory dereference. Of course this only matters if the lookup is actually a bottleneck of your function call speed.
> > matter in parrot as long as bytecode values are as big as C ints:-)
> That ought to come ;) Cachegrind shows no problem with opcode fetch and > you know, when it's compiled to JIT bytecode size doesn't matter anyway. > We just avoid the opcode and operand decoding.
If you use a JIT, decode overhead is already very small:-) AFAIK, alpha is the only interesting architecture that doesn't do byte access (at least on older processors) and so it may be a little inefficient there. But I think you should optimize for the common case. On my machine going through a byte opcode array is faster than an int one by about 15% (more if level 2 cache or mem is needed to hold it). The only issue is when you need to load int values that don't fit in a byte, but those are not so common as register numbers in your bytecode which currently take a whole int could just use a byte. Anyway, the two approaches may also balance out if the opcodes are in ro memory. The issue is that in perl, for example, so much is supposed to happen at runtime, because the 'use' operator changes the compiling environment, so you actually need to compile at runtime in many cases, not only eval. That means emitting parrot bytecode in memory and this bytecode is per-process, so it increases memory usage and eventually swapping activity. As you say, since you jit, this memory is wasted, since it goes unused soon after it is written. Another issue is disk-load time: when you have small test apps it doesn't matter, but when you start having bigger apps it might (even mmapping has its cost, if you need a larger working set to load bytecodes).
BTW, in the computed goto code, make the array of address labels const: it helps reducing the rw working set at least when parrot is built as an executable.
lupus
-- ----------------------------------------------------------------- lu...@debian.org debian/rules lu...@ximian.com Monkeys do it better
Paolo Molaro <lu...@debian.org> wrote: > On 10/29/04 Leopold Toetsch wrote: >> Well, we already got a nice method cache, which makes lookup a >> vtable-like operation, i.e. an array lookup. But that's runtime only >> (and it needs invalidation still). So actually just the first method >> lookup is a hash operation. > And where is it cached and how?
in src/objects.c:1077 ff. It's indexed by some bits of the method's name address.
> From your description it seems it would look like: > vtable = $i->vtable; > code = vtable->method_lookup ("method"); // C function call > run_code (code);
The method_lookup doesn't have a vtable indirection. And having a direct array lookup doesn't really scale. So the actual code is a bit more complicated (and in no way optimized).
> Note that $i may be of different type for each loop iteration.
Sure.
> Even a cached lookup is going to be slower than a simple memory > dereference. Of course this only matters if the lookup is > actually a bottleneck of your function call speed.
Function calling is currently being changed. So I don't know yet.
>> > matter in parrot as long as bytecode values are as big as C ints:-)
[ more about op size ]
Well, that isn't an issue currently, AFAIK. More time consuming things need optimizations first.
> BTW, in the computed goto code, make the array of address labels const: > it helps reducing the rw working set at least when parrot is built as an > executable.
On Fri, Oct 29, 2004 at 05:19:35PM +0200, Leopold Toetsch wrote: > The method_lookup doesn't have a vtable indirection. And having a direct > array lookup doesn't really scale. So the actual code is a bit more > complicated (and in no way optimized).
Something that just struck me reading this whole thread - can we take advantage of the lazy approach the prederef cores use? Prime all the slots in the vtable array with pointers to the same function that actually does the real work of the method lookup. So we take the lookup hit only when we call the method. This would also mean that cache invalidation is easy - if the method is changed (or deleted) just unconditionally reset all the slots that refer to it back to pointers to the initial lookup function.
However this way we'd suffer on copy on write shared memory pages becoming unshared if something forked, so (massive premature optimisation alarm bells ring here) maybe it would also be useful to have an API hook to call to say "stop being lazy; do whatever work is needed to optimise the situation for being a parent process that will spawn many children".
Nicholas Clark wrote: > On Fri, Oct 29, 2004 at 05:19:35PM +0200, Leopold Toetsch wrote:
>>The method_lookup doesn't have a vtable indirection. And having a direct >>array lookup doesn't really scale. So the actual code is a bit more >>complicated (and in no way optimized).
> Something that just struck me reading this whole thread - can we take > advantage of the lazy approach the prederef cores use? Prime all the slots > in the vtable array with pointers to the same function that actually does > the real work of the method lookup. So we take the lookup hit only when > we call the method.
First, while Paolo named the thing vtable, that term collides with pmc->table. So let's call the thing method_table. Second, I've to revise my above sentence, method lookup *is* a vtable call. We have:
The find_method calls the method lookup, which is a hash lookup in the objects class, and likely some more hash lookups in the classes parents.
So back to your idea. We'd need a list of the top N methods of all objects and generate a sparse array of index <-> method_str mappings per object primed with the actual lookup function. For all top N methods, we'd replace the keyed find_method with an indexed find_method.
Then after a successful lookup, we could replace the lookup-function with a function that returns the real method. Sounds doable, yes.