Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[svn:parrot-pdd] r14774 - in trunk: . docs/pdds/clip

4 views
Skip to first unread message

l...@cvs.perl.org

unread,
Sep 27, 2006, 3:57:23 PM9/27/06
to perl6-i...@perl.org
Author: leo
Date: Wed Sep 27 12:57:22 2006
New Revision: 14774

Added:
trunk/docs/pdds/clip/pddXX_cstruct.pod (contents, props changed)
trunk/docs/pdds/clip/pddXX_pmc.pod (contents, props changed)

Changes in other areas also in this revision:
Modified:
trunk/MANIFEST

Log:
add 2 new design docs - see also mail

Added: trunk/docs/pdds/clip/pddXX_cstruct.pod
==============================================================================
--- (empty file)
+++ trunk/docs/pdds/clip/pddXX_cstruct.pod Wed Sep 27 12:57:22 2006
@@ -0,0 +1,324 @@
+=head1 TITLE
+
+C Structure Class
+
+=head1 STATUS
+
+Proposal.
+
+=head1 AUTHOR
+
+Leopold Toetsch
+
+=head1 ABSTRACT
+
+The ParrotClass PMC is the default implementation (and the meta class)
+of parrot's HLL classes. It provides attribute access and (TODO)
+introspection of attribute names. It is also handling method
+dispatch and inheritance.
+
+C structures used all over in parrot (PMCs) and user-visible C
+structures provided by the C<{Un,}ManagedStruct> PMC dont't have this
+flexibility.
+
+The proposed C<CStruct> PMC is trying to bridge this gap.
+
+=head1 DESCRIPTION
+
+The C<CStruct> PMC is the class PMC of classes, which are not
+based on PMC-only attributes but on the general case of a C structure.
+That is, the C<CStruct> is actually the parent class of
+C<ParrotClass>, which is a PMC-only special case. And it is the
+theoretical ancestor class of all PMCs (including itself :).
+
+The relationship of C<CStruct> to other PMCs is like this:
+
+ PASM/PIR code C code
+ Class ParrotClass CStruct
+ Object ParrotObject *ManagedStruct
+ (other PMCs)
+
+That is, it is the missing piece of already existing PMCs. The current
+*ManagedStruct PMCs are providing the class and object functionality in
+one and the same PMC (as BTW all other existing PMCs are doing). But
+this totally prevents proper inheritance and reusability of such PMCs.
+
+The C<CStruct> class provides the necessary abstract backings to get
+rid of current limitations.
+
+=head1 SYNTAX BITS
+
+=head2 Constructing a CStruct
+
+A typical C structure:
+
+ struct foo {
+ int a;
+ char b;
+ };
+
+could be created in PIR with:
+
+ cs = subclass 'CStruct', 'foo' # or maybe cs = new_c_class 'foo'
+ addattribute cs, 'a'
+ addattribute cs, 'b'
+
+The semantics of a C struture are the same as of a Parrot Class.
+But we need the types of the attributes too:
+
+Handwavingly TBD 1)
+
+with ad-hoc existing syntax:
+
+ .include "datatypes.pasm"
+ cs['a'] = .DATATYPE_INT
+ cs['b'] = .DATATYPE_CHAR
+
+Handwavingly TBD 2)
+
+with new variants of the C<addattribute> opcode:
+
+ addattribute cs, 'a', .DATATYPE_INT
+ addattribute cs, 'b', .DATATYPE_CHAR
+
+Probably desired and with not much effort TBD 3):
+
+ addattribute(s) cs, <<'DEF'
+ int a;
+ char b;
+ DEF
+
+The possible plural in the opcode name would match semantics, but it is not
+necessary. The syntax is just using Parrot's here documents to define
+all the attributes and types.
+
+ addattribute(s) cs, <<'DEF'
+ int "a";
+ char "b";
+ DEF
+
+The generalization of arbitrary attribute names would of course be
+possible too, but isn't likely needed.
+
+=head2 Syntax variant
+
+ cs = subclass 'CStruct', <<'DEF
+ struct foo {
+ int a;
+ char b;
+ };
+ DEF
+
+I.e. create all in one big step.
+
+=head2 Object creation and attribute usage
+
+This is straight forward and conforming to current ParrotObjects:
+
+ o = new 'foo' # a ManagedStruct instance
+ setattribute o, 'a', 4711
+ setattribute o, 'b', 22
+ ...
+
+The only needed extension would be C<{get,set}attribute> variants with
+natural types.
+
+Even (with nice to have IMCC syntax sugar):
+
+ o.a = 4711 # setattribute
+ o.b = 22
+ $I0 = o.a # getattribute
+
+=head2 Nested Structures
+
+ foo_cs = subclass 'CStruct', 'foo'
+ addattribute(s) foo_cs, <<'DEF'
+ int a;
+ char b;
+ DEF
+ bar_cs = subclass 'CStruct', 'bar'
+ addattribute(s) bar_cs, <<'DEF'
+ double x;
+ foo foo; # the foo class is already defined
+ foo *fptr;
+ DEF
+ o = new 'bar'
+ setattribute o, 'x', 3.14
+ setattribute o, ['foo'; 'a'], 4711 # o.foo.a = 4711
+ setattribute o, ['fptr'; 'b'], 255
+
+Attribute access is similar to current *ManagedStruct's hash syntax
+but with a syntax matching ParrotObjects.
+
+=head2 Array Structures Elements
+
+ foo_cs = subclass 'CStruct', 'foo'
+ addattribute(s) foo_cs, <<'DEF'
+ int a;
+ char b[100];
+ DEF
+
+=head2 Possible future extemsios
+
+ cs = subclass 'CStruct', 'todo'
+ addattribute(s) foo_cs, <<'DEF'
+ union { # union keyword
+ int a;
+ double b;
+ } u;
+ char b[100] :ro; # attributes like r/o
+ DEF
+
+=head2 Managed vs. Unmanaged Structs
+
+The term "managed" in current structure usage defines the owner of the
+structure memory. C<ManagedStruct> means that parrot is the owner of
+the memory and that GC will eventually free the structure memory. This
+is typically used when C structures are created in parrot and passed
+into external C code.
+
+C<UnManagedStruct> means that there's some external owner of the
+structure memory. Such structures are typically return results of
+external code.
+
+E.g.:
+
+ $P0 = some_c_func() # UnManagedStruct result
+ assign $P0, foo_cs # assign a structure class to it
+
+ o = new 'foo_cs' # ManagedStruct instance
+ setattribute o, 'a', 100
+ setattribute o, ['b'; 99], 255 # set last elem
+
+=head1 RATIONAL
+
+Parrot as the planned interpreter glue language should have access to
+all possible C libraries and structures. It has to abstract the
+low-level bindings in a HLL independant way and should still be able to
+communicate all information "upstairs" to the HLL users.
+
+But it's not HLL usage only, parrot itself is already suffering from
+lack of abstraction at PMC level.
+
+=head2 Inheritance
+
+I've implemented an OO-ified HTTP server named F<httpd2.pir>. The
+C<HTTP::Connection> class ought to be a subclass of C<ParrotIO> (we
+don't have a base socket class, but ParrotIO would do it for now).
+This kind of inheritance isn't possible. The implementation is now a
+connection B<hasa> ParrotIO, instead of B<isa>. It's of course losing
+all inheritance with that which leads to delegation code and work
+arounds.
+
+The same workarounds are all over F<SDL/*> classes. There are
+I<layout> helpers and raw structure accessores and what not. Please
+read the code. It's really not a problem of the implementation (which is
+totally fine) it's just the lack of usability of parrot (when it comes
+to native structures (or PMCs)).
+
+All these experiments to use a C structures or a PMC as base class are
+ending with a C<has> relationship instead of the natural C<isa>.
+Any useful OO-ish abstraction is lost and is leading to clumsy code,
+and - no - implementing interfaces/traits/mixins can't help here, as
+these are all based on the abstraction, which is described here.
+
+=head2 Inheritance and attribute access
+
+This proposal alone doesn't solve all inheritance problems. It is also
+needed that the memory layout of PMCs and ParrotObjects deriving from
+PMCs is the same. E.g.
+
+ cl = subclass 'Integer', 'MyInt'
+
+The C<int_val> attribute of the core C<Integer> type is located in the
+C<cache> union of the PMC. The integer item in the subclass is hanging
+off the C<data> array of attributes and worse it is a PMC too, not a
+natural int. This not only causes additional indirections (see
+F<deleg_pmc.pmc>) but also negatively impacts C<Integer> PMCs, as all
+access to the C<int_val> has to be indirected through C<get_integer()>
+or C<set_integer_native()> to be able to deal with subclassed integers.
+
+Again the implementation of above is: MyInt B<hasa> Integer, instead of
+the desired B<isa> int_val.
+
+With the abstraction of a C<CStruct> describing the C<Integer> PMC and
+with differently sized PMCs, we can create an object layout, where the
+C<int_val> attribute of C<Integer> and C<MyInt> are at the same
+location and of the same type.
+
+Given this (internal) definition of the C<Integer> PMC:
+
+ intpmc_cl = subclass 'CStruct', 'Integer'
+ addattribute(s) intpmc_cl, <<'DEF'
+ INTVAL int_val; # PMC internals are hidden
+ DEF
+
+we can transparently subclass it as C<MyInt>, as all the needed
+information is present in the C<CStruct intpmc_cl> class.
+
+=head2 Introspection, PMCs and more
+
+ cc = subclass 'CStruct', 'Complex'
+ addattribute(s) cc, <<'DEF'
+ FLOATVAL re;
+ FLOATVAL im;
+ DEF
+
+This is the (hypothetical) description of a C<Complex> PMC class. An
+equivalent syntax can be translated by the PMC compiler to achieve the
+same result.
+
+This definition of the attributes of that PMC provides automagically
+access to all the information stored in the PMC. All such access is
+currently hand-crafted in the F<complex.pmc>. Not only that this
+accessor code could be abandoned (and unified with common syntax),
+all possible classes inheriting from that PMC could use this
+information.
+
+=head1 Implementation
+
+C<CStruct> is basically yet another PMC and can be implemented and put
+to functionality without any interference with existing code. It is
+also orthogonal with possible PMC layout changes.
+
+The internals of C<CStruct> can vastly reuse code from F<src/objects.c>
+to deal with inheritance or object instantiation. The main difference
+is that attributes have additionally a type attached to it and
+consequently that the attribute offsets are calculated differently
+depending on type, alignment, and padding. These calculations are
+already done in F<unmanagedstruct.pmc>.
+
+C<CStruct> classes can be attached to existing PMCs gradually (and by
+far not all PMCs need that abstract backing). But think e.g. of the
+C<Sub> PMC. Attaching a C<CStruct> to it, would instantly give access
+to all it's attributes and vastly simplify introspection.
+
+Only the final step ("Inheritance and attribute access") needs all
+parts to play together.
+
+=head1 All together now
+
+=over
+
+=item Differently sized PMCs
+
+Provide the flexible PMC layout.
+
+=item CStruct classes
+
+Are describing the structure of PMCs (or any C structure).
+
+=item R/O vtables
+
+Prohibit modification of readonly PMCs like the C<Sub> PMC. These are
+already coded within the C<STM> project.
+
+=back
+
+=head1 SEE ALSO
+
+pddXX_pmc.pod (proposal for a flexible PMC layout)
+
+=cut
+
+vim: expandtab sw=2 tw=70:

Added: trunk/docs/pdds/clip/pddXX_pmc.pod
==============================================================================
--- (empty file)
+++ trunk/docs/pdds/clip/pddXX_pmc.pod Wed Sep 27 12:57:22 2006
@@ -0,0 +1,295 @@
+# Copyright (C) 2001-2006, The Perl Foundation.
+# $Id:$
+
+=head1 NAME
+
+docs/pdds/pddXX_pmc.pod - PMCs
+
+=head1 STATUS
+
+Proposal
+
+=head1 VERSION
+
+$Revision:$
+
+=head1 ABSTRACT
+
+This document defines Parrot Magic Cookies (PMCs).
+
+[[ maybe rename PMC to PO (Parrot Objects) or such to reduce confusion
+with Perl5's PMC (compiled .pm files). ]]
+
+=head1 TERMINOLOGY
+
+This document uses C<OPMC>, when speaking of "old" PMCs of Parrot
+Version 0.4.6 or less. C<PMC> is the new layout as proposed in this
+document.
+
+An C<attribute> is otherwise also known as a C<field> or structure
+element, but I'm using C<attribute> here because the difference of
+PMCs and Parrot Objects should be minimized.
+
+=head1 DESCRIPTION
+
+PMCs are Parrot's low-level objects implemented in C. PMCs are small
+non-resizable variable-sized structures. The C<PMC> itself is the
+common part of all PMCs. The per-PMC payload holds PMC-specific
+attributes.
+
+=head2 PMC Layout
+
+ +---------------+
+ | vtable | # common PMC attribute
+ +---------------+
+ | flags | # common PMC attribute
+ +---------------+
+ | attrib_1 | # "user" defined part
+ | ... |
+ +---------------+
+
+ #define THE_PMC \
+ struct _vtable* vtable; \
+ UINTVAL flags
+
+An Integer Value PMC could be defined with:
+
+ struct VInt_PMC {
+ THE_PMC;
+ INTVAL val;
+ };
+
+Such PMC definitions are typically private to the F<.pmc> files. All
+access to PMCs shall be through VTABLE functions or methods. OTOH some
+widely used PMCs might export their attributes for public use and are
+then part of the Parrot API.
+
+A typical C<VInt> vtable function would look like this:
+
+ INTVAL get_integer()
+ VInt_PMC *me = (VInt_PMC*)SELF; # [1]
+ return me->val;
+ }
+
+The C<OPMC> can be defined in terms of a PMC by rearranging the
+structure elements.
+
+[1] The PMC compiler could provide this line automagically and define
+a convenience variable C<ME> similar to the current C<SELF>.
+
+=head2 PMC creation
+
+PMCs are created via C<VTABLE_new> or variants of C<_new>. It's up to
+the PMC to initialize it's attributes. C<new> is a class method, i.e.
+it's called with the PMC's C<class> as C<SELF>.
+
+ PMC* new() {
+ VInt_PMC *me = new_bufferlike_header(INTERP, sizeof(VInt_PMC));
+ me->val = 0;
+ return (PMC*)me;
+ }
+
+[[ rename C<new_bufferlike_header> to something more meaningful ]]
+
+=head3 Optimization
+
+The vtable can provide a pointer to the sized header pool to possibly
+speedup allocation.
+
+=head3 OPMC vs. PMC creation
+
+PMCs with a non-default C<new> method are PMCs, The old scheme via
+C<pmc_new> and C<VTABLE_init> provides a fallback of creating C<OPMCs>.
+
+=head1 Additional PMC attributes
+
+=head2 pmc->_next_for_GC / opmc->pmc_ext->_next_for_GC
+
+All PMCs that refer to other PMCs have a 3rd mandatory attribute
+C<_next_for_GC>, used for garbage collection, The presence of this
+attribute is signaled by the flag bit C<PObj_is_PMC_EXT_FLAG>.
+
+ +---------------+
+ | vtable |
+ +---------------+
+ | flags |
+ +---------------+
+ | _next_for_GC |
+ +---------------+
+ | ... |
+ +---------------+
+
+=head2 Properties opmc->pmc_ext->_metadata
+
+PMCs do not support properties universally, If properties are still
+desired, these can be implemented in one of the following ways:
+
+[[ TODO define something canonical ]]
+
+=head3 Per PMC type
+
+Each PMC that wants this extra hash can just provide an attribute for
+it and implement the property vtable functions.
+
+=head3 interpreter->prop_hash
+
+This will be a Hash, indexed by the PMC's address, containing the property
+Hash. An additional flag can be provided, if such a property hash
+exists for a PMC. During collection of a PMC, this hash is invalidated
+too.
+
+=head3 PropRef
+
+A transparent C<Ref> PMC can point to a structure holding the original
+PMC and the property Hash.
+
+=head2 Locking or opmc->pmc_ext->_synchronize
+
+PMCs do no support locking universally. Creating sharable PMCs at
+runtime (from standard PMCs) is again done by transparent Refs like
+C<SharedRef> or C<STMRef>.
+
+=head2 Shared PMCs
+
+If needed, we can define shared PMCs by allocating the C<_Sync>
+structure in front of the PMC:
+
+ +---------------+
+ | struct |
+ | _Sync |
+ +---------------+ <--- pmc points here
+ | vtable |
+ +---------------+
+ | flags |
+ +---------------+
+
+This works of course only, if PMCs are created as C<shared> in the
+first place. The presence of the C<_Sync> structure is stated by a PMC
+flag bit.
+
+=head2 PMCs and morphing
+
+PMCs (like current OPMCs), which may morph themselves, and thereby
+change their vtable and the meaning of their attributes shall use a
+union of the desired attributes, e.g.:
+
+ struct Integer_PMC {
+ THE_PMC;
+ union {
+ INTVAL int_val;
+ FLOATVAL num_val;
+ STRING *str_val;
+ } u;
+ };
+
+=head1 ATTACHMENTS
+
+(none)
+
+=head1 REFERENCES
+
+ TODO pdd02_vtables.pod
+ TODO pddXX_interfaces.pod
+ TODO pddXX_classes.pod
+ TODO pddXX_objects.pod
+ TODO pddXX_cstruct.pod [2]
+
+[2] PMCs need a class object that defines their attributes to properly
+allow subclassing. The attribute definition is held by a C<CStruct>
+PMC, the meta class of all Parrot PMCs. It's a list of attribute
+names, their types, and possibly the offsets in the PMC structure. See
+also the C<UnManangedStruct> PMC.
+
+=head1 RATIONAL
+
+Current OPMCs are too rigid: mostly either too small or too big. A lot
+of information is hanging off secondary malloced structures like
+C<PMC_sub> in the C<Sub> OPMC.
+
+But more importantly, OPMCs don't properly allow subclassing. E.g.
+
+ cl = subclass 'Hash', 'PGE::Match'
+
+is currently done by creating a C<ParrotClass>. When instantiate, this
+is a "hidden" C<__value> element as first attribute, which is a
+pointer to the hash parent PMC. This is creating internal structures
+which aren't compatible, because the object attributes are differently
+arranged. That is, above subclassing is mainly: C<PGE::Match> I<hasa>
+C<Hash> instead of I<isa>, when it comes to attribute relationship.
+
+This limitation prevents further implementation of already (at least
+partially) documented APIs, like the C<Compiler> one.
+
+A C<Compiler> object is either a Parrot C<Sub> like C<PGE::P6Regex> or
+a builtin that is C<NCI> compiler like C<PIR>. But C<Sub> and C<NCI>
+objects are that different that even currently needed attributes aren't
+consistently arranged (e.g. C<multi_sig> or C<namespace_stash>).
+Creating proper compiler objects like C<PIR_Compiler> with common and
+needed C<Compiler> attributes isn't possible now.
+
+[[ Well, with another one or two indirections all can be implemented,
+but that's just adding to code complexity. ]]
+
+Please note that the mentioned C<Compiler> PMC ist just one of many
+PMCs that exhibit the same problem.
+
+In combination with a proper metaclass for PMCs, PMCs and "real"
+Parrot objects should be able to work together seemlessly.
+
+=head1 PERFORMANCE CONSIDERATIONS
+
+Due to reduced memory consumption and reduced allocation of secondary
+helper structures, this change will very likely speed up Parrot
+performance slightly to moderate. No negative performance impact is
+forseeable due to these changes.
+
+=head1 IMPLEMENTATION
+
+Parrot core needs very little changes to be able to deal with
+differently sized PMCs. All the GC infrastructure is already there for
+C<Buffer_like> objects, which are managed in sized header pools.
+
+Changing PMCs to the new scheme can be done as needed and isn't
+mandatory.
+
+C<OPMCs> attribute access is currently already done through C macros,
+like C<PMC_int_val> or C<PMC_struct_val>. These macros can cast the
+passed pointer to C<(OPMC*)>, so that all these C<OPMCs> will still be
+working. New PMCs shall use explicit and more verbose attribute names,
+which don't collide with present C<OPMC> attributes.
+
+There'll be no implications to existing PASM or PIR code nor to
+existing dynamic PMCs.
+
+=head1 ALTERNATIVE PMC IMPLEMENTATIONS
+
+I've in another document (F<PMC.pod>) already layed out a PMC scheme
+optimized for generational garbage collection. The PMC layout is using
+also differently but fixed sized user parts of PMCs, but these are
+subject of one more indirection. If we see the need for optimized GC,
+this PMC scheme can still be implemented. We probably could take
+provisions that such a change is not too intrusive by cleverly using
+the PMC compiler and/or C macros like the proposed C<ME> in [1] above.
+
+The payload of PMCs in this scheme is hanging off a C<pmc_body>
+pointer:
+
+ +---------------+
+ | pmc_body | --> body (buffer) memory
+ +---------------+
+ | vtable |
+ +---------------+
+ | flags |
+ +---------------+
+
+The implementation of C<PMC> might take into consideration that the PMC
+layout could change further.
+
+=cut
+
+__END__
+Local Variables:
+ fill-column:78
+End:
+
+vim: expandtab sw=2 tw=70:

Jonathan Worthington

unread,
Sep 27, 2006, 7:30:43 PM9/27/06
to perl6-i...@perl.org
Hi,

Some first thoughts that come to mind after reading leo's two proposals.

> +A typical C structure:
> +
> + struct foo {
> + int a;
> + char b;
> + };
> +
> +could be created in PIR with:
> +
> + cs = subclass 'CStruct', 'foo' # or maybe cs = new_c_class 'foo'
> + addattribute cs, 'a'
> + addattribute cs, 'b'
> +
> +The semantics of a C struture are the same as of a Parrot Class.
> +But we need the types of the attributes too:
> +
> +Handwavingly TBD 1)
> +
> +with ad-hoc existing syntax:
> +
> + .include "datatypes.pasm"
> + cs['a'] = .DATATYPE_INT
> + cs['b'] = .DATATYPE_CHAR
> +
>

This (and the addattribute for native types) is one thing that would
certainly simplify code generation for the .Net translator by
eliminating various boxing and unboxing code that I emit now. I imagine
it will help with other languages too.

> +Handwavingly TBD 2)
> +
> +with new variants of the C<addattribute> opcode:
> +
> + addattribute cs, 'a', .DATATYPE_INT
> + addattribute cs, 'b', .DATATYPE_CHAR
>

Certainly preferable to syntax 1.

> +Probably desired and with not much effort TBD 3):
> +
> + addattribute(s) cs, <<'DEF'
> + int a;
> + char b;
> + DEF
>

I'm not so keen on this part of the proposal. It means the CStruct PMC
needs to parse the above syntax (but at least that also means no
additions to PIR parsing to support this, but the previous two
suggestions did not either).

I think if we could "magically" have the .DATATYPE_INT constants
existing without needing to .include them the previous syntax (number 2)
would be preferable. It compiles down to just a sequence of bytecode
instructions then, rather than a constants table entry for the string.
But more importantly, all syntax checking is done at PIR compile time,
whereas the string describing the struct elements and types would not be
parsed until runtime so typo's in type names or general syntax errors
aren't detected until then.

> +The generalization of arbitrary attribute names would of course be
> +possible too, but isn't likely needed.
>

Unsure what this means - please clarify this a bit.

> +=head2 Syntax variant
> +
> + cs = subclass 'CStruct', <<'DEF
> + struct foo {
> + int a;
> + char b;
> + };
> + DEF
> +
> +I.e. create all in one big step.
>

Same issues as above.

> +=head2 Object creation and attribute usage
> +
> +This is straight forward and conforming to current ParrotObjects:
> +
> + o = new 'foo' # a ManagedStruct instance
> + setattribute o, 'a', 4711
> + setattribute o, 'b', 22
> + ...
> +
> +The only needed extension would be C<{get,set}attribute> variants with
> +natural types.
>

This is the real place, of course, where the .Net translator (and I
think other compilers) will save on spitting out box/unbox code.

> +=head2 Nested Structures
> +
> + foo_cs = subclass 'CStruct', 'foo'
> + addattribute(s) foo_cs, <<'DEF'
> + int a;
> + char b;
> + DEF
> + bar_cs = subclass 'CStruct', 'bar'
> + addattribute(s) bar_cs, <<'DEF'
> + double x;
> + foo foo; # the foo class is already defined
>

May I suggest change second foo there to something else? I know it's the
attribute name, but it made me scratch my head to check something odd
wasn't going on.


> + foo *fptr;
> + DEF
> + o = new 'bar'
> + setattribute o, 'x', 3.14
> + setattribute o, ['foo'; 'a'], 4711 # o.foo.a = 4711
> + setattribute o, ['fptr'; 'b'], 255
>

Can you describe the semantics of foo vs *foo (or *fooptr as it appears
in the above code) are more clearly? Is guess it just that in one case
the foo structure is a part of the bar one, and in the other case it's a
pointer to it, like in C? But please don't rely too much on knowledge of
C semantics when describing Parrot ones.

> +=head2 Array Structures Elements
> +
> + foo_cs = subclass 'CStruct', 'foo'
> + addattribute(s) foo_cs, <<'DEF'
> + int a;
> + char b[100];
> + DEF
>

With bounds checking on accesses to b, right?

> +=head2 Managed vs. Unmanaged Structs
> +
> +The term "managed" in current structure usage defines the owner of the
> +structure memory. C<ManagedStruct> means that parrot is the owner of
> +the memory and that GC will eventually free the structure memory. This
> +is typically used when C structures are created in parrot and passed
> +into external C code.
> +
> +C<UnManagedStruct> means that there's some external owner of the
> +structure memory. Such structures are typically return results of
> +external code.
> +
>

I think for safety reasons we will later want to have some way of only
letting approved code that uses unmanagedstructs run, as with them
anyone can segfault the VM in no time at all...but that's for a security
PDD or something.

> +This proposal alone doesn't solve all inheritance problems. It is also
> +needed that the memory layout of PMCs and ParrotObjects deriving from
> +PMCs is the same. E.g.
> +
> + cl = subclass 'Integer', 'MyInt'
> +

> ...


> +
> +With the abstraction of a C<CStruct> describing the C<Integer> PMC and
> +with differently sized PMCs, we can create an object layout, where the
> +C<int_val> attribute of C<Integer> and C<MyInt> are at the same
> +location and of the same type.
> +
> +Given this (internal) definition of the C<Integer> PMC:
> +
> + intpmc_cl = subclass 'CStruct', 'Integer'
> + addattribute(s) intpmc_cl, <<'DEF'
> + INTVAL int_val; # PMC internals are hidden
> + DEF
> +
> +we can transparently subclass it as C<MyInt>, as all the needed
> +information is present in the C<CStruct intpmc_cl> class.
> +
>

Maybe a side issue, but how do you propose dealing with languages that
allow:

class A {
private int x;
...
}

class B is A {
private int x; /* Parent's x not visible, but name is the same. */
...
}

Where methods in A will access the x defined in A and methods in B will
access the x defined in B?

> +=head1 Implementation
> +
> +C<CStruct> is basically yet another PMC and can be implemented and put
> +to functionality without any interference with existing code. It is
> +also orthogonal with possible PMC layout changes.
> +
> +The internals of C<CStruct> can vastly reuse code from F<src/objects.c>
> +to deal with inheritance or object instantiation. The main difference
> +is that attributes have additionally a type attached to it and
> +consequently that the attribute offsets are calculated differently
> +depending on type, alignment, and padding. These calculations are
> +already done in F<unmanagedstruct.pmc>.
>

I am curious how this hurts our portability. Alignment and padding can
differ somewhat between platforms. And don't optimizing compilers
sometimes re-order struct elements for better packing? Yes, there will
(should!) be flags to disable that of course, but what burden are we
putting on people porting Parrot?

(Put another way: how portable is the UnmanagedStruct PMC?)

> +C<CStruct> classes can be attached to existing PMCs gradually (and by
> +far not all PMCs need that abstract backing). But think e.g. of the
> +C<Sub> PMC. Attaching a C<CStruct> to it, would instantly give access
> +to all it's attributes and vastly simplify introspection.
>

A Good Thing. Also we will want an interface to get hold of the
attribute names and types...

> +=head1 Additional PMC attributes
> +
> +=head2 pmc->_next_for_GC / opmc->pmc_ext->_next_for_GC
> +
> +All PMCs that refer to other PMCs have a 3rd mandatory attribute
> +C<_next_for_GC>, used for garbage collection, The presence of this
> +attribute is signaled by the flag bit C<PObj_is_PMC_EXT_FLAG>.
> +
> + +---------------+
> + | vtable |
> + +---------------+
> + | flags |
> + +---------------+
> + | _next_for_GC |
> + +---------------+
> + | ... |
> + +---------------+
> +
>

Another side-thought - if we know the types of the things in the
attributes of the PMC, can we not auto-generate the mark code for any we
know are PMC* or STRING*?

> +=head2 Locking or opmc->pmc_ext->_synchronize
> +
> +PMCs do no support locking universally. Creating sharable PMCs at
> +runtime (from standard PMCs) is again done by transparent Refs like
> +C<SharedRef> or C<STMRef>.
> +
> +=head2 Shared PMCs
> +
> +If needed, we can define shared PMCs by allocating the C<_Sync>
> +structure in front of the PMC:
> +
> + +---------------+
> + | struct |
> + | _Sync |
> + +---------------+ <--- pmc points here
> + | vtable |
> + +---------------+
> + | flags |
> + +---------------+
> +
> +This works of course only, if PMCs are created as C<shared> in the
> +first place. The presence of the C<_Sync> structure is stated by a PMC
> +flag bit.
>

Why put it before the location that is pointed to? That seems confusing
to me, and inconsistent with the next_for_gc entry that is placed after
the flags rather than before the PMC starts. Plus I imagine it
complicates de-allocation - you have to check the flag and subtract
sizeof(struct _Sync) if it's set...

That's "all" that comes to me right now. ;-)

Thanks,

Jonathan

Leopold Toetsch

unread,
Sep 28, 2006, 5:06:38 AM9/28/06
to perl6-i...@perl.org
Am Donnerstag, 28. September 2006 01:30 schrieb Jonathan Worthington:
> Hi,
>
> Some first thoughts that come to mind after reading leo's two proposals.

> > +with new variants of the C<addattribute> opcode:


> > +
> > + addattribute cs, 'a', .DATATYPE_INT
> > + addattribute cs, 'b', .DATATYPE_CHAR
>
> Certainly preferable to syntax 1.
>
> > +Probably desired and with not much effort TBD 3):
> > +
> > + addattribute(s) cs, <<'DEF'
> > + int a;
> > + char b;
> > + DEF

> But more importantly, all syntax checking is done at PIR compile time,


> whereas the string describing the struct elements and types would not be
> parsed until runtime so typo's in type names or general syntax errors
> aren't detected until then.

That's true. Well, the idea is of course to be able to paste arbitrary C
structures into the PIR and be done with it. (I can imagine that in the
long-run class parsing and construction will be done at BEGIN or IMMEDIATE
time. This might also be done with some external parser and not in C code.)

> > +The generalization of arbitrary attribute names would of course be
> > +possible too, but isn't likely needed.

pdd updated - I ment quoted attr names.

> > + addattribute(s) bar_cs, <<'DEF'
> > + double x;
> > + foo foo; # the foo class is already defined
>
> May I suggest change second foo there to something else?

Done.

> ... But please don't rely too much on knowledge of


> C semantics when describing Parrot ones.

Yep. OTOH are we dealing with C structures here. Pointers to structs vs.
contained structs are very common construct in C code.

> > +=head2 Array Structures Elements

> With bounds checking on accesses to b, right?

Yep. pdd updated.

> I think for safety reasons we will later want to have some way of only
> letting approved code that uses unmanagedstructs run, as with them
> anyone can segfault the VM in no time at all...but that's for a security
> PDD or something.

Indeed.

> Maybe a side issue, but how do you propose dealing with languages that
> allow:
>
> class A {
> private int x;
> ...
> }
>
> class B is A {
> private int x; /* Parent's x not visible, but name is the same. */
> ...
> }
>
> Where methods in A will access the x defined in A and methods in B will
> access the x defined in B?

Parrot allows that already. See also t/pmc/objects_17.pir

> > +consequently that the attribute offsets are calculated differently
> > +depending on type, alignment, and padding. These calculations are
> > +already done in F<unmanagedstruct.pmc>.
>
> I am curious how this hurts our portability. Alignment and padding can
> differ somewhat between platforms.

Sure. Pointer size also differs. But as the offsets are always calculated on
the very same platform this doesn't really matter.

> And don't optimizing compilers
> sometimes re-order struct elements for better packing?

Not as far as I know. This would also cause troubles for C code accessing any
C library.

> (Put another way: how portable is the UnmanagedStruct PMC?)

E.g. the SDL code (using {Un,}ManagedStruct a lot runs on 32 & 64 bit
machines, LE or BE.

> A Good Thing. Also we will want an interface to get hold of the
> attribute names and types...

Yep.

> Another side-thought - if we know the types of the things in the
> attributes of the PMC, can we not auto-generate the mark code for any we
> know are PMC* or STRING*?

Yes. We do that already for ParrotObjects/Classes.

> > +=head2 Locking or opmc->pmc_ext->_synchronize
>

> Why put it before the location that is pointed to? That seems confusing
> to me, and inconsistent with the next_for_gc entry that is placed after
> the flags rather than before the PMC starts.

Well, the idea is to have the very same struct layout, whether the PMC is
shared or not.

> Plus I imagine it
> complicates de-allocation - you have to check the flag and subtract
> sizeof(struct _Sync) if it's set...

Sure. OTOH it's simplifying attribute access.

> That's "all" that comes to me right now. ;-)

Thanks for all your comments and suggestions.

> Thanks,
>
> Jonathan

leo

0 new messages