After the changes introduced by pdd21, I'm lost as to how 
to deal with classname conflicts when multiple HLL namespaces 
are involved.  I have a very real example from PGE, but bear 
with me as I present some background.  Also, note that I've
simplified a few details here for illustration, so if
you compare this to the actual PGE code you'll notice some
(insignificant) differences.
Background - before pdd21
-------------------------
When PGE was first implemented, everything in Parrot tended to
go in a global shared namespace (or at least that's all I knew about).
Therefore, to avoid namespace conflicts, I wrote PGE with "PGE::"
prefixes in all of its classnames.  Therefore we have classes like:
    PGE::Match    - base class for Match objects
    PGE::Grammar  - base class for Grammar objects
    PGE::Exp      - base class for nodes in the regex AST
The PGE::Exp class is itself subclassed into different node
types representing literals, groups, anchors, quantifiers, closures,
etc.  The current PGE code has these expression node subclasses named
with a prefix of "PGE::Exp::", but they probably should have been
named without the Exp:: portion, thus we have AST classes like:
    PGE::Literal  - literal expressions
    PGE::Group    - non-capturing group
    PGE::CGroup   - capturing group
    PGE::Subrule  - subrule call
    PGE::Closure  - embedded closure
Here's code to create PGE::Exp and its subclasses:
    .sub __onload :load
        $P0 = newclass 'PGE::Exp'
        $P0 = subclass 'PGE::Exp', 'PGE::Literal'
        $P0 = subclass 'PGE::Exp', 'PGE::Group'
        $P0 = subclass 'PGE::Exp', 'PGE::CGroup'
        $P0 = subclass 'PGE::Exp', 'PGE::Subrule'
        $P0 = subclass 'PGE::Exp', 'PGE::Closure'
        # ...
    .end
Okay, so far so good.  Now let's move into the world
postulated by pdd21_namespaces...
After pdd21
-----------
According to pdd21, each HLL gets its own hll_namespace.
PGE is really a form of HLL compiler, so it should have
its own hll_namespace, instead of using parrot's hll namespace:
.HLL 'pge', ''
Now then, the 'PGE::' prefixes on the classnames were
just implementation artifacts of working in a globally
flat namespace -- as a high-level language PGE really
ought to be referring to its classes as 'Match',
'Exp', 'Literal', etc.  So, if we're in the PGE HLL,
we ought to be able to drop the 'PGE::' prefix from
our classnames and namespaces.
So, here's the revised version of the code to create
the classes:
.HLL 'pge', ''
    .sub __onload :load
        $P0 = newclass 'Exp'
        $P0 = subclass 'Exp', 'Literal'
        $P0 = subclass 'Exp', 'Group'
        $P0 = subclass 'Exp', 'CGroup'
        $P0 = subclass 'Exp', 'Subrule'
        $P0 = subclass 'Exp', 'Closure'
        # ...
    .end
This code fails when run from parrot, because Parrot seemingly
already has a class named 'Closure':
    $ ./parrot ns.pir
    Class Closure already registered!
    current instr.: '__onload' pc 19 (ns.pir:9)
    $
So, this brings me to my question:  What is the official 
"best practice" pattern for HLLs to create their own classes
such that we avoid naming conflicts with existing classes
in Parrot and other HLLs?
----
Anticipating the answer that the classname arguments to C<subclass>
should be namespace keys instead of strings, as in:
$P0 = subclass [ 'pge'; 'Exp' ], [ 'pge'; 'Closure' ]
what namespace directive do we use to define the methods 
for the Closure class?
Thanks in advance, and apologies if I've overlooked the obvious.
Pm
I don't know that that's necessarily the case, but it's definitely an
option. You can just as easily argue that it's a library.
> Now then, the 'PGE::' prefixes on the classnames were
>  just implementation artifacts of working in a globally
> flat namespace -- as a high-level language PGE really
> ought to be referring to its classes as 'Match',
> 'Exp', 'Literal', etc.  So, if we're in the PGE HLL,
>  we ought to be able to drop the 'PGE::' prefix from
> our classnames and namespaces.
>
> So, here's the revised version of the code to create
> the classes:
>
>     .HLL 'pge', ''
>
>     .sub __onload :load
>          $P0 = newclass 'Exp'
>         $P0 = subclass 'Exp', 'Literal'
>         $P0 = subclass 'Exp', 'Group'
>         $P0 = subclass 'Exp', 'CGroup'
>         $P0 = subclass 'Exp', 'Subrule'
>         $P0 = subclass 'Exp', 'Closure'
>         # ...
>     .end
>
> This code fails when run from parrot, because Parrot seemingly
> already has a class named 'Closure':
>
>     $ ./parrot ns.pir
>     Class Closure already registered!
> current instr.: '__onload' pc 19 ( ns.pir:9)
>     $
>
> So, this brings me to my question:  What is the official
> "best practice" pattern for HLLs to create their own classes
> such that we avoid naming conflicts with existing classes
>  in Parrot and other HLLs?
This is unspecced. ATM, all classes go into the 'parrot' HLL. This is
a relic of the past and I think it needs to change. I'm pretty sure
that HLL classes will have to go into the HLL's root namespace (this
needs to happen anyway to prevent namespace pollution). That leaves us
with the question of how to differentiate core PMCs from HLL PMCs. I'm
not sure how to handle that, but that's what a spec is for.
We discussed some of this briefly at the OSCON hackathon, when we
talked about changing the class internals so that a Class isa
Namespace. That discussion hasn't led to any changes yet as Chip has
been kidnapped by his Real Life (tm).
I think the object model needs a thorough going over in general -- for
the reasons above and because it's an unproven system. I'm not
convinced that it will handle all of Perl 6's needs as is. No serious
OO language has been implemented yet on Parrot; everything up to this
point has been either procedural or functional.
--
Matt Diephouse
http://matt.diephouse.com
Agreed, but I think my questions equally apply to something
like .HLL 'perl6'.  
In PGE's case, if we simply want to treat it as a library for now,
in the [ 'parrot'; 'PGE'; ... ] namespace, I think we could do that
for a while.  But with perl6 and other languages joining parrot
soon, I'm not sure it's something we should postpone for too much
longer.
> >So, this brings me to my question:  What is the official
> >"best practice" pattern for HLLs to create their own classes
> >such that we avoid naming conflicts with existing classes
> > in Parrot and other HLLs?
> 
> This is unspecced. ATM, all classes go into the 'parrot' HLL. This is
> a relic of the past and I think it needs to change. I'm pretty sure
> that HLL classes will have to go into the HLL's root namespace (this
> needs to happen anyway to prevent namespace pollution). That leaves us
> with the question of how to differentiate core PMCs from HLL PMCs. I'm
> not sure how to handle that, but that's what a spec is for.
Why is the differentiation necessary -- wouldn't "core PMCs" simply
be part of the 'parrot' HLL?
> We discussed some of this briefly at the OSCON hackathon, when we
> talked about changing the class internals so that a Class isa
> Namespace. That discussion hasn't led to any changes yet as Chip has
> been kidnapped by his Real Life (tm).
I'm afraid I wasn't able to keep up with all of the details and
implications of that discussion at the hackathon.  I'll be glad
to chime in where I can, but I still don't understand some of
the details.
Thanks,
Pm
That's the place to put them. But how do you make the core PMCs
visible to the compiler and not to the user? I expect the Perl 6
compiler will want to use a ResizablePMCArray in places without making
it a builtin Perl 6 class. But how can you if there's only one new
opcode?
Perhaps this will be clearer if I demonstrate with code. I imagine
that this Perl 6:
my $obj = Perl6Object.new()
will translate to something like this PIR:
    .lex '$obj', $P0
    $P0 = new 'Perl6Object' # do Perl6 classes have sigils?
    $P0.INIT()
But that means if the user writes this Perl 6:
my $obj = ResizablePMCArray.new()
this PIR will be generated:
    .lex '$obj', $P0
    $P0 = new 'ResizablePMCArray' # oh no! this isn't an actual Perl6
class - it's namespace pollution!
    $P0.INIT()
We need to somehow differentiate between Perl6Object and
ResizablePMCArray. Especially given the possibility that the user will
write this:
class ResizablePMCArray { ... }
Does that break the compiler when it tries to create a
ResizablePMCArray to use internally? Or die because there's already a
ResizablePMCArray class? Remember that no matter how much name
mangling you do in this case, there's probably a language that doesn't
want to do any.
This isn't too much different from using keyed class names like
['pge'; 'Closure'] like you guessed in your first email. But this
places classes next to their namespaces, which is a good thing. But we
probably do need keyed class names to support this:
class Foo::Bar { ... }
HTH,
Two off-the-top-of-my-head possibilities:
1.  Reference core PMCs by their .ClassName constants as opposed 
    to their stringified names.  Then stringified names are _always_
    hll classes.
2.  Provide an opcode that allows us to lookup class names 
    in other hlls; i.e., allow the equivalent of things like
        $I0 = find_type [ 'parrot'; 'String' ]
        $P0 = new $I0
        $I0 = find_type [ 'pge'; 'Match' ]
        $P1 = new $I0
> Perhaps this will be clearer if I demonstrate with code.  I imagine
> that this Perl 6:
> 
>    my $obj = Perl6Object.new()
> 
> will translate to something like this PIR:
> 
>    .lex '$obj', $P0
>    $P0 = new 'Perl6Object' # do Perl6 classes have sigils?
>    $P0.INIT()
(If Perl6 classes have sigils, it's probably '::', just like package names.)
Actually, now that you mention this, perhaps it would end up
being more along the lines of:
    .lex '$obj', $P0                  # declare lexicaly scoped '$obj'
    $P1 = find_name 'Perl6Object'     # find class for 'Perl6Object'
    $P0 = $P1.'new'()                 # send 'new' message to Perl6Object
and then the 'new' method of the 'Perl6Object' class (likely inherited
from a base 'Class' type in Perl6 hll-space) takes care of finding 
the correct Parrot object type, calling Parrot's C<new> opcode with
that type, invoking INIT, and returning the resulting object to be 
placed in $P0.  
> But that means if the user writes this Perl 6:
> 
>    my $obj = ResizablePMCArray.new()
> 
> this PIR will be generated:
> 
>    .lex '$obj', $P0
>    $P0 = new 'ResizablePMCArray' # oh no! this isn't an actual Perl6
> class - it's namespace pollution!
>    $P0.INIT()
>
> We need to somehow differentiate between Perl6Object and
> ResizablePMCArray. Especially given the possibility that the user will
> write this:
> 
>    class ResizablePMCArray { ... }
There aren't any barewords in Perl 6, so all bare classnames have
to be predeclared in order to get past the compiler, and then it's
fairly certain we're talking about a Perl 6 class and not a Parrot
class.  
I suspect that if the Perl 6 programmer really wants to be using 
the Parrot ResizablePMCArray, it will need to be imported into 
the perl6 hll_namespace somehow, or otherwise given enough details
so that perl6's 'ResizablePMCArray' class object knows that it's
the Parrot class and not the Perl6 one.
> This isn't too much different from using keyed class names like
> ['pge'; 'Closure'] like you guessed in your first email. But this
> places classes next to their namespaces, which is a good thing. But we
> probably do need keyed class names to support this:
> 
>    class Foo::Bar { ... }
I'm expecting that both PGE and perl6 will be translating names
like "Foo::Bar" into an array of [ 'Foo'; 'Bar' ], and then looked
up relative to the current namespace.
All of which might seem to indicate that 'class is a namespace' is
the right approach, or at least that perl6 will be modeling it that way.
Thanks, Matt -- this is turning into a really helpful and useful
discussion, at least for me.
Pm
It comes down to a question of whether Perl 6 grammars are a high-level 
language. Debatable, so I'd go with whichever is easiest to work with 
both within PGE, and in code that uses PGE.
Within PGE, it comes down to whether you have to prefix every access to 
a PGE module with the PGE namespace, or whether you can use the .HLL 
directive to set a "default".
Outside PGE, it's a question of whether you can access the module 
directly or have to take extra steps to reach it as a module outside 
your current HLL. Or, if we say you can only directly access namespaces 
within your current HLL, then it's a question of whether you can access 
PGE modules in the 'parrot' HLL (so in the general case you only have to 
work with two HLLs: your own and 'parrot'), or whether you have to work 
with an arbitrary number of different HLLs to access core modules like 
PGE and TGE. With this in mind, I lean toward putting PGE in the 
'parrot' HLL.
But, agreed, the namespace pollution problem needs to be solved either way.
Aye, if a class is defined in an HLL namespace, it shouldn't also exist 
in the 'parrot' namespace. I'd call this a bug, 'subclass' should 
respect the current namespace (which should be set by the .HLL directive).
> We discussed some of this briefly at the OSCON hackathon, when we
> talked about changing the class internals so that a Class isa
> Namespace. That discussion hasn't led to any changes yet as Chip has
> been kidnapped by his Real Life (tm).
That's still a possibility, but it may also end up as Class is linked to 
a Namespace. (Anonymous classes have no namespace, but may be associated 
with a namespace at runtime.)
> I think the object model needs a thorough going over in general
Yup. It's on the list right after I/O, threads, and events.
> -- for
> the reasons above and because it's an unproven system. I'm not
> convinced that it will handle all of Perl 6's needs as is. No serious
> OO language has been implemented yet on Parrot; everything up to this
> point has been either procedural or functional.
Ruby is a serious OO language, but it's not finished yet. For that 
matter, Perl 6 is partially implemented. But, I entirely agree on the 
core point that pushing these languages forward will help push Parrot 
forward.
Allison
If we have a strict separation between the HLL namespace and the Parrot 
namespace (and I think we should), then the only way instantiate a core 
Parrot class/PMC from within an HLL is to first retrieve the 'parrot' 
namespace, and preferably through the typed interface. Speculatively:
    $P0 = get_root_namespace ['parrot']
    $P1 = $P0.find_class('ResizablePMCArray')
    $P2 = new $P1
    $P2.INIT()
How Perl 6 (or some other HLL) chooses to distinguish loading a module 
written in the same HLL from loading a module written in a different HLL 
is an open question. It will need some syntax. One earlier proposal 
suggested separating the HLL from the rest of the name with a single 
colon ('python:NLTK-Lite::Parse::LamdaCalculus').
Allison
I think we have to keep in mind here that there will be a *lot* of
hand-written code that needs to create PMCs from the Parrot core. I
don't want to have to use the above snippet in all my hand written
code; it adds a lot of bulk and is a huge pain.
Patrick threw out the idea of letting .Type constants refer to core
PMCs. That's a reasonable idea, I think. It lets me create them easily
and doesn't get in the way of HLL classes. And I don't think there's
any way to get those constants to work with anything but core PMCs
anyway.
> How Perl 6 (or some other HLL) chooses to distinguish loading a module
> written in the same HLL from loading a module written in a different HLL
> is an open question. It will need some syntax. One earlier proposal
> suggested separating the HLL from the rest of the name with a single
> colon ('python:NLTK-Lite::Parse::LamdaCalculus').
This is included in PDD21. Perl 6 will strip off the language, split
the module name and end up with a string ("python") and an array
(['NLTK-Lite', 'Parse', 'LamdaCalculus']). It can use the string to
load the correct compiler (this is still unimplemented, by the way).
The compiler object it gets will take the array and load the
appropriate library (this is also unimplemented atm).
Perl 6 could presumably install the class into it's own HLL, which
makes instantiation easy.
Don't forget static core pmcs vs. dyanamic core pmcs: pretty sure you can't 
use the . notation safely on the dynpmcs. 
.HLL 'pge', ''
is implying the toplevel namespace ['pge']. The C<newclass 'Exp'> therfore is 
created as ['pge';'Exp']. But you are subclassing that to an existing 
(because unqualified) 'Closure' name.
IMHO this should look like this:
  .HLL 'pge', ''
  ...
  cl = newclass 'Exp'     # ['pge'; 'Exp']
  ...
  .namespace ['Exp']      # ['pge'; 'Exp']
  ...
  scl = subclass 'Exp', ['Exp'; 'Closure']  # ['pge'; 'Exp'; 'Closure']
  ...
leo
et ceterum censeo ... that .HLL and namspaces should be orthogonal concepts
I strongly disagree.  I don't think that a subclass should have to
be named as a sub-namespace of its parent class.
Put another way, if Num isa Object, and Int isa Num,
does that mean that I would have to do...?
.hll 'perl6', ''
    $P0 = newclass 'Object'
    $P1 = subclass 'Object', ['Object'; 'Num']
    $P2 = subclass ['Object'; 'Num'], ['Object'; 'Num'; 'Int']
Normally I would expect 'Object', 'Int', and 'Num' to have their
own top-level namespaces within the HLL namespace, and not require
classnames to always include the list of parent classes.
Pm
> I strongly disagree.  I don't think that a subclass should have to
> be named as a sub-namespace of its parent class.
Namespace and classes are currently totally orthogonal. You are declaring a 
subclass (not a sub-namespace) with all the implications for naming it.
There was some discussion re unifying namespace and class 'namespaces' but it 
stalled.
The "class isa NameSpace" thingy is still undecided.
leo
Okay, I'll rephrase to avoid the classname/namespace confusion(*):
I don't think that a subclass' name should have to include the
names of its parent classes.  From your earlier message:
On Sat, Oct 21, 2006 at 07:10:21PM +0200, Leopold Toetsch wrote:
> IMHO this should look like this:
>
>   .HLL 'pge', ''
>   ...
>   cl = newclass 'Exp'     # ['pge'; 'Exp']
>   ...
>   .namespace ['Exp']      # ['pge'; 'Exp']
>   ...
>   scl = subclass 'Exp', ['Exp'; 'Closure']  # ['pge'; 'Exp'; 'Closure']
>   ...
It's the ['Exp'; 'Closure'] that bothers me here -- I don't think
that a subclass should have to include the name of its parent in
the class name.  It should be:
scl = subclass 'Exp', 'Closure' # ['pge'; 'Closure']
However, writing either this or
scl = subclass 'Exp', ['Closure'] # ['pge'; 'Closure']
gives me the "class Closure already registered" error that
started this thread.
-----
(*):  AFAICT, it's also not true that classnames and namespaces 
are "currently totally orthogonal", since the class' methods
have to be placed in a namespace that matches the classname.
So, a class named [ 'Exp'; 'Closure' ] must place its methods
in a [ 'Exp'; 'Closure' ] namespace.  
Pm
Would it be a good idea to start collecting requirements together from 
different language implementors so that when the time comes to work on 
the OO PDD, there is already a good description of what it needs to do?  
If so, I'm happy to make a start on a first cut and maintain it (e.g. 
accept patches to it from anyone who wants to contribute but doesn't 
have a commit bit).
Jonathan
> Would it be a good idea to start collecting requirements together from
> different language implementors so that when the time comes to work on
> the OO PDD, there is already a good description of what it needs to do?
> If so, I'm happy to make a start on a first cut and maintain it (e.g.
> accept patches to it from anyone who wants to contribute but doesn't
> have a commit bit).
Please do. The docs/pdds/clip/ directory exists for this.
-- c
I'll be very happy to see this and contribute where I can.
For my immediate/near-term future needs, I'm reasonably happy
with Parrot's existing implementation, with the exception that
classnames in HLLs seem to conflict with Parrot's pre-existing
classnames (and perhaps those of other HLLs).
Pm
I'm of course seeing your point, but the implementation differs. I'll try to 
summarize all the guts with more details:
1) a class hasa namespace
This means that namespace names and class names are fully independent.
2) Above newclass/subclass actually are doing this:
(with names abbreviated for line-length's sake)
  opcode / directive             # Namespace          Class
  ---------------------------------------------------------------   
  .HLL 'p', ''                   # 'p'  (or ['p'])    --- (1)
  cl = newclass 'E'              # ['p'; 'E']         'E'  
  scl = subclass 'E', ['E'; 'C'] # ['p'; 'E'; 'C']    ['E' 'C']
3) when a class is created, the code in (2) tries to find a matching namespace 
in the current namespace then in the HLL namespace else a new namespace is 
created.
4) Summary - if you don't qualify the 'Closure' it just collides with the 
existing class of that name - that's it.
(1) no effect
(2) src/objects.c:577 ff
leo
More specifically: If you have any questions related to a PDD in clip, 
please add them to a QUESTIONS section at the end of the PDD. For 
requirements, use REQUIREMENTS. Neither of these sections will live in 
the final version of the PDD, so it's a flag for me to process the 
discussion. (And it's enormously easier to roll the discussion into the 
PDD when it's collected together like that than scattered across several 
months of email. Especially considering how terrible Thunderbird's 
full-text searching is.)
Allison
HLL classnames should live in the symbol table (i.e. namespace), not in 
Parrot's internal class registry. Yes, this means PIR/POST will need 
different syntax for instantiating objects from HLL classes. But the 
syntax to create lexical and global variables is different than the 
syntax to create Parrot's internal named temporary variables. They're 
fundamentally different things, so different syntax is sensible.
The syntax for instantiating an object from an HLL class (that only 
lives in a namespace) should be quick and easy.
All of Parrot's internal classes will be accessible via the 'parrot' HLL 
namespace (though at times only virtually), so we don't necessarily have 
to have syntax that deals directly with the registry. But there's enough 
code that instantiates from type numbers to make it worth keeping that 
as an option.
------
Okay, so that's what I want. Discussion: What does it break and are the 
trade-offs worth it? How deeply ingrained is the Parrot class registry? 
('interpreter->class_hash' is pretty thoroughly sprinkled through the 
code when looking up types (curiously, the 'class_hash' is a 
'enum_type_NameSpace', but a different instance of it than 
'interpreter->root_namespace').)
How costly would it be to have lookups performed on the root_namespace 
instead of the class_hash? If it's too costly, name-mangling the HLL 
namespace name into the class_hash is a possibility, but an ugly one.
Allison
What would be *really* great, though, is if implementers of other 
languages that do OO stuff could contribute their needs to this section. 
If you would rather send a patch than ci (or don't hve a commit bit), 
just send it along to the list and I'll make sure it's applied.
Thanks!
Jonathan
> OK, so I've added a REQUIREMENTS section to the objects PDD now and 
> filled it out with some (hopefully most) of what Perl 6 and .Net need as 
> a start. 
Thanks Jonathan, it's a great start!
Allison