Okay, as promised, here's the object proposal for parrot. (And yes, it was finished by 5PM EST--if you got it later, it just means I lingered over coffee in a blissfully wireless-free coffee shop down the street from my apartment)
Objects, as far as I see it, have the following properties:
1) They have runtime-assignable properties 2) They have reference semantics 3) They belong to a class 4) The class they belong to may be the child of one or more parent classes 5) You can call methods on them 6) They have a set of attributes 7) They may inherit attributes from their parent class(es) 8) Attributes are mostly private to the declaring class 9) They don't have a value outside their attributes 10) Parent classes can only be in the inheritance hierarchy once. 11) They are all orange
There are probably other core object properties that I've missed. This would be the time to mention them.
Now, as far as the above go, here's what we're going to need to add, and what I'm proposing.
#1 is provided by core services already. We do need to work out whether properties get propagated to the object or stay with the reference part, but as property assignment's delegated to the PMC itself we can do this with no core changes.
#2 can be done with a reference PMC type that delegates most of its operations to its referent. This isn't much (if any) different from regular references, and just needs a PMC class for it. No biggie.
#3 Since each class has a PMC type, we just need to associate the PMC type with the data a class needs. Also pretty much no big deal, and something we already have facilities for.
#4 is a matter either for method dispatch, which is handled by the PMC's vtable method entry, or for attribute construction, which we'll deal with in a little bit.
#5 We can already do this, via the method entry in the PMC's vtable. We need to better define how it works, since I've been unreasonably fuzzy about it. I'll do that in a little, hopefully with a JIT-friendly spin.
#6 Attributes. These are just a set of slots with names associated with them. Objects are essentially an attribute chunk wrapper with some metadata. I'll describe them after the end of the list
#7 This means that we have to deal with a parent class looking for attributes insinde an object of a child class
#8 means we may potentially deal with multiple attributes of the same name in an object, in separate parent classes
#9 Yes, this is different from perl 5's object model. That's fine, as it won't impact perl 5 object code running on us. (Think about it for a second, given what we already have in place... :)
#10 We do MI, but we don't instantiate a class' attributes multiple times if its in the hierarchy for a class more than once. If it is, the leftmost instance is real, the rest are virtual
#11 This goes without saying, of course
So, with these things in place, what do we need? I see the current scheme needing extension in three places
1) We need callmeth and jmpmeth opcodes
2) The method call vtable entry needs to be better defined.
3) We need to define what the heck attributes are and how the work for an object
4) We need to define how attributes work for classes
The call/jmpmeth opcodes either make a returnable or non-returnable method call. They fetch the function pointer from the object PMC and either dispatch to it or save the current state and jsr to it. Note that you can't jmpmeth into a C-implemented method, so no tail calling into those without some wrapper opcodes. The registers still need to be set up appropriately, just like with a regular sub call.
The find_method vtable entry should die, and be replaced with a plain method entry. This should return either the address of the start of the method's bytecode, or NULL. The NULL return is for those cases where the method actually executed via native code, and thus doesn't have to go anywhere. If an address is returned it's expected that the engine will immediately dispatch to that spot, obeying parrot's calling conventions.
For object attributes, we're going to use an Attr PMC. Its data pointer points to an attribute bufffer, which is just an array. Each attribute takes up a single slot in the array. By default attributes must be GC-able entities, but if someone objects and is willing to pay the speed penalty they can have an Attr subclass that has a custom GC routine. (I'm not convinced that the speed loss outweighs the memory win, hence the default restrictions)
Classes are a type of object, and the attributes in classes keep track of the important meta-information for objects. For example, one of the class attributes is a hash that has a name->slot mapping table so we don't have to duplicate it for each object.
We have the option of either building in attribute lookup code to the interpreter, or adding in a few vtable methods. I'm more inclined to gp with interpreter support, as we've got more than enough vtable entries as it is, but I can be convinced otherwise. If we *do* add in vtable support I think we can safely skip the keyed versions, since the only things that should be getting attribute info are the methods for a class, and by the time you're in there you have to have a real object, not an index into an array or something.
The structures:
*) Attr PMCs have an array off their data pointer.
*) Classes are variants of Attr PMCs. They have the following guaranteed attributes in slots:
0) Hash with class name to attribute offset 1) Hash with attribute name to attribute offset (relative to the offset found from the hash in slot 0, generally known at compile time, but introspection is nice) 2) Integer number of attributes this class has 3) Notification array
Finally, for perl 5 compatibility, we don't have to add anything special to the interpreter engine. Having an appropriate method vtable entry for perl 5 classes is all it takes, and that's no big deal. (As perl 5 classes are just hashes/arrays/scalars with a special method call entry)
I'm not 100% sure how to handle delegation, so for right now we're just punting. No automatic delegation. This *will* be fixed, but we can hash it out after this proposal has been ravaged appropriately.
So.... have at it. What have I forgotten, what's wrong, and who's got questions on how this works? (I can put together examples, but this is pretty long as it is, and I think it's reasonably self-explanatory. Besides, assembly language isn't generally the best way to demonstrate anything... :)
-- Dan
--------------------------------------"it's like this"------------------- Dan Sugalski even samurai d...@sidhe.org have teddy bears and even teddy bears get drunk
On Thu, 9 Jan 2003 16:40:20 -0500, Dan Sugalski wrote: > #10 We do MI, but we don't instantiate a class' attributes multiple > times if its in the hierarchy for a class more than once. If it is, the > leftmost instance is real, the rest are virtual
This will mean we can't support Eiffel, which allows repeated real inheritance of the same class. It does this by allowing renaming at the feature level (thats attributes and methods in perl) when inheritance is declared. Repeated inherited features are shared if they keep the same name (and they really are the same feature), or split if they don't.
-- Peter Haworth p...@edison.ioppublishing.com Warning! Your Operating System is out of date! It can be replaced by a more memory abusive Operating System. Please contact your System Vendor for details.
> On Thu, 9 Jan 2003 16:40:20 -0500, Dan Sugalski wrote: >> #10 We do MI, but we don't instantiate a class' attributes multiple >> times if its in the hierarchy for a class more than once. If it is, >> the leftmost instance is real, the rest are virtual
My only question here is: What is leftmost? Is that the same as "closest to the actual class we're looking at" or is it "first time it appears in the inheritance structure, and thus the furthest from the relevant class" ? (or something else entirely?)
> This will mean we can't support Eiffel, which allows repeated real > inheritance of the same class. It does this by allowing renaming at the > feature level (thats attributes and methods in perl) when inheritance is > declared. Repeated inherited features are shared if they keep the same > name (and they really are the same feature), or split if they don't.
I'll admit to never having gotten to looking at eiffel, just hearing about it from some other folks ...
But what is the point of explicitly inheriting from the same class multiple times? I mean, sure, if it's in the inheritance tree multiple times, fine, but then you ignore most of them generally; what benefit/use comes from having it actually be in the tree multiple times as distinct entities?
I'm just wondering there ...
But if it's renaming the structure anyway, wouldn't it still be possible with the single-MI structure that dan proposed? as in, if B inherits from A and then C inherits from A and B directly (and assuming there's a need to separately retain the individual inheritance directions), wouldn't the compiler then say that B inherits from A and C inherits from A2 and B, to retain them both in the parrot?
--attriel
(I could, of course, be horribly wrong, had I stated a firm opinion rather than requests for more information :)
>On Thu, 9 Jan 2003 16:40:20 -0500, Dan Sugalski wrote: >> #10 We do MI, but we don't instantiate a class' attributes multiple >> times if its in the hierarchy for a class more than once. If it is, the >> leftmost instance is real, the rest are virtual
>This will mean we can't support Eiffel
Nope. :)
What it means is that the proposed base object system won't work for eiffel. There are very few proposed core changes to support this object system, and if you think about it the expressed program-level semantics are sufficient for eiffel (I think), it's just the behind-the-curtain bits that aren't. And there's nothing to say that a theoretical eiffel implementation couldn't just have a different Attr-style object. (Eiffel's classes, IIRC, are compile-time fixed so it can do the necessary code cloning and renaming magic to make it all work. I suppose we could too, but it's a lot more work to do it since things can change at runtime in perl/python/ruby classes) -- Dan
--------------------------------------"it's like this"------------------- Dan Sugalski even samurai d...@sidhe.org have teddy bears and even teddy bears get drunk
> > On Thu, 9 Jan 2003 16:40:20 -0500, Dan Sugalski wrote: >>> #10 We do MI, but we don't instantiate a class' attributes multiple >>> times if its in the hierarchy for a class more than once. If it is, >>> the leftmost instance is real, the rest are virtual
>My only question here is: What is leftmost? Is that the same as "closest >to the actual class we're looking at" or is it "first time it appears in >the inheritance structure, and thus the furthest from the relevant class" >? (or something else entirely?)
For attributes it doesn't matter. For methods, well, I'm not sure whether a leftmost depth-first insertion is right or inserting it somewhere else is better. I expect Damian Has The Answer. (Or, if I remember from the last time I asked, several answers of which there wasn't a clear Best Answer)
> > This will mean we can't support Eiffel, which allows repeated real >> inheritance of the same class. It does this by allowing renaming at the >> feature level (thats attributes and methods in perl) when inheritance is >> declared. Repeated inherited features are shared if they keep the same >> name (and they really are the same feature), or split if they don't.
>I'll admit to never having gotten to looking at eiffel, just hearing about >it from some other folks ...
>But what is the point of explicitly inheriting from the same class >multiple times? I mean, sure, if it's in the inheritance tree multiple >times, fine, but then you ignore most of them generally; what benefit/use >comes from having it actually be in the tree multiple times as distinct >entities?
If you do redispatching of method calls it makes sense, since if there are enough redispatches you may end up redispatching to one class' method more than once because it's in the tree more than once, in which case you'd want to get the correct set of attributes, as each instance of the class in the tree should have a separate set of attributes in that case.
>(I could, of course, be horribly wrong, had I stated a firm opinion rather >than requests for more information :)
Ah, go for the bold statement. People are more likely to refute an error that way. :)
Also, as I pointed out a little while ago, this is the proposal for the base perl6/ruby/python object scheme, but there's nothing that forbids transparent interoperability with a different object scheme. -- Dan
--------------------------------------"it's like this"------------------- Dan Sugalski even samurai d...@sidhe.org have teddy bears and even teddy bears get drunk
Dan Sugalski wrote: > and who's got > questions on how this works? (I can put together examples, but this > is pretty long as it is, and I think it's reasonably > self-explanatory. Besides, assembly language isn't generally the best > way to demonstrate anything... :)
Well, as far as I'm concerned, an assembly snippet would be nice. But I can as well wait for implementation and study what will be relevant to objects in the t/ directory.
Indeed, once you wrote some Parrot assembly code to support a^Htwo stupid^Wesoteric languages, Parrot assembly is quite a nice and easy way to see how theory behaves in real life... :o)
On Friday, January 10, 2003, at 11:49 AM, Dan Sugalski wrote: > At 1:37 PM +0000 1/10/03, Peter Haworth wrote: >> This will mean we can't support Eiffel
> Nope. :)
> What it means is that the proposed base object system won't work for > eiffel.
Actually, if you really want Eiffel to compile to Parrot, it might be interesting to work on getting ANSI C to compile to Parrot first, since most Eiffel compilers use compilation to C as an intermediate step.
You might lose the ability to reuse an Eiffel-created object with other languages, but if you're using Eiffel, you're probably not going to be happy with object-orientation in any language that doesn't strictly enforce DBC anyway. ;-)
Here are some examples from Object Oriented Software Construction (Second Edtion), Chapter 15 (Multiple Inheritance):
* Simple multiple inheritance:
class PLANE ... class ASSET ... class COMPANY_PLANE inherit PLANE ASSET ...
or
class TREE [G] ... -- Parametric Polymorphism class RECTANGLE ... class WINDOW inherit TREE[WINDOW] RECTANGLE ...
* Renaming:
class WINDOW inherit TREE [WINDOW] rename child as subwindow, is_leaf as is_terminal, root as screen, arity as child_count, ... end RECTANGLE ...
* Page 548, "Unobtrusive repeated inheritance":
Cases of repeated inheritance [...], with duplicated features as well as shared ones, do occur in practice, but not frequently. THey are not for beginners; only after you have reached a good level of sophistication and practice in object technology should you encounter any need for them.
If you are writing a straightforward application and end up using repeated inheritance, you are probably making things more complicated than you need to.
* Redundant inheritance:
class A ... class B inherit A ... class D inherit B A ... -- Forgetting that B inherits from A
In Eiffel, the default sharing semantics for multiple inheritance means this misstep doesn't cause weird things to happen.
* Other weirdness
class DRIVER ... -- violation count, address, etc. class US_DRIVER inherit DRIVER ... class FR_DRIVER inherit DRIVER ... -- Ah, France! class US_FR_DRIVER inherit US_DRIVER rename violations as us_violations, ... FR_DRIVER rename violations as fr_violations.
PLEASE NOTE: I'm not a fan of this example. But, it comes from the book. I'd be more likely to model this as DRIVER has-a SET of LICENSEs keyed-by AUTHORITY, where the LICENSE has stuff like licensed address and violation count, etc. But, then, my thinking and modeling habits tend toward the dynamic, and Eiffel tends toward the static. The implications of continuing the pattern of this example in the face of a larger set of authorities (countries) is, well, explosive (speaking combinatorically). In the face of a dynamic set of authorities, its unworkable.
Anyway, I know that the Eiffel libraries make plenty of use of Eiffel's inheritance and assertion mechanisms. I don't know how often these more complicated situations arise in practice. The point is, Eiffel does have these mechanisms defined and they are expected to be available, and possibly required just to build mundane applications that use the standard library.
Regards,
-- Gregor
"attriel" <attr...@d20boards.net> 01/10/2003 10:37 AM Please respond to attriel
To: <perl6-intern...@perl.org> cc: Subject: Re: Objects, finally (try 1)
> On Thu, 9 Jan 2003 16:40:20 -0500, Dan Sugalski wrote: >> #10 We do MI, but we don't instantiate a class' attributes multiple >> times if its in the hierarchy for a class more than once. If it is, >> the leftmost instance is real, the rest are virtual
My only question here is: What is leftmost? Is that the same as "closest to the actual class we're looking at" or is it "first time it appears in the inheritance structure, and thus the furthest from the relevant class" ? (or something else entirely?)
> This will mean we can't support Eiffel, which allows repeated real > inheritance of the same class. It does this by allowing renaming at the > feature level (thats attributes and methods in perl) when inheritance is > declared. Repeated inherited features are shared if they keep the same > name (and they really are the same feature), or split if they don't.
I'll admit to never having gotten to looking at eiffel, just hearing about it from some other folks ...
But what is the point of explicitly inheriting from the same class multiple times? I mean, sure, if it's in the inheritance tree multiple times, fine, but then you ignore most of them generally; what benefit/use comes from having it actually be in the tree multiple times as distinct entities?
I'm just wondering there ...
But if it's renaming the structure anyway, wouldn't it still be possible with the single-MI structure that dan proposed? as in, if B inherits from A and then C inherits from A and B directly (and assuming there's a need to separately retain the individual inheritance directions), wouldn't the compiler then say that B inherits from A and C inherits from A2 and B, to retain them both in the parrot?
--attriel
(I could, of course, be horribly wrong, had I stated a firm opinion rather than requests for more information :)
> Actually, if you really want Eiffel to compile to Parrot, it might be > interesting to work on getting ANSI C to compile to Parrot first, since > most Eiffel compilers use compilation to C as an intermediate step.
This won't be too much of stretch .... We already have an ANSI C compiler working well with DotGNU pnet.. (with a IL output plugin)...
<log of="#dotgnu" on="22-10-2002">
[11:31] <rhysw> Dan: to go off on a different tangent now - C support [11:31] <rhysw> Dan: compiling C to Parrot, that is [11:31] <Dan> Ah, that. Yeah, definitely doable. It'll be rather slow, though [11:31] <Dan> Our function call overhead's rather large compared to what C needs [11:32] <Dan> Still, I find the thought of C with native continuations rather interesting. Scary, but interesting [11:32] <rhysw> Dan: I was more thinking of the memory layout issues. C code is very particular about struct layout, array representation, etc. I didn't see any opcodes that would allow one to do "pull an int32 out of offset N from this pointer". [11:33] <Dan> C's not at all particular about struct layout, unless they changed the standard. [11:33] <Dan> Still, you can do them either with struct PMCs, whcih'd be slowish, or with the pack/unpack opcodes, which I bet are insufficiently documetned [11:34] Action: Dan apparently can't type this evening [11:35] <Dan> Still, the packed structures need more thoght. Hrm. [11:35] <rhysw> Dan: I suppose a better question would be "is supporting C a goal, or would it just be a cool hack but otherwise uninteresting?" [11:36] <rhysw> because, as you say, it wouldn't be terribly efficient ... [11:36] <Dan> Neither, really. It's interesting in the sense that it'd let people use code that they otherwise couldn't, if they don't have a C compiler for. [11:36] <Dan> But it's definitely not a primary goal [11:36] <Dan> Consider it both mildly interesting and mildly bemusing :) [11:37] <fitzix> Dan: It could make it very useful as a power tool [11:37] <Kyeran> Could someone toss up the Parrot URL so I can find out what it is? :) [11:37] <fitzix> Dan: sort of a "swiss army knife" kind of thing [11:37] <Dan> True, but I'm not willing to lose sight of the primary goal for it.
</log>
Was our last conversation about Parrot & C compilers ..
We'll be adding a Parrot codegen for the compiler backend , as soon as the Objects are set in stone.... So possibly there exists a possibility for doing up the C compiler (with codegen tweaking) to develop a C compiler targetting Parrot.
Gopal -- The difference between insanity and genius is measured by success
>>... Besides, assembly language isn't generally the best >>way to demonstrate anything... :) > Indeed, once you wrote some Parrot assembly code to support a^Htwo > stupid^Wesoteric languages, Parrot assembly is quite a nice and easy > way to see how theory behaves in real life... :o)
Good point. The implmentation of $thing does show, how's the real usability. These totally unneeded^Wsuperfluous^Wfine languages brought parrot & imcc a lot further.
On Sat, Jan 11, 2003 at 10:12:42AM +0530, Gopal V wrote: > If memory serves me right, Chris Dutton wrote: > > Actually, if you really want Eiffel to compile to Parrot, it might be > > interesting to work on getting ANSI C to compile to Parrot first, since > > most Eiffel compilers use compilation to C as an intermediate step. > [11:32] <rhysw> Dan: I was more thinking of the memory layout issues. C code is very particular about struct layout, array representation, etc. I didn't see any opcodes that would allow one to do "pull an int32 out of offset N from this pointer". > [11:33] <Dan> C's not at all particular about struct layout, unless they changed the standard. > [11:33] <Dan> Still, you can do them either with struct PMCs, whcih'd be slowish, or with the pack/unpack opcodes, which I bet are insufficiently documetned
Probably this has all been worked out by now, but Rhys and Dan are coming at it from different angles. C isn't fussy, but the ABI for a platform is very fussy. I presume Rhys is thinking about compiling C code to parrot, and then linking through to native C code (such as the native standard C library) via parrot. Dan's assuming that C code we compile never wants to call out to the platform. If I remember the terms correctly, Dan's thinking about free standing C implementations. Effectively that would give Inline::C without needing a C compiler, but you'd not be able to call out from your C code.
What Rhys is thinking about is potentially far more interesting, as it would allow the perl6 version of Inline::C to wrap external libraries supplied only as objects and headers, without needing a C compiler on the machine. It's also harder, partly because such a system would need to know the ABI for each platform you wanted to do this on.
But if anyone has tuits, could we have a z-code interpreter first please? Or better still, a unified, fast, assembler? (with a pony?)
> fussy. I presume Rhys is thinking about compiling C code to parrot, and then > linking through to native C code (such as the native standard C library) via > parrot.
Nope ... At least for our .NET platorm stuff ,we are planning to compile glibc into IL so that the "native ABI" is accessed only via the engine. Most peices of glibc, depend on a few platform functions to run (most notably some in unistd.h) ... Most of the other peices of glibc can be directly used like the printf formatting code or file functions , once these underlying posix calls are in place... this is consistent with design philosophy of glibc which has been source portability rather than binary.
So get your own glibc-managed.dll according to long,int, and char sizes :)
> without needing a C compiler, but you'd not be able to call out from your > C code.
We are facing a similar situation, but only that we have PInvoke (like your NCI) which allows you to define PInvoke methods from Managed C (I'd prefer the term Micro-Managed :)
But the following does work for some functions :)
extern int puts(const char *s) __attribute__((__pinvoke__("libc.so.6")));
mm... ugly !
Rhys has also been really cool about the data stored via C ... So legacy code which used fixed size files (otherwise called "records") will be useful. This allows us to declare 8bit characters and strings of those and all the stuff we're used to with C like unions ... (C# has 16bit chars, and strings are UTF8 encoded , IIRC) ...
In short, we will keep similar layouts in memory for structs and unions as far as possible ... (unions without offset based access to memory boggles my mind....)
So even with all the type-safety of IL we can run the following code ...
float a=3.14; int b=*((int*)(&(a)));
> What Rhys is thinking about is potentially far more interesting, as it would
What Rhys is using right now is a custom stdlib which is called pnetC and was just released ... People *really* curious about what Rhys is doing (and what those half-a-million lines are doing in DotGNU) should try the Portable.net C compiler (http://dotgnu.org/downloads/pnet/) .. We have mirrors on each gnu mirror as /projects/dotgnu ... (since we're likely to be slashdotted to death soon....)
Very, Very curiously ... I can call C# methods from Managed C, but not the otherway around :) .. and I can call native methods from Managed C , but that's dreadfully unportable like you said ...
I'm really working only to get C# compiled to Parrot .... other language frontends in development like Java or C or JScript can wait a looong time.
Gopal -- The difference between insanity and genius is measured by success
On Thu, Jan 09, 2003 at 04:40:20PM -0500, Dan Sugalski wrote: > The find_method vtable entry should die, and be replaced with a plain > method entry. This should return either the address of the start of > the method's bytecode, or NULL. The NULL return is for those cases > where the method actually executed via native code, and thus doesn't > have to go anywhere. If an address is returned it's expected that the > engine will immediately dispatch to that spot, obeying parrot's > calling conventions.
What about the case where the object doesn't have the method you're asking for? You seem to be using NULL to mean something other than "not found", so does that mean not found is an exception?
And if NULL is returned it is expected that the method has already been called? If so, there doesn't seem to be any way to find out if a PMC possesses (modulo AUTOLOAD) a method, without the danger of it being called.
Will there be anything built in at parrot level like Perl's AUTOLOAD system? Or will that have to be done explicitly by the perl6 code generator wrapping methods in a routine that catches the "not found" exception, and attempts to use AUTOLOAD? [and whatever multimatch despatch system perl6 will be using to find the "best" method]
On Sat, Jan 11, 2003 at 06:34:56PM +0530, Gopal V wrote: > If memory serves me right, Nicholas Clark wrote: > > fussy. I presume Rhys is thinking about compiling C code to parrot, and then > > linking through to native C code (such as the native standard C library) via > > parrot.
> Nope ... At least for our .NET platorm stuff ,we are planning to compile > glibc into IL so that the "native ABI" is accessed only via the engine.
Oh right. As you commented on IRC, I am very wrong.
> Rhys has also been really cool about the data stored via C ... So legacy > code which used fixed size files (otherwise called "records") will be useful. > This allows us to declare 8bit characters and strings of those and all the > stuff we're used to with C like unions ... (C# has 16bit chars, and strings > are UTF8 encoded , IIRC) ...
That doesn't sound right. But if it is right, then it sounds very wrong.
(Translation: Are you sure about your terms, because what you describe sounds wonky. Hence if they are using UTF8 but with 16 bit chars, that feels like a silly design decision to me. Perl 5 performance is not enjoying a variable length encoding, but using an 8 bit encoding in 8 bit chars at least makes it small in memory.)
> So even with all the type-safety of IL we can run the following code ...
> float a=3.14; > int b=*((int*)(&(a)));
Ooh. So what happens if I try to run:
char *a = 0; *a++;
:-)
Does the VM just "segfault" the failing thread, rather than all threads in a process?
> What Rhys is using right now is a custom stdlib which is called pnetC and > was just released ... People *really* curious about what Rhys is doing > (and what those half-a-million lines are doing in DotGNU) should try the > Portable.net C compiler (http://dotgnu.org/downloads/pnet/) .. We have > mirrors on each gnu mirror as /projects/dotgnu ... (since we're likely to > be slashdotted to death soon....) > I'm really working only to get C# compiled to Parrot .... other language > frontends in development like Java or C or JScript can wait a looong time.
Hmm. So if DotGNU has a C to Parrot compiler, then we just compile the perl5 source code down to Parrot bytecode, et voilá, we have a perl implementation. I do hope no-one wanted it to go fast. :-) [then again, I wonder how the parrot JIT would cope]
So Rhys is mad:
> The difference between insanity and genius is measured by success
> That doesn't sound right. But if it is right, then it sounds very wrong.
> (Translation: Are you sure about your terms, because what you describe sounds > wonky. Hence if they are using UTF8 but with 16 bit chars, that feels like a > silly design decision to me. Perl 5 performance is not enjoying a variable > length encoding, but using an 8 bit encoding in 8 bit chars at least makes > it small in memory.)
I mean they're using UTF8 encoded strings ... and chars are 16 bit ... which means that reading chars from strings is wonky (from all I remember, ILString* uses int16[] for storing stuff... but the files do use UTF8).. Actually I think the ILImage functions in Portable.net abstract out the UTF8 reading and return int16* arrays ...
Hope that makes more sense :)
> Ooh. So what happens if I try to run:
> char *a = 0; > *a++;
Assuming you meant (*a)++; ... coz *a++; is optimsed away into just a++; by GCC ..
> Does the VM just "segfault" the failing thread, rather than all threads in > a process?
How does this do for an answer ?
Uncaught exception: System.NullReferenceException: The value 'null' was found where an instance of an object was required at <Module>.main() in segfault.c:3 at <Module>..start(String[])
Bye Bye segfaults, hello exceptions ...
> Hmm. So if DotGNU has a C to Parrot compiler, then we just compile the perl5 > source code down to Parrot bytecode, et voilá, we have a perl implementation.
This is assuming we can get all the perl5 dependencies compiled to Parrot without any issues ... (in fact this might be very,very hard... considering the fact that parrot codegen is still commented out for most of portable.net).
> I do hope no-one wanted it to go fast. :-) > [then again, I wonder how the parrot JIT would cope]
Which is why nobody is really interested in doing this :-) ... we're building the C compiler for fun ...
> So Rhys is mad:
> > The difference between insanity and genius is measured by success
> I hope he falls on the right side of the divide.
That's *my* sig ... and I'm attempting a cross-over :)
Gopal -- The difference between insanity and genius is measured by success
>On Thu, Jan 09, 2003 at 04:40:20PM -0500, Dan Sugalski wrote: >> The find_method vtable entry should die, and be replaced with a plain >> method entry. This should return either the address of the start of >> the method's bytecode, or NULL. The NULL return is for those cases >> where the method actually executed via native code, and thus doesn't >> have to go anywhere. If an address is returned it's expected that the >> engine will immediately dispatch to that spot, obeying parrot's >> calling conventions.
>What about the case where the object doesn't have the method you're asking >for? You seem to be using NULL to mean something other than "not found", >so does that mean not found is an exception?
Sorry. The assumption is one of three things happen:
1) A value is returned, which is the address of the parrot code to dispatch to 2) A NULL is returned, which indicates the method call has been made and the interpreter can proceed to the next instruction in the stream 3) An exception is thrown, indicating that the method couldn't be called.
>And if NULL is returned it is expected that the method has already been >called? If so, there doesn't seem to be any way to find out if a PMC >possesses (modulo AUTOLOAD) a method, without the danger of it being called.
If we don't have a can in the vtable, then we need to fix that. :)
>Will there be anything built in at parrot level like Perl's AUTOLOAD system? >Or will that have to be done explicitly by the perl6 code generator wrapping >methods in a routine that catches the "not found" exception, and attempts to >use AUTOLOAD? [and whatever multimatch despatch system perl6 will be using to >find the "best" method]
Parrot will have an AUTOLOAD-style fallback mechanism available to it. I'll add that to the design todo list for the edited version of objects. -- Dan
--------------------------------------"it's like this"------------------- Dan Sugalski even samurai d...@sidhe.org have teddy bears and even teddy bears get drunk
>Dan Sugalski wrote: >> and who's got >> questions on how this works? (I can put together examples, but this >> is pretty long as it is, and I think it's reasonably >> self-explanatory. Besides, assembly language isn't generally the best >> way to demonstrate anything... :)
>Well, as far as I'm concerned, an assembly snippet would be nice. >But I can as well wait for implementation and study what will be >relevant to objects in the t/ directory.
I'll put together some of what I'm thinking of next week with the first rev of the object spec. -- Dan
--------------------------------------"it's like this"------------------- Dan Sugalski even samurai d...@sidhe.org have teddy bears and even teddy bears get drunk
> > This allows us to declare 8bit characters and strings of those and all the > > stuff we're used to with C like unions ... (C# has 16bit chars, and strings > > are UTF8 encoded , IIRC) ...
> That doesn't sound right. But if it is right, then it sounds very wrong.
> (Translation: Are you sure about your terms, because what you describe sounds > wonky. Hence if they are using UTF8 but with 16 bit chars, that feels like a > silly design decision to me. Perl 5 performance is not enjoying a variable > length encoding, but using an 8 bit encoding in 8 bit chars at least makes > it small in memory.)
The CLR runtimes use 16 bit chars and UTF16-encoded strings (at least as far as it's visible to the 'user' programs).
lupus
-- ----------------------------------------------------------------- lu...@debian.org debian/rules lu...@ximian.com Monkeys do it better
> The CLR runtimes use 16 bit chars and UTF16-encoded strings (at least as > far as it's visible to the 'user' programs).
10 23.2.3 #Strings heap 11 The stream of bytes pointed to by a "#Strings" header is the physical representation of the logical string heap. 13 but parts that are reachable from a table shall contain a valid null terminated UTF8 string. When the #String
So I think the runtime does a UTF16 conversion , I suppose ...So you're right ... all C# programs get UTF16 strings to work with... but that's not the way they're in meta-data... (JVM also has UTF8 strings and 16 bit chars, it's not really a big issue :)
But coming back to parrot ... I don't think parrot uses UTF8 (from what I could gather it seems to be all ASCII ?) ... Or is UTF8 hiding in somewhere ?...
Gopal -- The difference between insanity and genius is measured by success
> If memory serves me right, Paolo Molaro wrote: > > The CLR runtimes use 16 bit chars and UTF16-encoded strings (at least as > > far as it's visible to the 'user' programs).
> 10 23.2.3 #Strings heap > 11 The stream of bytes pointed to by a "#Strings" header is the physical representation of the logical string heap. > 13 but parts that are reachable from a table shall contain a valid null terminated UTF8 string. When the #String
The #Strings heap doesn't contain strings for programs that run in the CLR (unlike the #US -user sring- heap that contains the strings in UTF-16) encoding. What matters, though, is the encoding of the String class at runtime and that is defined to be UTF-16, it has absolutely no importance what encoding it has on disk (even though that encoding is still UTF-16).
lupus
-- ----------------------------------------------------------------- lu...@debian.org debian/rules lu...@ximian.com Monkeys do it better
Gopal V: # But coming back to parrot ... I don't think parrot uses UTF8 # (from what I could gather it seems to be all ASCII ?) ... Or # is UTF8 hiding in # somewhere ?...
Parrot will have a "default string type" that's build-specific, so that e.g. Asian nations can have whatever the most popular encoding is in their country. The "default default string type" will be utf8, but it's currently ASCII because Unicode Is Hard.
"If you want to propagate an outrageously evil idea, your conclusion must be brazenly clear, but your proof unintelligible." --Ayn Rand, explaining how today's philosophies came to be
>But coming back to parrot ... I don't think parrot uses UTF8 (from what >I could gather it seems to be all ASCII ?) ... Or is UTF8 hiding in >somewhere ?...
Unicode is hiding in the ICU directory, which we need to get integrated. We'll probably be mostly UTF16, only because that's what ICU uses and there's no good reason to reinvent the wheel again. All three encodings and their endian variants will be supported, as will a variety of other encodings and character sets. -- Dan
--------------------------------------"it's like this"------------------- Dan Sugalski even samurai d...@sidhe.org have teddy bears and even teddy bears get drunk
>Gopal V: ># But coming back to parrot ... I don't think parrot uses UTF8 ># (from what I could gather it seems to be all ASCII ?) ... Or ># is UTF8 hiding in ># somewhere ?...
>Parrot will have a "default string type" that's build-specific, so that >e.g. Asian nations can have whatever the most popular encoding is in >their country. The "default default string type" will be utf8, but it's >currently ASCII because Unicode Is Hard.
Well... default may well be latin-1 or plain ASCII, because Unicode Is Unneccesary. :) Well, most of the time at least. Unicode, of course, will be available, but if the data coming in is ASCII or Latin-1, re-encoding's a bit of a waste of time. -- Dan
--------------------------------------"it's like this"------------------- Dan Sugalski even samurai d...@sidhe.org have teddy bears and even teddy bears get drunk
On Fri, 10 Jan 2003 11:49:14 -0500, Dan Sugalski wrote: > At 1:37 PM +0000 1/10/03, Peter Haworth wrote: > >On Thu, 9 Jan 2003 16:40:20 -0500, Dan Sugalski wrote: > >> #10 We do MI, but we don't instantiate a class' attributes multiple > >> times if its in the hierarchy for a class more than once. If it is, > >> the leftmost instance is real, the rest are virtual
> >This will mean we can't support Eiffel
> Nope. :)
I realised this soon after leaving for the weekend, thus leaving myself looking stupid for an extended period of time :-)
> Eiffel's classes, IIRC, are compile-time fixed so it can do the necessary > code cloning and renaming magic to make it all work.
Exactly. At the implementation level you end up directly inheriting once from the offending class, and reimplementing some/all of the features for the repeated inheritance, either directly in the derived class, or in a specially constructed modified copy of the base class which is only used for inheritance by the derived class.
I'm not an Eiffel programmer either, but I have read OOSC, so I know enough to make me dangerous.
-- Peter Haworth p...@edison.ioppublishing.com "An IRC channel, in ERROR?! On Undernet no less?! THE DEUCE YOU SAY!! Next thing you're going to tell me the commentary on Slashdot isn't totally impartial!" -- Michael G Schwern
> -----Original Message----- > From: Dan Sugalski [mailto:d...@sidhe.org]
[snip]
> Objects, as far as I see it, have the following properties:
> 1) They have runtime-assignable properties
Terminology question: what is the difference between a property and an attribute? Perhaps the answer could go in the glossary.
[snip]
> #3 Since each class has a PMC type, we just need to associate the PMC > type with the data a class needs. Also pretty much no big deal, and > something we already have facilities for.
So, while there may be exceptions, generally all classes will be instances of the Class PMC, true?
[snip]
> The call/jmpmeth opcodes either make a returnable or non-returnable > method call. They fetch the function pointer from the object PMC and > either dispatch to it or save the current state and jsr to it. Note > that you can't jmpmeth into a C-implemented method, so no tail > calling into those without some wrapper opcodes. The registers still > need to be set up appropriately, just like with a regular sub call.
So the call opcode takes a method name or offset and calls a vtable method to find the method and then invokes it?
> The find_method vtable entry should die, and be replaced with a plain > method entry. This should return either the address of the start of > the method's bytecode, or NULL. The NULL return is for those cases > where the method actually executed via native code, and thus doesn't > have to go anywhere. If an address is returned it's expected that the > engine will immediately dispatch to that spot, obeying parrot's > calling conventions.
Not sure what this means, does it mean that there is a method named "find_method" accessed something like
call Px, Py, "find_method"
which I can then call to find the method or am I off?
[snip]
> The structures:
> *) Attr PMCs have an array off their data pointer.
> *) Classes are variants of Attr PMCs. They have the following > guaranteed attributes in slots:
> 0) Hash with class name to attribute offset
I am not sure what this means, Don't we already have the class and it's attributes if we are accessing these slots?
> 1) Hash with attribute name to attribute offset (relative to the > offset found from the hash in slot 0, generally known at > compile time, but introspection is nice) > 2) Integer number of attributes this class has > 3) Notification array
Do we store ptrs to parent classes in one of these slots? Also Can I access slots like:
set Px, Py[1] # store the name to offset hash in Px
[snip]
So to sum up we need the following pmc's:
pmclass Ref { data is a pointer to an object }
pmclass Attr { data is an array of attributes }
pmclass Class extends Attr { }
pmclass Object { this was not explained, but I guess it at least has a reference to a Class and field data ??? }