I'm Rhys Weatherley, the author of Portable.NET, which is part
of the DotGNU project. (Put down that flame thrower! I come
in peace. :-) )
DotGNU is currently reaching out to other projects in the OSS/FS
world to see how we can help you and how you might be able to
help us. One of the projects that we are looking into is
compiling C# to Parrot bytecode, so that perhaps people in
the Parrot community may in turn be interested in helping us
complete our compiler and system library.
The Portable.NET C# compiler, cscc, is very extensive, and is
capable of generating output for multiple bytecode formats (IL
and JVM are currently supported, more or less).
I'm aware of Cola, but I know from bitter experience that the
hardest part of C# is parsing and semantic analysis, which
we've already got working. So we may be able to help you on
that front. We also have the beginnings of a C->bytecode
compiler.
So much for the high level stuff. Getting technical ...
I'm a bit confused as to how one creates a user-defined class
in Parrot, and makes virtual method calls, accesses fields,
and what-not. I can't seem to find a good example (Cola does
non-virtual methods only at present).
Is there a convention for which registers must be saved
across a call and which can be clobbered? Using the
saveall/restoreall convention and passing all values on
the stack doesn't seem terribly efficient. But maybe I'm
missing something? Is the JIT smart enough to optimize
away unnecessary copies?
What is the size of the "int" type? Will it always be 32 bit
or is it "whatever is best for the machine"? And how do
I perform a "sign extend" operation?
Is there some means to store and access auxillary data in
a Parrot bytecode file? I might need this to store metadata
for supporting C# reflection.
We in DotGNU should be able to bang out C#->Parrot fairly
quickly, if we can resolve the above issues.
Cheers,
Rhys.
http://www.southern-storm.com.au/portable_net.html
http://www.dotgnu.org/
> The Portable.NET C# compiler, cscc, is very extensive, and is
> capable of generating output for multiple bytecode formats (IL
> and JVM are currently supported, more or less).
Have a look at imcc, which is our high level assembler. imcc does
register allocation and (currently little) optimization. perl6 produces
IMCC code. imcc can also run the code or write PBC files.
> I'm a bit confused as to how one creates a user-defined class
> in Parrot, and makes virtual method calls, accesses fields,
> and what-not.
Not yet.
> Is there a convention for which registers must be saved
> across a call and which can be clobbered?
docs/pdds/pdd03_calling_conventions.pod
> ... Is the JIT smart enough to optimize
> away unnecessary copies?
AFAIK no, optimzation should be done one level up, so all run cores
would benefit.
> What is the size of the "int" type? Will it always be 32 bit
> or is it "whatever is best for the machine"?
It's a Configure option.
> And how do
> I perform a "sign extend" operation?
We currently have only one INTVAL, it should promote to BIGINT.
> Is there some means to store and access auxillary data in
> a Parrot bytecode file? I might need this to store metadata
> for supporting C# reflection.
There was some discussion WRT extending the bytecode format not too long
ago: s. Subject "RFC: static line number information".
> Cheers,
>
> Rhys.
leo
Hey. Don't worry--we're not worried about DotGNU. On the other hand,
if you said you were on the actual .Net development group, it would
probably be toastin' time. :^)
# I'm a bit confused as to how one creates a user-defined class
# in Parrot, and makes virtual method calls, accesses fields,
# and what-not. I can't seem to find a good example (Cola does
# non-virtual methods only at present).
You don't, at least not yet. Eventually, there'll be an Object PMC
(which can be inherited from to make PerlObject, CsObject, etc.); to
retrieve a method, you'd probably call something like
getmethod(methname) on the Object, then push your arguments onto the
stack and call invoke() on the Sub (PerlSub, CsMethod (no subs,
right?)...) PMC getmethod() gave you. For a multimethod subroutine, the
Sub PMC would look at the types of the arguments on the stack and choose
which routine to invoke; for a normal subroutine, it would just invoke
the code. Presumably, the C# compiler could optimize that further.
Oh, by the way, that's all speculation until Dan Sugalski specs it out.
# Is there a convention for which registers must be saved
# across a call and which can be clobbered? Using the
# saveall/restoreall convention and passing all values on the
# stack doesn't seem terribly efficient. But maybe I'm missing
# something? Is the JIT smart enough to optimize away
# unnecessary copies?
There's a PDD on that, but IIRC it's out of date.
# What is the size of the "int" type? Will it always be 32 bit
# or is it "whatever is best for the machine"? And how do I
It's always at least 32 bits, but is selectable at Configure-time. Most
systems it'll be 32, but on a few it'll be 64.
# Is there some means to store and access auxillary data in
# a Parrot bytecode file? I might need this to store metadata
# for supporting C# reflection.
You would probably attach this to the CsObject and CsMethod PMCs.
# We in DotGNU should be able to bang out C#->Parrot fairly
# quickly, if we can resolve the above issues.
Please remember that Parrot is very much a work in progress. We've done
a lot in the one year, one month and one week (really!) since we
released, but we still have a lot to do before it's really usable for
full languages.
--Brent Dax <bren...@cpan.org>
@roles=map {"Parrot $_"} qw(embedding regexen Configure)
Wire telegraph is a kind of a very, very long cat. You pull his tail in
New York and his head is meowing in Los Angeles. And radio operates
exactly the same way. The only difference is that there is no cat.
--Albert Einstein (explaining radio)
> The Portable.NET C# compiler, cscc, is very extensive, and is
> capable of generating output for multiple bytecode formats (IL
> and JVM are currently supported, more or less).
Oh, excellent. If you're already targeting both then it should be
fairly easy to target Parrot too (still-to-be-developed-features not
withstanding. This is quite interesting indeed, especially if you have
a good test suite ;-) I'll try and have a look at it over the weekend.
Leon
--
Leon Brocard.............................http://www.astray.com/
scribot.................................http://www.scribot.com/
.... Dyslexics of the world, untie!
> Have a look at imcc, which is our high level assembler. imcc does
> register allocation and (currently little) optimization. perl6 produces
> IMCC code. imcc can also run the code or write PBC files.
Yes, I saw that. I haven't yet decided whether to generate pasm
or imcc directly from cscc. I did have some problems getting
"test_spilling.imc" to work. Is this a known issue?
> > What is the size of the "int" type? Will it always be 32 bit
> > or is it "whatever is best for the machine"?
>
> It's a Configure option.
That may be a bit of a problem, as C# (and Java for that matter)
is very particular about the sizes for its types, and the
behaviour of operations. e.g.
int x = (0x80000000 + 0x80000000) / 2;
This will give 0 on a 32-bit system, but 0x80000000 on a 64-bit.
Note: I'm not criticising Parrot's choice to use native integers.
It just would be nice for a compiler to be able to say "this type
must be n bits because I say so!".
> > And how do
> > I perform a "sign extend" operation?
>
> We currently have only one INTVAL, it should promote to BIGINT.
That's not quite what I was after. Here is an example:
int x = ...;
int y = (short)x;
The value of x is truncated to 16 bits, and then sign-extended
to int. I'm looking for something like the "conv.i2" instruction
in IL, or "i2s" in JVM.
> > Is there some means to store and access auxillary data in
> > a Parrot bytecode file? I might need this to store metadata
> > for supporting C# reflection.
>
> There was some discussion WRT extending the bytecode format not too long
> ago: s. Subject "RFC: static line number information".
I have a suggestion: allow compilers to embed arbitrary extra
"named sections" in the bytecode format. From the point of view
of the Parrot VM, it's just binary data to be ignored. But the
program can do an "open_section" or something to read the data
as a stream. Then it is up to the compiler and the associated
language libraries as to how the data is interpreted. Just a
suggestion.
Cheers,
Rhys.
> # I'm a bit confused as to how one creates a user-defined class
> # in Parrot, and makes virtual method calls, accesses fields,
> # and what-not. I can't seem to find a good example (Cola does
> # non-virtual methods only at present).
>
> You don't, at least not yet. Eventually, there'll be an Object PMC
> (which can be inherited from to make PerlObject, CsObject, etc.); to
> retrieve a method, you'd probably call something like
> getmethod(methname) on the Object, then push your arguments onto the
> stack and call invoke() on the Sub (PerlSub, CsMethod (no subs,
> right?)...) PMC getmethod() gave you.
Seems reasonable. For the time being, I can jury-rig something
with PerlHash PMC's, and plug in the "real deal" when it is available.
> For a multimethod subroutine, the
> Sub PMC would look at the types of the arguments on the stack and choose
> which routine to invoke; for a normal subroutine, it would just invoke
> the code. Presumably, the C# compiler could optimize that further.
My compiler already knows which variant to call, so multimethods
wouldn't be much use to me (they might be for other languages).
It would be easier for me to do getmethod("int foo(Object)")
instead of getmethod("foo") and a multimethod invoke().
> Oh, by the way, that's all speculation until Dan Sugalski specs it out.
Of course.
> # We in DotGNU should be able to bang out C#->Parrot fairly
> # quickly, if we can resolve the above issues.
>
> Please remember that Parrot is very much a work in progress. We've done
> a lot in the one year, one month and one week (really!) since we
> released, but we still have a lot to do before it's really usable for
> full languages.
I understand, as Portable.NET is also a work in progress. My hope
is that the two projects can help each other. Certainly, cscc should
be able to "torture the Parrot" from a non-scripting angle, which
will be invaluable for helping it evolve stronger feathers. :-)
Cheers,
Rhys.
One concievable way to do that is basically have the hypothetical
CSInt and CSShort PMC classes know and care how many bits they're
supposed to be, and handle overflow internally.
If need an op to do it, you can always provide a custom one. I think.
--
fga is frequently given answers... the best are "Date::Calc", "use a hash",
and "yes, it's in CPAN" or Data::Dumper or mySQL or "check your permissions"
or NO Fmh THAT'S WRONG or "You can't. crypt is one-way" or "yes, i'm single"
or "I think that's a faq." or substr! or "use split" or "man perlre" - #perl
> Leopold Toetsch wrote:
>
> > > What is the size of the "int" type? Will it always be 32 bit
> > > or is it "whatever is best for the machine"?
> >
> > It's a Configure option.
>
> That may be a bit of a problem, as C# (and Java for that matter)
> is very particular about the sizes for its types, and the
> behaviour of operations. e.g.
>
> int x = (0x80000000 + 0x80000000) / 2;
>
> This will give 0 on a 32-bit system, but 0x80000000 on a 64-bit.
>
> Note: I'm not criticising Parrot's choice to use native integers.
> It just would be nice for a compiler to be able to say "this type
> must be n bits because I say so!".
While that sounds reasonable to want to do, it is, alas, very very hard.
One significant underlying problem is that, on some platforms where we
want Parrot to run (e.g. some Crays) there simply *is no 32-bit type*.
short, int, and long are all 64 bits. I have no idea what C# or Java
would do on such a platform.
--
Andy Dougherty doug...@lafayette.edu
> DotGNU is currently reaching out to other projects in the OSS/FS
> world to see how we can help you and how you might be able to
> help us.
It looks like the DotGNU weekly IRC meeting will be discussing
Parrot. Could be interesting:
http://www.dotgnu.org/pipermail/developers/2002-October/008345.html
Leon
--
Leon Brocard.............................http://www.astray.com/
scribot.................................http://www.scribot.com/
.... Drive A: format failure, formatting C: instead...
Interesting indeed. I'll try and make the later meeting. (I'll give a
shot at the earlier meeting, but that's either 5AM or 6AM (I can't
recall) which is awfully early for a non-morning person like me... :)
--
Dan
--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk
The author of that mail needs to learn the difference between GMT and UTC
Note: the times are UTC, not GMT. That is, they do not include
daylight savings adjustments during the northern summer. If you
have a Unix system, then the command "date; date -u" will give
you your current time in both local and UTC, allowing you do
determine when the next meeting will occur.
GMT does not have daylight savings adjustments.
Graham.
> Leopold Toetsch wrote:
[ imcc ]
> Yes, I saw that. I haven't yet decided whether to generate pasm
> or imcc directly from cscc. I did have some problems getting
> "test_spilling.imc" to work. Is this a known issue?
Now yes ;-) Last cleanup changes did break the spilling code. I've got
it running again, though there are still minor issues.
I'll commit it in a minute.
> int x = (0x80000000 + 0x80000000) / 2;
>
> This will give 0 on a 32-bit system, but 0x80000000 on a 64-bit.
No, this would be done in your integer class.
> Note: I'm not criticising Parrot's choice to use native integers.
Native integers are for HL languages, that know the limitations or for
fast packed arrays and such.
All the rest (including sign extension) could go in extra classes.
Though, on popular demand, there could be some opcodes for e.g.
int<->short conversions.
[ bytecode ]
> I have a suggestion: allow compilers to embed arbitrary extra
> "named sections" in the bytecode format.
Yes, that was my proposal too.
> Cheers,
>
> Rhys.
leo
Yep, that's definitely a good way to do that. In many cases I think
it'll be the right way
>If need an op to do it, you can always provide a custom one. I think.
Alternately, we can have ops that act on I regs as 8, 16, 32 or,
maybe, 64 bit quantities. Dealing with I regs as 64 bit quantities on
a 32 bit system is somewhat problematic, as there are interesting
architectural issues. (Do you double up registers? If so, what about
systems that don't need it? Or do you keep a second set of
half-registers squirrelled away somewhere? Which is an interesting
problem in cache coherency, not to mention a hack)
> It looks like the DotGNU weekly IRC meeting will be discussing
> Parrot. Could be interesting:
It was quite interesting. I managed to make it to the early one and
Dan to the later one. An "annotated and abridged chatlog" is available:
http://dotgnu.info/pipermail/developers/2002-October/008380.html
It looks like we're going to need 8,16,32,64 bit types...
Leon
--
Leon Brocard.............................http://www.astray.com/
scribot.................................http://www.scribot.com/
.... 43rd Law of Computing: Anything that can go wr...
Interesting read. Dan skimmed over this, but what do .NET (and JVM) doe
for floating point numbers?
Are we still targeting a middle ground for C? (Enough to be able to
parse and handle structs natively, and possibly even make calls
natively?)
--
Bryan C. Warnock
bwarnock@(gtemail.net|raba.com)
A condensed summary of the IRC meetings have been posted as :-
http://www.dotgnu.org/pipermail/developers/2002-October/008380.html.
Also the full logs are available from http://ajmitch.dhis.org/dotgnu/
Gopal
ps: here's my hello world to parrot , a small example compiler that does IL
and *now* parrot. http://symonds.net/~gopalv82/code/expr_compiler.tgz
Check out the treecc AST operation management .
--
The difference between insanity and genius is measured by success
IL (Ecma-335)
--------------
13 4.1.1 Floating Point
14 The floating point feature set consists of the user-visible
floating-point data types float32 and float64, and
15 support for an internal representation of floating-point numbers.
Float32 -- Single
Float64 -- Double
And,IIRC the same for JVM as well ?.
> Are we still targeting a middle ground for C? (Enough to be able to
> parse and handle structs natively, and possibly even make calls
> natively?)
Hmm... would be thinking of something like PInvoke in C# ?
(viz a lot like JNI, but marshalls/unmarshalls args automatically,
and we've managed to wrap parts of X11 with it).
Or are you thinking about compiling C to Parrot ?
<Dan> : "Consider it both mildly interesting and mildly bemusing :)"
Gopal
> Interesting read. Dan skimmed over this, but what do .NET (and JVM) doe
> for floating point numbers?
The CLI has three floating point types, of which 2 are visible
to C# and a third is used by the engine. These are "float32",
"float64", and "native float". The first two have specific
sizes, and are expected to be IEEE compatible.
The third is the native floating point representation for the
underlying machine, which is assumed to have precision greater
than or equal to float64. The actual size is unspecified by
the standard, and it is OK to set "native float == float64".
Floating point operations are always performed in "native float",
and then down-graded to float32 or float64 prior to being stored
in a local or object field.
The long and the short of it is that Parrot's FLOATVAL is good
enough for C#'s purposes, with the minor requirement of a
"double -> float" conversion opcode. Which is essentially a
"truncate and then re-extend to FLOATVAL", similar to the
"int -> short" and "int -> byte" cases.
Cheers,
Rhys.
> Interesting read. Dan skimmed over this, but what do .NET (and JVM) doe
> for floating point numbers?
For the JVM:
http://java.sun.com/docs/books/vmspec/2nd-edition/html/Concepts.doc.html#19511
"The floating-point types are float and double, which are conceptually
associated with the 32-bit single-precision and 64-bit
double-precision IEEE 754 values and operations as specified in IEEE
Standard for Binary Floating-Point Arithmetic, ANSI/IEEE Standard
754-1985 (IEEE, New York)."
More details at:
http://java.sun.com/docs/books/vmspec/2nd-edition/html/Concepts.doc.html#33377
HTH, Leon
--
Leon Brocard.............................http://www.astray.com/
scribot.................................http://www.scribot.com/
.... Cryptonomicon: The girl's guide to geek guys
I think so. I'm going to add in some conversion ops for the shorter
float forms, and for the partial-sized integers. I'm unsure at the
moment whether I want to commit to full 64 bit integers in I
registers. On the one hand it means a lot more can be done at the low
level, on the other it means things are going to be potentially slow
and emulated on non-64 bit int platforms. Plus it'll waste a fair
amount of L1 cache for no purpose most of the time.
> > Are we still targeting a middle ground for C? (Enough to be able to
>> parse and handle structs natively, and possibly even make calls
>> natively?)
>
>Hmm... would be thinking of something like PInvoke in C# ?
>(viz a lot like JNI, but marshalls/unmarshalls args automatically,
> and we've managed to wrap parts of X11 with it).
Yeah, I want to do that with parrot, being able to on the fly
generate low-level call frames and call into native routines without
having to explicitly generate and compile C code for it.
It's a portability problem, to be sure, but on the other hand if
we're going to have a JIT most places we're already getting much
grubbier.
> I think so. I'm going to add in some conversion ops for the shorter
> float forms, and for the partial-sized integers. I'm unsure at the
> moment whether I want to commit to full 64 bit integers in I
> registers. On the one hand it means a lot more can be done at the low
> level, on the other it means things are going to be potentially slow
> and emulated on non-64 bit int platforms. Plus it'll waste a fair
> amount of L1 cache for no purpose most of the time.
64-bit integers (both signed and unsigned) are rare enough in C#
(and Java) code that accessing them via PMC operations will not
be a huge burden. Putting them in registers is not necessary
on my account.
As to your other message, where you suggest making INTVAL's 64-bit
all of the time, I really don't like that proposal. It makes the
common case inefficient. You'll also go mad trying to implement
64-bit multiplication and division in the JIT ( :-) ). If you
make it a PMC operation, you can let the compiler do the work.
There are other places in Parrot that assume that an INTVAL is
the same size as a native pointer (e.g. set_addr). Mandating a
fixed size might break these assumptions.
My humble suggestion is to do this:
INTVAL is guaranteed to be 32 bits or higher in size.
FLOATVAL is guaranteed to be 64 bits or higher in precision.
Int64 is added as a new PMC type.
Then use conversion opcodes to re-normalize results to smaller
fixed sizes where necessary. Languages that care about sizes
must do explicit conversion. Other languages can be free-form.
It will be a little annoying to do this in the C# compiler:
add I0, I1, I2
conv_int32 I0, I0
However, the JIT can optimize away the "conv_int32" on a 32-bit
platform, so it isn't really an issue. But it will be an issue
if you make everything 64-bit.
BTW, I've yet to come across a compiler that doesn't have some
way to use 64-bit integers. The names vary: "long" on some,
"long long" on others, and the obnoxious "__int64" on Windows.
But that's just a configuration problem.
Cheers,
Rhys.
P.S. Don't forget the unsigned types. :-)