Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[RFD] Symbol naming and imcc2

10 views
Skip to first unread message

Melvin Smith

unread,
Feb 11, 2004, 12:26:16 PM2/11/04
to perl6-i...@perl.org
RFD = Request For Discussion ;)


Much discussion has been made on IRC concerning
symbol names.

The request, mainly, is for imcc to handle sigil characters
from other languages which basically equates to exposing
a lot to imcc from the high-level language. I won't
argue how much of that is good or bad; I'd rather just try to
make imcc as friendly as possible.

The state of things:

1) Declared symbols can be handled pretty easily with any character
we want to support, imcc just has to track it. It just so happens
that we don't allow many non alpha characters at this time.

2) $ is currently used to denote a symbolic register ($I[0-9]+ is an int
register).
which is not pre-declared. It just pops up in the instruction stream and
imcc assigns a register.

It is possible that we can stick with $ for temporaries, but make imcc
check symbol tables first, and allow people to declare symbols with $
as well. This would solve some issues but might make for some
confusing looking code.

In reality it would not really be that confusing if you don't know your
variables the same convention as temporaries, but who can guarantee
that.

.local PerlString $str
.local PerlString $I123
$I123 = "help"
$str = $I123
$I124 = 1 + 2
$125 = $124 * 3

The $ no longer stands out as a temporary, so maybe we choose a different
character for temp registers (I have suggested using a period C<.>)

.local PerlString $str
.local PerlString $I123
$I123 = "help"
$str = $I123
.i124 = 1 + 2
.i125 = .i124 * 3


Another option is to use quotes for symbols with sigils, but since most
of our code will end up coming from Perl6, that won't be optimizing
for the common case.

Finally, I for one, support name mangling. Its arguable how much high
level compilers should expose to the back-end compiler. I think, though,
that most people prefer to be able to debug PIR code and see the
original symbols, and I sympathize.

This is just an example to stimulate discussion. I'd like to hear all sides
before making any decisions.

(And remember namespaces when considering solutions)

-Melvin


Matt Fowles

unread,
Feb 11, 2004, 3:04:53 PM2/11/04
to Melvin Smith, perl6-i...@perl.org
All~

I don't like the leading C<.> option, what about having a leading _ for
temporaries instead and allowing any non-space, non-operator character
in symbol names, so _$foo would be a valid temp. This has the advantage
of not conflicting with symbolic registers.

The other option is to force registers to use a sigil that does not
occur in perl such as # Then all of the "normal" sigils (&, %, @, and
$) would be available for use in variable names.

Matt

Jonathan Worthington

unread,
Feb 11, 2004, 6:48:12 PM2/11/04
to Matt_...@softhome.net, Melvin Smith, perl6-i...@perl.org
Hi,

> I don't like the leading C<.> option, what about having a leading _ for
> temporaries instead and allowing any non-space, non-operator character
> in symbol names, so _$foo would be a valid temp. This has the advantage
> of not conflicting with symbolic registers.
>

But could it potentially conflict with labels, which also start with an
underscore right now IIRC. I agree with the idea that non-alphanumeric
should be allowed in symbol names, such as $, though. [1]

> The other option is to force registers to use a sigil that does not
> occur in perl such as #

# is currently used for comments. Of course, we can change this (and the
underscore situation above if I'm right about it) if we really
wanted...but...

> Then all of the "normal" sigils (&, %, @, and $) would be available for
use in variable names.

It isn't just Perl we're dealing with. Other languages could potentially
have other sigils. e.g. _ and #. Other languages have no sigils, in which
case name mangling is certainly needed as a variable called I2, for example,
would cause all kinds of "fun" if not mangled.

I would go with the idea of having a sigil that is placed before all local
variables, and another (different!) sigil for registers (of the IMCC-handled
type). Anything without one of those is a direct register access. Or a
syntax error. Clean, simple rules. What the sigils are is relatively
immaterial if what is placed after them (for locals, not registers) can
contains non-alphanumeric stuff. And whatever sigils a language wants can
be put there. This way, name mangling can be "avoided" - though arguably
we're defining a syntax that "auto-mangles". :-)

Jonathan

[1] (C|S)hould we potentially provide a "quoting" mechanism, e.g. for
languages that want variable names containing characters that are not
allowed due to IMCC syntax rules? Or is it up to the compiler to emit
"compliant" names? And I'm too scared of unicode to mention unicode
variable names.


Pete Lomax

unread,
Feb 11, 2004, 6:59:54 PM2/11/04
to perl6-i...@perl.org
On Wed, 11 Feb 2004 15:04:53 -0500, Matt Fowles
<Matt_...@softhome.net> wrote:

>All~
>
>I don't like the leading C<.> option, what about having a leading _ for

I don't care. Really, I don't care. I kinda like $, but I don't care.
I currently get by just with $[I.N.S.P]nnn symbolic temporaries
because I set a flag to use them. At the switch of a flag I can emit
code using _XX_<original_var_name>_nnnn, where XX is some helpful info
regards type, <original_var_name> is, well, the original var name, and
nnnn is a four-digit number I made up to make me fairly happy that
imcc won't confuse the local integer i in one routine with the local
integer i in another routine, since it is obvious to me that IMCC
cannot possibly cope with different scope rules for languages left
right and sundry.

Personally, I think you should change $ to . if and only if it helps
perl (which is not my bag). The rest of us, in the words of Dan, can
cope: a little whining is acceptable, if somewhat unbecoming ;-)

Luke Palmer

unread,
Feb 11, 2004, 7:02:49 PM2/11/04
to Jonathan Worthington, Matt_...@softhome.net, Melvin Smith, perl6-i...@perl.org
Jonathan Worthington writes:
> I would go with the idea of having a sigil that is placed before all local
> variables, and another (different!) sigil for registers (of the IMCC-handled
> type). Anything without one of those is a direct register access. Or a
> syntax error. Clean, simple rules. What the sigils are is relatively
> immaterial if what is placed after them (for locals, not registers) can
> contains non-alphanumeric stuff. And whatever sigils a language wants can
> be put there. This way, name mangling can be "avoided" - though arguably
> we're defining a syntax that "auto-mangles". :-)

Hooray! That is precisely what sigils are for. No use in making a
"variable" sigil that shares its name with a register sigil.

On the other hand, we could define four sigils and do away with the
$S358 syntax, like so:

?foo # I register
+foo # N register
~foo # S register
$foo # P register

(I took the most Perl6ish representitave sigils I could think of...
doesn't matter what they are, really)

> Jonathan
>
> [1] (C|S)hould we potentially provide a "quoting" mechanism, e.g. for
> languages that want variable names containing characters that are not
> allowed due to IMCC syntax rules? Or is it up to the compiler to emit
> "compliant" names? And I'm too scared of unicode to mention unicode
> variable names.

I can envisage something like:

$'%foo' = new PerlHash

But is that all that more readable than:

$Hfoo = new PerlHash

I would say we should definitely go with something like this if
registers held more permanent values. But in writing my own compilers,
I've found that most of my register naming comes from prefixing a
constant string to an incremented counter. Lexical pads already let
you do this, and those are the ones that need to.

Bascially, I think the current system works fine, but it would be nice
to namespace locals somehow.

Luke

Leopold Toetsch

unread,
Feb 12, 2004, 1:54:03 AM2/12/04
to Melvin Smith, perl6-i...@perl.org
Melvin Smith <mrjol...@mindspring.com> wrote:

> Another option is to use quotes for symbols with sigils,

And we have to cope with unicode finally. So I'd vote for that
alternative. *But* as code normally comes out of a compiler and there
may be many different compilers, we can't deal with arbitrary symbols,
because, we don't know the scoping rules of these compilers.

We can only deal with mangled symbol names.

my $i;
{ my $i ; }

> (And remember namespaces when considering solutions)

Yes. We need IMHO something like:

- .lexical <type> name
- .lexcial <type> name, '$unmangled_orig_name'
- .global ...

where C<name> is a mangled symbol name like now or even C<$P\d+>. We
have to know, if the symbol is a temporary or not for spilling. Lexical
and globals have their store in the lex pad or in the stash, so for
spilling we don't have to store these variables, we only need to
refetch, where we now fetch from the spill array.

The .lexical and .global directives should use the appropriate lexical
or global opcodes to deal with these symbols.

The unmangled name is just for diagnostics and will be stored in a
different packfile segment.

> -Melvin

leo

Dan Sugalski

unread,
Feb 17, 2004, 3:15:35 PM2/17/04
to Melvin Smith, perl6-i...@perl.org
At 12:26 PM -0500 2/11/04, Melvin Smith wrote:
>
>The request, mainly, is for imcc to handle sigil characters
>from other languages which basically equates to exposing
>a lot to imcc from the high-level language.

If you're looking for a "How do I use $foo in my imcc code?" then I
have one of two answers:

1) You don't, doofus. Go fetch it out of the symbol table by name, with

var1 = global [foo; bar] "$foo:

or

var1 = local "$foo"

2) .alias is your friend!

.alias some_nice_symbol global [foo;bar] "$foo"
.alias some_other_symbol local "$bar"

Either way, I don't think IMCC should have to deal with language
symbols explicitly.
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Leopold Toetsch

unread,
Feb 17, 2004, 4:41:39 PM2/17/04
to Dan Sugalski, Melvin Smith, perl6-i...@perl.org
Dan Sugalski wrote:

> Either way, I don't think IMCC should have to deal with language symbols
> explicitly.

Zhat's true. But still we need to know, *what are* language symbols.
I've stated several times that for the spilling code its essential to
know, if a symbol has already a store in either lexicals or globals, so
that we can just cut down life the range of such symbols, if spilling is
needed.

leo

Dan Sugalski

unread,
Feb 17, 2004, 4:58:55 PM2/17/04
to Leopold Toetsch, Melvin Smith, perl6-i...@perl.org

Right, hence the option to either use global/local (or something like
that) to load into safely named things, or adding in .alias to rename
them to something safe.

Or, I suppose, we could go and move IMCC over to being AST-driven,
in which case it turns into a simple text->AST mapping problem... :)

0 new messages