Hyphens vs. Underscores

Daniel Brockman

unread,

Nov 16, 2005, 10:05:30 PM11/16/05

to perl6-l...@perl.org

I'm not a Lisp weenie. However, I have always preferred
hyphens over underscores, and I have always preferred
identifiers that use delimiters over camel-cased ones.

I just think `foo-bar-baz' looks better than `foo_bar_baz'.
Maybe it's the "lexical connotation" of hyphens from natural
language (it joins words), or maybe it's that hyphens fall
more or less on the center of gravity of an identifier,
whereas underscores usually fall beneath the baseline,
making it a purely typographical preference.

Maybe it's that I like Lisp, Dylan, CSS, XSLT, or some other
language in which hyphens are predominant. But I don't
think that's it, because I like even more languages in which
underscores are predominant. It's simply that I like the
look of the identifiers better in Lisp.

I realize that lots of people feel the same, only with the
exact opposite viewpoint. It's purely a matter of taste.

My brother just walked past my monitor, glanced through the
above text and exclaimed, "Yeah, if Perl 6 allowed hyphens
in identifiers, it would be, like, the ultimate language!"

I don't mean to quote my beloved brother as an authority on
this matter (quite to the contrary, I don't think there are
any authorities, since it's a matter of taste). I just mean
to illustrate that people do feel strongly about this issue.
(And I don't need to tell you that aesthetic aspects play a
serious role in deciding, if not whether a person is going
to end up liking the language, then at least what their
first impression is going to be.)

So what is my suggestion? Obviously disallowing underscores
and instead allowing hyphens would just replace one problem
with an even worse problem (not only would there still be
people who don't like hyphens, but it would alienate a large
portion of the user base). Therefore, to make the language
more appealing (i.e., less alienating) to the group of
people I am going to refer to as "Lispers", it's obvious
that both characters should be allowed.

But simply adding hyphen as another character allowed in
identifiers would only superficially solve the problem.
To be sure, it would cause a new group of people, Lispers,
to migrate to Perl 6, and might even case a few long-time
users (any closet Lispers around?) to switch to hyphens.
But it would also doom these two camps to forever remain at
war with one another.

This is because if you like underscores, you aren't going to
want libraries to use to have hyphens all over the place.
For one thing, it would be plain hell to have to remember
which packages used underscores and which used hyphens.

Therefore, my suggestion is to allow both characters, but
have them mean the same thing. This concept caused some
confusion on IRC, so let me provide a screenshot:

sub foo-bar ($x) { say $x }
foo_bar(42); # says 42

(image courtesy of luqui on #perl6)

If you think about it, this makes sense. Just as you are
not forced to use the same indentation style as the authors
of the libraries you use, you should be free to use whichever
word-joiner (subtle hint not intended) you prefer.

I did not invent this solution. "For convenience," as the
modprobe(1) man page puts it, "there is no difference
between _ and - in module names." As another example,
GObject, the GTK+ object system, uses it to allow

g_object_set (foo, "bar-baz") and

g_object_set (foo, "bar_baz")

as synonymous, which is particularly convenient since there
are so many different language bindings for GObject.
(I should --- since it is advantageous to my case --- point
out that GObject uses hyphens internally.) There are probably
other examples (if you can think of any, please tell).

Anyway, if we can agree --- without considering the syntactical
implications --- that it is desirable to make these characters
synonymous, we have to look at the implications next.

The first thing that comes to mind is that some people write
binary minus without surrounding whitespace. Were this
proposal to be accepted, those people would do best to
change habits, lest they be bitten when they try to subtract
the return value of a subroutine call from something.

What about unary minus? I propose the following:

-foo-bar === -(foo-bar) === -(foo_bar)
_foo-bar === _foo-bar === _foo_bar

That is, hyphen and underscore are synonymous in identifiers,
but an initial hyphen is not taken to be part of the identifier.

I'm not sure about postfix unary minus, however. You could
argue the case both ways (so please do that).

My gut feeling is that any postfix unary minus is doomed to
be confusing either way, so it might not matter.

Oh, by the way, I've been using hyphens in my Ruby
identifiers for some time now, and have not ran into any
serious problems. The problems I have ran into are related
to either the modified Ruby mode for Emacs, or to the kludgy
implementation --- I wrote a preprocessor that parses code
looking for identifiers. You can find that here:

<http://www.brockman.se/software/hyphen-ruby/>

Finally, I realize that this is a religious issue, I hope
that I have not stepped on anybody's toes, and I humbly ask
that you approach this discussion with an open mind.

Kind regards,

--
Daniel Brockman <dan...@brockman.se>

Sebastian

unread,

Nov 16, 2005, 11:08:57 PM11/16/05

to Daniel Brockman, perl6-l...@perl.org

I like hyphens. They're easier to type and help
prevent_me_from_Doing_This and generating errors because of case
sensitivity.

On the other hand, consistency of appearance may be a problem for some
people. I often associate code with the way it looks on screen, not
necessarily with what it does or says. Looking for
some_code_like_this() in a place that uses some-code-like-this() might
be troublesome.

- sebastian

Daniel Brockman

unread,

Nov 16, 2005, 11:31:27 PM11/16/05

to perl6-l...@perl.org

Sebastian,

> I like hyphens. They're easier to type and help
> prevent_me_from_Doing_This and generating errors because
> of case sensitivity.
>
> On the other hand, consistency of appearance may be a
> problem for some people. I often associate code with the
> way it looks on screen, not necessarily with what it does
> or says. Looking for some_code_like_this() in a place
> that uses some-code-like-this() might be troublesome.

I think that is a valid concern, but I don't think it is
very troublesome. I don't think it takes long for your eyes
to adapt when switching between hyphens and underscores.

I would certainly agree, however, that mixing two styles in
a single file or, to a lesser extent, a single source tree,
would be troublesome. Not so much for the human readability,
but for the automatic searchability.

If you are standing on this piece of code

sub foo-bar ($a, $b) { say "whatever, I don't care" }

and perform a search for `foo-bar', you probably are not
going to expect this code a few hundred lines down:

foo_bar("this example doesn't have a theme")

This is a very valid concern, but the problem will not arise
unless people start mixing these two styles --- something
which is very obviously not a good idea.

Besides, another couple of hundred lines down, you might
(but you probably won't) find the following code:

eval ("foo", "bar").join("_")

In the end, this is a "suit yourself" kind of problem.

--
Daniel Brockman <dan...@brockman.se>

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

Chromatic

unread,

Nov 16, 2005, 11:49:29 PM11/16/05

to Daniel Brockman, perl6-l...@perl.org

On Thu, 2005-11-17 at 05:31 +0100, Daniel Brockman wrote:

> This is a very valid concern, but the problem will not arise
> unless people start mixing these two styles --- something
> which is very obviously not a good idea.

That doesn't mean that people will avoid it, by accident or on purpose.
It's a serious concern worth more consideration than "just don't do it!"

-- c

Brent 'Dax' Royal-Gordon

unread,

Nov 17, 2005, 12:12:06 AM11/17/05

to Daniel Brockman, perl6-l...@perl.org

Daniel Brockman <dan...@brockman.se> wrote:
> So what is my suggestion? Obviously disallowing underscores
> and instead allowing hyphens would just replace one problem
> with an even worse problem (not only would there still be
> people who don't like hyphens, but it would alienate a large
> portion of the user base). Therefore, to make the language
> more appealing (i.e., less alienating) to the group of
> people I am going to refer to as "Lispers", it's obvious
> that both characters should be allowed.

I see a few syntactic problems with this idea: the subtraction and
negation operators you already mentioned, but also the fact that
dashes are already used in package names to indicate version and
author (`class Foo::Bar-2.10-cpan:BRENTDAX;`). I suspect that both of
these problems will be more troublesome than you might guess.

But there's a philosophical problem too. This proposal is an instance
of what I'm going to call the "dimmer switch problem".

In American cars at least, virtually every control on a car's
dashboard has an immediate purpose and utility. There are steering
controls (i.e. the wheel), cruise control settings, controls for
exterior lights, controls for the radio and air conditioner, and so
on; all of these need to be within easy reach of the driver, either
because he needs them to safely operate the car or because he's likely
to want to twiddle them while he's driving.

And then there's the dimmer switch, used to control the brightness of
the dashboard's lighting. This is not something the driver often
changes, and it's not crucial to the car's operation. A driver will
adjust it once if he bothers adjusting it at all. It's there solely
because different people have different preferences for the
brightness, and there's nowhere else to put it.

Perl has a lot of different ways of doing things. But if you examine
the design, you'll realize that they aren't mere cosmetic
differences--each form lends itself to different tasks. For example,
`for`` and `map` are functionally equivalent, but implementing a
Schwartzian transform is much easier with `map`, and a large loop body
is much easier to visually parse with `for`.

A lot of the suggestions I see for Perl 6 are dimmer switches; they
add an option or two to better suit someone's tastes but don't add any
power to the language. This is hardly the first case; the suggestion
a long time ago to use backtick as a subscript operator comes to mind,
but there have been many others.

Car designers, of course, are stuck with the dimmer switch: they do
need to provide some way to provide this feature to their customers,
and there are only so many ways to do it with a physical piece of
plastic and vinyl. Language designers are luckier, though, and Perl 6
is better than most.

This feature can be added as grammar-modifying pragma. If you want
the hyphen, simply type something like `use hyphens;` and you can use
hyphenated identifiers in the surrounding scope. And unlike Ruby,
this will be easy to do unambiguously: just override the Perl 6
grammar's identifier rule. All the edge cases will be resolved by the
longest token principle, so `foo-bar-baz` will be an identifier.

--
Brent 'Dax' Royal-Gordon <br...@brentdax.com>
Perl and Parrot hacker

Daniel Brockman

unread,

Nov 17, 2005, 1:27:56 AM11/17/05

to perl6-l...@perl.org

Thank you for your considerate reply, Brent.

> I see a few syntactic problems with this idea: the subtraction and
> negation operators you already mentioned,

Did I miss any problems related to those?

> but also the fact that dashes are already used in package names to
> indicate version and author (`class Foo::Bar-2.10-cpan:BRENTDAX;`).

Hmm, I did not consider that.

> I suspect that both of these problems will be more troublesome than
> you might guess.
>
> But there's a philosophical problem too. This proposal is an instance
> of what I'm going to call the "dimmer switch problem".

[...]

> Perl has a lot of different ways of doing things. But if you examine
> the design, you'll realize that they aren't mere cosmetic
> differences--each form lends itself to different tasks.

Yet you have the choice of where to put your braces, even
though the braces don't lend themselves to different tasks
depending on whether you put them on a new line or not.

No sane person would put their braces in different places in
different parts of their code, so why don't we just say,
"from now on, you must use brace style X"?

But I see your point. If Perl started adding tons of
syntactic dimmer swithes, that would certainly be a wrong
turn for TMTOWTDI. (Luckily, Perl 6 has so many hidden
switches that you could probably play with them forever and
never get bored.)

> A lot of the suggestions I see for Perl 6 are dimmer switches; they
> add an option or two to better suit someone's tastes but don't add any
> power to the language.

I might far too humble to try to think of anything that
could possibly add more power to such an enourmously
powerful beast of a language. If not, then at least I know
far too little about the language.

Is Perl 6 really in such a desperate need of new and more
powerful features that issues of convenience are irrelevant?

> This is hardly the first case; the suggestion a long time
> ago to use backtick as a subscript operator comes to mind,
> but there have been many others.

No offense to whoever made that suggestion, but I think
there are far more people out there with a developed taste
for hyphenated identifiers than there are people with a
thing for using backticks as subscript operators.

Do you see the difference? I'm trying to cater to an
actually existing and in many cases strong preference.

> Car designers, of course, are stuck with the dimmer switch: they do
> need to provide some way to provide this feature to their customers

Do they, really? Can't they just settle on a standard
dimmer setting that works well enough for everyone?

> This feature can be added as grammar-modifying pragma. If you want
> the hyphen, simply type something like `use hyphens;` and you can use
> hyphenated identifiers in the surrounding scope. And unlike Ruby,
> this will be easy to do unambiguously: just override the Perl 6
> grammar's identifier rule. All the edge cases will be resolved by the
> longest token principle, so `foo-bar-baz` will be an identifier.

Yes, it's very comforting to know that even if Perl 6 won't
have this feature built in, it will be so amazingly easy to
implement in a beautifully clean way.

But what about class Foo::Bar-2.10-cpan:BRENTDAX?

--
Daniel Brockman <dan...@brockman.se>

Chromatic

unread,

Nov 17, 2005, 1:40:58 AM11/17/05

to Daniel Brockman, perl6-l...@perl.org

On Thu, 2005-11-17 at 07:27 +0100, Daniel Brockman wrote:

> Yet you have the choice of where to put your braces, even
> though the braces don't lend themselves to different tasks
> depending on whether you put them on a new line or not.

You *don't* have the choice to use different types of braces, though --
at least not by default.

> Is Perl 6 really in such a desperate need of new and more
> powerful features that issues of convenience are irrelevant?

I see the proposal to treat - and _ as identical in identifiers as a
feature almost as useful as making identifiers case-insensitive.
Heteronymity seems too dangerous to encourage by supporting as a
default.

-- c

Daniel Brockman

unread,

Nov 17, 2005, 2:52:30 AM11/17/05

to perl6-l...@perl.org

chromatic <chro...@wgz.org> writes:

>> Yet you have the choice of where to put your braces, even
>> though the braces don't lend themselves to different tasks
>> depending on whether you put them on a new line or not.
>
> You *don't* have the choice to use different types of
> braces, though -- at least not by default.

Right, but noone is asking for that. You also don't have
the choice of writing your code backwards, but noone is
asking for that either. The choice of using hyphens instead
of underscores is neither universally undesired nor absurd.

>> Is Perl 6 really in such a desperate need of new and more
>> powerful features that issues of convenience are irrelevant?
>
> I see the proposal to treat - and _ as identical in identifiers as a
> feature almost as useful as making identifiers case-insensitive.

It might not be as useful --- after all, it just lets you
raise those low-hanging bars in your names a few pixels ---
but I think it is less problematic. I do think that case
insensitivity is a desirable characteristic, but I am not
sure how feasible it would be in the case of Perl 6.

For example, it's good that BUILD and OUTER and all the
other uppercased special things are distinctly named.
If they were to be distinct under case insensitivity,
they would need some sigil or something.

> Heteronymity seems too dangerous to encourage by
> supporting as a default.

You may be right about this. I would be happy if the
standard distribution came with a package that enabled the
hyphenated identifiers syntax in the lexical block:

use hyphenated_identifiers;

Hopefully the name of that package won't actually have
any underscores in it.

--
Daniel Brockman <dan...@brockman.se>

Uri Guttman

unread,

Nov 17, 2005, 4:23:42 AM11/17/05

to Daniel Brockman, perl6-l...@perl.org

>>>>> "DB" == Daniel Brockman <dan...@brockman.se> writes:

DB> You may be right about this. I would be happy if the
DB> standard distribution came with a package that enabled the
DB> hyphenated identifiers syntax in the lexical block:

DB> use hyphenated_identifiers;

DB> Hopefully the name of that package won't actually have
DB> any underscores in it.

this idea would need to be worked out in much greater detail. there are
many different identifiers in perl. would all of them be subject to this
change? how would a global work if some other module refered to it using
underscores but your module used hyphens? would your pragma just do a
compile time translation of - to _ when inside identifiers? what about
in eval or symrefs? would the translation be done at runtime then? how
could that be handled in a lexical way if the foo-name is passed to another
module which hadn't used the pragma? or would all symbol lookups just
tr/-/_/ beforehand? but that can't be easily controlled in a lexical
scope.

and i know how you feel about wanting - vs. _ but i have it the other
way around. i much prefer _ since it maked the words more readable as _
sorta disappears in the baseline. but then i hate list too so that
influences me a trifle. :)

but the sickest thing i have done is to remap _ to - and back inside
emacs. this was so typing -> is done with both keys shifted and i typed
that too often. also that made writing foo_bar easier. so my brain has
to swap then when inside emacs vs everywhere else. makes for some odd
homicidal twitching sometimes (especially when anyone else dares to type
into my emacs :).

anyhow, my main point is that IMO this has too many problems with both
syntax and unknown semantics that are sure to make some large fraction
of us very mad. perl has its style and that it _ for word
separation. the evil studly caps is used for module names only (where it
does seem to work better than _ would. or maybe we are just so used to
it by now). trying to change that in a scoped way will only cause pain
somewhere else.

uri

--
Uri Guttman ------ u...@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org

Daniel Brockman

unread,

Nov 17, 2005, 4:56:27 AM11/17/05

to perl6-l...@perl.org

Uri Guttman <u...@stemsystems.com> writes:

> this idea would need to be worked out in much greater detail. there are
> many different identifiers in perl. would all of them be subject to this
> change? how would a global work if some other module refered to it using
> underscores but your module used hyphens? would your pragma just do a
> compile time translation of - to _ when inside identifiers?

Yes, that's what it would do.

(Actually, my Ruby preprocessor removes hyphens it finds in
identifiers that start with an uppercase character and
contain at least one lowercase character, to allow Foo-Bar
as a synonym for FooBar.)

> what about in eval or symrefs? would the translation be done at runtime
> then? how could that be handled in a lexical way if the foo-name is
> passed to another module which hadn't used the pragma? or would all
> symbol lookups just tr/-/_/ beforehand? but that can't be easily
> controlled in a lexical scope.

That problem is not specific to this feature. For any package
that changes the syntax, you can ask "what about eval?"

So... what *about* eval? :-)

> and i know how you feel about wanting - vs. _ but i have it the other
> way around. i much prefer _ since it maked the words more readable as _
> sorta disappears in the baseline. but then i hate list too so that
> influences me a trifle. :)

I guess it depends on whether you see it as a separator or a joiner.
You could say that hyphens join words, but underscores separate them.

> but the sickest thing i have done is to remap _ to - and back inside
> emacs. this was so typing -> is done with both keys shifted and i typed
> that too often. also that made writing foo_bar easier.

Hmm, that's an interesting hack. :-)

> so my brain has to swap then when inside emacs vs everywhere else.
> makes for some odd homicidal twitching sometimes (especially when anyone
> else dares to type into my emacs :).

Ha! Like you're ever going to go anywhere outside of Emacs...

> anyhow, my main point is that IMO this has too many problems with both
> syntax and unknown semantics that are sure to make some large fraction
> of us very mad. perl has its style and that it _ for word separation.

I agree, of course, that if this can't be done well, then it
shouldn't be done at all. Well, at least then it shouldn't
be put in the standard distribution. :-)

> the evil studly caps is used for module names only (where it does seem
> to work better than _ would. or maybe we are just so used to it by now).

I've come to prefer Foo-Bar-Baz for class and module names.
Having used that style for a while, I don't see any good
reason to write them smashed together.

> trying to change that in a scoped way will only cause pain
> somewhere else.

If so, then that is a symptom of a wider problem. I mean,
wasn't Perl 6 supposed to make this kind of hack a breeze?

--
Daniel Brockman <dan...@brockman.se>

Daniel Hulme

unread,

Nov 17, 2005, 4:07:29 AM11/17/05

to perl6-l...@perl.org

> No sane person would put their braces in different places in
> different parts of their code, so why don't we just say,
> "from now on, you must use brace style X"?

Have you never seen code that's been worked on by several people with
differing tastes in brace positioning and no coding standard? Have you
never copied a chunk of code from one place to another 'temporarily'
without adjusting the indentation (though, admittedly, most editors
will do it for you)?

People *do* mix brace and indentation styles in the same file, and it is
ugly when it happens, but it doesn't change the meaning of the program.

--
Stop the infinite loop, I want to get off! http://surreal.istic.org/
Paraphernalia/Never hides your broken bones,/ And I don't know why you'd
want to try:/ It's plain to see you're on your own. -- Paul Simon
The documentation that can be written is not the true documentation.

Jan Dubois

unread,

Nov 17, 2005, 2:51:19 AM11/17/05

to Daniel Brockman, perl6-l...@perl.org

On Wed, 16 Nov 2005, Daniel Brockman wrote:
> No offense to whoever made that suggestion, but I think there are far
> more people out there with a developed taste for hyphenated
> identifiers than there are people with a thing for using backticks as
> subscript operators.
>
> Do you see the difference? I'm trying to cater to an actually existing
> and in many cases strong preference.

No offense either, but if you are suggesting that

@a[$i-1] + @a[$i+1]

should be interpreted as

@a[$i_1] + @a[$i+1]

then I think it is pretty obvious why this is a really bad idea.

Cheers,
-Jan

Daniel Brockman

unread,

Nov 17, 2005, 3:11:37 AM11/17/05

to Jan Dubois, perl6-l...@perl.org

Jan,

> No offense either, but if you are suggesting that
>
> @a[$i-1] + @a[$i+1]
>
> should be interpreted as
>
> @a[$i_1] + @a[$i+1]
>
> then I think it is pretty obvious why this is a really bad idea.

That's a very good example. I think I'm going to have to
change my mind and agree that it should not be the default.
(Some will say I should have thought this through before
making the suggestion, but the thing is that I *did* think
it through. Just comes to show how poorly I think.)

Further, the package that makes hyphens identical to
underscores should probably warn about $i+1 and $i*1.

I would still use that package, and your example doesn't
bother me personally. I would never write $i-1 or $i+1.
But I wouldn't want to be the one to have to reply to all
the complaints about the unintuitive meaning of

@a[$i-1] + @a[$i+1].

--
Daniel Brockman <dan...@brockman.se>

Larry Wall

unread,

Nov 17, 2005, 11:32:41 AM11/17/05

to perl6-l...@perl.org

On Thu, Nov 17, 2005 at 10:56:27AM +0100, Daniel Brockman wrote:
: That problem is not specific to this feature. For any package

: that changes the syntax, you can ask "what about eval?"
:
: So... what *about* eval? :-)

Always parses with the parser in effect at that point, the same one you'd
get if you asked for $?PARSER.

: > trying to change that in a scoped way will only cause pain

: > somewhere else.
:
: If so, then that is a symptom of a wider problem. I mean,
: wasn't Perl 6 supposed to make this kind of hack a breeze?

Sure, but you still have to deal with the consequences of your choices.
You can still be tried for murder even if you only ever kill people
on Tuesday. Perl 6 is just trying to make it easier for you to kill
people on Tuesday without accidentally killing people on Saturday.

Larry

Robin Redeker

unread,

Nov 20, 2005, 9:48:07 AM11/20/05

to perl6-l...@perl.org

On Thu, Nov 17, 2005 at 04:05:30AM +0100, Daniel Brockman wrote:
> That is, hyphen and underscore are synonymous in identifiers,
> but an initial hyphen is not taken to be part of the identifier.
>

Why not make this feature generic and define equivalence classes for
equivalent characters in an identifier?

This would introduce a very interesting mathematical feature to Perl6.

Something else comes here to my mind: Wildcard identifiers:

foo*bar (12, "foo");

(maybe with a different syntax, i'm not sure yet).
From the matching functions/subroutines/whatever the one with a
matching signature could be called.

A warning or some exception should be thrown if multiple signatures
match.

Maybe these features could also be implemented via a syntax hook in the
parser maybe:

use wildcard_identifiers;

or
use equivalence_identifiers;

There are no other languages i know where this is possible,
but i would find it quite useful, and i know others that
would like a feature like this.

Wildcard identifiers would play in hand with the module versioning maybe
too. Selecting _any_ version of a module, but i haven't paid much
attention to the syntax/semantic of that.

greetings,
Robin Redeker

--
el...@x-paste.de / ro...@nethype.de / r.re...@gmail.com
Robin Redeker