Problems of Symbol Congestion in Computer Languages

120 views
Skip to first unread message

Xah Lee

unread,
Feb 16, 2011, 5:07:56 PM2/16/11
to
might be interesting.

〈Problems of Symbol Congestion in Computer Languages (ASCII Jam;
Unicode; Fortress)〉
http://xahlee.org/comp/comp_lang_unicode.html

--------------------------------------------------
Problems of Symbol Congestion in Computer Languages (ASCII Jam;
Unicode; Fortress)

Xah Lee, 2011-02-05, 2011-02-15

Vast majority of computer languages use ASCII as its character set.
This means, it jams multitude of operators into about 20 symbols.
Often, a symbol has multiple meanings depending on contex. Also, a
sequence of chars are used as a single symbol as a workaround for lack
of symbols. Even for languages that use Unicode as its char set (e.g.
Java, XML), often still use the ~20 ASCII symbols for all its
operators. The only exceptions i know of are Mathematica, Fortress,
APL. This page gives some examples of problems created by symbol
congestion.

-------------------------------
Symbol Congestion Workarounds

--------------------
Multiple Meanings of a Symbol

Here are some common examples of a symbol that has multiple meanings
depending on context:

In Java, [ ] is a delimiter for array, also a delimiter for getting a
element of array, also as part of the syntax for declaring a array
type.

In Java and many other langs, ( ) is used for expression grouping,
also as delimiter for arguments of a function call, also as delimiters
for parameters of a function's declaration.

In Perl and many other langs, : is used as a separator in a ternary
expression e.g. (test ? "yes" : "no"), also as a namespace separator
(e.g. use Data::Dumper;).

In URL, / is used as path separators, but also as indicator of
protocol. e.g. http://example.org/comp/unicode.html

In Python and many others, < is used for “less than” boolean operator,
but also as a alignment flag in its “format” method, also as a
delimiter of named group in regex, and also as part of char in other
operators that are made of 2 chars, e.g.: << <= <<= <>.

--------------------
Examples of Multip-Char Operators

Here are some common examples of operators that are made of multiple
characters: || && == <= != ** =+ =* := ++ -- :: // /* (* …

-------------------------------
Fortress & Unicode

The language designer Guy Steele recently gave a very interesting
talk. See: Guy Steele on Parallel Programing. In it, he showed code
snippets of his language Fortress, which freely uses Unicode as
operators.

For example, list delimiters are not the typical curly bracket {1,2,3}
or square bracket [1,2,3], but the unicode angle bracket ⟨1,2,3⟩.
(See: Matching Brackets in Unicode.) It also uses the circle plus ⊕ as
operator. (See: Math Symbols in Unicode.)

-------------------------------
Problems of Symbol Congestion

I really appreciate such use of unicode. The tradition of sticking to
the 95 chars in ASCII of 1960s is extremely limiting. It creates
complex problems manifested in:

* String Escape mechanism (C's backslash \n, \/, …, widely
adopted.)
* Complex delimiters for strings. (Python's triple quotes and
perl's variable delimiters q() q[] q{} m//, and heredoc. (See: Strings
in Perl and Python ◇ Heredoc mechanism in PHP and Perl.)
* Crazy leaning toothpicks syndrome, especially bad in emacs
regex.
* Complexities in character representation (See: Emacs's Key
Notations Explained (/r, ^M, C-m, RET, <return>, M-, meta) ◇ HTML
entities problems. See: HTML Entities, Ampersand, Unicode, Semantics.)
* URL Percent Encoding problems and complexities: Javascript
Encode URL, Escape String

All these problems occur because we are jamming so many meanings into
about 20 symbols in ASCII.

See also:

* Computer Language Design: Strings Syntax
* HTML6: Your JSON and SXML Simplified

Most of today's languages do not support unicode in function or
variable names, so you can forget about using unicode in variable
names (e.g. α=3) or function names (e.g. “lambda” as “λ” or “function”
as “ƒ”), or defining your own operators (e.g. “⊕”).

However, there are a few languages i know that do support unicode in
function or variable names. Some of these allow you to define your own
operators. However, they may not allow unicode for the operator
symbol. See: Unicode Support in Ruby, Perl, Python, javascript, Java,
Emacs Lisp, Mathematica.

Xah

Message has been deleted

Cthun

unread,
Feb 17, 2011, 9:40:00 PM2/17/11
to
On 17/02/2011 9:11 PM, rantingrick wrote:
. On Feb 16, 4:07 pm, Xah Lee<xah...@gmail.com> wrote:
.> Vast majority of computer languages use ASCII as its character set.
.> This means, it jams multitude of operators into about 20 symbols.
.> Often, a symbol has multiple meanings depending on contex.
.
. I think in theory the idea of using Unicode chars is good, however in
. reality the implementation would be a nightmare! A wise man once
. said: "The road to hell is paved in good intentions". ;-)
.
. If we consider all the boundaries that exist between current
. (programming) languages (syntax, IDE's, paradigms, etc) then we will
. realize that adding *more* symbols does not help, no, it actually
. hinders! And Since Unicode is just a hodgepodge encoding of many
. regional (natural) languages --of which we have too many already in
. this world!

What does your aversion to cultural diversity have to do with Lisp,
rantingrick? Gee, I do hope you're not a racist, rantingrick.

. -- proliferating Unicode symbols in source code only serves
. to further complicate our lives with even *more* multiplicity!
.
. Those of us on the *inside* know that Unicode is nothing more than an
. poor attempt to monkey patch multiplicity. And that statement barely
. scratches the surface of an underlying disease that plagues all of
. human civilization. The root case is selfishness, which *then*
. propagates up and manifests itself as multiplicity in our everyday
. lives. It starts as the simple selfish notion of "me" against "other"
. and then extrapolates exponentially into the collective of "we"
. against "others".
.
. This type of grouping --or selfish typecasting if you will-- is
. impeding the furtherer evolution of homo sapiens. Actually we are
. moving at a snails pace when we could be moving at the speed of light!
. We *should* be evolving as a genetic algorithm but instead we are the
. ignorant slaves of our own collective selfishness reduced to naive and
. completely random implementations of bozosort!

What does that have to do with Lisp, rantingrick?

. Now don't misunderstand all of this as meaning "multiplicity is bad",
. because i am not suggesting any such thing! On the contrary,
. multiplicity is VERY important in emerging problem domains. Before
. such a domain is understood by the collective unconscience we need
. options (multiplicity!) from which to choose from. However, once a
. "collective understanding" is reached we must reign in the
. multiplicity or it will become yet another millstone around our
. evolutionary necks, slowing our evolution.

Classic illogic. Evolution depends upon diversity as grist for the mill
of selection, rantingrick. A genetically homogeneous population cannot
undergo allele frequency shifts, rantingrock.

. But multiplicity is just the very beginning of a downward spiral of
. devolution. Once you allow multiplicity to become the sport of
. Entropy, it may be too late for recovery! Entropy leads to shock
. (logical disorder) which then leads to stagnation (no logical order at
. all!). At this point we loose all forward momentum in our evolution.
. And why? Because of nothing more than self gratifying SELFISHNESS.
.
. Anyone with half a brain understands the metric system is far superior
. (on many levels) then any of the other units of measurement. However
. again we have failed to reign in the multiplicity and so entropy has
. run a muck, and we are like a deer "caught-in-the-headlights" of the
. shock of our self induced devolution and simultaneously entirely
. incapable of seeing the speeding mass that is about to plow over us
. with a tremendous kinetic energy -- evolutionary stagnation!
.
. Sadly this disease of selfishness infects many aspects of the human
. species to the very detriment of our collective evolution. Maybe one
. day we will see the light of logic and choose to unite in a collective
. evolution. Even after thousands of years we are but infants on the
. evolutionary scale because we continue to feed the primal urges of
. selfishness.

What does any of that have to do with Lisp, rantingrick?

And you omitted the #1 most serious objection to Xah's proposal,
rantingrick, which is that to implement it would require unrealistic
things such as replacing every 101-key keyboard with 10001-key keyboards
and training everyone to use them. Xah would have us all replace our
workstations with machines that resemble pipe organs, rantingrick, or
perhaps the cockpits of the three surviving Space Shuttles. No doubt
they'd be enormously expensive, as well as much more difficult to learn
to use, rantingrick.

Cor Gest

unread,
Feb 17, 2011, 9:55:47 PM2/17/11
to
Some entity, AKA Cthun <cthu...@qmail.net.au>,
wrote this mindboggling stuff:
(selectively-snipped-or-not-p)


> And you omitted the #1 most serious objection to Xah's proposal,
> rantingrick, which is that to implement it would require unrealistic
> things such as replacing every 101-key keyboard with 10001-key
> keyboards and training everyone to use them. Xah would have us all
> replace our workstations with machines that resemble pipe organs,
> rantingrick, or perhaps the cockpits of the three surviving Space
> Shuttles. No doubt they'd be enormously expensive, as well as much
> more difficult to learn to use, rantingrick.

Atleast it should try to mimick a space-cadet keyboard, shouldn't it?

Cor

--
Monosyllabisch antwoorden is makkelijker ik kan mij zelfs melk veroorloven
Geavanceerde politieke correctheid is niet te onderscheiden van sarcasme
First rule of enaging in a gunfight: HAVE A GUN
SPAM DELENDA EST http://www.spammesenseless.nl

Message has been deleted

Littlefield, Tyler

unread,
Feb 17, 2011, 10:42:12 PM2/17/11
to pytho...@python.org
>My intention was to educate him on the pitfalls of multiplicity.
O. that's what you call that long-winded nonsense? Education? You must
live in America. Can I hazard a guess that your universal language might
be english? Has it not ever occured to you that people take pride in
their language? It is part of their culture. And yet you rant on about
selfishness?
On 2/17/2011 8:29 PM, rantingrick wrote:

> On Feb 17, 8:40 pm, Cthun<cthun_...@qmail.net.au> wrote:
>
>> What does your aversion to cultural diversity have to do with Lisp,
>> rantingrick? Gee, I do hope you're not a racist, rantingrick.
> Why must language be constantly "connected-at-the-hip" to cultural
> diversity? People have this irrational fear that if we create a single
> universal language then *somehow* freedom have been violated.
>
> You *do* understand that language is just a means of communication,
> correct? And i would say a very inefficient means. However, until
> telekinesis becomes common-place the only way us humans have to
> communicate is through a fancy set of grunts and groans. Since that is
> the current state of our communication thus far, would it not be
> beneficial that at least we share a common world wide mapping of this
> noise making?
>
> <sarcasm> Hey, wait, i have an idea... maybe some of us should drive
> on the right side of the road and some on the left. This way we can be
> unique (psst: SELFISH) from one geographic location on the earth to
> another geographic location on the earth. Surely this multiplicity
> would not cause any problems? Because, heck, selfishness is so much
> more important than anyones personal safety anyway</sarcasm>
>
> Do you see how this morphs into a foolish consistency?

>
>> . Now don't misunderstand all of this as meaning "multiplicity is bad",
>> . because i am not suggesting any such thing! On the contrary,
>> . multiplicity is VERY important in emerging problem domains. Before
>> . such a domain is understood by the collective unconscience we need
>> . options (multiplicity!) from which to choose from. However, once a
>> . "collective understanding" is reached we must reign in the
>> . multiplicity or it will become yet another millstone around our
>> . evolutionary necks, slowing our evolution.
>>
>> Classic illogic. Evolution depends upon diversity as grist for the mill
>> of selection, rantingrick. A genetically homogeneous population cannot
>> undergo allele frequency shifts, rantingrock.
> Oh, maybe you missed this paragraph:

>
> . Now don't misunderstand all of this as meaning "multiplicity is
> bad",
> . because i am not suggesting any such thing! On the contrary,
> . multiplicity is VERY important in emerging problem domains. Before
> . such a domain is understood by the collective unconscience we need
> . options (multiplicity!) from which to choose from. However, once a
> . "collective understanding" is reached we must reign in the
> . multiplicity or it will become yet another millstone around our
> . evolutionary necks, slowing our evolution.
>
> Or maybe this one:

>
> . I think in theory the idea of using Unicode chars is good, however
> in
> . reality the implementation would be a nightmare! A wise man once
> . said: "The road to hell is paved in good intentions". ;-)
>
> Or this one:

>
> . If we consider all the boundaries that exist between current
> . (programming) languages (syntax, IDE's, paradigms, etc) then we
> will
> . realize that adding *more* symbols does not help, no, it actually
> . hinders! And Since Unicode is just a hodgepodge encoding of many
> . regional (natural) languages --of which we have too many already in
> . this world!
>
>> What does any of that have to do with Lisp, rantingrick?
> The topic is *ahem*... "Problems of Symbol Congestion in Computer
> Languages"... of which i think is not only a lisp issue but an issue
> of any language. (see my comments about selfishness for insight)

>
>> And you omitted the #1 most serious objection to Xah's proposal,
>> rantingrick, which is that to implement it would require unrealistic
>> things such as replacing every 101-key keyboard with 10001-key keyboards
>> and training everyone to use them. Xah would have us all replace our
>> workstations with machines that resemble pipe organs, rantingrick, or
>> perhaps the cockpits of the three surviving Space Shuttles. No doubt
>> they'd be enormously expensive, as well as much more difficult to learn
>> to use, rantingrick.
> Yes, if you'll read my entire post then you'll clearly see that i
> disagree with Mr Lee on using Unicode chars in source code. My
> intention was to educate him on the pitfalls of multiplicity.
>
>


--

Thanks,
Ty

Cthun

unread,
Feb 17, 2011, 10:49:21 PM2/17/11
to
On 17/02/2011 10:29 PM, rantingrick wrote:
> On Feb 17, 8:40 pm, Cthun<cthun_...@qmail.net.au> wrote:
>
>> What does your aversion to cultural diversity have to do with Lisp,
>> rantingrick? Gee, I do hope you're not a racist, rantingrick.
>
> Why must language be constantly "connected-at-the-hip" to cultural
> diversity?

Language is a part of culture, rantingrick.

> People have this irrational fear that if we create a single
> universal language then *somehow* freedom have been violated.

No, it is that if we stop using the others, or forcibly wipe them out,
that something irreplaceable will have been lost, rantingrick.

> You *do* understand that language is just a means of communication,
> correct?

Classic unsubstantiated and erroneous claim. A language is also a
cultural artifact, rantingrick. If we lose, say, the French language, we
lose one of several almost-interchangeable means of communication,
rantingrick. But we also lose something as unique and irreplaceable as
the original canvas of the Mona Lisa, rantingrick.

> And i would say a very inefficient means. However, until
> telekinesis becomes common-place the only way us humans have to
> communicate is through a fancy set of grunts and groans. Since that is
> the current state of our communication thus far, would it not be
> beneficial that at least we share a common world wide mapping of this
> noise making?

What does your question have to do with Lisp, rantingrick?

> <sarcasm> Hey, wait, i have an idea... maybe some of us should drive
> on the right side of the road and some on the left. This way we can be
> unique (psst: SELFISH) from one geographic location on the earth to
> another geographic location on the earth.

Classic illogic. Comparing, say, the loss of the French language to
standardizing on this is like comparing the loss of the Mona Lisa to
zeroing one single bit in a computer somewhere, rantingrick.

> Surely this multiplicity
> would not cause any problems? Because, heck, selfishness is so much
> more important than anyones personal safety anyway</sarcasm>

Non sequitur.

> Do you see how this morphs into a foolish consistency?

What does your classic erroneous presupposition have to do with Lisp,
rantingrick?

>> Classic illogic. Evolution depends upon diversity as grist for the mill


>> of selection, rantingrick. A genetically homogeneous population cannot
>> undergo allele frequency shifts, rantingrock.
>

> Oh, maybe you missed this paragraph

What does your classic erroneous presupposition have to do with Lisp,
rantingrick?

> . Now don't misunderstand all of this as meaning "multiplicity is
> bad",
> . because i am not suggesting any such thing! On the contrary,
> . multiplicity is VERY important in emerging problem domains. Before
> . such a domain is understood by the collective unconscience we need
> . options (multiplicity!) from which to choose from. However, once a
> . "collective understanding" is reached we must reign in the
> . multiplicity or it will become yet another millstone around our
> . evolutionary necks, slowing our evolution.

Classic erroneous presupposition that evolution is supposed to reach a
certain point and then stop and stagnate on a single universal standard,
rantingrick.

> Or maybe this one:


>
> . I think in theory the idea of using Unicode chars is good, however
> in
> . reality the implementation would be a nightmare! A wise man once
> . said: "The road to hell is paved in good intentions". ;-)

Classic unsubstantiated and erroneous claim. I read that one, rantingrick.

> Or this one:


>
> . If we consider all the boundaries that exist between current
> . (programming) languages (syntax, IDE's, paradigms, etc) then we
> will
> . realize that adding *more* symbols does not help, no, it actually
> . hinders! And Since Unicode is just a hodgepodge encoding of many
> . regional (natural) languages --of which we have too many already in
> . this world!

Classic unsubstantiated and erroneous claim. I read that one, too,
rantingrick.

>> What does any of that have to do with Lisp, rantingrick?
>

> The topic is *ahem*... "Problems of Symbol Congestion in Computer
> Languages"... of which i think is not only a lisp issue but an issue
> of any language.

Classic illogic. The topic of the *thread* is *computer* languages, yet
you attacked non-computer languages in the majority of your rant,
rantingrick. Furthermore, the topic of the *newsgroup* is the *Lisp
subset* of computer languages.

> (see my comments about selfishness for insight)

What does that have to do with Lisp, rantingrick?

>> And you omitted the #1 most serious objection to Xah's proposal,


>> rantingrick, which is that to implement it would require unrealistic
>> things such as replacing every 101-key keyboard with 10001-key keyboards
>> and training everyone to use them. Xah would have us all replace our
>> workstations with machines that resemble pipe organs, rantingrick, or
>> perhaps the cockpits of the three surviving Space Shuttles. No doubt
>> they'd be enormously expensive, as well as much more difficult to learn
>> to use, rantingrick.
>

> Yes, if you'll read my entire post then you'll clearly see that i
> disagree with Mr Lee on using Unicode chars in source code.

Classic erroneous presuppositions that I did not read your entire post
and that I thought you weren't disagreeing with Mr. Lee, rantingrick.

> My intention was to educate him on the pitfalls of multiplicity.

Classic illogic, since "multiplicity" (also known as "diversity") does
not in and of itself have pitfalls, rantingrick.

On the other hand, monoculture has numerous well-known pitfalls,
rantingrick.

alex23

unread,
Feb 17, 2011, 11:04:15 PM2/17/11
to
rantingrick <rantingr...@gmail.com> wrote:

> Cthun <cthun_...@qmail.net.au> wrote:
>
> > What does your aversion to cultural diversity have to do with Lisp,
> > rantingrick? Gee, I do hope you're not a racist, rantingrick.
>
> Why must language be constantly "connected-at-the-hip" to cultural
> diversity? People have this irrational fear that if we create a single

> universal language then *somehow* freedom have been violated.

Because monocultures _die_ and no amount of fascist-like rick-ranting
about a One True Way will ever change that.

Message has been deleted

Chris Jones

unread,
Feb 17, 2011, 10:28:23 PM2/17/11
to pytho...@python.org
On Thu, Feb 17, 2011 at 09:55:47PM EST, Cor Gest wrote:
> Some entity, AKA Cthun <cthu...@qmail.net.au>,

[..]

> > And you omitted the #1 most serious objection to Xah's proposal,
> > rantingrick, which is that to implement it would require unrealistic
> > things such as replacing every 101-key keyboard with 10001-key
> > keyboards and training everyone to use them. Xah would have us all
> > replace our workstations with machines that resemble pipe organs,
> > rantingrick, or perhaps the cockpits of the three surviving Space
> > Shuttles. No doubt they'd be enormously expensive, as well as much
> > more difficult to learn to use, rantingrick.

> At least it should try to mimick a space-cadet keyboard, shouldn't it?

Implementation details, and not very accurate at that.. the APL keyboard
has not additional keys and yet it has the potential to add up to 100
additional symbols to the US-ASCII keyboard, half of which are produced
via a single modifier.. same as upper-case letters. So unless more than
50+20 = 70 symbols are needed the keyboard conversion would cost about..
what.. $2.00 in stickers and maybe ten minutes to place them.

Maybe the problem lies elsewhere..?

cj

Message has been deleted

John Nagle

unread,
Feb 18, 2011, 12:43:37 AM2/18/11
to
On 2/17/2011 6:55 PM, Cor Gest wrote:
> Some entity, AKA Cthun<cthu...@qmail.net.au>,
> wrote this mindboggling stuff:
> (selectively-snipped-or-not-p)
>
>
>> And you omitted the #1 most serious objection to Xah's proposal,
>> rantingrick, which is that to implement it would require unrealistic
>> things such as replacing every 101-key keyboard with 10001-key
>> keyboards and training everyone to use them. Xah would have us all
>> replace our workstations with machines that resemble pipe organs,
>> rantingrick, or perhaps the cockpits of the three surviving Space
>> Shuttles. No doubt they'd be enormously expensive, as well as much
>> more difficult to learn to use, rantingrick.
>
> At least it should try to mimick a space-cadet keyboard, shouldn't it?

I've used both the "MIT Space Cadet" keyboard on a Symbolics LISP
machine, and the Stanford SAIL keyboard. There's
something to be said for having more mathematical symbols.

Some programs use a bigger character set. MathCAD,
for example, has a broader range of mathematical symbols on
the input side than ASCII offers. They're not decorative;
MathCAD has different "=" symbols for assignment, algebraic
equivalence, identity, and comparison.

I've previously mentioned that Python suffers in a few places
from unwanted overloading. Using "+" for concatenation of
strings, then extending that to vectors, resulted in undesirable
semantics. "+" on arrays from "numpy", and on built-in vectors
behave quite differently. A dedicated concatenation operator
would have avoided that mess.

C++ has worse problems, because it uses < and > as both
brackets and operators. This does horrible things to the syntax.

However, adding a large number of new operators or new
bracket types is probably undesirable.

John Nagle

Westley Martínez

unread,
Feb 18, 2011, 1:27:45 AM2/18/11
to pytho...@python.org
On Thu, 2011-02-17 at 21:38 -0800, rantingrick wrote:
> <rant />
That was a very insightful point-of-view. Perhaps you should write a
book or blog, as I'd be very interested in reading more about what you
have to say.

Westley Martínez

unread,
Feb 18, 2011, 1:30:04 AM2/18/11
to pytho...@python.org
On Thu, 2011-02-17 at 22:28 -0500, Chris Jones wrote:
> On Thu, Feb 17, 2011 at 09:55:47PM EST, Cor Gest wrote:
> > Some entity, AKA Cthun <cthu...@qmail.net.au>,
>
> [..]

>
> > > And you omitted the #1 most serious objection to Xah's proposal,
> > > rantingrick, which is that to implement it would require unrealistic
> > > things such as replacing every 101-key keyboard with 10001-key
> > > keyboards and training everyone to use them. Xah would have us all
> > > replace our workstations with machines that resemble pipe organs,
> > > rantingrick, or perhaps the cockpits of the three surviving Space
> > > Shuttles. No doubt they'd be enormously expensive, as well as much
> > > more difficult to learn to use, rantingrick.
>
> > At least it should try to mimick a space-cadet keyboard, shouldn't it?
>
> Implementation details, and not very accurate at that.. the APL keyboard
> has not additional keys and yet it has the potential to add up to 100
> additional symbols to the US-ASCII keyboard, half of which are produced
> via a single modifier.. same as upper-case letters. So unless more than
> 50+20 = 70 symbols are needed the keyboard conversion would cost about..
> what.. $2.00 in stickers and maybe ten minutes to place them.
>
> Maybe the problem lies elsewhere..?
>
> cj
>
$2.00 * thousands of programmers -> thousands of dollars + thousands of
lost training time; not to mention code conversion.

Chris Jones

unread,
Feb 18, 2011, 2:05:39 AM2/18/11
to pytho...@python.org
On Fri, Feb 18, 2011 at 01:30:04AM EST, Westley Martínez wrote:
> On Thu, 2011-02-17 at 22:28 -0500, Chris Jones wrote:
> > On Thu, Feb 17, 2011 at 09:55:47PM EST, Cor Gest wrote:
> > > Some entity, AKA Cthun <cthu...@qmail.net.au>,
> >
> > [..]

> >
> > > > And you omitted the #1 most serious objection to Xah's proposal,
> > > > rantingrick, which is that to implement it would require unrealistic
> > > > things such as replacing every 101-key keyboard with 10001-key
> > > > keyboards and training everyone to use them. Xah would have us all
> > > > replace our workstations with machines that resemble pipe organs,
> > > > rantingrick, or perhaps the cockpits of the three surviving Space
> > > > Shuttles. No doubt they'd be enormously expensive, as well as much
> > > > more difficult to learn to use, rantingrick.
> >
> > > At least it should try to mimick a space-cadet keyboard, shouldn't it?
> >
> > Implementation details, and not very accurate at that.. the APL keyboard
> > has not additional keys and yet it has the potential to add up to 100
> > additional symbols to the US-ASCII keyboard, half of which are produced
> > via a single modifier.. same as upper-case letters. So unless more than
> > 50+20 = 70 symbols are needed the keyboard conversion would cost about..
> > what.. $2.00 in stickers and maybe ten minutes to place them.
> >
> > Maybe the problem lies elsewhere..?
> >
> > cj
> >

> $2.00 * thousands of programmers -> thousands of dollars + thousands
> of lost training time; not to mention code conversion.

Yeah.. yeah.. one coffee break missed.. get real..

:-)

cj

Chris Jones

unread,
Feb 18, 2011, 2:50:11 AM2/18/11
to pytho...@python.org
On Fri, Feb 18, 2011 at 12:43:37AM EST, John Nagle wrote:
> On 2/17/2011 6:55 PM, Cor Gest wrote:

[..]

>> At least it should try to mimick a space-cadet keyboard, shouldn't
>> it?

> I've used both the "MIT Space Cadet" keyboard on a Symbolics LISP
> machine, and the Stanford SAIL keyboard. There's something to be
> said for having more mathematical symbols.

Really..? Wow..! I only every saw pictures of the beast and I was never
really convinced it was for real.. :-)

> Some programs use a bigger character set. MathCAD, for example,
> has a broader range of mathematical symbols on the input side than
> ASCII offers. They're not decorative; MathCAD has different "="
> symbols for assignment, algebraic equivalence, identity, and
> comparison.

Out of curiosity, I played a bit of APL lately, and I was amazed at how
quickly you get to learn the extra symbols and their location on the
keyboard. Had intergrating the basic concepts of the language been that
easy, I would have been comfortably coding within a couple of hours.
I was also rather enchanted by the fact that the coding closely matched
my intentions. No overloading in this respect. Not that I'm an APL
advocate, but who knows what programming languages will look like in the
not-so-distant future.

> I've previously mentioned that Python suffers in a few places
> from unwanted overloading. Using "+" for concatenation of
> strings, then extending that to vectors, resulted in undesirable
> semantics. "+" on arrays from "numpy", and on built-in vectors
> behave quite differently. A dedicated concatenation operator
> would have avoided that mess.

And the worst part of it is that you get so used to it that you take
such matters for granted. Thanks for the eye-opener.

> C++ has worse problems, because it uses < and > as both brackets
> and operators. This does horrible things to the syntax.

.. from a quite different perspective it may be worth noting that
practically all programming languages (not to mention the attached
documentation) are based on the English language. And interestingly
enough, most any software of note appears to have come out of cultures
where English is either the native language, or where the native
language is either relatively close to English.. Northern Europe
mostly.. and not to some small extent, countries where English is
well-established as a universal second language, such as India. Always
struck me as odd that a country like Japan for instance, with all its
achievements in the industrial realm, never came up with one single
major piece of software.

cj

Steven D'Aprano

unread,
Feb 18, 2011, 5:26:02 AM2/18/11
to
On Thu, 17 Feb 2011 21:43:37 -0800, John Nagle wrote:

> I've used both the "MIT Space Cadet" keyboard on a Symbolics LISP
> machine, and the Stanford SAIL keyboard. There's something to be said
> for having more mathematical symbols.

Agreed. I'd like Python to support proper mathematical symbols like ∞ for
float('inf'), ≠ for not-equal, ≤ for greater-than-or-equal, and ≥ for
less-than-or-equal.

They would have to be optional, because most editors still make it
difficult to enter such characters, and many fonts are lacking in glyphs,
but still, now that Python supports non-ASCII encodings in source files,
it could be done.

> Some programs use a bigger character set. MathCAD,
> for example, has a broader range of mathematical symbols on the input
> side than ASCII offers. They're not decorative; MathCAD has different
> "=" symbols for assignment, algebraic equivalence, identity, and
> comparison.
>
> I've previously mentioned that Python suffers in a few places
> from unwanted overloading. Using "+" for concatenation of strings, then
> extending that to vectors, resulted in undesirable semantics. "+" on
> arrays from "numpy", and on built-in vectors behave quite differently.
> A dedicated concatenation operator would have avoided that mess.

I don't quite agree that the mess is as large as you make out, but yes,
more operators would be useful.


--
Steven

Steven D'Aprano

unread,
Feb 18, 2011, 6:40:17 AM2/18/11
to
On Fri, 18 Feb 2011 02:50:11 -0500, Chris Jones wrote:

> Always
> struck me as odd that a country like Japan for instance, with all its
> achievements in the industrial realm, never came up with one single
> major piece of software.

I think you are badly misinformed.

The most widespread operating system in the world is not Windows. It's
something you've probably never heard of, from Japan, called TRON.

http://www.linuxinsider.com/story/31855.html
http://web-japan.org/trends/science/sci030522.html

Japan had an ambitious, but sadly failed, "Fifth Generation Computing"
project:

http://en.wikipedia.org/wiki/Fifth_generation_computer
http://vanemden.wordpress.com/2010/08/21/who-killed-prolog/

They did good work, but unfortunately were ahead of their time and the
project ended in failure.

Japan virtually *owns* the video game market. Yes, yes, Americans publish
a few high-profile first-person shooters. For every one of them, there's
about a thousand Japanese games that never leave the country.

There's no shortages of programming languages which have come out of
Japan:

http://hopl.murdoch.edu.au/findlanguages.prx?id=jp&which=ByCountry
http://no-sword.jp/blog/2006/12/programming-without-ascii.html

The one you're most likely to have used or at least know of is Ruby.

--
Steven

Giacomo Boffi

unread,
Feb 18, 2011, 7:02:09 AM2/18/11
to
Chris Jones <cjns...@gmail.com> writes:

> [...] most any software of note appears to have come out of cultures


> where English is either the native language, or where the native

> language is either relatively close to English...

i do acknowledge your "most", but how do you spell "Moon" in Portuguese?

Giacomo Boffi

unread,
Feb 18, 2011, 7:06:24 AM2/18/11
to
Steven D'Aprano <steve+comp....@pearwood.info> writes:

>> A dedicated concatenation operator would have avoided that mess.
>
> I don't quite agree that the mess is as large as you make out, but yes,
> more operators would be useful.

am i wrong, or "|" is still available?
--
l'amore e' un sentimento a senso unico. a volte una via comincia dove
finisce un'altra e viceversa -- Caldana, in IFQ

Xah Lee

unread,
Feb 18, 2011, 7:43:13 AM2/18/11
to
On 2011-02-16, Xah Lee  wrote:
│ Vast majority of computer languages use ASCII as its character set.
│ This means, it jams multitude of operators into about 20 symbols.
│ Often, a symbol has multiple meanings depending on contex.

On 2011-02-17, rantingrick wrote:

On 2011-02-17, Cthun wrote:
│ And you omitted the #1 most serious objection to Xah's proposal,


│ rantingrick, which is that to implement it would require unrealistic
│ things such as replacing every 101-key keyboard with 10001-key
keyboards
│ and training everyone to use them. Xah would have us all replace our
│ workstations with machines that resemble pipe organs, rantingrick,
or
│ perhaps the cockpits of the three surviving Space Shuttles. No doubt
│ they'd be enormously expensive, as well as much more difficult to
learn
│ to use, rantingrick.

keyboard shouldn't be a problem.

Look at APL users.
http://en.wikipedia.org/wiki/APL_(programming_language)
they are happy campers.

Look at Mathematica, which support a lot math symbols since v3 (~1997)
before unicode became popular.
see:
〈How Mathematica does Unicode?〉
http://xahlee.org/math/mathematica_unicode.html

word processors, also automatically do symbols such as “curly quotes”,
trade mark sign ™, copyright sing ©, arrow →, bullet •, ellipsis …
etc, and the number of people who produce document with these chars
are probably more than the number of programers.

in emacs, i recently also wrote a mode that lets you easily input few
hundred unicode chars.
〈Emacs Math Symbols Input Mode (xmsi-mode)〉
http://xahlee.org/emacs/xmsi-math-symbols-input.html

the essence is that you just need a input system.

look at Chinese, Japanese, Korean, or Islamic. They happily type
without requiring that every symbol they use must have a corresponding
key on keyboard. Some lang, such as Chinese, that's impossible or
impractical.

when a input system is well designd, it could be actually more
efficient than
keyboard combinations to typo special symbols (such as in Mac OS X's
opt key, or
Windows's AltGraph). Because a input system can be context based, that
it looks
at adjacent text to guess what you want.

for example, when you type >= in python, the text editor can
automatically change it to ≥ (when it detects that it's appropriate,
e.g. there's a “if” nearby)

Chinese phonetic input system use this
extensively. Abbrev system in word processors and emacs is also a form
of
this. I wrote some thought about this here:

〈Designing a Math Symbols Input System〉
http://xahlee.org/comp/design_math_symbol_input.html

Xah Lee

Xah Lee

unread,
Feb 18, 2011, 8:01:45 AM2/18/11
to

Chris Jones wrote:
«.. from a quite different perspective it may be worth noting that

practically all programming languages (not to mention the attached
documentation) are based on the English language. And interestingly
enough, most any software of note appears to have come out of cultures
where English is either the native language, or where the native
language is either relatively close to English.. Northern Europe
mostly.. and not to some small extent, countries where English is well-
established as a universal second language, such as India. Always
struck me as odd that a country like Japan for instance, with all its
achievements in the industrial realm, never came up with one single
major piece of software.»

btw, english is one of the two of India's official lang. It's used
between Indians, and i think it's rare or non-existent for a college
in india that uses local dialect. (this is second hand knowledeg. I
learned this in Wikipedia and experience with indian co-workers)

i also wondered about why japan doesn't seems to have created major
software or OS. Though, Ruby is invented in Japan. I do think they
have some OSes just not that popular... i think for special purposes
OSes, they have quite a lot ... from Mitsubishi, NEC, etc... in their
huge robotics industry among others. (again, this is all second hand
knowledge)

... i recall having read non-english comp lang that appeared
recently...

Xah Lee

Steve Schafer

unread,
Feb 18, 2011, 8:55:53 AM2/18/11
to
On Thu, 17 Feb 2011 20:47:57 -0800 (PST), rantingrick
<ranti...@gmail.com> wrote:

>What is evolution?
>
>Evolution is the pursuit of perfection at the expense of anything and
>everything!

No, evolution is the pursuit of something just barely better than what
the other guy has. Evolution is about gaining an edge, not gaining
perfection.

Perfect is the enemy of good.

-Steve Schafer

Message has been deleted
Message has been deleted

Steve Schafer

unread,
Feb 18, 2011, 10:01:01 AM2/18/11
to
On Fri, 18 Feb 2011 06:22:32 -0800 (PST), rantingrick
<ranti...@gmail.com> wrote:

>Evolution is about one cog gaining an edge over another, yes. However
>the system itself moves toward perfection at the expense of any and
>all cogs.

Um, do you actually know anything about (biological) evolution? There is
no evidence of an overall "goal" of any kind, perfect or not.

* There are many examples of evolutionary "arms races" in nature; e.g.,
the cheetah and the gazelle, each gaining incrementally on the other,
and a thousand generations later, each in essentially the same place
relative to the other that they started from, only with longer legs
or a more supple spine.

* There are many adaptations that confer a serious DISadvantage in one
aspect of survivability, that survive because they confer an
advantage in another (sickle-cell disease in humans, a peacock's
tail, etc.).

>If perfection is evil then what is the pursuit of perfection: AKA:
>gaining an edge?

1) I never said that perfection is evil; those are entirely your words.

2) If you don't already see the obvious difference between "pursuit of
perfection" and "gaining an edge," then I'm afraid there's nothing I can
do or say to help you.

-Steve Schafer

Cthun

unread,
Feb 18, 2011, 10:07:43 AM2/18/11
to
On 18/02/2011 7:43 AM, Xah Lee wrote:
> On 2011-02-17, Cthun wrote:
> │ And you omitted the #1 most serious objection to Xah's proposal,
> │ rantingrick, which is that to implement it would require unrealistic
> │ things such as replacing every 101-key keyboard with 10001-key
> keyboards

What does your classic unsubstantiated and erroneous claim have to do
with Lisp, Lee?

Stephen Hansen

unread,
Feb 18, 2011, 10:08:56 AM2/18/11
to pytho...@python.org
On 2/17/11 7:42 PM, Littlefield, Tyler wrote:
>>My intention was to educate him on the pitfalls of multiplicity.
> O. that's what you call that long-winded nonsense? Education? You must
> live in America. Can I hazard a guess that your universal language might
> be english? Has it not ever occured to you that people take pride in
> their language? It is part of their culture. And yet you rant on about
> selfishness?

This is an old-rant, there's nothing new to it. Rick's racist and
imperialistic anti-Unicode rants have all been fully hashed out months
if not years ago, Tyler. There's really nothing more to say about it.

He doesn't get it.

--

Stephen Hansen
... Also: Ixokai
... Mail: me+list/python (AT) ixokai (DOT) io
... Blog: http://meh.ixokai.io/

signature.asc

Chris Jones

unread,
Feb 18, 2011, 9:06:37 AM2/18/11
to pytho...@python.org

Food for thought.. Thanks much for the links..!

cj

Westley Martínez

unread,
Feb 18, 2011, 2:16:30 PM2/18/11
to pytho...@python.org
More people despise APL than like it.

Allowing non-ascii characters as operators is a silly idea simply
because if xorg breaks, which it's very likely to do with the current
video drivers, I'm screwed. Not only does the Linux virtual terminal not
support displaying these special characters, but there's also no way of
inputting them. On top of that, these special characters require more
bytes to display than ascii text, which would bloat source files
unnecessarily.

You say we have symbol congestion, but in reality we have our own symbol
bloat. Japanese has more or less than three punctuation marks, while
English has perhaps more than the alphabet! The fundamental point here
is using non-ascii operators violates the Zen of Python. It violates
"Simple is better than complex," as well as "There should be one-- and
preferably only one --obvious way to do it."

Steven D'Aprano

unread,
Feb 18, 2011, 7:03:06 PM2/18/11
to
On Fri, 18 Feb 2011 04:43:13 -0800, Xah Lee wrote:

> for example, when you type >= in python, the text editor can
> automatically change it to ≥ (when it detects that it's appropriate,
> e.g. there's a “if” nearby)

You can't rely on the presence of an `if`.

flag = x >= y
value = lookup[x >= y]
filter(lambda x, y: x >= y, sequence)

Not that you need to. There are no circumstances in Python where the
meaning of >= is changed by an `if` statement.


Followups set to comp.lang.python.


--
Steven

Steven D'Aprano

unread,
Feb 18, 2011, 8:01:57 PM2/18/11
to
On Fri, 18 Feb 2011 11:16:30 -0800, Westley Martínez wrote:

> Allowing non-ascii characters as operators is a silly idea simply
> because if xorg breaks, which it's very likely to do with the current
> video drivers, I'm screwed.

And if your hard drive crashes, you're screwed too. Why stop at "xorg
breaks"?

Besides, Windows and MacOS users will be scratching their head asking
"xorg? Why should I care about xorg?"

Programming languages are perfectly allowed to rely on the presence of a
working environment. I don't think general purpose programming languages
should be designed with reliability in the presence of a broken
environment in mind.

Given the use-cases people put Python to, it is important for the
language to *run* without a GUI environment. It's also important (but
less so) to allow people to read and/or write source code without a GUI,
which means continuing to support the pure-ASCII syntax that Python
already supports. But Python already supports non-ASCII source files,
with an optional encoding line in the first two lines of the file, so it
is already possible to write Python code that runs without X but can't be
easily edited without a modern, Unicode-aware editor.


> Not only does the Linux virtual terminal not
> support displaying these special characters, but there's also no way of
> inputting them.

That's a limitation of the Linux virtual terminal. In 1984 I used to use
a Macintosh which was perfectly capable of displaying and inputting non-
ASCII characters with a couple of key presses. Now that we're nearly a
quarter of the way into 2011, I'm using a Linux PC that makes entering a
degree sign or a pound sign a major undertaking, if it's even possible at
all. It's well past time for Linux to catch up with the 1980s.


> On top of that, these special characters require more
> bytes to display than ascii text, which would bloat source files
> unnecessarily.

Oh come on now, now you're just being silly. "Bloat source files"? From a
handful of double-byte characters? Cry me a river!

This is truly a trivial worry:

>>> s = "if x >= y:\n"
>>> u = u"if x ≥ y:\n"
>>> len(s)
11
>>> len(u.encode('utf-8'))
12


The source code to the decimal module in Python 3.1 is 205470 bytes in
size. It contains 63 instances of ">=" and 62 of "<=". Let's suppose
every one of those were changed to ≥ or ≤ characters. This would "bloat"
the file by 0.06%.

Oh the humanity!!! How will my 2TB hard drive cope?!?!


> You say we have symbol congestion, but in reality we have our own symbol
> bloat. Japanese has more or less than three punctuation marks, while
> English has perhaps more than the alphabet! The fundamental point here
> is using non-ascii operators violates the Zen of Python. It violates
> "Simple is better than complex," as well as "There should be one-- and
> preferably only one --obvious way to do it."

Define "simple" and "complex" in this context.

It seems to me that single character symbols such as ≥ are simpler than
digraphs such as >=, simply because the parser knows what the symbol is
after reading a single character. It doesn't have to read on to tell
whether you meant > or >=.

You can add complexity to one part of the language (hash tables are more
complex than arrays) in order to simplify another part (dict lookup is
simpler and faster than managing your own data structure in a list).

And as for one obvious way, there's nothing obvious about using a | b for
set union. Why not a + b? The mathematician in me wants to spell set
union and intersection as a ⋃ b ⋂ c, which is the obvious way to me (even
if my lousy editor makes it a PITA to *enter* the symbols).

The lack of good symbols for operators in ASCII is a real problem. Other
languages have solved it in various ways, sometimes by using digraphs (or
higher-order symbols), and sometimes by using Unicode (or some platform/
language specific equivalent). I think that given the poor support for
Unicode in the general tools we use, the use of non-ASCII symbols will
have to wait until Python4. Hopefully by 2020 input methods will have
improved, and maybe even xorg be replaced by something less sucky.

I think that the push for better and more operators will have to come
from the Numpy community. Further reading:


http://mail.python.org/pipermail/python-dev/2008-November/083493.html


--
Steven

Westley Martínez

unread,
Feb 18, 2011, 9:14:32 PM2/18/11
to pytho...@python.org
On Sat, 2011-02-19 at 01:01 +0000, Steven D'Aprano wrote:
> On Fri, 18 Feb 2011 11:16:30 -0800, Westley Martínez wrote:
>
> > Allowing non-ascii characters as operators is a silly idea simply
> > because if xorg breaks, which it's very likely to do with the current
> > video drivers, I'm screwed.
>
> And if your hard drive crashes, you're screwed too. Why stop at "xorg
> breaks"?
Because I can still edit text files in the terminal.

I guess you could manually control the magnet in the hard-drive if it
failed but that'd be horrifically tedious.


> Besides, Windows and MacOS users will be scratching their head asking
> "xorg? Why should I care about xorg?"

Why should I care if my programs run on Windows and Mac? Because I'm a
nice guy I guess....


> Programming languages are perfectly allowed to rely on the presence of a
> working environment. I don't think general purpose programming languages
> should be designed with reliability in the presence of a broken
> environment in mind.
>
> Given the use-cases people put Python to, it is important for the
> language to *run* without a GUI environment. It's also important (but
> less so) to allow people to read and/or write source code without a GUI,
> which means continuing to support the pure-ASCII syntax that Python
> already supports. But Python already supports non-ASCII source files,
> with an optional encoding line in the first two lines of the file, so it
> is already possible to write Python code that runs without X but can't be
> easily edited without a modern, Unicode-aware editor.
>
> > Not only does the Linux virtual terminal not
> > support displaying these special characters, but there's also no way of
> > inputting them.
>
> That's a limitation of the Linux virtual terminal. In 1984 I used to use
> a Macintosh which was perfectly capable of displaying and inputting non-
> ASCII characters with a couple of key presses. Now that we're nearly a
> quarter of the way into 2011, I'm using a Linux PC that makes entering a
> degree sign or a pound sign a major undertaking, if it's even possible at
> all. It's well past time for Linux to catch up with the 1980s.

I feel it's unnecessary for Linux to "catch up" simply because we have
no need for these special characters! When I read Python code, I only
see text from Latin-1, which is easy to input and every *decent* font
supports it. When I read C code, I only see text from Latin-1. When I
read code from just about everything else that's plain text, I only see
text from Latin-1. Even Latex, which is designed for typesetting
mathematical formulas, only allows ASCII in its input. Languages that
accept non-ASCII input have always been somewhat esoteric. There's
nothing wrong with being different, but there is something wrong in
being so different that your causing problems or at least speed bumps
for particular users.


> > On top of that, these special characters require more
> > bytes to display than ascii text, which would bloat source files
> > unnecessarily.
>
> Oh come on now, now you're just being silly. "Bloat source files"? From a
> handful of double-byte characters? Cry me a river!
>
> This is truly a trivial worry:
>
> >>> s = "if x >= y:\n"
> >>> u = u"if x ≥ y:\n"
> >>> len(s)
> 11
> >>> len(u.encode('utf-8'))
> 12
>
>
> The source code to the decimal module in Python 3.1 is 205470 bytes in
> size. It contains 63 instances of ">=" and 62 of "<=". Let's suppose
> every one of those were changed to ≥ or ≤ characters. This would "bloat"
> the file by 0.06%.
>
> Oh the humanity!!! How will my 2TB hard drive cope?!?!

A byte saved is a byte earned. What about embedded systems trying to
conserve as much resources as possible?


> > You say we have symbol congestion, but in reality we have our own symbol
> > bloat. Japanese has more or less than three punctuation marks, while
> > English has perhaps more than the alphabet! The fundamental point here
> > is using non-ascii operators violates the Zen of Python. It violates
> > "Simple is better than complex," as well as "There should be one-- and
> > preferably only one --obvious way to do it."
>
> Define "simple" and "complex" in this context.
>
> It seems to me that single character symbols such as ≥ are simpler than
> digraphs such as >=, simply because the parser knows what the symbol is
> after reading a single character. It doesn't have to read on to tell
> whether you meant > or >=.
>
> You can add complexity to one part of the language (hash tables are more
> complex than arrays) in order to simplify another part (dict lookup is
> simpler and faster than managing your own data structure in a list).

I believe dealing with ASCII is simpler than dealing with Unicode, for
reasons on both the developer's and user's side.


> And as for one obvious way, there's nothing obvious about using a | b for
> set union. Why not a + b? The mathematician in me wants to spell set
> union and intersection as a ⋃ b ⋂ c, which is the obvious way to me (even
> if my lousy editor makes it a PITA to *enter* the symbols).

Not all programmers are mathematicians (in fact I'd say most aren't). I
know what those symbols mean, but some people might think "a u b n c ...
what?" | actually makes sense because it relates to bitwise OR in which
bits are turned on. Here's an example just for context:

01010101 | 10101010 = 11111111
{1, 2, 3} | {4, 5, 6} = {1, 2, 3, 4, 5, 6}

For me, someone who is deeply familiar with bitwise operations but not
very familiar with sets, I found the set syntax to be quite easy to
understand.


> The lack of good symbols for operators in ASCII is a real problem. Other
> languages have solved it in various ways, sometimes by using digraphs (or
> higher-order symbols), and sometimes by using Unicode (or some platform/
> language specific equivalent). I think that given the poor support for
> Unicode in the general tools we use, the use of non-ASCII symbols will
> have to wait until Python4. Hopefully by 2020 input methods will have
> improved, and maybe even xorg be replaced by something less sucky.
>
> I think that the push for better and more operators will have to come
> from the Numpy community. Further reading:
>
>
> http://mail.python.org/pipermail/python-dev/2008-November/083493.html
>
>
> --
> Steven
>

You have provided me with some well thought out arguments and have
stimulated my young programmer's mind, but I think we're coming from
different angles. You seem to come from a more math-minded, idealist
angle, while I come from a more practical angle. Being a person who has
had to deal with the í in my last name and Japanese text on a variety of
platforms, I've found the current methods of non-ascii input to be
largely platform-dependent and---for lack of a better word---crappy,
i.e. not suitable for a 'wide-audience' language like Python.

Paul Rubin

unread,
Feb 18, 2011, 9:24:43 PM2/18/11
to
Westley Martínez <anik...@gmail.com> writes:
> When I read Python code, I only
> see text from Latin-1, which is easy to input and every *decent* font
> supports it. When I read C code, I only see text from Latin-1. When I
> read code from just about everything else that's plain text, I only see
> text from Latin-1. Even Latex, which is designed for typesetting
> mathematical formulas, only allows ASCII in its input. Languages that
> accept non-ASCII input have always been somewhat esoteric.

Maybe we'll see more of them as time goes by. C, Python, and Latex all
predate Unicode by a number of years. If Latex were written today it
would probably accept Unicode for math symbols, accented and non-Latin
characters, etc.

Steven D'Aprano

unread,
Feb 19, 2011, 1:29:39 AM2/19/11
to
On Fri, 18 Feb 2011 18:14:32 -0800, Westley Martínez wrote:

>> Besides, Windows and MacOS users will be scratching their head asking
>> "xorg? Why should I care about xorg?"
> Why should I care if my programs run on Windows and Mac? Because I'm a
> nice guy I guess....

Python is a programming language that is operating system independent,
and not just a Linux tool. So you might not care about your Python
programs running on Windows, but believe me, the Python core developers
care about Python running on Windows and Mac OS. (Even if sometimes their
lack of resources make Windows and Mac somewhat second-class citizens.)


>> That's a limitation of the Linux virtual terminal. In 1984 I used to
>> use a Macintosh which was perfectly capable of displaying and inputting
>> non- ASCII characters with a couple of key presses. Now that we're
>> nearly a quarter of the way into 2011, I'm using a Linux PC that makes
>> entering a degree sign or a pound sign a major undertaking, if it's
>> even possible at all. It's well past time for Linux to catch up with
>> the 1980s.
>
> I feel it's unnecessary for Linux to "catch up" simply because we have
> no need for these special characters!

Given that your name is Westley Martínez, that's an astonishing claim!
How do you even write your name in your own source code???

Besides, speak for yourself, not for "we". I have need for them.


> When I read Python code, I only
> see text from Latin-1, which is easy to input

Hmmm. I wish I knew an easy way to input it. All the solutions I've come
across are rubbish. How do you enter (say) í at the command line of a
xterm?

But in any case, ASCII != Latin-1, so you're already using more than
ASCII characters.


> Languages that
> accept non-ASCII input have always been somewhat esoteric.

Then I guess Python is esoteric, because with source code encodings it
supports non-ASCII literals and even variables:

[steve@sylar ~]$ cat encoded.py
# -*- coding: utf-8 -*-
résumé = "Some text here..."
print(résumé)

[steve@sylar ~]$ python3.1 encoded.py
Some text here...


[...]


> A byte saved is a byte earned. What about embedded systems trying to
> conserve as much resources as possible?

Then they don't have to use multi-byte characters, just like they can
leave out comments, and .pyo files, and use `ed` for their standard text
editor instead of something bloated like vi or emacs.

[...]


> I believe dealing with ASCII is simpler than dealing with Unicode, for
> reasons on both the developer's and user's side.

Really? Well, I suppose if you want to define "you can't do this AT ALL"
as "simpler", then, yes, ASCII is simpler.

Using pure-ASCII means I am forced to write extra code because there
aren't enough operators to be useful, e.g. element-wise addition versus
concatenation. It means I'm forced to spell out symbols in full, like
"British pound" instead of £, and use legally dubious work-arounds like
"(c)" instead of ©, and mispell words (including people's names) because
I can't use the correct characters, and am forced to use unnecessarily
long and clumsy English longhand for standard mathematical notation.

If by simple you mean "I can't do what I want to do", then I agree
completely that ASCII is simple.


>> And as for one obvious way, there's nothing obvious about using a | b
>> for set union. Why not a + b? The mathematician in me wants to spell
>> set union and intersection as a ⋃ b ⋂ c, which is the obvious way to me
>> (even if my lousy editor makes it a PITA to *enter* the symbols).
>
> Not all programmers are mathematicians (in fact I'd say most aren't). I
> know what those symbols mean, but some people might think "a u b n c ...
> what?" | actually makes sense because it relates to bitwise OR in which
> bits are turned on.

Not all programmers are C programmers who have learned that | represents
bitwise OR. Some will say "a | b ... what?". I know I did, when I was
first learning Python, and I *still* need to look them up to be sure I
get them right.

In other languages, | might be spelled as any of

bitor() OR .OR. || ∧


[...]


> Being a person who has
> had to deal with the í in my last name and Japanese text on a variety of
> platforms, I've found the current methods of non-ascii input to be
> largely platform-dependent and---for lack of a better word---crappy,

Agreed one hundred percent! Until there are better input methods for non-
ASCII characters, without the need for huge keyboards, Unicode is hard
and ASCII easy, and Python can't *rely* on Unicode tokens.

That doesn't mean that languages like Python can't support Unicode
tokens, only that they shouldn't be the only way to do things. For a long
time Pascal include (* *) as a synonym for { } because not all keyboards
included the { } characters, and C has support for trigraphs:

http://publications.gbdirect.co.uk/c_book/chapter2/alphabet_of_c.html

Eventually, perhaps in another 20 years, digraphs like != and <= will go
the same way as trigraphs. Just as people today find it hard to remember
a time when keyboards didn't include { and }, hopefully they will find it
equally hard to remember a time that you couldn't easily enter non-ASCII
characters.


--
Steven

Westley Martínez

unread,
Feb 19, 2011, 2:41:20 AM2/19/11
to pytho...@python.org
On Sat, 2011-02-19 at 06:29 +0000, Steven D'Aprano wrote:
> On Fri, 18 Feb 2011 18:14:32 -0800, Westley Martínez wrote:
>
> >> Besides, Windows and MacOS users will be scratching their head asking
> >> "xorg? Why should I care about xorg?"
> > Why should I care if my programs run on Windows and Mac? Because I'm a
> > nice guy I guess....
>
> Python is a programming language that is operating system independent,
> and not just a Linux tool. So you might not care about your Python
> programs running on Windows, but believe me, the Python core developers
> care about Python running on Windows and Mac OS. (Even if sometimes their
> lack of resources make Windows and Mac somewhat second-class citizens.)

You didn't seem to get my humor. It's ok; most people don't.


> >> That's a limitation of the Linux virtual terminal. In 1984 I used to
> >> use a Macintosh which was perfectly capable of displaying and inputting
> >> non- ASCII characters with a couple of key presses. Now that we're
> >> nearly a quarter of the way into 2011, I'm using a Linux PC that makes
> >> entering a degree sign or a pound sign a major undertaking, if it's
> >> even possible at all. It's well past time for Linux to catch up with
> >> the 1980s.
> >
> > I feel it's unnecessary for Linux to "catch up" simply because we have
> > no need for these special characters!
>
> Given that your name is Westley Martínez, that's an astonishing claim!
> How do you even write your name in your own source code???
>
> Besides, speak for yourself, not for "we". I have need for them.

The í is easy to input. (Vim has a diacritic feature) It's the funky
mathematical symbols that are difficult.


> > When I read Python code, I only
> > see text from Latin-1, which is easy to input
>
> Hmmm. I wish I knew an easy way to input it. All the solutions I've come
> across are rubbish. How do you enter (say) í at the command line of a
> xterm?

I use this in my xorg.conf:

Section "InputDevice"
Identifier "Keyboard0"
Driver "kbd"
Option "XkbLayout" "us"
Option "XkbVariant" "dvorak-alt-intl"
EndSection

Simply remove 'dvorak-' to get qwerty. It allows you to use the right
Alt key as AltGr. For example:
AltGr+' i = í
AltGr+c = ç
AltGr+s = ß

I don't work on Windows or Mac enough to have figured out how to do on
those platforms, but I'm sure there's a simple way.
Again, it's the funky symbols that would be difficult to input.

> But in any case, ASCII != Latin-1, so you're already using more than
> ASCII characters.
>
>
> > Languages that
> > accept non-ASCII input have always been somewhat esoteric.
>
> Then I guess Python is esoteric, because with source code encodings it
> supports non-ASCII literals and even variables:
>
> [steve@sylar ~]$ cat encoded.py
> # -*- coding: utf-8 -*-
> résumé = "Some text here..."
> print(résumé)
>
> [steve@sylar ~]$ python3.1 encoded.py
> Some text here...

I should reword that to "Languages that require non-ASCII input have
always been somewhat esoteric" i.e. APL.


> [...]
> > A byte saved is a byte earned. What about embedded systems trying to
> > conserve as much resources as possible?
>
> Then they don't have to use multi-byte characters, just like they can
> leave out comments, and .pyo files, and use `ed` for their standard text
> editor instead of something bloated like vi or emacs.

Hey, I've heard of jobs where all you do is remove comments from source
code, believe it or not!


> [...]
> > I believe dealing with ASCII is simpler than dealing with Unicode, for
> > reasons on both the developer's and user's side.
>
> Really? Well, I suppose if you want to define "you can't do this AT ALL"
> as "simpler", then, yes, ASCII is simpler.
>
> Using pure-ASCII means I am forced to write extra code because there
> aren't enough operators to be useful, e.g. element-wise addition versus
> concatenation. It means I'm forced to spell out symbols in full, like
> "British pound" instead of £, and use legally dubious work-arounds like
> "(c)" instead of ©, and mispell words (including people's names) because
> I can't use the correct characters, and am forced to use unnecessarily
> long and clumsy English longhand for standard mathematical notation.
>
> If by simple you mean "I can't do what I want to do", then I agree
> completely that ASCII is simple.

I guess it's a matter of taste. I don't mind seeing my name as
westley_martinez and am so use to seeing **, sqrt(), and / that seeing
the original symbols is a bit foreign!


> >> And as for one obvious way, there's nothing obvious about using a | b
> >> for set union. Why not a + b? The mathematician in me wants to spell
> >> set union and intersection as a ⋃ b ⋂ c, which is the obvious way to me
> >> (even if my lousy editor makes it a PITA to *enter* the symbols).
> >
> > Not all programmers are mathematicians (in fact I'd say most aren't). I
> > know what those symbols mean, but some people might think "a u b n c ...
> > what?" | actually makes sense because it relates to bitwise OR in which
> > bits are turned on.
>
> Not all programmers are C programmers who have learned that | represents
> bitwise OR. Some will say "a | b ... what?". I know I did, when I was
> first learning Python, and I *still* need to look them up to be sure I
> get them right.
>
> In other languages, | might be spelled as any of
>
> bitor() OR .OR. || ∧

Good point, but C is a very popular language.
I'm not saying we should follow C, but we should be aware that that's
where the majority of Python's users are probably coming from (or from
languages with C-like syntax)


> [...]
> > Being a person who has
> > had to deal with the í in my last name and Japanese text on a variety of
> > platforms, I've found the current methods of non-ascii input to be
> > largely platform-dependent and---for lack of a better word---crappy,
>
> Agreed one hundred percent! Until there are better input methods for non-
> ASCII characters, without the need for huge keyboards, Unicode is hard
> and ASCII easy, and Python can't *rely* on Unicode tokens.
>
> That doesn't mean that languages like Python can't support Unicode
> tokens, only that they shouldn't be the only way to do things. For a long
> time Pascal include (* *) as a synonym for { } because not all keyboards
> included the { } characters, and C has support for trigraphs:
>
> http://publications.gbdirect.co.uk/c_book/chapter2/alphabet_of_c.html
>
> Eventually, perhaps in another 20 years, digraphs like != and <= will go
> the same way as trigraphs. Just as people today find it hard to remember
> a time when keyboards didn't include { and }, hopefully they will find it
> equally hard to remember a time that you couldn't easily enter non-ASCII
> characters.
>
>
> --
> Steven

That was good info. I think there is possibility for more symbols, but
not for a long while, and I'll probably never use them if they do become
available, because I don't really care.

Nicholas Devenish

unread,
Feb 19, 2011, 6:00:59 AM2/19/11
to
On 19/02/2011 07:41, Westley Martínez wrote:
> Simply remove 'dvorak-' to get qwerty. It allows you to use the right
> Alt key as AltGr. For example:
> AltGr+' i = í
> AltGr+c = ç
> AltGr+s = ß
>
> I don't work on Windows or Mac enough to have figured out how to do on
> those platforms, but I'm sure there's a simple way.
> Again, it's the funky symbols that would be difficult to input.

On mac, the acute accent is Alt-e + vowel, so Alt-e i.

This seems to work universally, regardless of gui application or
terminal. I don't work with X applications enough to know if they work
there, however.

Nicholas Devenish

unread,
Feb 19, 2011, 6:10:01 AM2/19/11
to
On 18/02/2011 10:26, Steven D'Aprano wrote:
>
> Agreed. I'd like Python to support proper mathematical symbols like ∞ for
> float('inf'), ≠ for not-equal, ≤ for greater-than-or-equal, and ≥ for
> less-than-or-equal.
>

This would be joyful! At least with the subset of operations that
already exist/exist as operators, the possibility of using these
wouldn't affect anyone not using them (like the set/intersection
notation mentioned in another post).

I'm not very optimistic about anything like this ever being accepted
into python main, however (I can't imagine it being terribly complicated
to add to the accepted language, though).

BartC

unread,
Feb 20, 2011, 8:08:27 PM2/20/11
to

"WestleyMartínez" <anik...@gmail.com> wrote in message
news:mailman.202.1298081...@python.org...


> You have provided me with some well thought out arguments and have
> stimulated my young programmer's mind, but I think we're coming from
> different angles. You seem to come from a more math-minded, idealist
> angle, while I come from a more practical angle. Being a person who has
> had to deal with the í in my last name

What purpose does the í serve in your last name, and how is it different
from i?

(I'd have guessed it indicated stress, but it looks Spanish and I thought
that syllable was stressed anyway.)

--
Bartc

alex23

unread,
Feb 20, 2011, 11:52:36 PM2/20/11
to
rantingrick <rantingr...@gmail.com> wrote:
> You lack vision.

And you lack education.

> Evolution is the pursuit of perfection at the expense of anything and
> everything!

Evolution is the process by which organisms change over time through
genetically shared traits. There is no 'perfection', there is only
'fitness', that is, survival long enough to reproduce. Fitness is not
something any of your ideas possess.

The rest of your conjecture about my opinions and beliefs is just pure
garbage. You'd get far fewer accusations of being a troll if you
stopped putting words into other peoples mouths; then we'd just think
you're exuberantly crazy.

Also, Enough! With! The! Hyperbole! Already! "Visionary" is _never_ a
self-appointed title.

Steven D'Aprano

unread,
Feb 21, 2011, 2:16:05 AM2/21/11
to
On Sun, 20 Feb 2011 20:52:36 -0800, alex23 wrote:

> Also, Enough! With! The! Hyperbole! Already! "Visionary" is _never_ a
> self-appointed title.

You only say that because you lack the vision to see just how visionary
rantingrick's vision is!!!!1!11!

Followups set to c.l.p.


--
Steven

Ian

unread,
Feb 18, 2011, 7:27:19 AM2/18/11
to pytho...@python.org
On 18/02/2011 07:50, Chris Jones wrote:
> Always
> struck me as odd that a country like Japan for instance, with all its
> achievements in the industrial realm, never came up with one single
> major piece of software.
I think there are two reasons for this.

1) Written Japanese is so hard that the effective illiteracy rate in
Japan is astonishingly high.

Both UK and Japan claim 99% literacy rate, but UK has 10-20% who find
reading so hard they
don't read. In Japan by some estimates the proportion is about half.
Figures from memory.

Result - too many users struggle to read instructions and titles on
screen. Help texts? Forgetaboutit.

2) Culture. In the West, a designer will decide the architecture of a
major system, and it is a basis
for debate and progress. If he gets it wrong, it is not a personal
disgrace or career limiting. If it is
nearly right, then that is a major success. In Japan, the architecture
has to be a debated and agreed.
This takes ages, costs lots, and ultimately fails. The failure is
because architecture is always a trade off -
there is no perfect answer.

I was involved in one software project where a major Japanese company
had done the the feasibility
study. It was much much too expensive. The UK company I worked for was
able, not only to win the bid,
but complete the job profitably for less than the Japanese giant had
charged for the feasibility study.
We ignored their study - we did not have time to read through the
documentation which took
10 foot of shelf space to house.

Ian

Tim Wintle

unread,
Feb 21, 2011, 6:07:47 AM2/21/11
to hobs...@gmail.com, pytho...@python.org
On Fri, 2011-02-18 at 12:27 +0000, Ian wrote:
> 2) Culture. In the West, a designer will decide the architecture of a
> major system, and it is a basis
> for debate and progress. If he gets it wrong, it is not a personal
> disgrace or career limiting. If it is
> nearly right, then that is a major success. In Japan, the architecture
> has to be a debated and agreed.
> This takes ages, costs lots, and ultimately fails. The failure is
> because architecture is always a trade off -
> there is no perfect answer.

I find this really interesting - we spend quite a lot of time studying
the Toyota production system and seeing how we can do programming work
in a similar way, and it's worked fairly well for us (Kanban, Genchi
Genbutsu, eliminating Muda & Mura, etc).

I would have expected Japanese software to have worked quite smoothly,
with continuous improvement taking in everybody's opinions etc -
although I suppose that if production never starts because the
improvements are done to a spec, rather than the product, it would be a
massive hindrance.

Tim Wintle

Westley Martínez

unread,
Feb 21, 2011, 12:59:44 PM2/21/11
to pytho...@python.org
I don't know. I don't speak Spanish, but to my knowledge it's not a
critical diacritic like in some languages.

Message has been deleted

Westley Martínez

unread,
Feb 21, 2011, 6:34:56 PM2/21/11
to pytho...@python.org
On Mon, 2011-02-21 at 11:28 -0800, rantingrick wrote:
> On Feb 20, 7:08 pm, "BartC" <b...@freeuk.com> wrote:
> > "WestleyMartínez" <aniko...@gmail.com> wrote in message

> >
> > news:mailman.202.1298081...@python.org...
> >
> > > You have provided me with some well thought out arguments and have
> > > stimulated my young programmer's mind, but I think we're coming from
> > > different angles. You seem to come from a more math-minded, idealist
> > > angle, while I come from a more practical angle. Being a person who has
> > > had to deal with the í in my last name
> >
> > What purpose does the í serve in your last name, and how is it different
> > from i?
>
> Simple, it does not have a purpose. Well, that is, except to give the
> *impression* that a few pseudo intellectuals have a reason to keep
> their worthless tenure at universities world wide. It's window
> dressing, and nothing more.

>
> > (I'd have guessed it indicated stress, but it looks Spanish and I thought
> > that syllable was stressed anyway.)
>
> The ascii char "i" would suffice. However some languages fell it
> necessary to create an ongoing tutorial of the language. Sure French
> and Latin can sound "pretty", however if all you seek is "pretty
> music" then listen to music. Language should be for communication and
> nothing more.
Nicely said; you're absolutely right.

Alexander Kapps

unread,
Feb 21, 2011, 6:48:36 PM2/21/11
to pytho...@python.org

http://en.wikipedia.org/wiki/Newspeak


(Babbling Rick is just an Orwellian Nightmare, try to ignore him)

Westley Martínez

unread,
Feb 21, 2011, 7:34:48 PM2/21/11
to pytho...@python.org
I don't quite get what you mean, but I'm just loving the troll.

rusi

unread,
Feb 28, 2011, 12:58:28 PM2/28/11
to
On Feb 17, 3:07 am, Xah Lee <xah...@gmail.com> wrote:
> might be interesting.
>
> 〈Problems of Symbol Congestion in Computer Languages (ASCII Jam;
> Unicode; Fortress)〉http://xahlee.org/comp/comp_lang_unicode.html

Haskell is slowly moving this way see for example
http://www.haskell.org/ghc/docs/latest/html/users_guide/syntax-extns.html#unicode-syntax

But its not so easy the lambda does not work straight off -- see
http://hackage.haskell.org/trac/ghc/ticket/1102

Dotan Cohen

unread,
Feb 28, 2011, 1:39:28 PM2/28/11
to Xah Lee, pytho...@python.org
You miss the canonical bad character reuse case: = vs ==.

Had there been more meta keys, it might be nice to have a symbol for
each key on the keyboard. I personally have experimented with putting
the symbols as regular keys and the numbers as the Shifted versions.
It's great for programming.


--
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com

rusi

unread,
Feb 28, 2011, 10:30:49 PM2/28/11
to
On Feb 28, 11:39 pm, Dotan Cohen <dotanco...@gmail.com> wrote:
> You miss the canonical bad character reuse case: = vs ==.
>
> Had there been more meta keys, it might be nice to have a symbol for
> each key on the keyboard. I personally have experimented with putting
> the symbols as regular keys and the numbers as the Shifted versions.
> It's great for programming.

Hmmm... Clever!
Is it X or Windows?
Can I have your setup?

One problem we programmers face is that keyboards were made for
typists not programmers.
Another is that when we move from 'hi-level' questions eg code reuse
-- to lower and lower -- eg ergonomics of reading and writing code --
the focus goes from the center of consciousness to the periphery and
we miss how many inefficiencies there are in our semi-automatic
actions.

Xah Lee

unread,
Mar 1, 2011, 12:04:57 AM3/1/11
to
On Feb 28, 7:30 pm, rusi <rustompm...@gmail.com> wrote:
> On Feb 28, 11:39 pm, Dotan Cohen <dotanco...@gmail.com> wrote:
>
> > You miss the canonical bad character reuse case: = vs ==.
>
> > Had there been more meta keys, it might be nice to have a symbol for
> > each key on the keyboard. I personally have experimented with putting
> > the symbols as regular keys and the numbers as the Shifted versions.
> > It's great for programming.
>
> Hmmm... Clever!
> Is it X or Windows?
> Can I have your setup?

hi Russ,

there's a programer's dvorak layout i think is bundled with linux.

or you can do it with xmodmap on X-11 or AutoHotKey on Windows, or
within emacs... On the mac, you can use keyboardMaestro, Quickeys, or
just write a os wide config file yourself. You can see tutorials and
sample files for all these here http://xahlee.org/Periodic_dosage_dir/keyboarding.html

i'd be interested to know what Dotan Cohen use too.

i tried the swapping number row with symbols a few years back. didn't
like it so much because numbers are frequently used as well,
especially when you need to enter a series of numbers. e.g. heavy
math, or dates 2010-02-28. One can use the number pad but i use that
as extra programable buttons.

Xah

Dotan Cohen

unread,
Mar 1, 2011, 4:13:39 AM3/1/11
to rusi, pytho...@python.org
On Tue, Mar 1, 2011 at 05:30, rusi <rusto...@gmail.com> wrote:
>> Had there been more meta keys, it might be nice to have a symbol for
>> each key on the keyboard. I personally have experimented with putting
>> the symbols as regular keys and the numbers as the Shifted versions.
>> It's great for programming.
>
> Hmmm... Clever!
> Is it X or Windows?
> Can I have your setup?
>

It's X, on Kubuntu. I've since "destroyed" that layout, but you can
easily play around in /usr/share/X11/xkb/symbols/us or whichever
layout you prefer. I am working on another one, though, actually I
just stared working on it yesterday. It's currently broken (I'm in the
middle of troubleshooting it) but you can see what I currently have
here:
http://dotancohen.com/eng/keyboard_layout.html

> One problem we programmers face is that keyboards were made for
> typists not programmers.

Yes, I'm trying to solve that! Ideally in the end all the brackets
including {} won't need modifier keys. Give me some feedback, please,
on that layout.

Dotan Cohen

unread,
Mar 1, 2011, 4:19:42 AM3/1/11
to Xah Lee, pytho...@python.org
On Tue, Mar 1, 2011 at 07:04, Xah Lee <xah...@gmail.com> wrote:
> hi Russ,
>
> there's a programer's dvorak layout i think is bundled with linux.
>
> or you can do it with xmodmap on X-11 or AutoHotKey on Windows, or
> within emacs... On the mac, you can use keyboardMaestro, Quickeys, or
> just write a os wide config file yourself. You can see tutorials and
> sample files for all these here http://xahlee.org/Periodic_dosage_dir/keyboarding.html
>
> i'd be interested to know what Dotan Cohen use too.
>

You can see what I started working on yesterday, but it's far from finished:
http://dotancohen.com/eng/keyboard_layout.html

I tried reaching you on Skype yesterday, Xah, but I think that you
blocked me suspecting that I may be a bot. Try to Skype-chat with me
at user "dotancohen", I think that we can help each other.


> i tried the swapping number row with symbols a few years back. didn't
> like it so much because numbers are frequently used as well,
> especially when you need to enter a series of numbers. e.g. heavy
> math, or dates 2010-02-28. One can use the number pad but i use that
> as extra programable buttons.
>

I don't like the number pad so I'm looking for another solution. I
wired up a spring-off lightswitch to the Shift key and I operate it
with my foot. It's great but it only works when I'm home: it is too
ridiculous to take with me. I'm wiring up two more for Ctrl an Alt,
too bad it's too cumbersome to have ESC, Enter, and Backspace there as
well.

Mark Thomas

unread,
Mar 1, 2011, 8:01:22 AM3/1/11
to
I know someone who was involved in creating a language called A+. It
was invented at Morgan Stanley where they used Sun keyboards and had
access to many symbols, so the language did have set symbols, math
symbols, logic symbols etc. Here's a keyboard map including the
language's symbols (the red characters). http://www.aplusdev.org/keyboard.html

I have no idea if this language is still in use.

rusi

unread,
Mar 1, 2011, 9:46:19 AM3/1/11
to
On Mar 1, 6:01 pm, Mark Thomas <m...@thomaszone.com> wrote:
> I know someone who was involved in creating a language called A+. It
> was invented at Morgan Stanley where they used Sun keyboards and had
> access to many symbols, so the language did have set symbols, math
> symbols, logic symbols etc. Here's a keyboard map including the
> language's symbols (the red characters).http://www.aplusdev.org/keyboard.html

>
> I have no idea if this language is still in use.

Runs (ok limps) under debian/ubuntu -- see http://packages.debian.org/squeeze/aplus-fsf

My own attempts at improving the scene http://www.emacswiki.org/emacs/AplInDebian

If anyone has any further findings on this, I'd be happy to know.

Chris Jones

unread,
Mar 1, 2011, 6:40:17 PM3/1/11
to pytho...@python.org

Well.. a couple months back I got to the point where I'd really had it
with the anglo-centric verbosity of common programming languages (there
are days when even python makes me think of COBOL.. ugh..) and I took
a look at A+.

At first it looks like something MS (Morgan Stanley..) dumped into the
OSS lap fifteen years ago and nobody ever used it or maintained it.. so
it takes a bit of digging to make it.. sort of work in current GNU/linux
distributions.. especially since it knows nothing about Unicode.

Here's the X/A+ map I came up with:

// A+ keyboard layout: /usr/share/X11/xkb/symbols/apl
// Chris Jones - 18/12/2010

// Enable via:
// $ setxkbmap -v 10 apl

default
partial alphanumeric_keys modifier_keys
xkb_symbols "APL" {

name[Group1]= "APL";

// Alphanumeric section
key <TLDE> { [ grave, asciitilde, 0x010000fe, 0x0100007e ] };
key <AE01> { [ 1, exclam, 0x010000a1, 0x010000e0 ] };
key <AE02> { [ 2, at, 0x010000a2, 0x010000e6 ] };
key <AE03> { [ 3, numbersign, 0x0100003c, 0x010000e7 ] };
key <AE04> { [ 4, dollar, 0x010000a4, 0x010000e8 ] };
key <AE05> { [ 5, percent, 0x0100003d, 0x010000f7 ] };
key <AE06> { [ 6, asciicircum, 0x010000a6, 0x010000f4 ] };
key <AE07> { [ 7, ampersand, 0x0100003e, 0x010000e1 ] };
key <AE08> { [ 8, asterisk, 0x010000a8, 0x010000f0 ] };
key <AE09> { [ 9, parenleft, 0x010000a9, 0x010000b9 ] };
key <AE10> { [ 0, parenright, 0x0100005e, 0x010000b0 ] };
key <AE11> { [ minus, underscore, 0x010000ab, 0x01000021 ] };
key <AE12> { [ equal, plus, 0x010000df, 0x010000ad ] };

key <AD01> { [ q, Q, 0x0100003f, 0x010000bf ] };
key <AD02> { [ w, W, 0x010000d7, Nosymbol ] };
key <AD03> { [ e, E, 0x010000c5, 0x010000e5 ] };
key <AD04> { [ r, R, 0x010000d2, Nosymbol ] };
key <AD05> { [ t, T, 0x0100007e, Nosymbol ] };
key <AD06> { [ y, Y, 0x010000d9, 0x010000b4 ] };
key <AD07> { [ u, U, 0x010000d5, Nosymbol ] };
key <AD08> { [ i, I, 0x010000c9, 0x010000e9 ] };
key <AD09> { [ o, O, 0x010000cf, 0x010000ef ] };
key <AD10> { [ p, P, 0x0100002a, 0x010000b3 ] };
key <AD11> { [ bracketleft, braceleft, 0x010000fb, 0x010000dd ] };
key <AD12> { [ bracketright, braceright, 0x010000fd, 0x010000db ] };

key <AC01> { [ a, A, 0x010000c1, Nosymbol ] };
key <AC02> { [ s, S, 0x010000d3, 0x010000be ] };
key <AC03> { [ d, D, 0x010000c4, Nosymbol ] };
key <AC04> { [ f, F, 0x0100005f, 0x010000bd ] };
key <AC05> { [ g, G, 0x010000c7, 0x010000e7 ] };
key <AC06> { [ h, H, 0x010000c8, 0x010000e8 ] };
key <AC07> { [ j, J, 0x010000ca, 0x010000ea ] };
key <AC08> { [ k, K, 0x01000027, Nosymbol ] };
key <AC09> { [ l, L, 0x010000cc, 0x010000ec ] };
key <AC10> { [ semicolon, colon, 0x010000db, 0x010000bc ] };
key <AC11> { [ apostrophe, quotedbl, 0x010000dd, 0x010000bb ] };

key <AB01> { [ z, Z, 0x010000da, 0x010000fa ] };
key <AB02> { [ x, X, 0x010000d8, Nosymbol ] };
key <AB03> { [ c, C, 0x010000c3, 0x010000e3 ] };
key <AB04> { [ v, V, 0x010000d6, Nosymbol ] };
key <AB05> { [ b, B, 0x010000c2, 0x010000e2 ] };
key <AB06> { [ n, N, 0x010000ce, 0x010000ee ] };
key <AB07> { [ m, M, 0x0100007c, 0x010000cd ] };
key <AB08> { [ comma, less, 0x010000ac, 0x0100003c ] };
key <AB09> { [ period, greater, 0x010000dc, 0x010000ae ] };
key <AB10> { [ slash, question, 0x010000af, 0x0100003f ] };

key <BKSL> { [ backslash, bar, 0x010000dc, 0x010000fc ] };
key <CAPS> { [ Caps_Lock ] };
// End alphanumeric section

include "level3(win_switch)"
include "level3(menu_switch)"
};

In fine.. you fire up an xterm.. issue a ‘setxkbmap apl’ command from
the shell prompt and you're in business.

I used it daily for about a month before I switched to APLX - aka micro
APL.. and as I had zero problems.. So, I suspect it is 100% A+
compatible.

Initially, I thought of writing a python wrapper that would handle
conversion from Unicode to A+'s peculiar brand of latin1 and back (among
other things) but never had the time.

cj

Xah Lee

unread,
Mar 3, 2011, 7:29:08 AM3/3/11
to

a few tips i hope is helpful.

unicode has the complete set of APL chars.

APL ⌶ ⌷ ⌸ ⌹ ⌺ ⌻ ⌼ ⌽ ⌾ ⌿ ⍀ ⍁ ⍂ ⍃ ⍄ ⍅ ⍆ ⍇ ⍈ ⍉ ⍊ ⍋ ⍌ ⍍ ⍎ ⍏ ⍐ ⍑ ⍒ ⍓ ⍔ ⍕ ⍖
⍗ ⍘ ⍙ ⍚ ⍛ ⍜ ⍝ ⍞ ⍟ ⍠ ⍡ ⍢ ⍣ ⍤ ⍥ ⍦ ⍧ ⍨ ⍩ ⍪ ⍫ ⍬ ⍭ ⍮ ⍯ ⍰ ⍱ ⍲ ⍳ ⍴ ⍵ ⍶ ⍷ ⍸ ⍹
⍺ ⎕

and much more, of course.

see: 〈Computing Symbols in Unicode〉
http://xahlee.org/comp/unicode_computing_symbols.html

if you are a emacs user, you might try my

〈Emacs Unicode Math Symbols Input Mode (xmsi-mode)〉
http://xahlee.org/emacs/xmsi-math-symbols-input.html

or, you can create your own rather easily. There are few ways depends
on how you want the input to work.

you can insert unicode by abbrev, such as typing “alpha” auto expands
to α or “->” auto becomes →.

Sample code here:
〈Using Emacs's Abbrev Mode for Abbreviation〉
http://xahlee.org/emacs/emacs_abbrev_mode.html

or, you can setup systematic keys in emacs. e.g.
hold down Win key and any letter becomes the ones in APL keyboard.
Here's code example:

〈Emacs Custom Keybinding to Enhance Productivity〉
http://xahlee.org/emacs/emacs_useful_user_keybinding.html

if you are on OS X, you can also setup system wide config to enter
complete custom designed unicode layout. Super easy too (though there
are a few problem with certain key combo, because they are low level
that Apple don't want changed). See:

〈Creating Keyboard Layout in Mac OS X〉
http://xahlee.org/emacs/osx_keybinding.html

all of the above ways lets you direcetly use unicode symbols, so you
can view what you are defining in your keybinding source code.

Xah

Xah Lee

unread,
Mar 5, 2011, 4:32:25 AM3/5/11
to
On Mar 1, 3:40 pm, Chris Jones <cjns1...@gmail.com> wrote:

hi Chris,

i created a page dedicated to creating math symbol layouts for
different langs.
I linked to your post.

I wonder if you would let me mirror your X code on my site? Or, if you
place it on somewhere more permanent or dedicate page such as git, i'd
link to that. Thanks.

Xah

Albert van der Horst

unread,
Mar 5, 2011, 7:43:20 AM3/5/11
to
In article <mailman.487.1298918...@python.org>,

Dotan Cohen <dotan...@gmail.com> wrote:
>You miss the canonical bad character reuse case: = vs ==.
>
>Had there been more meta keys, it might be nice to have a symbol for
>each key on the keyboard. I personally have experimented with putting
>the symbols as regular keys and the numbers as the Shifted versions.
>It's great for programming.

People might be interested in the colorforth solution:

This goes the other way: characters are limited (lowercase
and few special chars) to what is needed for programming.
So the fingers never need to leave the home position,
reaching about 30 chars at most.
Different uses (defining a function versus using a function)
are indicated by color, so don't use up char's.

http://www.colorforth.com

I was forced to use it (a development environment required it)
and it is not as bad as it sounds.

>--
>Dotan Cohen

Groetjes Albert

--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Robert Maas, http://tinyurl.com/uh3t

unread,
Mar 13, 2011, 4:52:24 AM3/13/11
to
> From: rantingrick <rantingr...@gmail.com>
> Anyone with half a brain understands the metric system is far
> superior (on many levels) then any of the other units of
> measurement.

Anyone with a *whole* brain can see that you are mistaken. The
current "metric" system has two serious flaws:

It's based on powers of ten rather than powers of two, creating a
disconnect between our communication with computers (in decimal)
and how computers deal with numbers internally (in binary). Hence
the confusion newbies have as to why if you type into the REP loop
(+ 1.1 2.2 3.3)
you get out
6.6000004

The fundamental units are absurd national history artifacts such as
the French "metre" stick when maintained at a particular
temperature, and the Grenwich Observatory "second" as 1/(24*60*60)
of the time it took the Earth to rotate once relative to a
line-of-sight to the Sun under some circumstance long ago.

And now these have been more precisely defined as *exactly* some
inscrutable multiples of the wavelength and time-period of some
particular emission from some particular isotope under certain
particular conditions:
http://en.wikipedia.org/wiki/Metre#Standard_wavelength_of_krypton-86_emission
(that direct definition replaced by the following:)
http://en.wikipedia.org/wiki/Metre#Speed_of_light
"The metre is the length of the path travelled by light in vacuum
during a time interval of ^1/[299,792,458] of a second."
http://en.wikipedia.org/wiki/Second#Modern_measurements
"the duration of 9,192,631,770 periods of the radiation corresponding to
the transition between the two hyperfine levels of the ground state of
the caesium-133 atom"
Exercise to the reader: Combine those nine-decimal-digit and
ten-decimal-digit numbers appropriately to express exactly how many
wavelengths of the hyperfine transition equals one meter.
Hint: You either multiply or divide, hence if you just guess you
have one chance out of 3 of being correct.

Steven D'Aprano

unread,
Mar 13, 2011, 8:54:31 AM3/13/11
to
On Sun, 13 Mar 2011 00:52:24 -0800, Robert Maas, http://tinyurl.com/uh3t
wrote:

> Exercise to the reader: Combine those nine-decimal-digit and
> ten-decimal-digit numbers appropriately to express exactly how many
> wavelengths of the hyperfine transition equals one meter. Hint: You
> either multiply or divide, hence if you just guess you have one chance
> out of 3 of being correct.


Neither. The question is nonsense. The hyperfine transition doesn't have
a wavelength. It is the radiation emitted that has a wavelength. To work
out the wavelength of the radiation doesn't require guessing, and it's
not that complicated, it needs nothing more than basic maths.

Speed of light = 1 metre travelled in 1/299792458 of a second
If 9192631770 periods of the radiation takes 1 second, 1 period takes
1/9192631770 of a second.

Combine that with the formula for wavelength:
Wavelength = speed of light * period
= 299792458 m/s * 1/9192631770 s
= 0.03261225571749406 metre


Your rant against the metric system is entertaining but silly. Any
measuring system requires exact definitions of units, otherwise people
will disagree on how many units a particular thing is. The imperial
system is a good example of this: when you say something is "15 miles",
do you mean UK statute miles, US miles, survey miles, international
miles, nautical miles, or something else? The US and the UK agree that a
mile is exactly 1,760 yards, but disagree on the size of a yard. And
let's not get started on fluid ounces (a measurement of volume!) or
gallons...

The metric system is defined to such a ridiculous level of precision
because we have the technology, and the need, to measure things to that
level of precision. Standards need to be based on something which is
universal and unchanging. Anybody anywhere in the world can (in
principle) determine their own standard one metre rule, or one second
timepiece, without arguments about which Roman soldier's paces defines a
yard, or which king's forearm is a cubit.


Follow-ups set to comp.lang.python.


--
Steven

Reply all
Reply to author
Forward
0 new messages