Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
A proposal for Unicode variable and atom names in Erlang.
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 176 - 196 of 196 - Collapse all  -  Translate all to Translated (View all originals) < Older 
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Chris Hicks  
View profile  
 More options Oct 26 2012, 2:04 am
From: Chris Hicks <silent_vende...@hotmail.com>
Date: Thu, 25 Oct 2012 23:04:47 -0700
Local: Fri, Oct 26 2012 2:04 am
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.

While I can understand the frustration that some people seem to fear, I can't help but think that there are solutions to problems that seem to be ignored. For example, if the language has full unicode support and tell the difference between characters which look the same to us, but are obviously not to the machine...how hard would it be to create a parser for a src file, or input, which literally does nothing more than turn the code into strings of the unicode value. Don't want the entire file? How about just the functions, listed in order? How about just one function matching the input just passed in. There just seem to be so many ways that a unicode capable system could compensate for our human failings.
Now, you can absolutely tell me that implementing a system that is aware and capable in that way is a royal pain in the ass and I would have zero experience with unicode to counter your argument. But if you could have your VM default to accept only a specific encoding and to throw warnings and stop, or continue depending on your needs, compiling/running a module then you could, with virtually no effort, make sure that at least your code is only interacts with other code that you expect.
That one example is hardly a complete system but the point is this. If expanding the language respects the cultural wishes of peoples and laws of other countries without detracting from the experience of those people who don't want to be bothered with it then don't we at least need to give it a serious look? If it's technically impossible without seriously compromising the integrity of the language then sure, don't do it, but even if it's difficult but can be done without degrading the experience of others...why not?
Chris.

_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Vlad Dumitrescu  
View profile  
 More options Oct 26 2012, 2:32 am
From: Vlad Dumitrescu <vladd...@gmail.com>
Date: Fri, 26 Oct 2012 08:31:34 +0200
Local: Fri, Oct 26 2012 2:31 am
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.
<snip>100+ of posts about using Unicode (some mine)</snip>

Oh, so this is what the Maya predicted... _now_ I get it!

:-)

regards,
Vlad
_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Pierpaolo Bernardi  
View profile  
 More options Oct 26 2012, 4:12 am
From: Pierpaolo Bernardi <olopie...@gmail.com>
Date: Fri, 26 Oct 2012 10:12:31 +0200
Local: Fri, Oct 26 2012 4:12 am
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.

On Fri, Oct 26, 2012 at 5:32 AM, Richard O'Keefe <o...@cs.otago.ac.nz> wrote:

> On 22/10/2012, at 11:45 PM, Yurii Rashkovskii wrote:

>> Also, consider this: there are characters that look the same but encoded differently.

> You did read the part of the proposal that said to normalise?

I think Yurii meant cases like latin 'a' and cyrillic 'а', for example.

They are certainly a problem. But mixing scripts in this way can only
be done maliciously.  In normal cases is clear from context which
script is being used.

Also, we already have a similar problem without leaving ASCII:  in
many fonts, I and l are indistinguishable...

Пoкa
P.
_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kunthar  
View profile  
 More options Oct 26 2012, 5:07 am
From: Kunthar <kunt...@gmail.com>
Date: Fri, 26 Oct 2012 12:06:32 +0300
Local: Fri, Oct 26 2012 5:06 am
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.
> On 23/10/2012, at 4:41 AM, Henning Diedrich wrote:
>> But how many programmers do we really know who don't speak English?

I would like to kindly response to Mr. Henning Diedrich:

Sen ne þekil bir denyosun caným kardeþim?

--
BR,
\|/ Kunthar
_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Richard O'Keefe  
View profile  
 More options Oct 29 2012, 11:49 pm
From: "Richard O'Keefe" <o...@cs.otago.ac.nz>
Date: Tue, 30 Oct 2012 16:49:03 +1300
Local: Mon, Oct 29 2012 11:49 pm
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.

On 22/10/2012, at 6:08 PM, Yurii Rashkovskii wrote:

> Richard,

> Please excuse my ignorance, but can you name a single good reason for non-latin atoms and variable names?

There are literally billions of people on this planet
who are most comfortable reading and writing non-Latin scripts,
and many of them write programs.

> From my personal point of view, this is a sure road to hell.

Once Erlang accepted non-ASCII characters, it had already gone
at least half way down that road.  I note that IBM mainframe
programming languages like Fortran and PL/I supported DBCS
(double-byte character set) characters decades ago, and somehow,
hell completely failed to materialise.

_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Richard O'Keefe  
View profile  
 More options Oct 30 2012, 12:11 am
From: "Richard O'Keefe" <o...@cs.otago.ac.nz>
Date: Tue, 30 Oct 2012 17:11:06 +1300
Local: Tues, Oct 30 2012 12:11 am
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.

On 22/10/2012, at 7:44 PM, Rustom Mody wrote:

> 1.
> Python made a choice to embrace unicode more thoroughly in going from python 2 to python 3.  This seems to have caused some grief in that 'ASCII' code that used to work in python 2 now often does not in python 3. Maybe this has nothing to do with Richard's EEP because that is about the string data structure this is about variable names. Still just mentioning.

Can you be more specific?  Each ASCII character has the same numeric value
in Unicode, and an ASCII string represented as UTF-8 is exactly the same
sequence of bytes.  I can't help wondering if "ASCII" here really means
some 8-bit character set rather than ASCII.

> In all fairness (for Yurii's points) I should mention:
> 1. I was typing this on a windows box and could not see the characters until I switched to linux
> 2. Our computers may become completely, effortlessly unicode-capable someday, our keyboards will never. So to the extent that code is meant to be written, ASCII will always trump.  To the extent that it is to be read, a richer (within limits) character set has its attractions.

You are assuming that everyone who is using a keyboard is using a US keyboard.
That's not true.  For example, on a visit to Sweden, I was allowed to use my
host's computer to read my mail remotely, and my fingers kept tripping up
because it was a Swedish keyboard with lots of non-ASCII characters.
Heck, my wife has an iPad, and I have one on loan from the department, and
both of them have Greek keyboards installed, making it pretty much effortless
to type Greek, which I assure you is NOT ASCII.  It's just a matter of touching
the globe symbol and flicking over to the other keyboard.  This is old technology.
The Xerox D-machines had fast-switch virtual keyboards back in the 1980s.
It takes two mouse movements to switch from a US keyboard to a Greek one on my
desktop Mac (or to a Hebrew one or a Russian one or ...).

RIGHT NOW, our keyboards ARE completely, effortlessly non-Latin-1 capable.

Nobody is suggesting that any one programmer will want to use all 100,000+
Unicode characters in the same document.  What is suggested is that some
programmers, who can effortlessly type Russian on their Russian keyboard or
Gujarati on the Gujarati keyboard -- both of which Windows supports -- and
see that on their screen, should be able to do so.

I cannot for the life of my understand why, at this late date, anyone should
for an instant suppose that only ASCII can be easily typed.

As it happens, for my national needs, the Mac _does_ have Māori keyboard
support.  It's two mouse movements to switch from US keyboard to Māori one,
and then getting a vowel with a macron is just a matter of pressing the
Option key while typing the vowel.  A Māori student would have little reason
ever to switch over to the US keyboard.  I can certainly type words like
kurī and kīrehe and Ākarana without taking my fingers from the keyboard.

The idea that "ASCII will always trump" on account of being easier to type
deserves some kind of award for wrongness.


_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Richard O'Keefe  
View profile  
 More options Oct 30 2012, 12:18 am
From: "Richard O'Keefe" <o...@cs.otago.ac.nz>
Date: Tue, 30 Oct 2012 17:18:19 +1300
Local: Tues, Oct 30 2012 12:18 am
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.

On 22/10/2012, at 8:33 PM, Michael Uvarov wrote:

> What is the problem about unicode variables is that some characters
> are not equal: Х != X, but they look the same.

This would be a persuasive argument IF
(a) we did not already allow both XO and X0, Xl and X1, and so on;
(b) mixed scripts in a single token were plausible.
Neither is the case.

> Other problem about unicode is that a lot of algorithms are
> locale-based and difficult (a lot of rules and exceptions).

None of those algorithms applies to the current topic,
except for normalisation, which is not locale-based.

> Even non-locale based (unified and simple version of to_lower) contains this:
> - Contains additional case mappings that map to more than one
> character, such as "ß" to "SS".

That already applies to Latin-1, which Erlang supports RIGHT NOW.
(Nit-pick: that's an example of to_upper.)

> - Characters may have case mappings that depend on the locale.
>  For example, in Turkish the letter U+0049 "I" capital letter i
> lowercases to U+0131 "ı" small dotless i.

Indeed.  But since neither variable names nor unquoted atoms are
subjected to any kind of case mapping by the Erlang parser, how
is that relevant _here_?

You're mainly talking about problems with Unicode *data*, and
we don't have any option about dealing with those.

_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Richard O'Keefe  
View profile  
 More options Oct 30 2012, 12:33 am
From: "Richard O'Keefe" <o...@cs.otago.ac.nz>
Date: Tue, 30 Oct 2012 17:33:04 +1300
Local: Tues, Oct 30 2012 12:33 am
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.

On 22/10/2012, at 8:48 PM, Michael Uvarov wrote:

> There is no such thing as "language" in Unicode.

Actually, that's not quite true.  Unicode *does* include so-called
"language tags", so it is perfectly possible to mark up sections
of text with the language they are supposed to be in, all in straight
Unicode.

> "language" is a locale.

No, locale is more specific than that.  A locale is a script, a language,
a set of cultural conventions for writing numbers and money and dates,
and so on.  An "English" phone book and an "English" dictionary would
use different locales, because they use different rules for sorting.

> Locale-based algorithms are difficult and each
> character can have different meaning for each locale.

Locale-based algorithms are difficult, true.

Give one example of a character that has a different meaning
in two locales.  OK, character stand for different *sounds*
in different languages, but there is no case I can find in Unicode
where the class a character belongs to depends on the locale.

> There are a lot of cases, when I even cannot say which case a variable
> is in.

Tell me just ONE.  Hint: there aren't _any_ such cases.
Each defined Unicode character has one and only one class, and that
class is not in any way locale- or context-dependent.

> How I will detect is it a variable or an atom?

The proposal you are claiming to comment on gives a precise,
unambiguous, and natural way to do so, which is consistent with
other programming languages making a case distinction.

> Here is an example:
> I want to write a module in Turkish, then the  "length" id will be a
> variable, not a function.

What on earth are you talking about?  Lower case l is a lower case
letter, whether you're writing English, Turkish, or Old High Martian.

> Using code, written in few languages will be a hell.

WE ALREADY HAVE THAT POSSIBILITY RIGHT NOW.

You could, *right now*, have a module containing words from a
couple of dozen languages.  Imagine a mixture of English,
Swedish, Irish, Klingon, and Latino Sine Flexione.

Guess what?  IT DOESN'T HAPPEN!

At most we get a mixture of English and one other language.

_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rustom Mody  
View profile  
 More options Nov 3 2012, 6:18 am
From: Rustom Mody <rustompm...@gmail.com>
Date: Sat, 3 Nov 2012 15:48:27 +0530
Local: Sat, Nov 3 2012 6:18 am
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.

On Tue, Oct 30, 2012 at 12:24 PM, Stephen Hansen
<me+list/erl...@ixokai.io>wrote:

I was not referring to the semantic incompatibilities introduced going
python 2 to 3
I was referring to the the (claims that) python 3 is slower than 2
as for example here:
http://mail.python.org/pipermail/python-list/2012-August/629317.html (and
whole thread)

Can these problems be addressed? Of course.
Are they directly related to this EEP? Probably not...
I was just mentioning them so that Erlang can learn from python's mistakes.

Basically python has chosen a 'flexible string representation"
http://www.python.org/dev/peps/pep-0393/
which does the magic of using only 1 byte for ascii, 2 for bmp and 4 for
the rest (Unicode 2.0 onwards)
In the process however (of detecting the optimal char-width) some inner
loops seem to have got less efficient (my guess; dont know for sure)
So python has traded time for space.
A command-line option to choose string-engine at start time could solve
this problem.
[Though in a world where one erlang node talking to another is a very
normal usecase, this could cause its own challenges]

Also 32 bits for 'wide' unicode is wasteful, given that the number of
unicode codepoints is 1114112.
1114112 = 17*2^16 < 32*2^16 = 2^21 < 2^24 < 2^32
IOW an acceptable width could be 3 bytes and at 21 bits one could even pack
3 chars into 64 bits

_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Richard O'Keefe  
View profile  
 More options Nov 4 2012, 5:34 pm
From: "Richard O'Keefe" <o...@cs.otago.ac.nz>
Date: Mon, 5 Nov 2012 11:34:30 +1300
Local: Sun, Nov 4 2012 5:34 pm
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.

On 3/11/2012, at 11:18 PM, Rustom Mody wrote:

> I was referring to the the (claims that) python 3 is slower than 2
> as for example here: http://mail.python.org/pipermail/python-list/2012-August/629317.html (and whole thread)

> Can these problems be addressed? Of course.
> Are they directly related to this EEP? Probably not...

Certainly not.  EEP 40 is about the *lexical structure* of
Erlang *variables* and *unquoted atoms*.  This is something
that happens at compile time.  The speed of the Erlang *compiler*
will almost certainly be affected by the adoption of Unicode.
The speed of the run-time will be affected to the extent that
atom_to_list and list_to_atom will need changing to allow
Unicode characters in atom names, but that's going to happen
anyway; all EEP 40 has to say about that is which ones don't
need quotation marks

_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Steve Davis  
View profile  
 More options Nov 4 2012, 7:57 pm
From: Steve Davis <steven.charles.da...@gmail.com>
Date: Sun, 4 Nov 2012 16:57:38 -0800 (PST)
Local: Sun, Nov 4 2012 7:57 pm
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.

I'm personally looking forward to attempting to maintain open source kanji.
An awesome challenge.

_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Toby Thain  
View profile  
 More options Nov 4 2012, 8:02 pm
From: Toby Thain <t...@telegraphics.com.au>
Date: Sun, 04 Nov 2012 20:02:37 -0500
Local: Sun, Nov 4 2012 8:02 pm
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.
On 04/11/12 7:57 PM, Steve Davis wrote:

> I'm personally looking forward to attempting to maintain open source
> kanji. An awesome challenge.

Is it that a lot of people on this thread don't read ROK's posts? Or is
there another explanation for what just looks like wilful obtuseness?

--T


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Henning Diedrich  
View profile  
 More options Nov 4 2012, 10:42 pm
From: Henning Diedrich <hd2...@eonblast.com>
Date: Mon, 5 Nov 2012 04:42:45 +0100
Local: Sun, Nov 4 2012 10:42 pm
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.

On Oct 30, 2012, at 5:33 AM, Richard O'Keefe <o...@cs.otago.ac.nz> wrote:

>> Here is an example:
>> I want to write a module in Turkish, then the  "length" id will be a
>> variable, not a function.

> What on earth are you talking about?  Lower case l is a lower case
> letter, whether you're writing English, Turkish, or Old High Martian.

My point, maybe Michaels in way, too, was this:

1> Iength = length.
length
2> Ienght.
* 1: variable 'Ienght' is unbound
3> length = Iength.
length
4> Iength.
length

Fun factor depends on font you're using.

"Yes, exactly" to your second and fifth thought.

I think it matters to minimize the number of things to watch out for.

Henning
_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Richard O'Keefe  
View profile  
 More options Nov 5 2012, 12:49 am
From: "Richard O'Keefe" <o...@cs.otago.ac.nz>
Date: Mon, 5 Nov 2012 18:49:29 +1300
Local: Mon, Nov 5 2012 12:49 am
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.

On 5/11/2012, at 4:42 PM, Henning Diedrich wrote:

That's because the last two letters were swapped.  There's nothing
here to do with Turkish.  (For that matter, while Turkish has an
extra dotted capital I İ and an extra dotless small i ı, it uses
the same dotless capital I that we do, it's just the capital of a
dotless small i.)

> 3> length = Iength.
> length
> 4> Iength.
> length

> Fun factor depends on font you're using.

To quote a Pogo strip, "you have the wrong mistake".

I am reminded of a burglar indignantly protesting his
innocence:  "I didn't rob *THAT* house" (but don't ask
me about the one next door).

We *already* have confusable characters in Latin-1:
i/l/1, o/O/0 -- I'm seeing a slashed zero here and
very much wish I weren't because that's not how I was taught
to write a zero -- 2/Z, s/5, and if you had to read the handwriting
I'm reading during marking, you'd wonder if there were _any_ two
characters that couldn't be confused.  (There was a time when
Australian school-children were _taught_ to write unclosed
small "p" letters so they looked like long-tailed "r".  Why?)
So our burglar is saying "I don't have THOSE [Unicode] confusable
characters" (just don't ask me about all the others I do have).

If you are talking about the confusability of characters,
you could bring in CAPITAL A WITH RING ABOVE and ANGSTROM SIGN,
or for that matter the already mentioned Latin capital A,
Cyrillic capital A, and Greek capital alpha, all of which look
exactly the same.

If we once allow any kind of vaguely stringly-like thing to
include Unicode characters, we are *going* to have the problem
of confusible letters in data.  You could restrict identifiers
to be sequences of a/A characters and we'd still have the
problem in data.

Of all places, the very topmost *safest* place to have the
problem is in Erlang variable names, because of the singleton
style check.  The next safest is probably in function names.
These are places where the compiler will _tell_ us if things
do not match up.

Suppose someone writes

Ο_Φόβος = ο_φονιάς(του_μυαλού)

Yes, the Ο and ο will look like an O and an o, so someone
_could_ trick you.  But they won't be TRYING to.  And if they
_do_ type too much with the wrong keyboard set (as I did while
typing this!) the compiler will tell them.

And all this cowering in fear at the very time that we're seeing
more and more type checking in Erlang, checking that would quite
certainly catch such mistakes very well.  Makes you wonder about
people, really it does.

[The example is as close as I could get to 'Fear is the mind-killer'.]

_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Henning Diedrich  
View profile  
 More options Nov 5 2012, 6:23 am
From: Henning Diedrich <hd2...@eonblast.com>
Date: Mon, 5 Nov 2012 12:23:06 +0100
Local: Mon, Nov 5 2012 6:23 am
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.

On Nov 5, 2012, at 6:49 AM, "Richard O'Keefe" <o...@cs.otago.ac.nz> wrote:

> To quote a Pogo strip, "you have the wrong mistake".

That was the very point. In the instance I can do without more choice.

Henning
_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Steve Davis  
View profile  
 More options Nov 5 2012, 9:28 am
From: Steve Davis <steven.charles.da...@gmail.com>
Date: Mon, 5 Nov 2012 08:28:51 -0600
Local: Mon, Nov 5 2012 9:28 am
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.
It just seemed to be a short way to encapsulate the issues I see with the issue. There's no doubt that ROK's posts were far more detailed but a casual reader may miss the point. I have no doubt that if a coder can write in their native language then they would choose to do that more times than not. There's also no real reason that module or function names should not be "unicoded"... so the intent of the entire source could be natural-language encoded and balkanize the codebase. I'm not sure what the solution is, but is a gradual move towards introducing the ability to express source in natural language a solution to this problem? I'm not at all convinced of that.

On Nov 4, 2012, at 7:02 PM, Toby Thain <t...@telegraphics.com.au> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jesper Louis Andersen  
View profile  
 More options Nov 5 2012, 11:41 am
From: Jesper Louis Andersen <jesper.louis.ander...@erlang-solutions.com>
Date: Mon, 5 Nov 2012 17:41:35 +0100
Local: Mon, Nov 5 2012 11:41 am
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.

On Nov 5, 2012, at 12:23 PM, Henning Diedrich <hd2...@eonblast.com> wrote:

> On Nov 5, 2012, at 6:49 AM, "Richard O'Keefe" <o...@cs.otago.ac.nz> wrote:

>> To quote a Pogo strip, "you have the wrong mistake".

> That was the very point. In the instance I can do without more choice.

I think many of these problems about confusion is largely a non-issue. Given languages which actually allows you to write full, unnormalized unicode, Google Go comes to mind, I see very few such actual problems in programs.

We already have these kinds of problems: Tabs vs spaces are not distinguishable. Neither are trailing white space. There are characters which are hard to recognize - and some fonts make it a priority to make them apart, like Richard said.

What I think is the key point is that I may be able to express certain things better with a larger symbol table. Yes, this also means I can obfuscate more, but I honestly only need the Erlang Pre-processor to win that battle of obfuscation.

Jesper Louis Andersen
  Erlang Solutions Ltd., Copenhagen

_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Steve Davis  
View profile  
 More options Nov 5 2012, 7:35 pm
From: Steve Davis <steven.charles.da...@gmail.com>
Date: Mon, 5 Nov 2012 18:35:29 -0600
Local: Mon, Nov 5 2012 7:35 pm
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.
OK, so I can't resist this example:

Suppose the author writing in a natural language where the *exact same unicode characters* have entirely different semantics?

Map = ...

In Dutch, "Map" translates to "Folder" for an English speaker -- but the kick is that the Dutch also happen to be amazing English speakers - so it could mean what you expect a map to be or not. So the naming in the source means precisely nothing and does not help you (no matter how much post-processing you may choose to apply).

I have enough of a hard time with computer languages without having to know over 200 natural languages to boot.

Is the right decision, perhaps, to say that we need to agree on just one natural language for source - since that means you need to learn at most two languages? (And, also, did that natural language decision not happen already in every major computer system?)

If you think it's a good idea to change that status quo, then please let me know which natural language to use (yes, even if the choice were not a natural language that I currently know), just so I have a limit on where I need to educate myself. I have enough issues with encodings without being asked to learn every natural language in existence.

/s

On Nov 5, 2012, at 8:28 AM, Steve Davis <steven.charles.da...@gmail.com> wrote:

> It just seemed to be a short way to encapsulate the issues I see with the issue. There's no doubt that ROK's posts were far more detailed but a casual reader may miss the point. I have no doubt that if a coder can write in their native language then they would choose to do that more times than not. There's also no real reason that module or function names should not be "unicoded"... so the intent of the entire source could be natural-language encoded and balkanize the codebase. I'm not sure what the solution is, but is a gradual move towards introducing the ability to express source in natural language a solution to this problem? I'm not at all convinced of that.

> On Nov 4, 2012, at 7:02 PM, Toby Thain <t...@telegraphics.com.au> wrote:

>> On 04/11/12 7:57 PM, Steve Davis wrote:
>>> I'm personally looking forward to attempting to maintain open source
>>> kanji. An awesome challenge.

>> Is it that a lot of people on this thread don't read ROK's posts? Or is there another explanation for what just looks like wilful obtuseness?

>> --T

_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Loïc Hoguin  
View profile  
 More options Nov 5 2012, 8:27 pm
From: Loïc Hoguin <es...@ninenines.eu>
Date: Tue, 06 Nov 2012 02:27:02 +0100
Local: Mon, Nov 5 2012 8:27 pm
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.
On 11/06/2012 01:35 AM, Steve Davis wrote:

> I have enough of a hard time with computer languages without having to know over 200 natural languages to boot.

Please, don't be ridiculous, you'll never encounter 200 different
natural languages in your life, code or not. It therefore does not
matter if people write code in a language that you can't understand.
You'll never have access to it anyway! Seen any French Erlang code yet?
No? Then look harder.

I don't see why you guys are fixated on allowing more people using their
own language through Unicode like it's something bad. Many languages can
already be used in Erlang with latin1, and they certainly are. We just
want to extend that to languages that require Unicode for writing.

Last I heard, Erlang models the real world. The world is concurrent. The
world is also multilingual. The world has many different writing
systems. Why would you want to prevent Erlang from catering to the
billions of people who don't use English?

--
Loïc Hoguin
Erlang Cowboy
Nine Nines
http://ninenines.eu
_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Richard O'Keefe  
View profile  
 More options Nov 5 2012, 10:25 pm
From: "Richard O'Keefe" <o...@cs.otago.ac.nz>
Date: Tue, 6 Nov 2012 16:24:45 +1300
Local: Mon, Nov 5 2012 10:24 pm
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.

On 6/11/2012, at 1:35 PM, Steve Davis wrote:

> Suppose the author writing in a natural language where the *exact same unicode characters* have entirely different semantics?

There's a science fiction story (sorry, I forget the title and author)
where one gimmick is the ambiguity of "Pet Shop".

> I have enough of a hard time with computer languages without having to know over 200 natural languages to boot.

> Is the right decision, perhaps, to say that we need to agree on just one natural language for source

No.  It is that each *exchange* needs to involve an agreed language.

When I was at Quintus, we had a company in Israel develop some graphics
software for us.  (Good software too, but for unrelated reasons we never
shipped it that I know of.)  You say in the contract that the documentation
will be in English (although several of our people could read Hebrew) and
you say that the code and comments will be in English too.

What Unicode makes possible is a contract where a company in Israel asks
a company in the US to provide documentation and code in Hebrew, and
there is no technical barrier to them doing it.  It also lets the
Israelis write scaffolding code in Hebrew if they want to.

We do not need "One Ring to rule them all and in the darkness bind them".
English for everything would suit me fine, if it _was_ English, and not
American (:-).

> - since that means you need to learn at most two languages? (And, also, did that natural language decision not happen already in every major computer system?)

Every major computer system has been busy unmaking that decision for
decades.

> If you think it's a good idea to change that status quo, then please let me know which natural language to use (yes, even if the choice were not a natural language that I currently know), just so I have a limit on where I need to educate myself. I have enough issues with encodings without being asked to learn every natural language in existence.

Nobody is asking you to do that.
For one thing, there are about six or seven thousand natural languages
in existence.  Unicode covers dozens of _scripts_ that I've never heard
of.  Heck, it includes scripts that nobody in the whole world can _read_.
(Unless you believe that the author of 'Code Breaker' got it right, and
I thought he was pretty convincing.)  Yes, I do mean U+101D0 to U+101FD,
the PHAISTOS DISC SIGN ... characters.

We are *not* talking about something new here.
As I keep pointing out, *nothing* stops people writing Erlang
in Klingon.  They don't even have to leave ASCII for that.
It's just that _if_ they do, they have to take the consequences of
nearly everyone else being unable to read it.
Nobody has forced you to learn Klingon just because it's possible
to write Erlang in Klingon, have they?

Or let's take a real example.  Erlang currently uses Latin-1.
Latin-1 lets you write Icelandic.  Has anybody been dumping Icelandic
Erlang on your desk, _expecting you to read it_?

Unicode introduces the problem that Erlang code might be written in
a *script* that you cannot read.  But the problem that it might be
in a *language* you cannot make head or tail of has been with us for
a long time, and the sky has not fallen.

_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Steve Davis  
View profile  
 More options Nov 6 2012, 8:19 am
From: Steve Davis <steven.charles.da...@gmail.com>
Date: Tue, 6 Nov 2012 07:19:39 -0600
Local: Tues, Nov 6 2012 8:19 am
Subject: Re: [erlang-questions] A proposal for Unicode variable and atom names in Erlang.
On Nov 5, 2012, at 9:24 PM, "Richard O'Keefe" <o...@cs.otago.ac.nz> wrote:

> Unicode introduces the problem that Erlang code might be written in
> a *script* that you cannot read.  But the problem that it might be
> in a *language* you cannot make head or tail of has been with us for
> a long time, and the sky has not fallen.

I have to admit this to be true. So maybe it's not such a problematic issue, after all.

Regards,
/s
_______________________________________________
erlang-questions mailing list
erlang-questi...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages < Older 
« Back to Discussions « Newer topic     Older topic »