Annoying octal notation

David

unread,

Aug 20, 2009, 3:06:43 PM8/20/09

to

Hi all,

Is there some magic to make the 2.x CPython interpreter to ignore the
annoying octal notation?
I'd really like 012 to be "12" and not "10".

If I want an octal I'll use oct()!

"Explicit is better than implicit..."

TIA
David

Johannes Bauer

unread,

Aug 20, 2009, 4:24:24 PM8/20/09

to

David schrieb:

> If I want an octal I'll use oct()!
>
> "Explicit is better than implicit..."

A leading "0" *is* explicit.

Implicit would be when some functions would interpret a "0" prefix as
octal and others wouldn't.

Regards,
Johannes

--
"Meine Gegenklage gegen dich lautet dann auf bewusste Verlogenheit,
verl�sterung von Gott, Bibel und mir und bewusster Blasphemie."
-- Prophet und Vision�r Hans Joss aka HJP in de.sci.physik
<48d8bf1d$0$7510$5402...@news.sunrise.ch>

Simon Forman

unread,

Aug 20, 2009, 4:37:33 PM8/20/09

to

On Aug 20, 3:06 pm, David <71da...@libero.it> wrote:
> Hi all,
>
> Is there some magic to make the 2.x CPython interpreter to ignore the
> annoying octal notation?

No. You would have to modify and recompile the interpreter. This is
not exactly trivial, see "How to Change Python's Grammar"
http://www.python.org/dev/peps/pep-0306/

However, see "Integer Literal Support and Syntax" http://www.python.org/dev/peps/pep-3127/

(Basically in 2.6 and onwards you can use 0oNNN notation.)

> I'd really like 012 to be "12" and not "10".
>
> If I want an octal I'll use oct()!

But that gives you a string, you're asking about literals.

Mensanator

unread,

Aug 20, 2009, 6:18:35 PM8/20/09

to

On Aug 20, 2:06 pm, David <71da...@libero.it> wrote:
> Hi all,
>
> Is there some magic to make the 2.x CPython interpreter to ignore the
> annoying octal notation?
> I'd really like 012 to be "12" and not "10".

Use 3.1:

>>> int('012')
12

(Just kidding! That works in 2.5 also. How are you using it where
it's coming out wrong? I can see you pulling '012' out of a text
file and want to calculate with it, but how would you use a
string without using int()? Passing it to functions that allow
string representations of numbers?)

James Harris

unread,

Aug 20, 2009, 7:59:14 PM8/20/09

to

On 20 Aug, 20:06, David <71da...@libero.it> wrote:

> Hi all,
>
> Is there some magic to make the 2.x CPython interpreter to ignore the
> annoying octal notation?
> I'd really like 012 to be "12" and not "10".

This is (IMHO) a sad hangover from C (which took it from B but not
from BCPL which used #<octal> and #x<hex>) and it appears in many
places. It sounds like you want to use leading zeroes in literals -
perhaps for spacing. I don't think there's an easy way. You just have
to be aware of it.

Note that it seems to apply to integers and not floating point
literals

>>> 012
10
>>> int("012")
12
>>> 012.5
12.5
>>>

This daft notation is recognised in some surprising places to catch
the unwary. For example, the place I first came across it was in a
windows command prompt:

s:\>ping 192.168.1.012
Pinging 192.168.1.10 with 32 bytes of data:

On B's use of the leading zero see

http://cm.bell-labs.com/cm/cs/who/dmr/kbman.html

and note the comment: "An octal constant is the same as a decimal
constant except that it begins with a zero. It is then interpreted in
base 8. Note that 09 (base 8) is legal and equal to 011."

It maybe made sense once but this relic of the past should have been
consigned to the waste bin of history long ago.

James

David

unread,

Aug 21, 2009, 12:35:07 PM8/21/09

to

Il Thu, 20 Aug 2009 22:24:24 +0200, Johannes Bauer ha scritto:

> David schrieb:
>
>> If I want an octal I'll use oct()!
>>
>> "Explicit is better than implicit..."
>
> A leading "0" *is* explicit.

It isn't explicit enough, at least IMO.

regards
David

David

unread,

Aug 21, 2009, 12:40:55 PM8/21/09

to

Il Thu, 20 Aug 2009 15:18:35 -0700 (PDT), Mensanator ha scritto:

> (Just kidding! That works in 2.5 also. How are you using it where
> it's coming out wrong? I can see you pulling '012' out of a text
> file and want to calculate with it, but how would you use a
> string without using int()? Passing it to functions that allow
> string representations of numbers?)

Obviously it's not a progamming issue, just a hassle using the interpreter
on command line.

David

unread,

Aug 21, 2009, 12:46:09 PM8/21/09

to

Il Thu, 20 Aug 2009 16:59:14 -0700 (PDT), James Harris ha scritto:

>
> It maybe made sense once but this relic of the past should have been
> consigned to the waste bin of history long ago.

I perfectly agree with you!

David.

MRAB

unread,

Aug 21, 2009, 12:58:25 PM8/21/09

to pytho...@python.org

Is this better?

Python 3.1 (r31:73574, Jun 26 2009, 20:21:35) [MSC v.1500 32 bit
(Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> 010
File "<stdin>", line 1
010
^

I would've preferred it to be decimal unless there's a prefix:

Python 3.1 (r31:73574, Jun 26 2009, 20:21:35) [MSC v.1500 32 bit
(Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> 0x10
16
>>> 0o10
8

Ah well, something for Python 4. :-)

Mensanator

unread,

Aug 21, 2009, 1:36:35 PM8/21/09

to

Aha! Then I WAS right after all. Switch to 3.1 and you'll
soon be cured of that bad habit:

>>> 012 + 012
SyntaxError: invalid token (<pyshell#4>, line 1)

>
> David

John Nagle

unread,

Aug 21, 2009, 1:45:51 PM8/21/09

to

Simon Forman wrote:
> On Aug 20, 3:06 pm, David <71da...@libero.it> wrote:
>> Hi all,
>>
>> Is there some magic to make the 2.x CPython interpreter to ignore the
>> annoying octal notation?
>
> No. You would have to modify and recompile the interpreter. This is
> not exactly trivial, see "How to Change Python's Grammar"
> http://www.python.org/dev/peps/pep-0306/
>
> However, see "Integer Literal Support and Syntax" http://www.python.org/dev/peps/pep-3127/
>
> (Basically in 2.6 and onwards you can use 0oNNN notation.)

Yes, and making lead zeros an error as suggested in PEP 3127 is a good idea.
It will be interesting to see what bugs that flushes out.

In 2009, Unisys finally exited the mainframe hardware business, and the
last of the 36-bit machines, the ClearPath servers, are being phased out.
That line of machines goes back to the UNIVAC 2200 series, and the UNIVAC
1100 series, all the way back to the vacuum-tube UNIVAC 1103 from 1952.
It's the longest running series of computers in history, and code for all
those machines used octal heavily.

And it's over. We can finally dispense with octal by default.

John Nagle

Derek Martin

unread,

Aug 21, 2009, 3:48:57 PM8/21/09

to pytho...@python.org

John Nagle wrote:
> Yes, and making lead zeros an error as suggested in PEP 3127 is a
> good idea. It will be interesting to see what bugs that flushes
> out.

James Harris wrote:
> It maybe made sense once but this relic of the past should have been
> consigned to the waste bin of history long ago.

Sigh. Nonsense. I use octal notation *every day*, for two extremely
prevalent purposes: file creation umask, and Unix file permissions
(i.e. the chmod() function/command).

I fail to see how 0O012, or even 0o012 is more intelligible than 012.
The latter reads like a typo, and the former is virtually
indistinguishable from 00012, O0012, or many other combinations that
someone might accidentally type (or intentionally type, having to do
this in dozens of other programming languages). I can see how 012 can
be confusing to new programmers, but at least it's legible, and the
great thing about humans is that they can be taught (usually). I for
one think this change is completely misguided. More than flushing out
bugs, it will *cause* them in ubiquity, requiring likely terabytes of
code to be poured over and fixed. Changing decades-old behaviors
common throughout a community for the sake of avoiding a minor
inconvenience of the n00b is DUMB.

--
Derek D. Martin
http://www.pizzashack.org/
GPG Key ID: 0x81CFE75D

Benjamin Peterson

unread,

Aug 21, 2009, 4:22:47 PM8/21/09

to pytho...@python.org

Simon Forman <sajmikins <at> gmail.com> writes:

> No. You would have to modify and recompile the interpreter. This is
> not exactly trivial, see "How to Change Python's Grammar"
> http://www.python.org/dev/peps/pep-0306/

And even that's incorrect. You'd have to modify the tokenizer.

Benjamin Peterson

unread,

Aug 21, 2009, 4:25:45 PM8/21/09

to pytho...@python.org

Derek Martin <code <at> pizzashack.org> writes:

> More than flushing out
> bugs, it will *cause* them in ubiquity, requiring likely terabytes of
> code to be poured over and fixed.

2to3, however, can fix it for you extreme easily.

Derek Martin

unread,

Aug 21, 2009, 4:44:14 PM8/21/09

to Benjamin Peterson, pytho...@python.org

On Fri, Aug 21, 2009 at 08:25:45PM +0000, Benjamin Peterson wrote:
> > More than flushing out bugs, it will *cause* them in ubiquity,
> > requiring likely terabytes of code to be poured over and fixed.
>
> 2to3, however, can fix it for you extreme easily.

Sure, but that won't stop people who've been writing code for 20 years
from continuing to type octal that way... Humans can learn fairly
easily, but UN-learning is often much harder, especially when the
behavior to be unlearned is still very commonly in use.

Anyway, whatever. This change (along with a few of the other
seemingly arbitrary changes in 3.x) is annoying, but Python is still
one of the best languages to code in for any multitude of problems.

Piet van Oostrum

unread,

Aug 21, 2009, 4:44:41 PM8/21/09

to

>>>>> Derek Martin <co...@pizzashack.org> (DM) wrote:

>DM> I fail to see how 0O012, or even 0o012 is more intelligible than 012.
>DM> The latter reads like a typo, and the former is virtually
>DM> indistinguishable from 00012, O0012, or many other combinations that
>DM> someone might accidentally type (or intentionally type, having to do
>DM> this in dozens of other programming languages).

You're right. Either hexadecimal should have been 0h or octal should
have been 0t :=)
--
Piet van Oostrum <pi...@cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: pi...@vanoostrum.org

MRAB

unread,

Aug 21, 2009, 5:18:41 PM8/21/09

to pytho...@python.org

Piet van Oostrum wrote:
>>>>>> Derek Martin <co...@pizzashack.org> (DM) wrote:
>
>> DM> I fail to see how 0O012, or even 0o012 is more intelligible than 012.
>> DM> The latter reads like a typo, and the former is virtually
>> DM> indistinguishable from 00012, O0012, or many other combinations that
>> DM> someone might accidentally type (or intentionally type, having to do
>> DM> this in dozens of other programming languages).
>
> You're right. Either hexadecimal should have been 0h or octal should
> have been 0t :=)

I have seen the use of Q/q instead in order to make it clearer. I still
prefer Smalltalk's 16rFF and 8r377.

James Harris

unread,

Aug 21, 2009, 7:23:57 PM8/21/09

to

On 21 Aug, 20:48, Derek Martin <c...@pizzashack.org> wrote:

...

> James Harris wrote:
> > It maybe made sense once but this relic of the past should have been
> > consigned to the waste bin of history long ago.
>
> Sigh. Nonsense. I use octal notation *every day*, for two extremely
> prevalent purposes: file creation umask, and Unix file permissions
> (i.e. the chmod() function/command).

You misunderstand. I was saying that taking a leading zero as
indicating octal is archaic. Octal itself is fine where appropriate.

The chmod command doesn't require a leading zero.

James

James Harris

unread,

Aug 21, 2009, 7:52:29 PM8/21/09

to

On 21 Aug, 22:18, MRAB <pyt...@mrabarnett.plus.com> wrote:
> Piet van Oostrum wrote:

Two interesting options. In a project I have on I have also considered
using 0q as indicating octal. I maybe saw it used once somewhere else
but I have no idea where. 0t was a second choice and 0c third choice
(the other letters of oct). 0o should NOT be used for obvious reasons.

So you are saying that Smalltalk has <base in decimal>r<number> where
r is presumably for radix? That's maybe best of all. It preserves the
syntactic requirement of starting a number with a digit and seems to
have greatest flexibility. Not sure how good it looks but it's
certainly not bad.

0xff & 0x0e | 0b1101
16rff & 16r0e | 2r1101

Hmm. Maybe a symbol would be better than a letter.

James

Ben Finney

unread,

Aug 21, 2009, 8:03:35 PM8/21/09

to

Derek Martin <co...@pizzashack.org> writes:

> James Harris wrote:
> > It maybe made sense once but this relic of the past should have been
> > consigned to the waste bin of history long ago.
>
> Sigh. Nonsense. I use octal notation *every day*, for two extremely
> prevalent purposes: file creation umask, and Unix file permissions
> (i.e. the chmod() function/command).

Right. Until Unix stops using these (and whatever replaces it would have
to be pretty compelling, given the prevalence of these in octal
notation), or until people stop using Unix, these will be with us and
require octal notation.

That doesn't mean, of course, that we need to elevate it above
hexadecimal in our language syntax; ‘0o012’ will allow octal notation
for literals just fine.

> I fail to see how 0O012, or even 0o012 is more intelligible than 012.
> The latter reads like a typo

No, it reads like a very common notation for decimal numbers in natural
usage. It's very frequently not a typo, but an expression of a
three-digit decimal number that happens to be less than 100.

> and the former is virtually indistinguishable from 00012, O0012, or
> many other combinations that someone might accidentally type (or
> intentionally type, having to do this in dozens of other programming
> languages).

Only if you type the letter in uppercase. The lower-case ‘o’ is much
easier to distinguish.

Whether or not you find ‘0o012’ easily distinguishable as a non-decimal
notation, it's undoubtedly easier to distinguish than ‘012’.

> I can see how 012 can be confusing to new programmers, but at least
> it's legible, and the great thing about humans is that they can be
> taught (usually). I for one think this change is completely misguided.

These human programmers, whether newbies or long-experienced, also deal
with decimal numbers every day, many of which are presented as a
sequence of digits with leading zeros — and we continue to think of them
as decimal numbers regardless. Having the language syntax opposed to
that is a wart, a cognitive overhead with little benefit, and I'll be
glad to see it go in favour of a clearer syntax.

--
\ “… one of the main causes of the fall of the Roman Empire was |
`\ that, lacking zero, they had no way to indicate successful |
_o__) termination of their C programs.” —Robert Firth |
Ben Finney

Ben Finney

unread,

Aug 21, 2009, 8:09:29 PM8/21/09

to

Derek Martin <co...@pizzashack.org> writes:

> Sure, but that won't stop people who've been writing code for 20 years
> from continuing to type octal that way... Humans can learn fairly
> easily, but UN-learning is often much harder, especially when the
> behavior to be unlearned is still very commonly in use.

This is exactly the argument for removing ‘012’ octal notation: humans
(and programmers who have far less need for octal numbers than for
decimal numbers) are *already* trained, and reinforced many times daily,
to think of that notation as a decimal number. They should not need to
un-learn that association in order to understand octal literals in code.

> Anyway, whatever. This change (along with a few of the other seemingly
> arbitrary changes in 3.x) is annoying, but Python is still one of the
> best languages to code in for any multitude of problems.

Hear hear.

--
\ “I wrote a song, but I can't read music so I don't know what it |
`\ is. Every once in a while I'll be listening to the radio and I |
_o__) say, ‘I think I might have written that.’” —Steven Wright |
Ben Finney

Steven D'Aprano

unread,

Aug 21, 2009, 10:55:51 PM8/21/09

to

On Fri, 21 Aug 2009 14:48:57 -0500, Derek Martin wrote:

>> It maybe made sense once but this relic of the past should have been
>> consigned to the waste bin of history long ago.
>
> Sigh. Nonsense. I use octal notation *every day*, for two extremely
> prevalent purposes: file creation umask, and Unix file permissions (i.e.
> the chmod() function/command).

And you will still be able to, by explicitly using octal notation.

> I fail to see how 0O012, or even 0o012 is more intelligible than 012.

The first is wrong, bad, wicked, and if I catch anyone using it, they
will be soundly slapped with a halibut. *wink*

Using O instead of o for octal is so unreadable that I think it should be
prohibited by the language, no matter that hex notation accepts both x
and X.

> The latter reads like a typo,

*Everything* reads like a typo if you're unaware of the syntax being used.

> and the former is virtually
> indistinguishable from 00012, O0012, or many other combinations that
> someone might accidentally type (or intentionally type, having to do
> this in dozens of other programming languages).

Agreed.

> I can see how 012 can
> be confusing to new programmers, but at least it's legible, and the
> great thing about humans is that they can be taught (usually).

And the great thing is that now you get to teach yourself to stop writing
octal numbers implicitly and be write them explicitly with a leading 0o
instead :)

It's not just new programmers -- it's any programmer who is unaware of
the notation (possibly derived from C) that a leading 0 means "octal".
That's a strange and bizarre notation to use, because 012 is a perfectly
valid notation for decimal 12, as are 0012, 00012, 000012 and so forth.
Anyone who has learnt any mathematics beyond the age of six will almost
certainly expect 012 to equal 12. Having 012 equal 10 comes as a surprise
even to people who are familiar with octal.

> I for
> one think this change is completely misguided. More than flushing out
> bugs, it will *cause* them in ubiquity, requiring likely terabytes of
> code to be poured over and fixed. Changing decades-old behaviors common
> throughout a community for the sake of avoiding a minor inconvenience of
> the n00b is DUMB.

Use of octal isn't common. You've given two cases were octal notation is
useful, but for every coder who frequently writes umasks on Unix systems,
there are a thousand who don't.

It's no hardship to write 0o12 instead of 012.

--
Steven

Message has been deleted

David

unread,

Aug 22, 2009, 5:15:16 AM8/22/09

to

Il Fri, 21 Aug 2009 10:36:35 -0700 (PDT), Mensanator ha scritto:

> Aha! Then I WAS right after all. Switch to 3.1 and you'll
> soon be cured of that bad habit:
>
>>>> 012 + 012
> SyntaxError: invalid token (<pyshell#4>, line 1)

I have tre (four) problems:

1) I am forced to use 2.5 since the production server has 2.5 installed.

2) Quite often I have to enter many zero-leading numbers and now my fingers
put a leading zero almost everywhere

3) I don't understand why useless but *harmless* things like algebrically
insignificant leading zeros should be *forbidden* and promoted to errors.

4) I still don't like the '0o..' notation because 0 (zero) and o (lowercase
O) glyphs appear very similar in many character sets. I'd prefer something
like '0c..' so it resembles the word 'oc' for 'octal'.

David

unread,

Aug 22, 2009, 5:27:32 AM8/22/09

to

Il Fri, 21 Aug 2009 16:52:29 -0700 (PDT), James Harris ha scritto:

>
> 0xff & 0x0e | 0b1101
> 16rff & 16r0e | 2r1101
>
> Hmm. Maybe a symbol would be better than a letter.

What about 2_1011, 8_7621, 16_c26h or 2;1011, 8;7621, 16;c26h ?

David

Steven D'Aprano

unread,

Aug 22, 2009, 5:57:32 AM8/22/09

to

On Sat, 22 Aug 2009 11:15:16 +0200, David wrote:

> 3) I don't understand why useless but *harmless* things like
> algebrically insignificant leading zeros should be *forbidden* and
> promoted to errors.

The PEP covering this change says:

"There are still some strong feelings that '0123' should be allowed as a
literal decimal in Python 3.0. If this is the right thing to do, this can
easily be covered in an additional PEP. This proposal only takes the
first step of making '0123' not be a valid octal number, for reasons
covered in the rationale."

http://www.python.org/dev/peps/pep-3127/

--
Steven

MRAB

unread,

Aug 22, 2009, 8:55:39 AM8/22/09

to pytho...@python.org

Dennis Lee Bieber wrote:
> On Fri, 21 Aug 2009 16:52:29 -0700 (PDT), James Harris
> <james.h...@googlemail.com> declaimed the following in
> gmane.comp.python.general:

>
>> So you are saying that Smalltalk has <base in decimal>r<number> where
>> r is presumably for radix? That's maybe best of all. It preserves the
>> syntactic requirement of starting a number with a digit and seems to
>> have greatest flexibility. Not sure how good it looks but it's
>> certainly not bad.
>>
>> 0xff & 0x0e | 0b1101
>> 16rff & 16r0e | 2r1101
>>
>> Hmm. Maybe a symbol would be better than a letter.
>>

> Or Ada's 16#FF#, 8#377#...
>
'#' starts a comment, so that's right out! :-)

> I forget if DEC/VMS FORTRAN or Xerox Sigma FORTRAN used x'FF' or
> 'FF'x, and o'377' or '377'o

MRAB

unread,

Aug 22, 2009, 8:55:42 AM8/22/09

to pytho...@python.org

'_': what if in the future we want to allow them in numbers for clarity?

';': used to separate multiple statements on a line (but not used that
often).

Grant Edwards

unread,

Aug 22, 2009, 10:08:09 AM8/22/09

to

On 2009-08-22, Dennis Lee Bieber <wlf...@ix.netcom.com> wrote:
> On Fri, 21 Aug 2009 10:45:51 -0700, John Nagle <na...@animats.com>

> declaimed the following in gmane.comp.python.general:
>
>>

>> And it's over. We can finally dispense with octal by default.
>>

> I've not looked at modern Intel processor format, but if there are
> folks still using variants of 8080 (8051?) and Z-80, Octal still works
> nice for op-codes... I don't recall the exact values, but the MOV
> instruction was something like '1SD'o, where S and D are three bit
> register specifications (A, B, C, D, E, H, L, and Memory as I recall)

The Heathkit's terminal I have uses a Z80, and IIRC, the
assembly listings were in split-octal [a 16 bit word ranges
from 000 000 to 377 377]. Stuff for the PDP-11 (which also had
instruction fields 3 bits wide) was always in octal as well.
The PDP-11 is pretty much dead, but I think there are embedded
Z80 derivitives still in use.

--
Grant

Derek Martin

unread,

Aug 22, 2009, 3:04:17 PM8/22/09

to Ben Finney, pytho...@python.org

On Sat, Aug 22, 2009 at 10:03:35AM +1000, Ben Finney wrote:
> > and the former is virtually indistinguishable from 00012, O0012, or
> > many other combinations that someone might accidentally type (or
> > intentionally type, having to do this in dozens of other programming
> > languages).
>
> Only if you type the letter in uppercase. The lower-case ‘o’ is much
> easier to distinguish.

It is easier, but I dispute that it is much easier.

> Whether or not you find ‘0o012’ easily distinguishable as a non-decimal
> notation, it's undoubtedly easier to distinguish than ‘012’.

012 has meant decimal 10 in octal to me for so long, from its use in
MANY other programming languages, that I disagree completely.

> > I can see how 012 can be confusing to new programmers, but at least
> > it's legible, and the great thing about humans is that they can be
> > taught (usually). I for one think this change is completely misguided.
>
> These human programmers, whether newbies or long-experienced, also deal
> with decimal numbers every day, many of which are presented as a
> sequence of digits with leading zeros — and we continue to think of them
> as decimal numbers regardless. Having the language syntax opposed to
> that is

...consistent with virtually every other popular programming language.

Jan Kaliszewski

unread,

Aug 22, 2009, 4:44:35 PM8/22/09

to Derek Martin, Ben Finney, pytho...@python.org

22-08-2009 o 21:04:17 Derek Martin <co...@pizzashack.org> wrote:

> On Sat, Aug 22, 2009 at 10:03:35AM +1000, Ben Finney wrote:

>> These human programmers, whether newbies or long-experienced, also deal
>> with decimal numbers every day, many of which are presented as a
>> sequence of digits with leading zeros — and we continue to think of them
>> as decimal numbers regardless. Having the language syntax opposed to
>> that is

> ...consistent with virtually every other popular programming language.

Probably not every other...

Anyway -- being (as it was said) inconsistent with every-day-convention --
it'd be also inconsistent with *Python* conventions, i.e.:

0x <- hex prefix
0b <- bin prefix

Cheers,
*j

--
Jan Kaliszewski (zuo) <z...@chopin.edu.pl>

James Harris

unread,

Aug 22, 2009, 5:54:41 PM8/22/09

to

On 22 Aug, 10:27, David <71da...@libero.it> wrote:

... (snipped a discussion on languages and other systems interpreting
numbers with a leading zero as octal)

> > Either hexadecimal should have been 0h or octal should
> > have been 0t :=)
>
>
> I have seen the use of Q/q instead in order to make it clearer. I still
> prefer Smalltalk's 16rFF and 8r377.
>
>
> Two interesting options. In a project I have on I have also considered
> using 0q as indicating octal. I maybe saw it used once somewhere else
> but I have no idea where. 0t was a second choice and 0c third choice
> (the other letters of oct). 0o should NOT be used for obvious reasons.
>
> So you are saying that Smalltalk has <base in decimal>r<number> where
> r is presumably for radix? That's maybe best of all. It preserves the
> syntactic requirement of starting a number with a digit and seems to
> have greatest flexibility. Not sure how good it looks but it's
> certainly not bad.
>
>

> > 0xff & 0x0e | 0b1101
> > 16rff & 16r0e | 2r1101
>
> > Hmm. Maybe a symbol would be better than a letter.

...

> > Or Ada's 16#FF#, 8#377#...

> > I forget if DEC/VMS FORTRAN or Xerox Sigma FORTRAN used x'FF' or

> > 'FF'x, and o'377' or '377'o

...

>
> What about 2_1011, 8_7621, 16_c26h or 2;1011, 8;7621, 16;c26h ?

They look good - which is important. The trouble (for me) is that I
want the notation for a new programming language and already use these
characters. I have underscore as an optional separator for groups of
digits - 123000 and 123_000 mean the same. The semicolon terminates a
statement. Based on your second idea, though, maybe a colon could be
used instead as in

2:1011, 8:7621, 16:c26b

I don't (yet) use it as a range operator.

I could also use a hash sign as although I allow hash to begin
comments it cannot be preceded by anything other than whitespace so
these would be usable

2#1011, 8#7621, 16#c26b

I have no idea why Ada which uses the # also apparently uses it to end
a number

2#1011#, 8#7621#, 16#c26b#

Copying this post also to comp.lang.misc. Folks there may either be
interested in the discussion or have comments to add.

James

Mel

unread,

Aug 22, 2009, 7:16:39 PM8/22/09

to

James Harris wrote:

> I have no idea why Ada which uses the # also apparently uses it to end
> a number
>
> 2#1011#, 8#7621#, 16#c26b#

Interesting. They do it because of this example from
<http://archive.adaic.com/standards/83rat/html/ratl-02-01.html#2.1>:

2#1#E8 -- an integer literal of value 256

where the E prefixes a power-of-2 exponent, and can't be taken as a digit of
the radix. That is to say

16#1#E2

would also equal 256, since it's 1*16**2 .

Mel.

Carl Banks

unread,

Aug 22, 2009, 7:45:20 PM8/22/09

to

If you know anything about Python, you should know that "consistent
with virtually every other programming langauge" is, at most, a polite
suggestion for how Python should do it.

Carl Banks

unread,

Aug 22, 2009, 7:49:40 PM8/22/09

to

On Aug 21, 12:48 pm, Derek Martin <c...@pizzashack.org> wrote:
> John Nagle wrote:
> > Yes, and making lead zeros an error as suggested in PEP 3127 is a
> > good idea. It will be interesting to see what bugs that flushes
> > out.
> James Harris wrote:
> > It maybe made sense once but this relic of the past should have been
> > consigned to the waste bin of history long ago.
>
> Sigh. Nonsense. I use octal notation *every day*, for two extremely
> prevalent purposes: file creation umask, and Unix file permissions
> (i.e. the chmod() function/command).

Unix file permissions maybe made sense once but this relic of the past
should have been consigned to the waste bin of history long ago. :)

Carl Banks

Derek Martin

unread,

Aug 22, 2009, 11:19:01 PM8/22/09

to pytho...@python.org

On Sat, Aug 22, 2009 at 02:55:51AM +0000, Steven D'Aprano wrote:
> > I can see how 012 can
> > be confusing to new programmers, but at least it's legible, and the
> > great thing about humans is that they can be taught (usually).
>
> And the great thing is that now you get to teach yourself to stop writing
> octal numbers implicitly and be write them explicitly with a leading 0o
> instead :)

Sorry, I don't write them implicitly. A leading zero explicitly
states that the numeric constant that follows is octal. It is so in 6
out of 7 computer languages I have more than a passing familiarity
with (the 7th being scheme, which is a thing unto itself), including
Python. It's that way on Bourne-compatible and POSIX-compatible Unix
shells (though it requires a leading backslash before the leading zero
there). I'm quite certain it can not be the case on only those 6
languages that I happen to be familiar with...

While it may be true that people commonly write decimal numbers with
leading zeros (I dispute even this, having to my recollection only
recently seen it as part of some serial number, which in practice is
really more of a string identifier than a number, often containing
characters other than numbers), it's also true that in the context of
computer programming languages, for the last 40+ years, a number
represented with a leading zero is most often an octal number. This
has been true even in Python for nearly *twenty years*. Why the
sudden need to change it?

So no, I don't get to teach myself to stop writing octal numbers with
a leading zero. Instead, I have to remember an exception to the rule.

Also I don't think it's exactly uncommon for computer languages to do
things differently than they are done in non-CS circles. A couple of
easy examples: we do not write x+=y except in computer languages. The
POSIX system call to create a file is called creat(). If you think
about it, I'm sure you can come up with lots of examples where even
Python takes liberties. Is this a bad thing? Not inherently, no.
Will it be confusing to people who aren't familiar with the usage?
Quite possibly, but that is not inherently bad either. It's all about
context.

> Use of octal isn't common.

It's common enough. Peruse the include files for your C libraries, or
the source for your operating system's kernel, or system libraries,
and I bet you'll find plenty of octal. I did. [Note that it is
irrelevant that these are C/C++ files; here we are only concerned with
whether they use octal, not how it is represented therein.] I'd guess
there's a fair chance that any math or scientific software package
uses octal. Octal is a convenient way to represent bit fields that
naturally occur in groups of 3, of which there are potentially
limitless cases.

> You've given two cases were octal notation is useful, but for every
> coder who frequently writes umasks on Unix systems, there are a
> thousand who don't.

I gave two cases that I use *daily*, or very nearly daily. My hats
currently include system admin, SQA, and software development, and I
find it convenient to use octal in each of those. But those are
hardly the only places where octal is useful. Have a look at the
ncurses library, for example. Given that Python has an ncurses
interface, I'm guessing it's used there too. In fact if the Python
source had no octal in it, I would find that very surprising.

> It's no hardship to write 0o12 instead of 012.

Computer languages are not write-only, excepting maybe Perl. ;-)
Writing 0o12 presents no hardship; but I assert, with at least some
support from others here, that *reading* it does.

Derek Martin

unread,

Aug 22, 2009, 11:32:03 PM8/22/09

to James Harris, pytho...@python.org

On Fri, Aug 21, 2009 at 04:23:57PM -0700, James Harris wrote:
> You misunderstand. I was saying that taking a leading zero as
> indicating octal is archaic. Octal itself is fine where appropriate.

I don't see that the leading zero is any more archaic than the use of
octal itself... Both originate from around the same time period, and
are used in the same cases. We should just prohibit octal entirely
then.

But I suppose it depends on which definition of "archaic" you use. In
the other common sense of the word, the leading zero is no more
archaic than the C programming language. Let's ban the use of all
three. :) (I believe C is still the language in which the largest
number of lines of new code are written, but if not, it's way up
there.)

> The chmod command doesn't require a leading zero.

No, but it doesn't need any indicator that the number given to it is
in octal; in the case of the command line tool, octal is *required*,
and the argument is *text*. However, the chmod() system call, and the
interfaces to it in every language I'm familiar with that has one, do
require the leading zero (because that's how you represent octal).
Including Python, for some 20 years or so.

Richard Harter

unread,

Aug 22, 2009, 11:38:59 PM8/22/09

to

On Sat, 22 Aug 2009 14:54:41 -0700 (PDT), James Harris
<james.h...@googlemail.com> wrote:

>On 22 Aug, 10:27, David <71da...@libero.it> wrote:
>
>... (snipped a discussion on languages and other systems interpreting
>numbers with a leading zero as octal)
>
>> > Either hexadecimal should have been 0h or octal should

>> > have been 0t :=3D)

>>
>>
>> I have seen the use of Q/q instead in order to make it clearer. I still
>> prefer Smalltalk's 16rFF and 8r377.
>>
>>
>> Two interesting options. In a project I have on I have also considered
>> using 0q as indicating octal. I maybe saw it used once somewhere else
>> but I have no idea where. 0t was a second choice and 0c third choice
>> (the other letters of oct). 0o should NOT be used for obvious reasons.
>>
>> So you are saying that Smalltalk has <base in decimal>r<number> where
>> r is presumably for radix? That's maybe best of all. It preserves the
>> syntactic requirement of starting a number with a digit and seems to
>> have greatest flexibility. Not sure how good it looks but it's
>> certainly not bad.

I opine that a letter is better; special characters are a
valuable piece of real estate. However for floating point you
need at least three letters because a floating point number has
three parts: the fixed point point, the exponent base, and the
exponent. Now we can represent the radices of the individual
parts with the 'r'scheme, e.g., 2r101001, but we need separate
letters to designate the exponent base and the exponent. B and E
are the obvious choices, though we want to be careful about a
confusion with 'b' in hex. For example, using 'R',

3R20.1B2E16Rac

is 20.1 in trinary (6 1/3) times 2**172 (hex ac).

I grant that this example looks a bit gobbledegookish, but normal
usage would be much simpler. The notation doesn't handle
balanced trinary; however I opine that balanced trinary requires
special notation.

Richard Harter, c...@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
No one asks if a tree falls in the forest
if there is no one there to see it fall.

Message has been deleted

Steven D'Aprano

unread,

Aug 23, 2009, 12:07:05 AM8/23/09

to

On Sat, 22 Aug 2009 14:04:17 -0500, Derek Martin wrote:

>> These human programmers, whether newbies or long-experienced, also deal
>> with decimal numbers every day, many of which are presented as a
>> sequence of digits with leading zeros — and we continue to think of
>> them as decimal numbers regardless. Having the language syntax opposed
>> to that is
>
> ...consistent with virtually every other popular programming language.

A mistake is still a mistake even if it shared with others.

Treating its with a lead zero as octal was a design error when it was
first thought up (possibly in C?) and it remains a design error no matter
how many languages copy it. I feel your pain of having to unlearn
something you have learned, but just because you have been lead astray by
the languages you use doesn't mean we should compound the error by
leading the next generation of coders astray too.

Octal is of little importance today, as near as I can tell it only has
two common uses in high level languages: file umasks and permissions on
Unix systems. It simply isn't special enough to justify implicit notation
that surprises people, leads to silent errors, and is inconsistent with
standard mathematical notation and treatment of floats:

>>> 123.2000 # insignificant trailing zeroes don't matter
123.2
>>> 000123.2 # neither do insignificant leading zeroes
123.2
>>> 001.23e0023 # not even if it is an integer
1.23e+23
>>> 000123 # but here is matters
83

--
Steven

Steven D'Aprano

unread,

Aug 23, 2009, 2:13:31 AM8/23/09

to

On Sat, 22 Aug 2009 22:19:01 -0500, Derek Martin wrote:

> On Sat, Aug 22, 2009 at 02:55:51AM +0000, Steven D'Aprano wrote:
>> > I can see how 012 can
>> > be confusing to new programmers, but at least it's legible, and the
>> > great thing about humans is that they can be taught (usually).
>>
>> And the great thing is that now you get to teach yourself to stop
>> writing octal numbers implicitly and be write them explicitly with a
>> leading 0o instead :)
>
> Sorry, I don't write them implicitly. A leading zero explicitly states
> that the numeric constant that follows is octal.

That is incorrect.

Decimal numbers implicitly use base 10, because there's nothing in the
literal 12340 (say) to indicate the base is ten, rather than 16 or 9 or
23. Although implicit is usually bad, when it's as common and expected as
decimal notation, it's acceptable.

Hex decimals explicitly use base 16, because the leading 0x is defined to
mean "base 16". 0x is otherwise not a legal decimal number, or hex number
for that matter. (It would be legal in base 34 or greater, but that's
rare enough that we can ignore this.) For the bases we care about, a
leading 0x can't have any other meaning -- there's no ambiguity, so we
can treat it as a synonym for "base 16".

(Explicitness isn't a binary state, and it would be even more explicit if
the base was stated in full, as in e.g. Ada where 16#FF# = decimal 255.)

However, octal numbers are defined implicitly: 012 is a legal base 10
number, or base 3, or base 9, or base 16. There's nothing about a leading
zero that says "base 8" apart from familiarity. We can see the difference
between leading 0x and leading 0 if you repeat it: repeating an explicit
0x, as in 0x0xFF, is a syntax error, while repeating an implicit 0
silently does nothing different:

>>> 0x0xFF
File "<stdin>", line 1
0x0xFF
^
SyntaxError: invalid syntax
>>> 0077
63

> It is so in 6 out of 7
> computer languages I have more than a passing familiarity with (the 7th
> being scheme, which is a thing unto itself), including Python. It's
> that way on Bourne-compatible and POSIX-compatible Unix shells (though
> it requires a leading backslash before the leading zero there). I'm
> quite certain it can not be the case on only those 6 languages that I
> happen to be familiar with...

No, of course not. There are a bunch of languages, pretty much all
heavily influenced by C, which treat integer literals with leading 0s as
oct: C++, Javascript, Python 2.x, Ruby, Perl, Java. As so often is the
case, C's design mistakes become common practice. Sigh.

However, there are many, many languages that don't, or otherwise do
things differently to C. Even some modern C-derived languages reject the
convention:

C# doesn't have octal literals at all.

As far as I can tell, Objective-C and Cocoa requires you to explicitly
enable support for octal literals before you use them.

In D, at least some people want to follow Python's lead and either drop
support for oct literals completely, or require a 0o prefix:
http://d.puremagic.com/issues/show_bug.cgi?id=2656

E makes a leading 0 a syntax error.

As far as other, non-C languages go, leading 0 = octal seems to be rare
or non-existent:

Basic and VB use a leading &O for octal.

FORTRAN 90 uses a leading O (uppercase o) for octal, and surrounds the
literal in quotation marks: O"12" would be ten in octal. 012 would be
decimal 12.

As far as I can tell, COBOL also ignores leading zeroes.

Forth interprets literals according to the current value of BASE (which
defaults to 10). There's no special syntax for it.To enter ten in octal,
you might say:

8 BASE ! 12

or if your system provides it:

OCT 12

Standard Pascal ignores leading 0s in integers, and doesn't support octal
at all. A leading $ is used for hex. At least one non-standard Pascal
uses leading zero for octal.

Haskell requires an explicit 0o:
http://www.haskell.org/onlinereport/lexemes.html#lexemes-numeric

So does OCaml.

Ada uses decimal unless you explicitly give the base:
http://archive.adaic.com/standards/83lrm/html/lrm-02-04.html

Leading zeroes are insignificant in bc:

[steve@sylar ~]$ bc
bc 1.06
Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
012 + 011
23

Leading zeroes are also insignificant in Hewlett-Packard RPN language
(e.g. HP-48GX calculators), Hypertalk and languages derived from it.

I'm not sure, but it looks to me like Boo doesn't support octal literals,
although it supports hex with 0x and binary with 0b.

Algol uses an explicit base: 8r12 to indicate octal 10.

Common Lisp and Scheme use a #o prefix.

As far as *languages* go, 0-based octal literals are in the tiny
minority. As far as *programmers* go, it may be in a plurality, perhaps
even a small minority, but remember there are still millions of VB
programmers out there who are just as unfamiliar with C conventions.

> While it may be true that people commonly write decimal numbers with
> leading zeros (I dispute even this

[...]

Leading zeroes in decimal numbers are *very* common in dates and times.

[...]

> Given that Python has an ncurses interface, I'm
> guessing it's used there too. In fact if the Python source had no octal
> in it, I would find that very surprising.

I can't see any oct literals in the standard library, not even in the
ncurses interface, but then my grep-foo is weak and I may have made a
mistake. I encourage you to look for yourself.

>> It's no hardship to write 0o12 instead of 012.
>
> Computer languages are not write-only, excepting maybe Perl. ;-) Writing
> 0o12 presents no hardship; but I assert, with at least some support from
> others here, that *reading* it does.

No more so than 0x or 0b literals. If anything, 0o12 stands out as "not
twelve" far more than 012 does.

--
Steven

Dmitry A. Kazakov

unread,

Aug 23, 2009, 4:21:52 AM8/23/09

to

On Sat, 22 Aug 2009 14:54:41 -0700 (PDT), James Harris wrote:

> They look good - which is important. The trouble (for me) is that I
> want the notation for a new programming language and already use these
> characters. I have underscore as an optional separator for groups of
> digits - 123000 and 123_000 mean the same. The semicolon terminates a
> statement. Based on your second idea, though, maybe a colon could be
> used instead as in
>
> 2:1011, 8:7621, 16:c26b
>
> I don't (yet) use it as a range operator.
>
> I could also use a hash sign as although I allow hash to begin
> comments it cannot be preceded by anything other than whitespace so
> these would be usable
>
> 2#1011, 8#7621, 16#c26b
>
> I have no idea why Ada which uses the # also apparently uses it to end
> a number
>
> 2#1011#, 8#7621#, 16#c26b#

If you are going Unicode, you could use the mathematical notation, which is

10112, 76218, c26b16

(subscript specification of the base). Yes, it might be difficult to type
(:-)), and would require some look-ahead in the parser. One of the
advantages of Ada notation, is that a numeric literal always starts with
decimal digit. That makes things simple for a descent recursive parser. I
guess this choice was intentional, back in 1983 a complex parser would eat
too much resources...

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

Bearophile

unread,

Aug 23, 2009, 5:09:20 AM8/23/09

to

MRAB:

>'_': what if in the future we want to allow them in numbers for clarity?

Hettinger says it's hard (= requires too many changes) to do that and
Python programs don't have big integer constants often enough, so
probably that improvement will not see the light.

In the meantime in a Python program of mine I have put a small bug,
writing 1000000 instead of 10000000. Now in Python I write
10*1000*1000, because I try to learn from my bugs. In D I enjoy
writing 10_000_000.

-------------------------------

Steven D'Aprano:

> In D, at least some people want to follow Python's lead and either drop
> support for oct literals completely, or require a 0o prefix:http://d.puremagic.com/issues/show_bug.cgi?id=2656

Yes, people in the D community are trying to improve things, but it's
a slow and painful process, and often it goes nowhere. There's lot of
politics.

Bye,
bearophile

garabik-ne...@kassiopeia.juls.savba.sk

unread,

Aug 23, 2009, 6:08:43 AM8/23/09

to

In comp.lang.python James Harris <james.h...@googlemail.com> wrote:
> On 22 Aug, 10:27, David <71da...@libero.it> wrote:

...
>>

>> What about 2_1011, 8_7621, 16_c26h or 2;1011, 8;7621, 16;c26h ?
>
> They look good - which is important. The trouble (for me) is that I
> want the notation for a new programming language and already use these
> characters. I have underscore as an optional separator for groups of
> digits - 123000 and 123_000 mean the same.

Why not just use the space? 123 000 looks better than 123_000, and
is not syntactically ambiguous (at least in python). And as it
already works for string literals, it could be applied to numbers, too…

--
-----------------------------------------------------------
| Radovan Garabík http://kassiopeia.juls.savba.sk/~garabik/ |
| __..--^^^--..__ garabik @ kassiopeia.juls.savba.sk |
-----------------------------------------------------------
Antivirus alert: file .signature infected by signature virus.
Hi! I'm a signature virus! Copy me into your signature file to help me spread!

Matthew Woodcraft

unread,

Aug 23, 2009, 9:13:32 AM8/23/09

to

Dennis Lee Bieber <wlf...@ix.netcom.com> writes:
> About the only place one commonly sees leading zeros on decimal
> numbers, in my experience, is zero-filled COBOL data decks (and since
> classic COBOL stores in BCD anyway... binary (usage is
> computational/comp-1) was a later add-on to the data specification model
> as I recall...)

A more common case is dates.

I've seen people trip over this writing things like

xxx = [
date(2009, 10, 12),
date(2009, 12, 26),
date(2010, 02, 09),
]

-M-

Ben Finney

unread,

Aug 23, 2009, 10:01:46 AM8/23/09

to

garabik-ne...@kassiopeia.juls.savba.sk writes:

> Why not just use the space? 123 000 looks better than 123_000, and is
> not syntactically ambiguous (at least in python). And as it already
> works for string literals, it could be applied to numbers, too…

+1 to all this. I think this discussion was had many months ago, but
can't recall how it ended back then.

--
\ “Only the educated are free.” —Epictetus, _Discourses_ |
`\ |
_o__) |
Ben Finney

J. Cliff Dyer

unread,

Aug 23, 2009, 10:35:22 AM8/23/09

to Ben Finney, pytho...@python.org

I had an objection to using spaces in numeric literals last time around
and it still stands, and it still stands in the new one.

What happens if you use a literal like 0x10f 304? Is 304 treated as
decimal or hexadecimal? It's not clear how you would begin to combine
it The way string concatenation works, it takes two independent string
literals, and combines them. If you specify r'\n' 'abc\n', the first
half is treated independently as a raw string, and the second half is
treated as a normal string. The result is '\\nabc\n'.

With numeric literals, this behavior doesn't even make sense. How do
you concatenate hex 10f with decimal 304? I suppose you could multiply
0x10f by 1000, and add them, but this probably wouldn't fit any
practical usecase.

Alternatively, you could raise an exception, and require the user to use
numeric literals of the same type, like 0x10f 0x304, but then you lose
any readability benefit you might have gained by dropping the _ to begin
with.

If, on the other hand, you want to combine the tokens before processing
their independent meanings, which makes the most intuitive sense, well,
in that case we're no longer talking about an operation analogous to
string contcatenation. We're talking about integers no longer being
simple tokens that can be assigned a value. I'm not familiar with the
code that makes all this happen in C Python (or any other implementation
for that matter), but it seems like it extends the complexity of the
parser unnecessarily.

I'm concerned that the benefit in readability will be outweighed by the
burden it places on the parser, and the cognitive burden on the
programmer of knowing what to expect when using non-decimal numeric
literals. For that reason, I'm a -1 on using a space in numeric
literals, but +1 on using some other separator, and an _, in spite of
its slight awkwardness in typing, seems like a good idea.

If someone with a solid understanding of the python parser could chime
in that this wouldn't cause as much friction as I think, and explain a
clean, elegant implementation for this, many of my concerns would be
alleviated, and I would change my -1 to a -0.

Cheers,
Cliff

bartc

unread,

Aug 23, 2009, 11:57:57 AM8/23/09

to

<garabik-ne...@kassiopeia.juls.savba.sk> wrote in message
news:h6r4fb$18a$1...@aioe.org...

> In comp.lang.python James Harris <james.h...@googlemail.com> wrote:
>> On 22 Aug, 10:27, David <71da...@libero.it> wrote:
>
> ...
>>>
>>> What about 2_1011, 8_7621, 16_c26h or 2;1011, 8;7621, 16;c26h ?
>>
>> They look good - which is important. The trouble (for me) is that I
>> want the notation for a new programming language and already use these
>> characters. I have underscore as an optional separator for groups of
>> digits - 123000 and 123_000 mean the same.
>
> Why not just use the space? 123 000 looks better than 123_000, and
> is not syntactically ambiguous (at least in python).

If the purpose is to allow "_" to introduce a non-base ten literal, using
this to enter a hexadecimal number might result in:

16_1234 ABCD

I'd say that that was ambiguous (depending on whether a name can follow a
number; if you have a operator called ABCD, then that would be a problem).
Unless each block of digits used it's own base:

16_1234 16_ABCD

> And as it
> already works for string literals, it could be applied to numbers, too…

String literals are conveniently surround by quotes, so they're a bit easier
to recognise.

--
Bart

James Harris

unread,

Aug 23, 2009, 2:11:36 PM8/23/09

to

On 21 Aug, 00:59, James Harris <james.harri...@googlemail.com> wrote:

...

> > Is there some magic to make the 2.x CPython interpreter to ignore the
> > annoying octal notation?
> > I'd really like 012 to be "12" and not "10".
>
> This is (IMHO) a sad hangover from C (which took it from B ...

This seemed worth writing up so I've taken the snipped comments and
posted them at

http://sundry.wikispaces.com/octal-zero-prefix

The idea is that the page can be pointed to any time the issue comes
up again.

I've also fleshed the comments out a bit.

James

James Harris

unread,

Aug 23, 2009, 4:55:19 PM8/23/09

to

On 23 Aug, 04:38, c...@tiac.net (Richard Harter) wrote:
> On Sat, 22 Aug 2009 14:54:41 -0700 (PDT), James Harris
>
>
>
>
>
> <james.harri...@googlemail.com> wrote:
> >On 22 Aug, 10:27, David <71da...@libero.it> wrote:
>
> >... (snipped a discussion on languages and other systems interpreting
> >numbers with a leading zero as octal)
>
> >> > Either hexadecimal should have been 0h or octal should
> >> > have been 0t :=3D)
>
> >> I have seen the use of Q/q instead in order to make it clearer. I still
> >> prefer Smalltalk's 16rFF and 8r377.
>
> >> Two interesting options. In a project I have on I have also considered
> >> using 0q as indicating octal. I maybe saw it used once somewhere else
> >> but I have no idea where. 0t was a second choice and 0c third choice
> >> (the other letters of oct). 0o should NOT be used for obvious reasons.
>
> >> So you are saying that Smalltalk has <base in decimal>r<number> where
> >> r is presumably for radix? That's maybe best of all. It preserves the
> >> syntactic requirement of starting a number with a digit and seems to
> >> have greatest flexibility. Not sure how good it looks but it's
> >> certainly not bad.
>
> I opine that a letter is better; special characters are a
> valuable piece of real estate.

Very very true.

> However for floating point you
> need at least three letters because a floating point number has
> three parts: the fixed point point, the exponent base, and the
> exponent. Now we can represent the radices of the individual
> parts with the 'r'scheme, e.g., 2r101001, but we need separate
> letters to designate the exponent base and the exponent. B and E
> are the obvious choices, though we want to be careful about a
> confusion with 'b' in hex. For example, using 'R',
>
> 3R20.1B2E16Rac

Ooh err!

> is 20.1 in trinary (6 1/3) times 2**172 (hex ac).
>
> I grant that this example looks a bit gobbledegookish,

You think? :-)

> but normal
> usage would be much simpler. The notation doesn't handle
> balanced trinary; however I opine that balanced trinary requires
> special notation.

When the programmer needs to construct such values how about allowing
him or her to specify something like

(20.1 in base 3) times 2 to the power of 0xac

Leaving out how to specify (20.1 in base 3) for now this could be

(20.1 in base 3) * 2 ** 0xac

The compiler could convert this to a constant.

James

James Harris

unread,

Aug 23, 2009, 5:42:16 PM8/23/09

to

On 23 Aug, 00:16, Mel <mwil...@the-wire.com> wrote:
> James Harris wrote:
> > I have no idea why Ada which uses the # also apparently uses it to end
> > a number
>
> > 2#1011#, 8#7621#, 16#c26b#
>
> Interesting. They do it because of this example from
> <http://archive.adaic.com/standards/83rat/html/ratl-02-01.html#2.1>:

Thanks for providing an explanation.

>
> 2#1#E8 -- an integer literal of value 256
>
> where the E prefixes a power-of-2 exponent, and can't be taken as a digit of
> the radix. That is to say
>
> 16#1#E2
>
> would also equal 256, since it's 1*16**2 .

Here's another suggested number literal format. First, keep the
familar 0x and 0b of C and others and to add 0t for octal. (T is the
third letter of octal as X is the third letter of hex.) The numbers
above would be

0b1011, 0t7621, 0xc26b

Second, allow an arbitrary number base by putting base and number in
quotes after a zero as in

0"2:1011", 0"8:7621", 0"16:c26b"

This would work for arbitrary bases and allows an exponent to be
tagged on the end. It only depends on zero followed by a quote mark
not being used elsewhere. Finally, although it uses a colon it doesn't
take it away from being used elsewhere in the language.

Another option:

0.(2:1011), 0.(8:7621), 0.(16:c26b)

where the three characters "0.(" begin the sequence.

Comments? Improvements?

James

James Harris

unread,

Aug 23, 2009, 5:45:39 PM8/23/09

to

On 23 Aug, 21:55, James Harris <james.harri...@googlemail.com> wrote:

...

> > However for floating point you

> > need at least three letters because a floating point number has
> > three parts: the fixed point point, the exponent base, and the
> > exponent. Now we can represent the radices of the individual
> > parts with the 'r'scheme, e.g., 2r101001, but we need separate
> > letters to designate the exponent base and the exponent. B and E
> > are the obvious choices, though we want to be careful about a
> > confusion with 'b' in hex. For example, using 'R',
>
> > 3R20.1B2E16Rac
>
> Ooh err!
>
> > is 20.1 in trinary (6 1/3) times 2**172 (hex ac).
>
> > I grant that this example looks a bit gobbledegookish,
>
> You think? :-)
>
> > but normal
> > usage would be much simpler. The notation doesn't handle
> > balanced trinary; however I opine that balanced trinary requires
> > special notation.
>
> When the programmer needs to construct such values how about allowing
> him or her to specify something like
>
> (20.1 in base 3) times 2 to the power of 0xac
>
> Leaving out how to specify (20.1 in base 3) for now this could be
>
> (20.1 in base 3) * 2 ** 0xac

Using the suggestion from another post would convert this to

0.(3:20.1) * 2 ** 0xac

Message has been deleted

MRAB

unread,

Aug 23, 2009, 6:15:55 PM8/23/09

to pytho...@python.org

James Harris wrote:
> On 23 Aug, 00:16, Mel <mwil...@the-wire.com> wrote:
>> James Harris wrote:
>>> I have no idea why Ada which uses the # also apparently uses it to end
>>> a number
>>> 2#1011#, 8#7621#, 16#c26b#
>> Interesting. They do it because of this example from
>> <http://archive.adaic.com/standards/83rat/html/ratl-02-01.html#2.1>:
>
> Thanks for providing an explanation.
>
>> 2#1#E8 -- an integer literal of value 256
>>
>> where the E prefixes a power-of-2 exponent, and can't be taken as a digit of
>> the radix. That is to say
>>
>> 16#1#E2
>>
>> would also equal 256, since it's 1*16**2 .
>
> Here's another suggested number literal format. First, keep the
> familar 0x and 0b of C and others and to add 0t for octal. (T is the
> third letter of octal as X is the third letter of hex.) The numbers
> above would be
>
> 0b1011, 0t7621, 0xc26b
>
> Second, allow an arbitrary number base by putting base and number in
> quotes after a zero as in
>
> 0"2:1011", 0"8:7621", 0"16:c26b"
>

Why not just put the base first, followed by the value in quotes:

2"1011", 8"7621", 16"c26b"

Scott David Daniels

unread,

Aug 23, 2009, 6:50:29 PM8/23/09

to

James Harris wrote:...

> Another option:
>
> 0.(2:1011), 0.(8:7621), 0.(16:c26b)
>
> where the three characters "0.(" begin the sequence.
>
> Comments? Improvements?

I did a little interpreter where non-base 10 numbers
(up to base 36) were:

.7.100 == 64 (octal)
.9.100 == 100 (decimal)
.F.100 == 256 (hexadecimal)
.1.100 == 4 (binary)
.3.100 == 9 (trinary)
.Z.100 == 46656 (base 36)
Advantages:
Tokenizer can recognize chunks easily.
Not visually too confusing,
No issue of what base the base indicator is expressed in.

--Scott David Daniels
Scott....@Acm.Org

bartc

unread,

Aug 23, 2009, 7:04:37 PM8/23/09

to

"Scott David Daniels" <Scott....@Acm.Org> wrote in message
news:kN2dnSZR5b0BWAzX...@pdx.net...

It can be assumed however that .9. isn't in binary?

That's a neat idea. But an even simpler scheme might be:

.octal.100
.decimal.100
.hex.100
.binary.100
.trinary.100

until it gets to this anyway:

.thiryseximal.100

--
Bartc

Max Erickson

unread,

Aug 23, 2009, 9:19:53 PM8/23/09

to pytho...@python.org

"bartc" <ba...@freeuk.com> wrote:

>
> "Scott David Daniels" <Scott....@Acm.Org> wrote in message
> news:kN2dnSZR5b0BWAzX...@pdx.net...
>> James Harris wrote:...
>>> Another option:
>

> It can be assumed however that .9. isn't in binary?
>
> That's a neat idea. But an even simpler scheme might be:
>
> .octal.100
> .decimal.100
> .hex.100
> .binary.100
> .trinary.100
>
> until it gets to this anyway:
>
> .thiryseximal.100
>

At some point, abandoning direct support for literals and just
having a function that can handle different bases starts to make a
lot of sense to me:

>>> int('100', 8)
64
>>> int('100', 10)
100
>>> int('100', 16)
256
>>> int('100', 2)
4
>>> int('100', 3)
9
>>> int('100', 36)
1296

max

greg

unread,

Aug 23, 2009, 9:46:18 PM8/23/09

to

J. Cliff Dyer wrote:

> What happens if you use a literal like 0x10f 304?

To me the obvious thing to do is concatenate them
textually and then treat the whole thing as a single
numeric literal. Anything else wouldn't be sane, IMO.

--
Greg

Ben Finney

unread,

Aug 23, 2009, 10:29:20 PM8/23/09

to

Max Erickson <maxer...@gmail.com> writes:

> At some point, abandoning direct support for literals and just
> having a function that can handle different bases starts to make a
> lot of sense to me:
>
> >>> int('100', 8)
> 64
> >>> int('100', 10)
> 100
> >>> int('100', 16)
> 256
> >>> int('100', 2)
> 4
> >>> int('100', 3)
> 9
> >>> int('100', 36)
> 1296

Hah! You don't get me that easily, nobody would make something so simple
and obvious.

Right, guys?

--
\ “When a well-packaged web of lies has been sold to the masses |
`\ over generations, the truth will seem utterly preposterous and |
_o__) its speaker a raving lunatic.” —Dresden James |
Ben Finney

Ben Finney

unread,

Aug 23, 2009, 10:45:25 PM8/23/09

to

greg <gr...@cosc.canterbury.ac.nz> writes:

Yet, as was pointed out, that behaviour would be inconsistent with the
concatenation of string literals::

>>> "abc" r'def' u"ghi" 'jkl'
u'abcdefghijkl'

So, different representations of literals are parsed as separate
literals, then concatenated. To have the behaviour you describe, the
case needs to be made separately that digit concatenation should not be
consistent with the established string literal parsing behaviour.

--
\ “What if the Hokey Pokey IS what it's all about?” —anonymous |
`\ |
_o__) |
Ben Finney

Message has been deleted

Piet van Oostrum

unread,

Aug 24, 2009, 3:51:37 AM8/24/09

to

>>>>> Scott David Daniels <Scott....@Acm.Org> (SDD) wrote:

>SDD> James Harris wrote:...

>>> Another option:
>>>
>>> 0.(2:1011), 0.(8:7621), 0.(16:c26b)
>>>
>>> where the three characters "0.(" begin the sequence.
>>>
>>> Comments? Improvements?

>SDD> I did a little interpreter where non-base 10 numbers
>SDD> (up to base 36) were:

>SDD> .7.100 == 64 (octal)
>SDD> .9.100 == 100 (decimal)
>SDD> .F.100 == 256 (hexadecimal)
>SDD> .1.100 == 4 (binary)
>SDD> .3.100 == 9 (trinary)
>SDD> .Z.100 == 46656 (base 36)

I wonder how you wrote that interpreter, given that some answers are wrong.
--
Piet van Oostrum <pi...@cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: pi...@vanoostrum.org

Erik Max Francis

unread,

Aug 24, 2009, 4:05:40 AM8/24/09

to

It's always a bit impressive how syntax suggestions get more and more
involved and, if you'll forgive me for saying, ridiculous as the
conversation continues. This is starting to get truly nutty.

What I've done in my projects is simply extend the pattern of 0x... for
hexadecimal literals in C to 0b... for binary, 0o... for octal, 0d...
for decimal (though redundant as that's the default), and so on. (Go
crazy and add 0t... for trinary and 0q... for quaternary if you feel
like it.) To me this always seemed elegant, simple, and understandable.

If arbitrary radix values is what's desirable, then some syntax like

(e.g., 8r024222570 for an octal number which represents a very lame
joke) would work, but seems to me like huge overkill. A normal string
literal coupled with a "constructor" type function would seem far more
appropriate -- and we already have that with `int`.

As for large literals, I'd go with having spaces indicate automatic
concatenation (though only the first in the series can indicate the
radix, whichever method you choose above). It's the same as for
strings, and it's the common SI recommendation for thousands separators
anyway.

--
Erik Max Francis && m...@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 18 N 121 57 W && AIM/Y!M/Skype erikmaxfrancis
The little I know, I owe to my ignorance.
-- Sacha Guitry

Erik Max Francis

unread,

Aug 24, 2009, 4:12:22 AM8/24/09

to

J. Cliff Dyer wrote:
> I had an objection to using spaces in numeric literals last time around
> and it still stands, and it still stands in the new one.
>
> What happens if you use a literal like 0x10f 304? Is 304 treated as
> decimal or hexadecimal? It's not clear how you would begin to combine

> it.

Well, you can't combine them in any meaningful mathematical or
computational sense if they're of different bases, so the answer lies
therein: You shouldn't be allowed to do that.

> The way string concatenation works, it takes two independent string
> literals, and combines them. If you specify r'\n' 'abc\n', the first
> half is treated independently as a raw string, and the second half is
> treated as a normal string. The result is '\\nabc\n'.
>
> With numeric literals, this behavior doesn't even make sense. How do
> you concatenate hex 10f with decimal 304?

You can't, and the operation makes no sense, which is what makes the
syntax unambiguous. An extended numeric literal continues the radix of
wherever it started.

--
Erik Max Francis && m...@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 18 N 121 57 W && AIM/Y!M/Skype erikmaxfrancis

Do not seek death. Death will find you.
-- Dag Hammarskjold

James Harris

unread,

Aug 24, 2009, 4:16:07 AM8/24/09

to

On 24 Aug, 02:19, Max Erickson <maxerick...@gmail.com> wrote:

...

> > It can be assumed however that .9. isn't in binary?
>
> > That's a neat idea. But an even simpler scheme might be:
>
> > .octal.100
> > .decimal.100
> > .hex.100
> > .binary.100
> > .trinary.100
>
> > until it gets to this anyway:
>
> > .thiryseximal.100
>
> At some point, abandoning direct support for literals and just
> having a function that can handle different bases starts to make a
> lot of sense to me:
>
> >>> int('100', 8)
> 64
> >>> int('100', 10)
> 100
> >>> int('100', 16)
> 256
> >>> int('100', 2)
> 4
> >>> int('100', 3)
> 9
> >>> int('100', 36)
> 1296

This is fine typed into the language directly but couldn't be entered
by the user or read-in from or written to a file.

James

Erik Max Francis

unread,

Aug 24, 2009, 4:20:02 AM8/24/09

to

Ben Finney wrote:
> Yet, as was pointed out, that behaviour would be inconsistent with the
> concatenation of string literals::
>
> >>> "abc" r'def' u"ghi" 'jkl'
> u'abcdefghijkl'
>
> So, different representations of literals are parsed as separate
> literals, then concatenated. To have the behaviour you describe, the
> case needs to be made separately that digit concatenation should not be
> consistent with the established string literal parsing behaviour.

Since digit concatenation can't possibly be useful any other way, it
makes perfect sense.

Why is the operator ** right-to-left associative? The same basic
reason: Because it would be dumb for it not to be. Does that make it
confusing and inconsistent compared to most of the other binary
operators? In some sense, yes, it does. But it also makes it sane. Is
anyone so upset by this that it didn't make it into the language, or
cause huge confusion on a regular basis that upsets a lot of users? Nope.

James Harris

unread,

Aug 24, 2009, 4:25:35 AM8/24/09

to

On 24 Aug, 09:05, Erik Max Francis <m...@alcyone.com> wrote:

...

> >> Here's another suggested number literal format. First, keep the
> >> familar 0x and 0b of C and others and to add 0t for octal. (T is the
> >> third letter of octal as X is the third letter of hex.) The numbers
> >> above would be
>
> >> 0b1011, 0t7621, 0xc26b
>
> >> Second, allow an arbitrary number base by putting base and number in
> >> quotes after a zero as in
>
> >> 0"2:1011", 0"8:7621", 0"16:c26b"
>
> > Why not just put the base first, followed by the value in quotes:
>
> > 2"1011", 8"7621", 16"c26b"
>
> It's always a bit impressive how syntax suggestions get more and more
> involved and, if you'll forgive me for saying, ridiculous as the
> conversation continues. This is starting to get truly nutty.

Why do you say that here? MRAB's suggestion is one of the clearest
there has been. And it incorporates the other requirements: starts
with a digit, allows an appropriate alphabet, has no issues with
spacing digit groups, shows clearly where the number ends and could
take an exponent suffix.

James

Hendrik van Rooyen

unread,

Aug 24, 2009, 4:29:20 AM8/24/09

to pytho...@python.org

On Monday 24 August 2009 01:04:37 bartc wrote:

> That's a neat idea. But an even simpler scheme might be:
>
> .octal.100
> .decimal.100
> .hex.100
> .binary.100
> .trinary.100
>
> until it gets to this anyway:
>
> .thiryseximal.100

Yeah right. So now I first have to type a string, which probably has a strict
spelling, before a number. It is only marginally less stupid than this:

1.0 - Unary
11.0101 - Binary
111. 012012 - Trinary
11111111.01234567 - Octal
1111111111.0123456789 - Decimal
1111111111111111.0123456789abcdef - Hex

Any parser that can count will immediately know what to do.

I also tried to include an example of a literal with a base of a Googol but I
ran out of both ink and symbols.
:-)
- Hendrik

Erik Max Francis

unread,

Aug 24, 2009, 4:30:11 AM8/24/09

to

In your opinion. Obviously not in others. Which is pretty obviously
what I meant, so the rhetorical question is a bit weird here.

There's a reason that languages designed by committee end up horrific
nightmares.

--
Erik Max Francis && m...@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 18 N 121 57 W && AIM/Y!M/Skype erikmaxfrancis

Erik Max Francis

unread,

Aug 24, 2009, 4:35:47 AM8/24/09

to

Why would a programmer be expecting an arbitrary-radix numeric literal
typed in by a user or read from a file? If you're reading it from a
file, you've already got it in some satisfactory form, binary or
otherwise. If you're taking it as input from a user, they're not going
to know the Python syntax anyway, and would type in the radix and then
the literal (in the unlikely case this would ever be required, which is
still hard to imagine).

Either way, conversion is, as Max showed, one line of code. It's hard
to see the explicit need for truly arbitrary-radix literals in any
language -- and I'm the guy who's put quaternary literals in syntaxes
he's had to develop just for fun. Binary, octal, decimal, hexadecimal,
sure. Beyond that it's a solution begging for problems.

Erik Max Francis

unread,

Aug 24, 2009, 4:41:14 AM8/24/09

to

Hendrik van Rooyen wrote:
> I also tried to include an example of a literal with a base of a Googol but I
> ran out of both ink and symbols.
> :-)

... or particles in the observable Universe, for that matter.

James Harris

unread,

Aug 24, 2009, 4:47:44 AM8/24/09

to

On 24 Aug, 09:30, Erik Max Francis <m...@alcyone.com> wrote:
> James Harris wrote:
> > On 24 Aug, 09:05, Erik Max Francis <m...@alcyone.com> wrote:
> >>>> Here's another suggested number literal format. First, keep the
> >>>> familar 0x and 0b of C and others and to add 0t for octal. (T is the
> >>>> third letter of octal as X is the third letter of hex.) The numbers
> >>>> above would be
> >>>> 0b1011, 0t7621, 0xc26b
> >>>> Second, allow an arbitrary number base by putting base and number in
> >>>> quotes after a zero as in
> >>>> 0"2:1011", 0"8:7621", 0"16:c26b"
> >>> Why not just put the base first, followed by the value in quotes:
> >>> 2"1011", 8"7621", 16"c26b"
> >> It's always a bit impressive how syntax suggestions get more and more
> >> involved and, if you'll forgive me for saying, ridiculous as the
> >> conversation continues. This is starting to get truly nutty.
>
> > Why do you say that here? MRAB's suggestion is one of the clearest
> > there has been. And it incorporates the other requirements: starts
> > with a digit, allows an appropriate alphabet, has no issues with
> > spacing digit groups, shows clearly where the number ends and could
> > take an exponent suffix.
>
> In your opinion. Obviously not in others. Which is pretty obviously
> what I meant, so the rhetorical question is a bit weird here.

Don't get defensive.... Yes, in my opinion, if you like, but you can't
say "obviously not in others" as no one else but you has commented on
MRAB's suggestion.

Also, when you say "This is starting to get truly nutty" would you
accept that that's in your opinion?

> There's a reason that languages designed by committee end up horrific
> nightmares.

True but I would suggest that mistakes are also made by designers who
do not seek the opinions of others. There's a balance to be struck
between a committee and an ivory tower.

James

Carl Banks

unread,

Aug 24, 2009, 6:56:08 AM8/24/09

to

On Aug 23, 7:45 pm, Ben Finney <ben+pyt...@benfinney.id.au> wrote:

> greg <g...@cosc.canterbury.ac.nz> writes:
> > J. Cliff Dyer wrote:
>
> > > What happens if you use a literal like 0x10f 304?
>
> > To me the obvious thing to do is concatenate them textually and then
> > treat the whole thing as a single numeric literal. Anything else
> > wouldn't be sane, IMO.
>
> Yet, as was pointed out, that behaviour would be inconsistent with the
> concatenation of string literals::
>
> >>> "abc" r'def' u"ghi" 'jkl'
> u'abcdefghijkl'

Well my take on it is that this would not be the same as string
concatenation, the series of digits would be parsed as a single token
with spaces automatically removed. That does make a difference to the
users (it's not just under the covers).

For instance, string concatenation works across lines:

"abc"
"def"

but if the numbers were parsed as a single token it wouldn't
necessarily be allowed, and would be unwise, so this is out:

100
200

You might want to also enforce rules such as only a single space can
separate digits, no tabs, not multiple spaces, so this

100 200

would also be right out. You might even want to enforce that spaces
be at regular intervals. I don't think it would matter too much that
digit separation can superficially resemble string concatenation if
you don't break the strings across lines, it's not too difficult to
explain what the difference is, and there's really not much chance
anyone would be confused by their meanings.

Having said all that, I would favor _ as a digit separator in Python
any day of the week, and I don't think it's all that important to have
one at all.

HOWEVER, I once proposed that if I were designing a new language I'd
consider allowing spaces in identifiers. (That didn't stop people
from arguing why it would be confusing in Python, but never mind
that.) If spaces were allowed in identifiers, then I'd be also in
favor of spaces in numeric literals.

> So, different representations of literals are parsed as separate
> literals, then concatenated. To have the behaviour you describe, the
> case needs to be made separately that digit concatenation should not be
> consistent with the established string literal parsing behaviour.

Well, one doesn't really *need* to make that case, they just might not
care about consistency.

But if they did I think Erik's case is a good one: very little chance
of confusion because there's really only one reasonable
interpretation. The point of consistency is to help understand things
by analogy, but if analogy doesn't help understanding--and it wouldn't
in this case--there's no point.

Carl Banks

NevilleDNZ

unread,

Aug 24, 2009, 8:22:42 AM8/24/09

to

On Aug 23, 9:42 pm, James Harris <james.harri...@googlemail.com>
wrote:

> The numbers above would be
>
> 0b1011, 0t7621, 0xc26b

Algol68 has the type BITS, that is converted to INT with the ABS
operator.
The numbers above would be:
> 2r1011, 8r7621, 16rc26b

"r" is for radix: http://en.wikipedia.org/wiki/Radix

The standard supports 2r, 4r, 8r & 16r only.

The standard supports LONG BITS, LONG LONG BITS etc, but does not
include UNSIGNED.

Compare gcc's:

bash$ cat num_lit.c
#include <stdio.h>
main(){
printf("%d %d %d %d\n",0xffff,07777,9999,0b1111);
}

bash$ ./num_lit
65535 4095 9999 15

With Algol68's: https://sourceforge.net/projects/algol68/

bash$ cat num_lit.a68
main:(
printf(($g$,ABS 16rffff,ABS 8r7777,9999,ABS 2r1111,$l$))
)

bash$ algol68g ./num_lit.a68
+65535 +4095 +9999 +15

Enjoy
N

Mel

unread,

Aug 24, 2009, 9:05:24 AM8/24/09

to

James Harris wrote:

> On 24 Aug, 02:19, Max Erickson <maxerick...@gmail.com> wrote:

[ ... ]

>> >>> int('100', 3)
>> 9
>> >>> int('100', 36)
>> 1296
>
> This is fine typed into the language directly but couldn't be entered
> by the user or read-in from or written to a file.

That's rather beside the point. Literals don't essentially come from files
or user input. Essentially literals are a subset of expressions, just like
function calls are, and they have to be evaluated by Python to yield a
value. I'm not averse to 32'rst', but we already have

Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41)
[GCC 4.3.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> int ('rst', 32)
28573

Mel.

>
> James

Derek Martin

unread,

Aug 24, 2009, 9:42:31 AM8/24/09

to Steven D'Aprano, pytho...@python.org

On Sun, Aug 23, 2009 at 06:13:31AM +0000, Steven D'Aprano wrote:
> On Sat, 22 Aug 2009 22:19:01 -0500, Derek Martin wrote:
> > On Sat, Aug 22, 2009 at 02:55:51AM +0000, Steven D'Aprano wrote:
> >> And the great thing is that now you get to teach yourself to stop
> >> writing octal numbers implicitly and be write them explicitly with a
> >> leading 0o instead :)
> >
> > Sorry, I don't write them implicitly. A leading zero explicitly states
> > that the numeric constant that follows is octal.
>
> That is incorrect.

No, it simply isn't. It is a stated specification in most popular
programming languages that an integer preceded by a leading zero is an
octal number. That makes it explicit, when used by a programmer to
write an octal literal. By definition. End of discussion.

> (Explicitness isn't a binary state

Of course it is. Something can be either stated or implied... there
are no shades in between. Perhaps you mean "obvious and intutitive"
where you are using the word "explicit" above (and that would be a
matter of subjective opinion). The leading zero, however, is
undoubtedly explicit. It is an explicitly written token which, in
that context, has the meaning that the digits that follow are an octal
number. One simply needs to be aware of that aspect of the
specification of the programming language, and one will clearly know
that it is octal.

My point in mentioning that many other programming languages, by the
way, was NOT to suggest that, "See, look here, all these other folks
do it that way too, so it must be right." It was to refute the notion that
the leading zero as octal was in some way unusual. It is, in fact,
ubiquitous in computing, taught roughly in the first week of any
beginning computing course using nearly any modern popular programming
language, and discussed within the first full page of text in the
section on numerical literals in _Learning Python_ (and undoubtedly
many other books on Python). It may be a surprise the first time you
run into it, but you typically won't forget that detail after you run
into it the first time.

> However, octal numbers are defined implicitly: 012 is a legal base 10
> number, or base 3, or base 9, or base 16.

Not in any programming language I use today, it's not. In all of
those, 012 is an octal integer literal, per the language spec.

> There's nothing about a leading zero that says "base 8" apart from
> familiarity.

That is simply untrue. What says base 8 about a leading zero is the
formal specification of the programming language.

The only way using octal could be implicit in the code is if you
wrote something like:

x = 12

in your code, and then had to pass a flag to your compiler or
interpreter to tell it that you meant to use octal integer literals
instead of decimal ones. That, of course, would be insane. But
specifying a leading zero to represent an octal number zero is very
much explicit, by definition.

> We can see the difference between leading 0x and leading 0 if you
> repeat it: repeating an explicit 0x, as in 0x0xFF, is a syntax
> error, while repeating an implicit 0 silently does nothing
> different:

No, we can't. Just as you can type 0012, you can also type 0x0FF.
It's not different AT ALL. In both cases, the marker designated by
the programming language as the base indicator can be followed by an
arbitrary number of zeros which do not impact the magnitude of the
integer so specified. Identical behavior. The key is simply to
understand that the first 0 is not a digit -- it's a syntactic marker,
an operator if you will (though Python may not technically think of it
that way). The definition of '0' is overloaded, just as other
language tokens often are. This, too, is hardly unusual.

> There are a bunch of languages, pretty much all heavily influenced
> by C, which treat integer literals with leading 0s as oct: C++,
> Javascript, Python 2.x, Ruby, Perl, Java. As so often is the case,
> C's design mistakes become common practice. Sigh.

That it is a design mistake is a matter of opinion. Obviously the
people who designed it didn't think it was a mistake, and neither do
I. If you search the web for this topic (I did), you will find no
shortage of people who think the recent movement to irradicate the
leading zero to be frustrating, annoying, and/or stupid. And by
the way, this representation predates C. It was at least present in
B.

> FORTRAN 90 uses a leading O (uppercase o) for octal

That clearly IS a design mistake, because O is virtually
indistinguishable from 0, especially considering varying fonts and
people's variable eye sight quality.

> Algol uses an explicit base: 8r12 to indicate octal 10.

This is far better than 0o01. I maintain that 0o1 is only marginally
better than O01 (from your fortran example) or 0O1, allowed by Python.
The string 8r12 has the nicety that it can easily be used to represent
integers of any radix in a consistent fashion. But I maintain that
the leading zero is just fine, and changing it after 20 years of
Python seems more than a little arbitrary and brain-damaged to me.

> > Computer languages are not write-only, excepting maybe Perl. ;-) Writing
> > 0o12 presents no hardship; but I assert, with at least some support from
> > others here, that *reading* it does.
>
> No more so than 0x or 0b literals. If anything, 0o12 stands out as "not
> twelve" far more than 012 does.

Obviously, I don't agree.

Moving on... I've wasted enough time arguing about this.

--
Derek D. Martin
http://www.pizzashack.org/
GPG Key ID: 0x81CFE75D

Derek Martin

unread,

Aug 24, 2009, 9:56:48 AM8/24/09

to Matthew Woodcraft, pytho...@python.org

On Sun, Aug 23, 2009 at 01:13:32PM +0000, Matthew Woodcraft wrote:
> Dennis Lee Bieber <wlf...@ix.netcom.com> writes:
> > About the only place one commonly sees leading zeros on decimal
> > numbers, in my experience, is zero-filled COBOL data decks (and since
> > classic COBOL stores in BCD anyway... binary (usage is
> > computational/comp-1) was a later add-on to the data specification model
> > as I recall...)
>
> A more common case is dates.

I suppose this is true, but I can't remember the last time I
hard-coded a date in a program, or worked on someone else's code with
hard-coded dates. I'm fairly certain I've never done it, and if I
had, I obviously would not have used leading zeros. I think
hard-coding dates is more uncommon than using octal. ;-) [It
unquestionably is, for me personally.] I tend to also discount this
example, because when we write dates with leading zeros, usually it's
in some variation of the form 08/09/2009, which, containing slashes,
is a string, not a number, and can not be used as a date literal in
any language I know. We do it for reasons of format, which again
implies that it has more the characteristics of a string than of a
number. To use such a thing in any programming language I can think
of would require translation from a string.

> I've seen people trip over this writing things like
>
> xxx = [
> date(2009, 10, 12),
> date(2009, 12, 26),
> date(2010, 02, 09),
> ]

I've never seen anyone do this (no doubt because it would be an
error), but as I said, I don't think I've ever seen hard-coded dates
in any programs I've worked on. I've never encountered anyone having
problems with octals who was not a total noob at programming. The
changing of this syntax seems like much ado about nothing to me, and
as such is annoying, consider that I use it very often.

Derek Martin

unread,

Aug 24, 2009, 10:14:25 AM8/24/09

to Matthew Woodcraft, pytho...@python.org

On Mon, Aug 24, 2009 at 08:56:48AM -0500, Derek Martin wrote:
> On Sun, Aug 23, 2009 at 01:13:32PM +0000, Matthew Woodcraft wrote:
> > A more common case is dates.
>

> I suppose this is true, but [...]

> I tend to also discount this example, because when we write dates
> with leading zeros, usually it's in some variation of the form
> 08/09/2009, which, containing slashes, is a string, not a number

In fact, now that I think of it...

I just looked at some old school papers I had tucked away in a family
album. I'm quite sure that in grammar school, I was tought to use a
date format of 8/9/79, without leading zeros. I can't prove it, of
course, but feel fairly sure that the prevalence of leading zeros in
dates occured only in the mid to late 1980's as computers became more
prevalent in our society (no doubt because thousands of cobol
programmers writing business apps needed a way to convert dates as
strings to numbers that was easy and fit in small memory).

Assuming I'm right about that, then the use of a leading 0 to
represent octal actually predates the prevalence of using 0 in dates
by almost two decades. And while using leading zeros in other
contexts is "familiar" to me, I would certainly not consider it
"common" by any means. Thus I think it's fair to say that when this
syntax was selected, it was a rather good choice.

garabik-ne...@kassiopeia.juls.savba.sk

unread,

Aug 24, 2009, 10:35:56 AM8/24/09

to

J. Cliff Dyer <j...@sdf.lonestar.org> wrote:
> I had an objection to using spaces in numeric literals last time around
> and it still stands, and it still stands in the new one.
>

Or, we can use U+00A0 NO-BREAK SPACE, once we already have unicode
variable names :-)
(probably some people would find it difficult to type, though
with my keyboard layout it is COMPOSE + SPACE + SPACE, not
more difficult than _).
Well, reading code listings could be a bit confusing.
Thinking about it, U+2005 FOUR-PER-EM SPACE makes more sense.
Aesthetically, too :-)

--
-----------------------------------------------------------
| Radovan Garabík http://kassiopeia.juls.savba.sk/~garabik/ |
| __..--^^^--..__ garabik @ kassiopeia.juls.savba.sk |
-----------------------------------------------------------
Antivirus alert: file .signature infected by signature virus.
Hi! I'm a signature virus! Copy me into your signature file to help me spread!

Carl Banks

unread,

Aug 24, 2009, 11:31:13 AM8/24/09

to

On Aug 24, 6:56 am, Derek Martin <c...@pizzashack.org> wrote:
> I think
> hard-coding dates is more uncommon than using octal. ;-) [It
> unquestionably is, for me personally.]

You just don't get it, do you? Do you really think this is a contest
over what's more common and the winner gets to choose the syntax? You
really think that's the issue?

It is not. The issue is that C's arcane octal notation is MIND-
BOGGLINGLY RETARDED.

So, even if Unix file permissions were a hundred times more common
than padding integer constants with zero, it still wouldn't be a good
idea to have it in Python because the notation is retarded.

Even if 99.999% of other languages use the notation, it still wouldn't
be a good idea to have it in Python because the notation is retarded.

The vast majority of people reading this will understand intuitively
why the arcane octal notation is retarded. I was going to explain it,
but I decided it's not worth a serious argument.

Carl Banks

Steven D'Aprano

unread,

Aug 24, 2009, 11:36:24 AM8/24/09

to

On Mon, 24 Aug 2009 12:45:25 +1000, Ben Finney wrote:

> greg <gr...@cosc.canterbury.ac.nz> writes:
>
>> J. Cliff Dyer wrote:
>>
>> > What happens if you use a literal like 0x10f 304?
>>
>> To me the obvious thing to do is concatenate them textually and then
>> treat the whole thing as a single numeric literal. Anything else
>> wouldn't be sane, IMO.

Agreed. It's the only sane way to deal with concatenating numeric
literals. It makes it simple and easy to understand: remove the
whitespace from inside the literal, and parse as normal.

123 4567 => 1234567 # legal
0xff 123 => 0xff123 # legal
123 0xff => 1230xff # illegal

The first two examples would be legal, the last would raise a syntax
error, for obvious reasons. This would also work for floats:

1.23 4e5 => 1.234e5 # legal
1.23 4.5 => 1.234.5 # illegal
1e23 4e5 => 1e234e5 # illegal

> Yet, as was pointed out, that behaviour would be inconsistent with the
> concatenation of string literals::
>
> >>> "abc" r'def' u"ghi" 'jkl'
> u'abcdefghijkl'

Unicode/byte conversion is obviously a special case, and arguably should
have been prohibited, although "practicality beats purity" suggests that
a single unicode string in the sequence should make the lot unicode.
(What else could it mean?)

In any case, numeric concatenation and string concatenation are very
different beasts. With strings, you have to interpret each piece as
either bytes or characters, you have to treat escapes specially, you have
to deal with matching delimiters. For numeric concatenation, none of
those complications is relevant: there is no equivalent to the byte/
character dichotomy, there are no escape sequences, there are no
delimiters.

Numeric literals are much simpler than string literals, consequently the
concatenation rule can be correspondingly simpler too. There's no need to
complicate it by *adding* complexity: you can't have mixed bases in a
single numeric literal without spaces, why would you expect to have mixed
bases in one with spaces?

--
Steven

Derek Martin

unread,

Aug 24, 2009, 11:50:28 AM8/24/09

to Hendrik van Rooyen, pytho...@python.org, Matthew Woodcraft

On Mon, Aug 24, 2009 at 05:22:39PM +0200, Hendrik van Rooyen wrote:
> > Assuming I'm right about that, then the use of a leading 0 to
> > represent octal actually predates the prevalence of using 0 in dates
> > by almost two decades.
>

> Not quite - at the time I started, punch cards and data entry forms were
> already well established practice, and at least on the English machines, (ICL
> 1500/1900 series) octal was prevalent, but I don't know when the leading zero
> octal notation started, and where.

I said "prevalence." The key is that the average person did not start
using leading zeros in dates until (I think) much, much later, and
that's what's relevant to this discussion. If it were not commonplace
for people to use decimal numbers with leading zeros, this whole
thread would be a moot point, the python devs likely never would have
considered changing the syntax, and we would not be having this
discussion. Most people did not work as data entry clerks on ICL
computers... :)

Those participating in this thread have pretty much all seem to agree
that the only places where decimal numbers with leading zeros really
are common are either in rather specialized applications, such as
computer-oriented data or serial numbers (which typically behave more
like strings, from a computer science perspective), or the rather
common one of dates. The latter case is perhaps what's significant,
if any of those cases are. I tend to think that within the computer
science arena, the history and prevalence of the leading 0 indicating
octal far outweighs all of those cases combined.

> I think you give it credence for far more depth of design thinking than what
> actually happened in those days - some team working on a compiler made a
> decision (based on gut feel or experience, or precedent, or whim ) and that
> was that - lo! - a standard is born!

Rather, I think you give the folks at Bell Labs way too little credit.
They designed a programming language and an operating system that,
while certainly not exactly the same as their original incarnations,
even then contained a lot of features and design principles that
remain state-of-the-art (though perhaps their specific implementation
details have since been improved) and in many ways superior to a lot
of what has come since (e.g. virtually anything that came out of
Microsoft). [That's just my opinion, of course... but shared by many.
:)] I don't think that happened by mere accident. That's not to say
they were perfect, but those guys had their proverbial $#!t together.

Scott David Daniels

unread,

Aug 24, 2009, 12:18:20 PM8/24/09

to

Piet van Oostrum wrote:
>>>>>> Scott David Daniels <Scott....@Acm.Org> (SDD) wrote:
>
>> SDD> James Harris wrote:...
>>>> Another option:
>>>>
>>>> 0.(2:1011), 0.(8:7621), 0.(16:c26b)
>>>>
>>>> where the three characters "0.(" begin the sequence.
>>>>
>>>> Comments? Improvements?
>
>> SDD> I did a little interpreter where non-base 10 numbers
>> SDD> (up to base 36) were:
>
>> SDD> .7.100 == 64 (octal)
>> SDD> .9.100 == 100 (decimal)
>> SDD> .F.100 == 256 (hexadecimal)
>> SDD> .1.100 == 4 (binary)
>> SDD> .3.100 == 9 (trinary)
>> SDD> .Z.100 == 46656 (base 36)
>
> I wonder how you wrote that interpreter, given that some answers are wrong.

Obviously I started with a different set of examples and edited after
starting to make a table that could be interpretted in each base. After
doing that, I forgot to double check, and lo and behold .F.1000 = 46656,
while .F.100 = 1296. Since it has been decades since I've had access
to that interpreter, this is all from memory.

--Scott David Daniels
Scott....@Acm.Org

Derek Martin

unread,

Aug 24, 2009, 12:21:46 PM8/24/09

to Carl Banks, pytho...@python.org

On Mon, Aug 24, 2009 at 08:31:13AM -0700, Carl Banks wrote:
> On Aug 24, 6:56 am, Derek Martin <c...@pizzashack.org> wrote:
> > I think hard-coding dates is more uncommon than using octal. ;-)
> > [It unquestionably is, for me personally.]
>
> You just don't get it, do you?

I think I get it just fine, thanks.

> Do you really think this is a contest over what's more common and
> the winner gets to choose the syntax? You really think that's the
> issue?

No, I think it's about egos. Someone got the idea that 0o1 was better
than 01, and had to be Right. And had the power to make it happen, or
at least (sadly) convince the people with the power.

I'm simply presenting an argument that the need for the change is not so
clear. You say the old syntax is retarded. I say the new syntax, and
the very act of making the change itself is retarded. I think my
argument is very solid and persuasive; but of course some minds are
invulnerable to persuasion. I might not even disagree that the old
syntax could be improved upon, except that it already is what it is,
and the new syntax is NOT better; I personally believe it's not only
not better, but that it's actually worse. Others have agreed.

> It is not. The issue is that C's arcane octal notation is MIND-
> BOGGLINGLY RETARDED.

As I said, I searched the web on this topic before I bothered to post.
I did a bit of research. One of the things that my search turned up:
A lot of smart people disagree with you. If the use of the leading
zero boggles your mind, then perhaps your mind is too easily boggled,
and perhaps you should seek a different way to occupy your time.

This is yet another case where some Pythonista has gotten it in his
head that "There is One Truth, and the Old Way be Damned, my way is
The Way, and Thus Shall It Be Evermore." And worse yet, managed to
convince others. Well, there's no such thing as One Truth, and there
are different perspectives that are just as valid as yours. I'm
expressing one now. This change sucks. I already know that my rant
won't change the syntax. The only reason I bothered to post is
because I do actually quite like Python -- something I can say of only
one other programming language -- and I think the powers that be are
(in some cases) making it worse, not better. I hoped to open a few
minds with a different perspective, but of course I should have known
better.

0o1 is not better than 01. On my terminal it's hard to see the
difference between 0 and o. YMMV. But since YMMV, and since the old
syntax is prevalent both within and without the Python community,
making the change is, was, and always will be a bad idea.

Steven D'Aprano

unread,

Aug 24, 2009, 12:47:43 PM8/24/09

to

On Mon, 24 Aug 2009 09:14:25 -0500, Derek Martin wrote:

> Assuming I'm right about that, then the use of a leading 0 to represent
> octal actually predates the prevalence of using 0 in dates by almost two
> decades. And while using leading zeros in other contexts is "familiar"
> to me, I would certainly not consider it "common" by any means. Thus I
> think it's fair to say that when this syntax was selected, it was a
> rather good choice.

Except of course to anyone familiar with mathematics in the last, oh,
five hundred years or so. Mathematics has used a positional system for
numbers for centuries now: leading zeroes have been insignificant, just
like trailing zeroes after the decimal point:

9 = 09 = 009 = 9.0 = 9.00 = 0009.000 etc.

--
Steven

Derek Martin

unread,

Aug 24, 2009, 1:02:39 PM8/24/09

to Steven D'Aprano, pytho...@python.org

On Mon, Aug 24, 2009 at 04:47:43PM +0000, Steven D'Aprano wrote:
> Except of course to anyone familiar with mathematics in the last, oh,
> five hundred years or so. Mathematics has used a positional system for
> numbers for centuries now: leading zeroes have been insignificant, just
> like trailing zeroes after the decimal point:
>
> 9 = 09 = 009 = 9.0 = 9.00 = 0009.000 etc.

Dude, seriously. No one ever *uses* leading zeros in the context of
mathematics except in 2nd grade math class.

Steven D'Aprano

unread,

Aug 24, 2009, 1:03:28 PM8/24/09

to

On Mon, 24 Aug 2009 11:21:46 -0500, Derek Martin wrote:

> since the old
> syntax is prevalent both within and without the Python community, making
> the change is, was, and always will be a bad idea.

Octal syntax isn't prevalent *at all*, except in a small number of niche
areas.

You've said that this change is a hardship for you, because on your
terminal 0 and o are hard to distinguish. Personally, I'd say that's a
good sign that your terminal is crappy and you should use a better one,
but putting that aside, let's accept that. To you, for whatever reason,
0o looks just like 00.

Okay then. Under the current 2.x syntax, 0012 would be interpreted as
octal. Under the new 3.x syntax, 0o12 which looks just like 0012 also
would be interpreted as octal. You have argued that it might not be any
harder to type the extra 'o' to get an octal literal, but that it will
hurt readability. I quote:

"Writing 0o12 presents no hardship; but I assert, with at least some
support from others here, that *reading* it does."

But according to you, reading 0o12 is just like reading 0012. 0o12 under
the new syntax gives decimal ten, and it looks just like 0012 in the old
syntax, which also gives ten. So there's no difference in reading, and
you've already accepted that the extra effort in writing it "presents no
hardship".

A whole lot of noise over a change which is more or less invisible.

--
Steven

Harald Luessen

unread,

Aug 24, 2009, 1:18:16 PM8/24/09

to

On Mon, 24 Aug 2009 Derek Martin wrote:
>Those participating in this thread have pretty much all seem to agree
>that the only places where decimal numbers with leading zeros really
>are common are either in rather specialized applications, such as
>computer-oriented data or serial numbers (which typically behave more
>like strings, from a computer science perspective), or the rather
>common one of dates. The latter case is perhaps what's significant,
>if any of those cases are.

I don't like the 'leading 0 is octal'-syntax. I typically think of
numbers as decimal and bytes as hexadecimal.
I would even write something like this:

# iterate over bits 3 to 5
for i in range(0x00, 0x40, 0x08):
...
print "0x%02x\n" % i

0x00
0x08
0x10
0x18
0x20
0x28
0x30
0x38

For me it is easier to see where the bits are in the hex notation.

And it is very common to use numbers with leading zeroes
that are hexadecimal. Like this:

# print address and data
for i in range(0x10000):
print "%04x: %d\n" % i, data[i]

0000: ...
0001: ...
...
000f: ...
0010: ...
...

When you are looking for examples of numbers where leading zeroes
do not mean octal then consider decimal AND hexadecimal.

>I tend to think that within the computer
>science arena, the history and prevalence of the leading 0 indicating
>octal far outweighs all of those cases combined.

I disagree.

Harald

Derek Martin

unread,

Aug 24, 2009, 1:40:24 PM8/24/09

to Steven D'Aprano, pytho...@python.org

On Mon, Aug 24, 2009 at 05:03:28PM +0000, Steven D'Aprano wrote:
> On Mon, 24 Aug 2009 11:21:46 -0500, Derek Martin wrote:
> > since the old syntax is prevalent both within and without the
> > Python community, making the change is, was, and always will be a
> > bad idea.
>
> Octal syntax isn't prevalent *at all*, except in a small number of
> niche areas.

Steven, don't be obtuse. Where octal is used in programming, the
leading zero is prevalent.

> You've said that this change is a hardship for you, because on your
> terminal 0 and o are hard to distinguish. Personally, I'd say that's a
> good sign that your terminal is crappy and you should use a better one,

The terminal I use is just fine. Stringing together similar
characters always has the possibility of confusing the reader. The
human mind tends to see what it expects, and fills in the gaps when it
does not. It wouldn't matter much if I changed my terminal font,
unless I made the font big enough to be not especially useful, except
for the rather exceptional case of detecting 0o1 and similar patterns
in python code. The suggestion is asinine, and you know it.

> but putting that aside, let's accept that. To you, for whatever reason,
> 0o looks just like 00.

It doesn't look "just like" 00, but similar enough that you have to
pay close attention.

> Okay then. Under the current 2.x syntax, 0012 would be interpreted as
> octal. Under the new 3.x syntax, 0o12 which looks just like 0012 also
> would be interpreted as octal. You have argued that it might not be any
> harder to type the extra 'o' to get an octal literal, but that it will
> hurt readability. I quote:
>
> "Writing 0o12 presents no hardship; but I assert, with at least some
> support from others here, that *reading* it does."

Let me clarify my statement. Writing 0o12 is easy -- there is no
hardship to type the characters 0o12 (well, actually it feels a bit
cumbersome, to be honest). Remembering to do so, however, when
virtually everwhere else one uses octal writes it 012, is not easy.
Then I stand corrected: There is indeed hardship.

> But according to you, reading 0o12 is just like reading 0012. 0o12 under
> the new syntax gives decimal ten, and it looks just like 0012 in the old
> syntax, which also gives ten. So there's no difference in reading,

But there *IS* a difference in reading, because 0o12 is not the same
as 0012, and which one you use *matters*. In particular, it will matter
with the adoption of Python 3.x, where the latter will be an error.
But it matters even in 2.6 because right now, you can write it either
way, and that is (I think) even more confusing... There is also still
discussion (mentioned in the relevant PEP) about making 0012 *valid
decimal*. That should never, ever, ever happen.

Why is it so hard for you to accept that intelligent people can
disagree with you, and that what's right for you might be bad for
others?

Gabriel Genellina

unread,

Aug 24, 2009, 3:40:14 PM8/24/09

to pytho...@python.org

En Mon, 24 Aug 2009 14:40:24 -0300, Derek Martin <co...@pizzashack.org>
escribi�:

> Why is it so hard for you to accept that intelligent people can
> disagree with you, and that what's right for you might be bad for
> others?

Ask the same question yourself please.

--
Gabriel Genellina

Derek Martin

unread,

Aug 24, 2009, 4:14:24 PM8/24/09

to Gabriel Genellina, pytho...@python.org

On Mon, Aug 24, 2009 at 04:40:14PM -0300, Gabriel Genellina wrote:
> En Mon, 24 Aug 2009 14:40:24 -0300, Derek Martin

> <co...@pizzashack.org> escribió:

>
> >Why is it so hard for you to accept that intelligent people can
> >disagree with you, and that what's right for you might be bad for
> >others?
>
> Ask the same question yourself please.

I accept it. But I reserve the right to voice my dissent, and am
doing so. The Usual Suspects in this forum seem to suggest that the
change is some silver bullet that makes Python suddenly Right With The
World, and I say it just ain't so. I happen to opine that the old
behavior was better, and I will not be dissuaded from that opinion
just because a few prominent posters in this forum suggest that I'm an
idiot for disagreeing with them.

My original post in this thread, if you weren't paying attention, was
in opposition to several people trying to cram the idea down the
throats of other posters that leading zeros to represent octal numbers
is inherently evil, and that anyone who suggests otherwise is an
Apostate to be damned for all eternity.

Alright, I exaggerate. Slightly. :)

James Harris

unread,

Aug 24, 2009, 7:23:06 PM8/24/09

to

On 24 Aug, 14:05, Mel <mwil...@the-wire.com> wrote:
> James Harris wrote:
> > On 24 Aug, 02:19, Max Erickson <maxerick...@gmail.com> wrote:

> [ ... ]
> >> >>> int('100', 3)
> >> 9
> >> >>> int('100', 36)
> >> 1296
>
> > This is fine typed into the language directly but couldn't be entered
> > by the user or read-in from or written to a file.
>
> That's rather beside the point. Literals don't essentially come from files
> or user input. Essentially literals are a subset of expressions, just like
> function calls are, and they have to be evaluated by Python to yield a
> value. I'm not averse to 32'rst', but we already have

...

> >>> int ('rst', 32)
>
> 28573

Sure but while I wouldn't normally want to type something as obscure
as 32"rst" into a file of data I might want to type 0xff00 or similar.
That is far clearer than 65280 in some cases.

My point was that int('ff00', 16) is OK for the programmer but cannot
be used generally as it includes a function call.

James

Steven D'Aprano

unread,

Aug 24, 2009, 8:25:39 PM8/24/09

to

On Mon, 24 Aug 2009 16:23:06 -0700, James Harris wrote:

> Sure but while I wouldn't normally want to type something as obscure as
> 32"rst" into a file of data I might want to type 0xff00 or similar. That
> is far clearer than 65280 in some cases.
>
> My point was that int('ff00', 16) is OK for the programmer but cannot be
> used generally as it includes a function call.

No, it's the other way around. If you have *data*, whether entered at run
time by the user or read from a file, you can easily pass it to a
function to convert to an int. (In fact you have to do this anyway,
because the data will be a string and you need an int.)

If you want your data file to have values entered in hex, or oct, or even
unary (1=one, 11=two, 111=three, 1111=four...) you can. There's no need
to have the user enter int('ff00', 16) to get hex, just have them enter
ff00.

But when writing *code*, you want syntax which will accept integers in
the most common bases (decimal, a distant second hex, an even more
distant third octal, and way out on the horizon binary) without the
runtime cost of a function call.

--
Steven

Steven D'Aprano

unread,

Aug 24, 2009, 8:37:36 PM8/24/09

to

On Mon, 24 Aug 2009 12:40:24 -0500, Derek Martin wrote:

> On Mon, Aug 24, 2009 at 05:03:28PM +0000, Steven D'Aprano wrote:
>> On Mon, 24 Aug 2009 11:21:46 -0500, Derek Martin wrote:
>> > since the old syntax is prevalent both within and without the Python
>> > community, making the change is, was, and always will be a bad idea.
>>
>> Octal syntax isn't prevalent *at all*, except in a small number of
>> niche areas.
>
> Steven, don't be obtuse. Where octal is used in programming, the
> leading zero is prevalent.

Now who is being obtuse? If you take *any* feature at all, no matter how
rare, you can say "Where it is used, it is prevalent". Among people who
program in Whitespace, all three of them, the use of spaces and tabs as
significant programming tokens is prevalent.

This whole argument is over whether or not a "feature" desired by a tiny
proportion of the programming community -- the intersection of those who
use octal frequently enough that using an extra 'o' is a hardship, and
those who use C-based languages -- should hold *everyone else* hostage to
their badly thought out notation.

[...]

> Why is it so hard for you to accept that intelligent people can disagree
> with you, and that what's right for you might be bad for others?

I can accept that intelligent people can disagree with me. I even
sympathise with you, that you're one of the minority who don't find octal
archaic and unnecessary, and you'll need to learn a new syntax for octal
literals in Python 3.x. But your argument is fundamentally "but we've
always done it this way, and other languages do it, so why should we
change?". We should change because the desire to prevent silent errors
caused by (e.g.) 012 being interpreted as 10, and the desire to be
consistent with both mathematical notation and floating point syntax
outweighs the need to be backward compatible.

This change was not a spur of the moment thing, it went through the
entire PEP process with due concern for backward compatibility, which
Python does *not* change without good reason. You lost, get over it. I'm
sorry that you personally find this change a hardship, but HTFU. If and
when you move to Python 3.x, you'll get used to it. If you can get used
to putting braces around code blocks in C and braces around dicts in
Python, you're more than capable of getting used to writing 012 in C and
0o12 in Python.

--
Steven

greg

unread,

Aug 24, 2009, 8:54:13 PM8/24/09

to

Ben Finney wrote:

> So, different representations of literals are parsed as separate
> literals, then concatenated. To have the behaviour you describe, the
> case needs to be made separately that digit concatenation should not be
> consistent with the established string literal parsing behaviour.

I think it's a pretty easy case to make, since there is no
obvious way of "concatenating" numbers written in different
bases. So if it's to be allowed at all, it pretty much has
to be restricted to a single base.

However, I'd be just as happy with underscores. I don't
see how there could be any great difficulty with implementing
that -- it only affects the scanner.

--
Greg

Mensanator

unread,

Aug 24, 2009, 9:01:38 PM8/24/09

to

On Aug 24, 7:25 pm, Steven D'Aprano <st...@REMOVE-THIS-

cybersource.com.au> wrote:
> On Mon, 24 Aug 2009 16:23:06 -0700, James Harris wrote:
> > Sure but while I wouldn't normally want to type something as obscure as
> > 32"rst" into a file of data I might want to type 0xff00 or similar. That
> > is far clearer than 65280 in some cases.
>
> > My point was that int('ff00', 16) is OK for the programmer but cannot be
> > used generally as it includes a function call.
>
> No, it's the other way around. If you have *data*, whether entered at run
> time by the user or read from a file, you can easily pass it to a
> function to convert to an int. (In fact you have to do this anyway,
> because the data will be a string and you need an int.)
>
> If you want your data file to have values entered in hex, or oct, or even
> unary (1=one, 11=two, 111=three, 1111=four...) you can.

Unary? I think you'll find that Standard Positional Number
Systems are not defined for radix 1.

Mel

unread,

Aug 24, 2009, 9:21:32 PM8/24/09

to

Mensanator wrote:
[ ... ]

>> If you want your data file to have values entered in hex, or oct, or even
>> unary (1=one, 11=two, 111=three, 1111=four...) you can.
>
> Unary? I think you'll find that Standard Positional Number
> Systems are not defined for radix 1.

It has to be tweaked. If the only digit you have is 0 then your numbers
take the form

0*1 + 0*1**2 + 0*1**3 ...

and every number has an infinitely long representation. If you cheat and
take a 1 digit instead then it becomes workable.

Mel.

Erik Max Francis

unread,

Aug 24, 2009, 11:20:21 PM8/24/09

to

Trailing zeroes are quite important when you're indicating the
significance of a figure. 9 is not the same as 9.0 or 9.000.

--
Erik Max Francis && m...@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 18 N 121 57 W && AIM/Y!M/Skype erikmaxfrancis
If the sky should fall, hold up your hands.
-- (a Spanish proverb)

Mensanator

unread,

Aug 25, 2009, 1:27:51 AM8/25/09

to

Not really. If your single digit is one, you still have
an infinitely long representation only instead of every
position being zero, every position is one.

So either the only number that can be represented is 0,
or the only number that can be represented is infinity.
No amount of tweaking can fix this.

So, to use radix 1, you have to abandon the concept
of "Standard" (contains a 0) AND abandon "Positional"
(infinitely long representation). It's all in TAOCP
by Knuth if you want to get it straight.

You can have a radix 1 number system, but is meaningless
to speak of "unary" in the same context as hex, decimal,
octal & binary.

>
> � � � � Mel.