... (snipped a discussion on languages and other systems interpreting
numbers with a leading zero as octal)
> > Either hexadecimal should have been 0h or octal should
> > have been 0t :=)
>
>
> I have seen the use of Q/q instead in order to make it clearer. I still
> prefer Smalltalk's 16rFF and 8r377.
>
>
> Two interesting options. In a project I have on I have also considered
> using 0q as indicating octal. I maybe saw it used once somewhere else
> but I have no idea where. 0t was a second choice and 0c third choice
> (the other letters of oct). 0o should NOT be used for obvious reasons.
>
> So you are saying that Smalltalk has <base in decimal>r<number> where
> r is presumably for radix? That's maybe best of all. It preserves the
> syntactic requirement of starting a number with a digit and seems to
> have greatest flexibility. Not sure how good it looks but it's
> certainly not bad.
>
>
> > 0xff & 0x0e | 0b1101
> > 16rff & 16r0e | 2r1101
>
> > Hmm. Maybe a symbol would be better than a letter.
...
> > Or Ada's 16#FF#, 8#377#...
> > I forget if DEC/VMS FORTRAN or Xerox Sigma FORTRAN used x'FF' or
> > 'FF'x, and o'377' or '377'o
...
>
> What about 2_1011, 8_7621, 16_c26h or 2;1011, 8;7621, 16;c26h ?
They look good - which is important. The trouble (for me) is that I
want the notation for a new programming language and already use these
characters. I have underscore as an optional separator for groups of
digits - 123000 and 123_000 mean the same. The semicolon terminates a
statement. Based on your second idea, though, maybe a colon could be
used instead as in
2:1011, 8:7621, 16:c26b
I don't (yet) use it as a range operator.
I could also use a hash sign as although I allow hash to begin
comments it cannot be preceded by anything other than whitespace so
these would be usable
2#1011, 8#7621, 16#c26b
I have no idea why Ada which uses the # also apparently uses it to end
a number
2#1011#, 8#7621#, 16#c26b#
Copying this post also to comp.lang.misc. Folks there may either be
interested in the discussion or have comments to add.
James
> I have no idea why Ada which uses the # also apparently uses it to end
> a number
>
> 2#1011#, 8#7621#, 16#c26b#
Interesting. They do it because of this example from
<http://archive.adaic.com/standards/83rat/html/ratl-02-01.html#2.1>:
2#1#E8 -- an integer literal of value 256
where the E prefixes a power-of-2 exponent, and can't be taken as a digit of
the radix. That is to say
16#1#E2
would also equal 256, since it's 1*16**2 .
Mel.
>On 22 Aug, 10:27, David <71da...@libero.it> wrote:
>
>... (snipped a discussion on languages and other systems interpreting
>numbers with a leading zero as octal)
>
>> > Either hexadecimal should have been 0h or octal should
>> > have been 0t :=3D)
>>
>>
>> I have seen the use of Q/q instead in order to make it clearer. I still
>> prefer Smalltalk's 16rFF and 8r377.
>>
>>
>> Two interesting options. In a project I have on I have also considered
>> using 0q as indicating octal. I maybe saw it used once somewhere else
>> but I have no idea where. 0t was a second choice and 0c third choice
>> (the other letters of oct). 0o should NOT be used for obvious reasons.
>>
>> So you are saying that Smalltalk has <base in decimal>r<number> where
>> r is presumably for radix? That's maybe best of all. It preserves the
>> syntactic requirement of starting a number with a digit and seems to
>> have greatest flexibility. Not sure how good it looks but it's
>> certainly not bad.
I opine that a letter is better; special characters are a
valuable piece of real estate. However for floating point you
need at least three letters because a floating point number has
three parts: the fixed point point, the exponent base, and the
exponent. Now we can represent the radices of the individual
parts with the 'r'scheme, e.g., 2r101001, but we need separate
letters to designate the exponent base and the exponent. B and E
are the obvious choices, though we want to be careful about a
confusion with 'b' in hex. For example, using 'R',
3R20.1B2E16Rac
is 20.1 in trinary (6 1/3) times 2**172 (hex ac).
I grant that this example looks a bit gobbledegookish, but normal
usage would be much simpler. The notation doesn't handle
balanced trinary; however I opine that balanced trinary requires
special notation.
Richard Harter, c...@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
No one asks if a tree falls in the forest
if there is no one there to see it fall.
> They look good - which is important. The trouble (for me) is that I
> want the notation for a new programming language and already use these
> characters. I have underscore as an optional separator for groups of
> digits - 123000 and 123_000 mean the same. The semicolon terminates a
> statement. Based on your second idea, though, maybe a colon could be
> used instead as in
>
> 2:1011, 8:7621, 16:c26b
>
> I don't (yet) use it as a range operator.
>
> I could also use a hash sign as although I allow hash to begin
> comments it cannot be preceded by anything other than whitespace so
> these would be usable
>
> 2#1011, 8#7621, 16#c26b
>
> I have no idea why Ada which uses the # also apparently uses it to end
> a number
>
> 2#1011#, 8#7621#, 16#c26b#
If you are going Unicode, you could use the mathematical notation, which is
1011<sub>2</sub>, 7621<sub>8</sub>, c26b<sub>16</sub>
(subscript specification of the base). Yes, it might be difficult to type
(:-)), and would require some look-ahead in the parser. One of the
advantages of Ada notation, is that a numeric literal always starts with
decimal digit. That makes things simple for a descent recursive parser. I
guess this choice was intentional, back in 1983 a complex parser would eat
too much resources...
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
...
>>
>> What about 2_1011, 8_7621, 16_c26h or 2;1011, 8;7621, 16;c26h ?
>
> They look good - which is important. The trouble (for me) is that I
> want the notation for a new programming language and already use these
> characters. I have underscore as an optional separator for groups of
> digits - 123000 and 123_000 mean the same.
Why not just use the space? 123 000 looks better than 123_000, and
is not syntactically ambiguous (at least in python). And as it
already works for string literals, it could be applied to numbers, too…
--
-----------------------------------------------------------
| Radovan Garabík http://kassiopeia.juls.savba.sk/~garabik/ |
| __..--^^^--..__ garabik @ kassiopeia.juls.savba.sk |
-----------------------------------------------------------
Antivirus alert: file .signature infected by signature virus.
Hi! I'm a signature virus! Copy me into your signature file to help me spread!
> Why not just use the space? 123 000 looks better than 123_000, and is
> not syntactically ambiguous (at least in python). And as it already
> works for string literals, it could be applied to numbers, too…
+1 to all this. I think this discussion was had many months ago, but
can't recall how it ended back then.
--
\ “Only the educated are free.” —Epictetus, _Discourses_ |
`\ |
_o__) |
Ben Finney
If the purpose is to allow "_" to introduce a non-base ten literal, using
this to enter a hexadecimal number might result in:
16_1234 ABCD
I'd say that that was ambiguous (depending on whether a name can follow a
number; if you have a operator called ABCD, then that would be a problem).
Unless each block of digits used it's own base:
16_1234 16_ABCD
> And as it
> already works for string literals, it could be applied to numbers, too…
String literals are conveniently surround by quotes, so they're a bit easier
to recognise.
--
Bart
Very very true.
> However for floating point you
> need at least three letters because a floating point number has
> three parts: the fixed point point, the exponent base, and the
> exponent. Now we can represent the radices of the individual
> parts with the 'r'scheme, e.g., 2r101001, but we need separate
> letters to designate the exponent base and the exponent. B and E
> are the obvious choices, though we want to be careful about a
> confusion with 'b' in hex. For example, using 'R',
>
> 3R20.1B2E16Rac
Ooh err!
> is 20.1 in trinary (6 1/3) times 2**172 (hex ac).
>
> I grant that this example looks a bit gobbledegookish,
You think? :-)
> but normal
> usage would be much simpler. The notation doesn't handle
> balanced trinary; however I opine that balanced trinary requires
> special notation.
When the programmer needs to construct such values how about allowing
him or her to specify something like
(20.1 in base 3) times 2 to the power of 0xac
Leaving out how to specify (20.1 in base 3) for now this could be
(20.1 in base 3) * 2 ** 0xac
The compiler could convert this to a constant.
James
Thanks for providing an explanation.
>
> 2#1#E8 -- an integer literal of value 256
>
> where the E prefixes a power-of-2 exponent, and can't be taken as a digit of
> the radix. That is to say
>
> 16#1#E2
>
> would also equal 256, since it's 1*16**2 .
Here's another suggested number literal format. First, keep the
familar 0x and 0b of C and others and to add 0t for octal. (T is the
third letter of octal as X is the third letter of hex.) The numbers
above would be
0b1011, 0t7621, 0xc26b
Second, allow an arbitrary number base by putting base and number in
quotes after a zero as in
0"2:1011", 0"8:7621", 0"16:c26b"
This would work for arbitrary bases and allows an exponent to be
tagged on the end. It only depends on zero followed by a quote mark
not being used elsewhere. Finally, although it uses a colon it doesn't
take it away from being used elsewhere in the language.
Another option:
0.(2:1011), 0.(8:7621), 0.(16:c26b)
where the three characters "0.(" begin the sequence.
Comments? Improvements?
James
...
> > However for floating point you
> > need at least three letters because a floating point number has
> > three parts: the fixed point point, the exponent base, and the
> > exponent. Now we can represent the radices of the individual
> > parts with the 'r'scheme, e.g., 2r101001, but we need separate
> > letters to designate the exponent base and the exponent. B and E
> > are the obvious choices, though we want to be careful about a
> > confusion with 'b' in hex. For example, using 'R',
>
> > 3R20.1B2E16Rac
>
> Ooh err!
>
> > is 20.1 in trinary (6 1/3) times 2**172 (hex ac).
>
> > I grant that this example looks a bit gobbledegookish,
>
> You think? :-)
>
> > but normal
> > usage would be much simpler. The notation doesn't handle
> > balanced trinary; however I opine that balanced trinary requires
> > special notation.
>
> When the programmer needs to construct such values how about allowing
> him or her to specify something like
>
> (20.1 in base 3) times 2 to the power of 0xac
>
> Leaving out how to specify (20.1 in base 3) for now this could be
>
> (20.1 in base 3) * 2 ** 0xac
Using the suggestion from another post would convert this to
0.(3:20.1) * 2 ** 0xac
I did a little interpreter where non-base 10 numbers
(up to base 36) were:
.7.100 == 64 (octal)
.9.100 == 100 (decimal)
.F.100 == 256 (hexadecimal)
.1.100 == 4 (binary)
.3.100 == 9 (trinary)
.Z.100 == 46656 (base 36)
Advantages:
Tokenizer can recognize chunks easily.
Not visually too confusing,
No issue of what base the base indicator is expressed in.
--Scott David Daniels
Scott....@Acm.Org
It can be assumed however that .9. isn't in binary?
That's a neat idea. But an even simpler scheme might be:
.octal.100
.decimal.100
.hex.100
.binary.100
.trinary.100
until it gets to this anyway:
.thiryseximal.100
--
Bartc
>SDD> James Harris wrote:...
>>> Another option:
>>>
>>> 0.(2:1011), 0.(8:7621), 0.(16:c26b)
>>>
>>> where the three characters "0.(" begin the sequence.
>>>
>>> Comments? Improvements?
>SDD> I did a little interpreter where non-base 10 numbers
>SDD> (up to base 36) were:
>SDD> .7.100 == 64 (octal)
>SDD> .9.100 == 100 (decimal)
>SDD> .F.100 == 256 (hexadecimal)
>SDD> .1.100 == 4 (binary)
>SDD> .3.100 == 9 (trinary)
>SDD> .Z.100 == 46656 (base 36)
I wonder how you wrote that interpreter, given that some answers are wrong.
--
Piet van Oostrum <pi...@cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: pi...@vanoostrum.org
...
> >> Here's another suggested number literal format. First, keep the
> >> familar 0x and 0b of C and others and to add 0t for octal. (T is the
> >> third letter of octal as X is the third letter of hex.) The numbers
> >> above would be
>
> >> 0b1011, 0t7621, 0xc26b
>
> >> Second, allow an arbitrary number base by putting base and number in
> >> quotes after a zero as in
>
> >> 0"2:1011", 0"8:7621", 0"16:c26b"
>
> > Why not just put the base first, followed by the value in quotes:
>
> > 2"1011", 8"7621", 16"c26b"
>
> It's always a bit impressive how syntax suggestions get more and more
> involved and, if you'll forgive me for saying, ridiculous as the
> conversation continues. This is starting to get truly nutty.
Why do you say that here? MRAB's suggestion is one of the clearest
there has been. And it incorporates the other requirements: starts
with a digit, allows an appropriate alphabet, has no issues with
spacing digit groups, shows clearly where the number ends and could
take an exponent suffix.
James
In your opinion. Obviously not in others. Which is pretty obviously
what I meant, so the rhetorical question is a bit weird here.
There's a reason that languages designed by committee end up horrific
nightmares.
--
Erik Max Francis && m...@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 18 N 121 57 W && AIM/Y!M/Skype erikmaxfrancis
Do not seek death. Death will find you.
-- Dag Hammarskjold
Don't get defensive.... Yes, in my opinion, if you like, but you can't
say "obviously not in others" as no one else but you has commented on
MRAB's suggestion.
Also, when you say "This is starting to get truly nutty" would you
accept that that's in your opinion?
> There's a reason that languages designed by committee end up horrific
> nightmares.
True but I would suggest that mistakes are also made by designers who
do not seek the opinions of others. There's a balance to be struck
between a committee and an ivory tower.
James
Algol68 has the type BITS, that is converted to INT with the ABS
operator.
The numbers above would be:
> 2r1011, 8r7621, 16rc26b
"r" is for radix: http://en.wikipedia.org/wiki/Radix
The standard supports 2r, 4r, 8r & 16r only.
The standard supports LONG BITS, LONG LONG BITS etc, but does not
include UNSIGNED.
Compare gcc's:
bash$ cat num_lit.c
#include <stdio.h>
main(){
printf("%d %d %d %d\n",0xffff,07777,9999,0b1111);
}
bash$ ./num_lit
65535 4095 9999 15
With Algol68's: https://sourceforge.net/projects/algol68/
bash$ cat num_lit.a68
main:(
printf(($g$,ABS 16rffff,ABS 8r7777,9999,ABS 2r1111,$l$))
)
bash$ algol68g ./num_lit.a68
+65535 +4095 +9999 +15
Enjoy
N
--Scott David Daniels
Scott....@Acm.Org
>They look good - which is important. The trouble (for me) is that I
>want the notation for a new programming language and already use these
>characters. I have underscore as an optional separator for groups of
>digits - 123000 and 123_000 mean the same. The semicolon terminates a
>statement. Based on your second idea, though, maybe a colon could be
>used instead as in
XPL uses "(2)1011" for base 4,
"(3)03212" for octal,
"(4)0741" for base 16.
PL/I uses 8FXN for numeric hex and X suffix for a hex character constant.