> > Either hexadecimal should have been 0h or octal should > > have been 0t :=)
> I have seen the use of Q/q instead in order to make it clearer. I still > prefer Smalltalk's 16rFF and 8r377.
> Two interesting options. In a project I have on I have also considered > using 0q as indicating octal. I maybe saw it used once somewhere else > but I have no idea where. 0t was a second choice and 0c third choice > (the other letters of oct). 0o should NOT be used for obvious reasons.
> So you are saying that Smalltalk has <base in decimal>r<number> where > r is presumably for radix? That's maybe best of all. It preserves the > syntactic requirement of starting a number with a digit and seems to > have greatest flexibility. Not sure how good it looks but it's > certainly not bad.
> > Hmm. Maybe a symbol would be better than a letter.
...
> > Or Ada's 16#FF#, 8#377#... > > I forget if DEC/VMS FORTRAN or Xerox Sigma FORTRAN used x'FF' or > > 'FF'x, and o'377' or '377'o
...
> What about 2_1011, 8_7621, 16_c26h or 2;1011, 8;7621, 16;c26h ?
They look good - which is important. The trouble (for me) is that I want the notation for a new programming language and already use these characters. I have underscore as an optional separator for groups of digits - 123000 and 123_000 mean the same. The semicolon terminates a statement. Based on your second idea, though, maybe a colon could be used instead as in
2:1011, 8:7621, 16:c26b
I don't (yet) use it as a range operator.
I could also use a hash sign as although I allow hash to begin comments it cannot be preceded by anything other than whitespace so these would be usable
2#1011, 8#7621, 16#c26b
I have no idea why Ada which uses the # also apparently uses it to end a number
2#1011#, 8#7621#, 16#c26b#
Copying this post also to comp.lang.misc. Folks there may either be interested in the discussion or have comments to add.
<james.harri...@googlemail.com> wrote: >On 22 Aug, 10:27, David <71da...@libero.it> wrote:
>... (snipped a discussion on languages and other systems interpreting >numbers with a leading zero as octal)
>> > Either hexadecimal should have been 0h or octal should >> > have been 0t :=3D)
>> I have seen the use of Q/q instead in order to make it clearer. I still >> prefer Smalltalk's 16rFF and 8r377.
>> Two interesting options. In a project I have on I have also considered >> using 0q as indicating octal. I maybe saw it used once somewhere else >> but I have no idea where. 0t was a second choice and 0c third choice >> (the other letters of oct). 0o should NOT be used for obvious reasons.
>> So you are saying that Smalltalk has <base in decimal>r<number> where >> r is presumably for radix? That's maybe best of all. It preserves the >> syntactic requirement of starting a number with a digit and seems to >> have greatest flexibility. Not sure how good it looks but it's >> certainly not bad.
I opine that a letter is better; special characters are a valuable piece of real estate. However for floating point you need at least three letters because a floating point number has three parts: the fixed point point, the exponent base, and the exponent. Now we can represent the radices of the individual parts with the 'r'scheme, e.g., 2r101001, but we need separate letters to designate the exponent base and the exponent. B and E are the obvious choices, though we want to be careful about a confusion with 'b' in hex. For example, using 'R',
3R20.1B2E16Rac
is 20.1 in trinary (6 1/3) times 2**172 (hex ac).
I grant that this example looks a bit gobbledegookish, but normal usage would be much simpler. The notation doesn't handle balanced trinary; however I opine that balanced trinary requires special notation.
On Sat, 22 Aug 2009 14:54:41 -0700 (PDT), James Harris wrote: > They look good - which is important. The trouble (for me) is that I > want the notation for a new programming language and already use these > characters. I have underscore as an optional separator for groups of > digits - 123000 and 123_000 mean the same. The semicolon terminates a > statement. Based on your second idea, though, maybe a colon could be > used instead as in
> 2:1011, 8:7621, 16:c26b
> I don't (yet) use it as a range operator.
> I could also use a hash sign as although I allow hash to begin > comments it cannot be preceded by anything other than whitespace so > these would be usable
> 2#1011, 8#7621, 16#c26b
> I have no idea why Ada which uses the # also apparently uses it to end > a number
> 2#1011#, 8#7621#, 16#c26b#
If you are going Unicode, you could use the mathematical notation, which is
(subscript specification of the base). Yes, it might be difficult to type (:-)), and would require some look-ahead in the parser. One of the advantages of Ada notation, is that a numeric literal always starts with decimal digit. That makes things simple for a descent recursive parser. I guess this choice was intentional, back in 1983 a complex parser would eat too much resources...
In comp.lang.python James Harris <james.harri...@googlemail.com> wrote:
> On 22 Aug, 10:27, David <71da...@libero.it> wrote:
...
>> What about 2_1011, 8_7621, 16_c26h or 2;1011, 8;7621, 16;c26h ?
> They look good - which is important. The trouble (for me) is that I > want the notation for a new programming language and already use these > characters. I have underscore as an optional separator for groups of > digits - 123000 and 123_000 mean the same.
Why not just use the space? 123 000 looks better than 123_000, and is not syntactically ambiguous (at least in python). And as it already works for string literals, it could be applied to numbers, too…
-- ----------------------------------------------------------- | Radovan Garabík http://kassiopeia.juls.savba.sk/~garabik/ | | __..--^^^--..__ garabik @ kassiopeia.juls.savba.sk | ----------------------------------------------------------- Antivirus alert: file .signature infected by signature virus. Hi! I'm a signature virus! Copy me into your signature file to help me spread!
garabik-news-2005...@kassiopeia.juls.savba.sk writes: > Why not just use the space? 123 000 looks better than 123_000, and is > not syntactically ambiguous (at least in python). And as it already > works for string literals, it could be applied to numbers, too…
+1 to all this. I think this discussion was had many months ago, but can't recall how it ended back then.
-- \ “Only the educated are free.” —Epictetus, _Discourses_ | `\ | _o__) | Ben Finney
> In comp.lang.python James Harris <james.harri...@googlemail.com> wrote: >> On 22 Aug, 10:27, David <71da...@libero.it> wrote:
> ...
>>> What about 2_1011, 8_7621, 16_c26h or 2;1011, 8;7621, 16;c26h ?
>> They look good - which is important. The trouble (for me) is that I >> want the notation for a new programming language and already use these >> characters. I have underscore as an optional separator for groups of >> digits - 123000 and 123_000 mean the same.
> Why not just use the space? 123 000 looks better than 123_000, and > is not syntactically ambiguous (at least in python).
If the purpose is to allow "_" to introduce a non-base ten literal, using this to enter a hexadecimal number might result in:
16_1234 ABCD
I'd say that that was ambiguous (depending on whether a name can follow a number; if you have a operator called ABCD, then that would be a problem). Unless each block of digits used it's own base:
16_1234 16_ABCD
> And as it > already works for string literals, it could be applied to numbers, too…
String literals are conveniently surround by quotes, so they're a bit easier to recognise.
> >... (snipped a discussion on languages and other systems interpreting > >numbers with a leading zero as octal)
> >> > Either hexadecimal should have been 0h or octal should > >> > have been 0t :=3D)
> >> I have seen the use of Q/q instead in order to make it clearer. I still > >> prefer Smalltalk's 16rFF and 8r377.
> >> Two interesting options. In a project I have on I have also considered > >> using 0q as indicating octal. I maybe saw it used once somewhere else > >> but I have no idea where. 0t was a second choice and 0c third choice > >> (the other letters of oct). 0o should NOT be used for obvious reasons.
> >> So you are saying that Smalltalk has <base in decimal>r<number> where > >> r is presumably for radix? That's maybe best of all. It preserves the > >> syntactic requirement of starting a number with a digit and seems to > >> have greatest flexibility. Not sure how good it looks but it's > >> certainly not bad.
> I opine that a letter is better; special characters are a > valuable piece of real estate.
Very very true.
> However for floating point you > need at least three letters because a floating point number has > three parts: the fixed point point, the exponent base, and the > exponent. Now we can represent the radices of the individual > parts with the 'r'scheme, e.g., 2r101001, but we need separate > letters to designate the exponent base and the exponent. B and E > are the obvious choices, though we want to be careful about a > confusion with 'b' in hex. For example, using 'R',
> 3R20.1B2E16Rac
Ooh err!
> is 20.1 in trinary (6 1/3) times 2**172 (hex ac).
> I grant that this example looks a bit gobbledegookish,
You think? :-)
> but normal > usage would be much simpler. The notation doesn't handle > balanced trinary; however I opine that balanced trinary requires > special notation.
When the programmer needs to construct such values how about allowing him or her to specify something like
(20.1 in base 3) times 2 to the power of 0xac
Leaving out how to specify (20.1 in base 3) for now this could be
> where the E prefixes a power-of-2 exponent, and can't be taken as a digit of > the radix. That is to say
> 16#1#E2
> would also equal 256, since it's 1*16**2 .
Here's another suggested number literal format. First, keep the familar 0x and 0b of C and others and to add 0t for octal. (T is the third letter of octal as X is the third letter of hex.) The numbers above would be
0b1011, 0t7621, 0xc26b
Second, allow an arbitrary number base by putting base and number in quotes after a zero as in
0"2:1011", 0"8:7621", 0"16:c26b"
This would work for arbitrary bases and allows an exponent to be tagged on the end. It only depends on zero followed by a quote mark not being used elsewhere. Finally, although it uses a colon it doesn't take it away from being used elsewhere in the language.
Another option:
0.(2:1011), 0.(8:7621), 0.(16:c26b)
where the three characters "0.(" begin the sequence.
> > However for floating point you > > need at least three letters because a floating point number has > > three parts: the fixed point point, the exponent base, and the > > exponent. Now we can represent the radices of the individual > > parts with the 'r'scheme, e.g., 2r101001, but we need separate > > letters to designate the exponent base and the exponent. B and E > > are the obvious choices, though we want to be careful about a > > confusion with 'b' in hex. For example, using 'R',
> > 3R20.1B2E16Rac
> Ooh err!
> > is 20.1 in trinary (6 1/3) times 2**172 (hex ac).
> > I grant that this example looks a bit gobbledegookish,
> You think? :-)
> > but normal > > usage would be much simpler. The notation doesn't handle > > balanced trinary; however I opine that balanced trinary requires > > special notation.
> When the programmer needs to construct such values how about allowing > him or her to specify something like
> (20.1 in base 3) times 2 to the power of 0xac
> Leaving out how to specify (20.1 in base 3) for now this could be
> (20.1 in base 3) * 2 ** 0xac
Using the suggestion from another post would convert this to
> where the three characters "0.(" begin the sequence.
> Comments? Improvements?
I did a little interpreter where non-base 10 numbers (up to base 36) were:
.7.100 == 64 (octal) .9.100 == 100 (decimal) .F.100 == 256 (hexadecimal) .1.100 == 4 (binary) .3.100 == 9 (trinary) .Z.100 == 46656 (base 36) Advantages: Tokenizer can recognize chunks easily. Not visually too confusing, No issue of what base the base indicator is expressed in.
>>>>> Scott David Daniels <Scott.Dani...@Acm.Org> (SDD) wrote: >SDD> James Harris wrote:... >>> Another option:
>>> 0.(2:1011), 0.(8:7621), 0.(16:c26b)
>>> where the three characters "0.(" begin the sequence.
>>> Comments? Improvements? >SDD> I did a little interpreter where non-base 10 numbers >SDD> (up to base 36) were: >SDD> .7.100 == 64 (octal) >SDD> .9.100 == 100 (decimal) >SDD> .F.100 == 256 (hexadecimal) >SDD> .1.100 == 4 (binary) >SDD> .3.100 == 9 (trinary) >SDD> .Z.100 == 46656 (base 36)
I wonder how you wrote that interpreter, given that some answers are wrong. -- Piet van Oostrum <p...@cs.uu.nl> URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4] Private email: p...@vanoostrum.org
> >> Here's another suggested number literal format. First, keep the > >> familar 0x and 0b of C and others and to add 0t for octal. (T is the > >> third letter of octal as X is the third letter of hex.) The numbers > >> above would be
> >> 0b1011, 0t7621, 0xc26b
> >> Second, allow an arbitrary number base by putting base and number in > >> quotes after a zero as in
> >> 0"2:1011", 0"8:7621", 0"16:c26b"
> > Why not just put the base first, followed by the value in quotes:
> > 2"1011", 8"7621", 16"c26b"
> It's always a bit impressive how syntax suggestions get more and more > involved and, if you'll forgive me for saying, ridiculous as the > conversation continues. This is starting to get truly nutty.
Why do you say that here? MRAB's suggestion is one of the clearest there has been. And it incorporates the other requirements: starts with a digit, allows an appropriate alphabet, has no issues with spacing digit groups, shows clearly where the number ends and could take an exponent suffix.
James Harris wrote: > On 24 Aug, 09:05, Erik Max Francis <m...@alcyone.com> wrote: >>>> Here's another suggested number literal format. First, keep the >>>> familar 0x and 0b of C and others and to add 0t for octal. (T is the >>>> third letter of octal as X is the third letter of hex.) The numbers >>>> above would be >>>> 0b1011, 0t7621, 0xc26b >>>> Second, allow an arbitrary number base by putting base and number in >>>> quotes after a zero as in >>>> 0"2:1011", 0"8:7621", 0"16:c26b" >>> Why not just put the base first, followed by the value in quotes: >>> 2"1011", 8"7621", 16"c26b" >> It's always a bit impressive how syntax suggestions get more and more >> involved and, if you'll forgive me for saying, ridiculous as the >> conversation continues. This is starting to get truly nutty.
> Why do you say that here? MRAB's suggestion is one of the clearest > there has been. And it incorporates the other requirements: starts > with a digit, allows an appropriate alphabet, has no issues with > spacing digit groups, shows clearly where the number ends and could > take an exponent suffix.
In your opinion. Obviously not in others. Which is pretty obviously what I meant, so the rhetorical question is a bit weird here.
There's a reason that languages designed by committee end up horrific nightmares.
-- Erik Max Francis && m...@alcyone.com && http://www.alcyone.com/max/ San Jose, CA, USA && 37 18 N 121 57 W && AIM/Y!M/Skype erikmaxfrancis Do not seek death. Death will find you. -- Dag Hammarskjold
> James Harris wrote: > > On 24 Aug, 09:05, Erik Max Francis <m...@alcyone.com> wrote: > >>>> Here's another suggested number literal format. First, keep the > >>>> familar 0x and 0b of C and others and to add 0t for octal. (T is the > >>>> third letter of octal as X is the third letter of hex.) The numbers > >>>> above would be > >>>> 0b1011, 0t7621, 0xc26b > >>>> Second, allow an arbitrary number base by putting base and number in > >>>> quotes after a zero as in > >>>> 0"2:1011", 0"8:7621", 0"16:c26b" > >>> Why not just put the base first, followed by the value in quotes: > >>> 2"1011", 8"7621", 16"c26b" > >> It's always a bit impressive how syntax suggestions get more and more > >> involved and, if you'll forgive me for saying, ridiculous as the > >> conversation continues. This is starting to get truly nutty.
> > Why do you say that here? MRAB's suggestion is one of the clearest > > there has been. And it incorporates the other requirements: starts > > with a digit, allows an appropriate alphabet, has no issues with > > spacing digit groups, shows clearly where the number ends and could > > take an exponent suffix.
> In your opinion. Obviously not in others. Which is pretty obviously > what I meant, so the rhetorical question is a bit weird here.
Don't get defensive.... Yes, in my opinion, if you like, but you can't say "obviously not in others" as no one else but you has commented on MRAB's suggestion.
Also, when you say "This is starting to get truly nutty" would you accept that that's in your opinion?
> There's a reason that languages designed by committee end up horrific > nightmares.
True but I would suggest that mistakes are also made by designers who do not seek the opinions of others. There's a balance to be struck between a committee and an ivory tower.
> I wonder how you wrote that interpreter, given that some answers are wrong.
Obviously I started with a different set of examples and edited after starting to make a table that could be interpretted in each base. After doing that, I forgot to double check, and lo and behold .F.1000 = 46656, while .F.100 = 1296. Since it has been decades since I've had access to that interpreter, this is all from memory.
>They look good - which is important. The trouble (for me) is that I >want the notation for a new programming language and already use these >characters. I have underscore as an optional separator for groups of >digits - 123000 and 123_000 mean the same. The semicolon terminates a >statement. Based on your second idea, though, maybe a colon could be >used instead as in
XPL uses "(2)1011" for base 4, "(3)03212" for octal, "(4)0741" for base 16.
PL/I uses 8FXN for numeric hex and X suffix for a hex character constant.