maybe digit subgroup readable numeric constants

100 views
Skip to first unread message

Jeffrey Sarnoff

unread,
May 17, 2012, 11:50:42 AM5/17/12
to juli...@googlegroups.com
on comp.lang.c in a thread about "C's replacement" BGB suggested the ability to put "_" in numbers as a separator:
int64_t myConst = 0x0123_4567_89AB_CDEF

Too much time is spent triple-checking each of the digits when entering or checking tables of multidigit coeffs.  Would this be good for Julia?

John Cowan

unread,
May 17, 2012, 1:20:44 PM5/17/12
to juli...@googlegroups.com
On Thu, May 17, 2012 at 11:50 AM, Jeffrey Sarnoff
<jeffrey...@gmail.com> wrote:

> the ability to put "_" in numbers as a separator

+1

--
GMail doesn't have rotating .sigs, but you can see mine at
http://www.ccil.org/~cowan/signatures

Stefan Karpinski

unread,
May 17, 2012, 7:35:44 PM5/17/12
to juli...@googlegroups.com
Seems reasonable to me although I've never really used this feature much in other languages. We had at some point tossed around the idea of allowing numbers in bases 2-10 to be written, e.g., as 1010101_2 with the 2 after the underscore indicating the base, but maybe that's a silly syntax. Realistically, how many bases do you really need to input things in? We already support decimal and hex, which are the two big ones for computing.

Jameson Nash

unread,
May 17, 2012, 7:59:49 PM5/17/12
to juli...@googlegroups.com
I've found binary can be occasionally useful, commonly with the notation 0b101010. Octal is sometimes available in other languages, although I've never really used it and I'm not a huge fan of the notation 07654 (where the leading zero is significant, presumably because of its similarity to the letter O).

There could be an argument that it would make importing data easier, if this syntax is common in any output formats. I don't see any disadvantages to reserving this syntax until someone has time to actually implement it.

Elliot Saba

unread,
May 17, 2012, 8:09:46 PM5/17/12
to juli...@googlegroups.com
I would second binary notation (0b00000011, etc....), even though it's trivially easy to convert binary -> hex. It's helpful in a few cases, and I don't see any downsides.

Octal has never had a very clean method of implementation, as far as I can see, and its use case is even more limited than binary, IMO.
-E

Stefan Karpinski

unread,
May 17, 2012, 8:14:10 PM5/17/12
to juli...@googlegroups.com
Adding more input syntaxes like 0x12 adds to number of syntax conflits with numeric literal coefficients: 0b11 could either mean 0*b11 or a binary representation of 3. We resolve such conflicts in favor of the numeric literal; the more such conflicts are possible, the more annoying that behavior gets, although I guess using a 0 as a literal coefficient is a bit weird. However, binary input doesn't seem important enough to add that potential annoyance. The leading zero syntax for octal is downright evil. Hate it.

John Cowan

unread,
May 17, 2012, 8:53:43 PM5/17/12
to juli...@googlegroups.com
On Thu, May 17, 2012 at 8:14 PM, Stefan Karpinski <ste...@karpinski.org> wrote:

> The leading zero syntax for octal is downright evil.
> Hate it.

Quite apart from the notation, base 8 is a relic of computers that
stopped being made in 1986. There is no excuse for it in a modern
programming language. (I made this case to the Go people before Go
was released, to no avail. Apparently some people just love them
their 3 bits at a time notation.)

Binary is more general, but marginal enough that I'd say don't worry about it.

Jeffrey Sarnoff

unread,
May 18, 2012, 12:50:35 AM5/18/12
to juli...@googlegroups.com
  
Don Knuth has moved away from octal constants  (http://mmix.cs.hm.edu/doc/mmixal.pdf), so should Julia.
(Freedom from the octal confers a more zero-like  zero: 721 == 000721 == 00721 == 0721 == 721.)

Hexadecimal constants prefix '0x' to one or more hex digits (at least one, 0x is an invalid numerical cons).  
Binary constants would prefix '0b'  to one or more binary digits (at least one, 0b would be invalid, like 0x).
Allowing the empty string as prefix, decimal constants follow the this same pattern.
Decimal constants prefix '' to one or more decimal digits (at least one, '' is an invalid numerical constant).

+1 for binary consts that are part of the solution, eg when writing software to control circuitry.

Julia has hexadecimal integer constants, and does not have hexadecimal float constants.
Where precise, reproducible values are necessary, best practice is to use hex float literals:

 "Hexadecimal floating-point constants, also known as hexadecimal floating-point literals, 
  are an alternative way to represent floating-point numbers in a computer program.
  A hexadecimal floating-point constant is shorthand for binary scientific notation, 
  which is an abstract — yet direct — representation of a binary floating-point number. 
  As such, hexadecimal floating-point constants have exact representations in binary 
  floating-point, unlike decimal floating-point constants, which in general do not.
 
   Hexadecimal floating-point constants are useful for two reasons: they bypass 
  decimal to floating-point conversions, which are sometimes done incorrectly, 
  and they bypass floating-point to decimal conversions which, even if done 
  correctly, are often limited to a fixed number of decimal digits. In short, 
  their advantage is that they allow for direct control of floating-point variables, 
  letting you read and write their exact contents."
  
+1 for hex floats as best practice

Stefan Karpinski

unread,
May 18, 2012, 1:10:05 AM5/18/12
to juli...@googlegroups.com
We don't do the octal 0-prefix thing and won't, so we're on the same page as Don Knuth already. I'm in favor of hex floats too, it's just a matter of implementing them in the parser, which is Jeff's territory. Everyone seems to favor disallowing 0 as a numeric literal coefficient, freeing up 0b as a binary prefix.

Our decimal notation to floating-point conversions are correct and our printing of floating-point values is precise and minmal: least decimal digits printed to reproduce the float value exactly (the exception is in arrays, where we print fewer digits so that we can show more stuff on the screen). Shout-out to Florian Loitsch's double-conversion library, which provides this functionality with excellent performance.

You can also reinterpret a Uint64 (or Uint32) value as a float to exactly control the bits. This is done in various places in base/float.jl. Still, hex float literals are a desirable feature. These examples use box and unbox, because they're defined before the reinterpret function, but the more idiomatic way to do something like this is to use the reinterpret function:

julia> reinterpret(Float64,0x7ff0000000000000)
Inf

Jeffrey Sarnoff

unread,
May 18, 2012, 1:26:05 AM5/18/12
to juli...@googlegroups.com
thanks

Stefan Karpinski

unread,
May 21, 2012, 5:50:04 PM5/21/12
to juli...@googlegroups.com
This now works: 123_456, 0x1234_abcd.
Reply all
Reply to author
Forward
0 new messages