SIP proposal - support _ in numeric literals

306 Aufrufe
Direkt zur ersten ungelesenen Nachricht

Eran Medan

ungelesen,
24.06.2015, 18:10:3224.06.15
an scala-...@googlegroups.com
Hi scala-debate!

Following a positive comment by Martin in r/scala, regarding adding _ seperator to numeric literals, (he said he is ok with this as an SIP) I'd like to ask your feedback / moral support as well 

The request is to support _ in number literals to allow writing more readable numeric constants (e.g. 1_000_000 instead of 1000000). This exists in Java 8, Rust and perhaps in other languages. 
The Java specification is to simply ignore any underscore chars inside a number literal, (e.g. as long as the first and last in a literal are digits all _ sequences in between are ignored) 

Since Scala always was ok with supporting basic Java expressions, it's not a new language feature as much as just keeping up with a very cosmetic and low risk readability feature. 
My questions 
  1. Can I go ahead and create an SIP for it? (even though I'm a "newbie" to contribution to scala and scalac in particular) 
  2. Should I also include other Java8 literal changes (e.g. binary literal)? or should this be a different SIP?
  3. Should I also implement it or rather have someone who comitted to Scala's compiler before pick it up (it's probably a small change) 
  4. Anything I need to know that is not covered here? http://www.scala-lang.org/contribute/hacker-guide.html 
  5. Is there a formal language definition syntax for Scala that I also need to update if I make this change?
Here is my naive implementation outline (I didn't test it or even try to compile it yet, just to ask if this is the right location and approach)
I'm sure it's much more complex than that, but well, I have to start from somewhere, right? (and the best way to get the right answer in the internet is not to ask, but to post the wrong answer :))

The only "complexity" is to handle this case 1_000_000_ (which personally I think should be ok, but the Java specification doesn't allow it, so we should follow)


protected def getNumber(): Unit = { // consume digits of a radix
def consumeDigits(radix: Int): Unit = { var lastIsUnderscore = false;
while (digit2int(ch, radix) >= 0 || ch == '_') { if(ch != '_') {

putChar(ch) lastIsUnderscore = false; } else { lastIsUnderscore = true; }
nextChar()
} if(lastIsUnderscore) { //TODO: howdo I throw an error if the last char in the number sequence is an _? } }

How far is it from what needs to be actually done? Also if by accident it's actually doing what it should do, can you help me find a more elegant way to implement it? Or is it readable enough?
(again, didn't compile / test yet, just getting initial feedback, never commited to Scala before... )

Should I go ahead and continue? or should I stay away from the compiler? :)

Thanks!
Eran

Eran Medan

ungelesen,
24.06.2015, 18:13:5424.06.15
an scala-...@googlegroups.com
Oops, obviously, without the semicolons in lastIsUnderscore
Again, this is just to verify this is the right starting spot, before I spend hours trying to debug it :)

Som Snytt

ungelesen,
24.06.2015, 21:21:5724.06.15
an Eran Medan, scala-debate
I think you could go direct to scala-language with this one. Who would debate the opportunity for another use of underscore in Scala? My guess is _.

The only hang-up might be your use of semicolons. Now that it's out there, it will haunt you forever like nude selfies on a Tibetan mountaintop.

To answer your question, you don't throw, which is like so side-effectual, but invoke something like syntaxError. There are variants. Yes, those are basically side-effectual as well.

For the record, I just made up the term "side-effectual", which I like.

Someone made fun of b"101" syntax so that might be a good target for a separate SIP. The debate is between minimalism and naturalism. That discussion would probably stay on scala-debate. You'll hear no debate from me. Why aren't plain ints expressed as int"42"?

A couple of numerals, 4 and 2, interspersed in the program text, couldn't possibly mean anything independent of the string representation from which they spring. Right?

--
You received this message because you are subscribed to the Google Groups "scala-debate" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-debate...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dennis Haupt

ungelesen,
25.06.2015, 03:35:4225.06.15
an Som Snytt, Eran Medan, scala-debate
underscores look rather ugly. why not 1.000.000,345?

Viktor Klang

ungelesen,
25.06.2015, 04:27:3225.06.15
an Dennis Haupt, scala-debate, Eran Medan, Som Snytt

The presentation of numeric literals seems, to me, as an IDE concern--not a language concern.

--
Cheers,

martin odersky

ungelesen,
25.06.2015, 05:25:5625.06.15
an Viktor Klang, Dennis Haupt, scala-debate, Eran Medan, Som Snytt
One can (and should) debate about pros and cons of putting underscores in literals. But one hard argument for me was that Scala by design adopted its lexical syntax from Java. Not because we were convinced it is best, but because we did not care about it much, and therefore decided to simply adopt the established standard. Now, the standard has evolved with Java 7 and there are now underscores in literals and binary numbers. It seems to me that we should stick to our design philosophy and just adopt these things in Scala as well. That's why I think the idea is sufficiently plausible to become a SIP. If Java had not had underscores in literals, I would be dead-set against them (precisely because I think there are more important issues than lexical syntax), but now that it has them, the balance swings in the other direction.

Cheers

 - Martin
Martin Odersky
EPFL

Simon Ochsenreither

ungelesen,
25.06.2015, 05:57:2725.06.15
an scala-...@googlegroups.com, ehrann...@gmail.com, som....@gmail.com, d.ha...@gmail.com, viktor...@gmail.com
I'm opposed to this. We have already enough uses of _ in the language. We already abandoned complete Java-equivalence (octal literals, floating point numbers), so I would prefer either using something which is not that overused already (and ugly, btw). 1'000'000 feels acceptable to me, but I'm quote fine with doing nothing at all, too.

If people have special requirements like these, the could also just define their string interpolator and support whatever literal they want.

(I think Viktor has a valid point, too.)

Daniel Armak

ungelesen,
25.06.2015, 06:05:5625.06.15
an Dennis Haupt, Som Snytt, Eran Medan, scala-debate

Hi Dennis,

Different languages have different thousands separators, typically ‘.’, ‘,’ or ‘ ‘ (dot, comma and space, and different unicode spaces at that).

If Scala adopts just one of these, it could be unnecessarily inconvenient or confusing for speakers of other languages. But most likely no-one uses underscores and they won’t confuse people.

BTW, in your example, presumably you meant 1,000,000.345. I don’t think we should change the decimal separator to a comma :-)


Daniel Armak

Jon Pretty

ungelesen,
25.06.2015, 06:30:5525.06.15
an Daniel Armak, Dennis Haupt, Som Snytt, Eran Medan, scala-debate
I'd make a few points about this, despite not feeling particularly strongly about it...

I think the underscores look ugly in that context, but there's no requirement to use them, so it needn't affect most people. I'd quite like the 1000s grouping to be enforced by the compiler, though, so that `10_00.00` would not be legal.

@odersky I understand the argument about following Java's numeric literal syntax consistently, but that's much less important than it was ten years ago: fewer people are learning Java before moving to Scala than back then. Given that I feel the two options are to maintain 100% consistency with Java, or to do what we think is best (regardless of Java), then choosing the first option precludes us from innovating in this area. One example would be introducing new kinds of numeric literal (e.g. byte literals or binary literals) or deprecating the use of a lower-case `l` to signify a long (because it looks too much like a 1). So I'd favour the latter approach, and would therefore like to justify the use of underscores in this context for the reason that we think it's best.

@soc I kind of disagree with the principle that the underscore "has enough uses already", just on the basis that that's its point: it works as a versatile operator, and this is now well-established, whereas most other symbols typically only have a unique purpose (or maybe two). So introducing yet another new role for the underscore does minimal damage to a developer's confidence in understanding the source code they read. (But again, this isn't something I feel that strongly about.)

@Daniel @Dennis Unfortunately both `1,000,000.345` and `1.000.000,345` look like multiple parameters. What would `List(1,000,000.345).length` or List(1.000.000,345).length` give?

Cheers,
Jon
Jon Pretty | @propensive

Dennis Haupt

ungelesen,
25.06.2015, 06:43:4825.06.15
an Jon Pretty, Daniel Armak, Som Snytt, Eran Medan, scala-debate
i was giving 1.000.000,345 as an example for a readable number, not as a real proposal... i'm in favor of separators, i am just against the _ which i perceive as ugly in numbers.
i would prefer to use ' or something like that, or even better, i will tell the jetbrains guys that i want intellij to display numbers in a formatted way, which completely solves the problem without making me enter weird characters

Jon Pretty

ungelesen,
25.06.2015, 06:50:4425.06.15
an Dennis Haupt, Daniel Armak, Som Snytt, Eran Medan, scala-debate
Yes, I should have added that I'd actually favour this particular feature being part of an IDE.

Cheers,
Jon
--
Jon Pretty | @propensive

Paul Hudson

ungelesen,
25.06.2015, 06:57:5625.06.15
an Jon Pretty, Daniel Armak, Dennis Haupt, Som Snytt, Eran Medan, scala-debate

On 25 June 2015 at 11:30, Jon Pretty <prope...@gmail.com> wrote:
I think the underscores look ugly in that context, but there's no requirement to use them, so it needn't affect most people. I'd quite like the 1000s grouping to be enforced by the compiler, though, so that `10_00.00` would not be legal.

Tom Switzer

ungelesen,
25.06.2015, 06:58:2825.06.15
an Jon Pretty, Eran Medan, Som Snytt, Dennis Haupt, scala-debate, Daniel Armak

As an argument against enforces thousands groupings, think of the case where you have an Int/long representing cents, then you may want something like,

val amount = 29_995_00

Anoher useful thing would be grouping bytes or shorts in hex format.

Naftoli Gugenheim

ungelesen,
25.06.2015, 06:58:5825.06.15
an Dennis Haupt, Jon Pretty, Daniel Armak, Som Snytt, Eran Medan, scala-debate
How high is the demand, meaning what percentage of scala users (or programs in general) have large amounts of long numeric literals?

Have people tried other solutions, like an interpolator, and found that even after getting used to it, it's still not good enough? (Does spire have such an interpolator?)

martin odersky

ungelesen,
25.06.2015, 07:12:2225.06.15
an Simon Ochsenreither, scala-debate, Eran Medan, Som Snytt, Dennis Haupt, Viktor Klang
On Thu, Jun 25, 2015 at 11:57 AM, Simon Ochsenreither
<simon.och...@gmail.com> wrote:
> I'm opposed to this. We have already enough uses of _ in the language. We
> already abandoned complete Java-equivalence (octal literals, floating point
> numbers),

Octal literals are sort of an exception in that even for Java it was a
legacy feature that hardly anyone ever used So the WTF effect of surprising
behavior with leading zeroes was
more important than the benefit of having them. What's different in terms of
floating point numbers?

- Martin

> so I would prefer either using something which is not that
> overused already (and ugly, btw). 1'000'000 feels acceptable to me, but I'm
> quote fine with doing nothing at all, too.
>
> If people have special requirements like these, the could also just define
> their string interpolator and support whatever literal they want.
>
> (I think Viktor has a valid point, too.)
>

Nick Stanchenko

ungelesen,
25.06.2015, 09:00:1425.06.15
an scala-...@googlegroups.com
assert(List(1, 2, 3).map(1_0) == List(110, 120, 130))

Eran Medan

ungelesen,
25.06.2015, 09:34:0725.06.15
an scala-...@googlegroups.com, dana...@gmail.com, ehrann...@gmail.com, d.ha...@gmail.com, som....@gmail.com
Many, many developers (e.g. data scientists) use vim / emacs / notepad for coding. They also use a lot of numeric literals. I think supporting _ in literals is something that most modern languages will add because, well, it is simple to add, with low risk, it is backward compatible, and you can opt out, and no one can argue that 1000000000 is more readable than 1_000_000_000 

Eran Medan

ungelesen,
25.06.2015, 09:40:2925.06.15
an scala-...@googlegroups.com
o_0

:)

Simon Ochsenreither

ungelesen,
25.06.2015, 09:49:2625.06.15
an scala-...@googlegroups.com, viktor...@gmail.com, d.ha...@gmail.com, ehrann...@gmail.com, som....@gmail.com, simon.och...@gmail.com

What's different in terms of floating point numbers?

1. is a valid floating-point literal in Java, but not in Scala.

Haoyi Li

ungelesen,
25.06.2015, 16:16:3925.06.15
an Simon Ochsenreither, √iktor Ҡlang, Dennis Haupt, scala-...@googlegroups.com, som....@gmail.com, ehrann...@gmail.com

I'm opposed to this.

I don't think doing it "because java did" is convincing: many other similar java syntaxes around literals were added and then had to be excused (octal literals, trailing . floats, ...) and others hang around and still cause grief (Unicode escapes...). Also, Scala.js is not a thing, and Scala is definitely bigger than "a better java" by this point.

Also, the alternatives have not been properly covered. The alternatives now are:

123 * 1000 * 1000 for round numbers
n"123,456,789" for non-round-numbers

Which look perfectly fine for me. It is 3 extra chars in the latter case, which hurts, but I'd like to see a piece of code with sufficient density of long numbers that the three extra chars is problematic. The comparisons so far have been against very-non-optimal use of what already exists so of course it looks better.

Lastly, we've all talked about making the language smaller, and thus definitely makes it bigger. By a small amount, but all the small amounts add up!

That's all I have to say


What's different in terms of floating point numbers?

1. is a valid floating-point literal in Java, but not in Scala.

Russ Paielli

ungelesen,
25.06.2015, 16:27:5825.06.15
an Haoyi Li, Simon Ochsenreither, √iktor Ҡlang, Dennis Haupt, scala-debate, Som Snytt, ehrann...@gmail.com
For "Doubles" (i.e., "real" numbers), one can simply write 1.3455e6 to make it easier to see the magnitude. For integers, one can simply use 1.3455e6.toInt. It may not be as aesthetically pleasing as 1,345,500, but at least it relieves the reader of counting digits.

Russ P.

ungelesen,
25.06.2015, 20:00:1925.06.15
an scala-...@googlegroups.com
Here's another approach:

    def int(ints: Int*): Int = {

        var int = 0
        var place = 1

        for (i <- ints.reverse) {
            int += i * place
            place *= 1000
            }

        int
        }

    println(int(34,222,895))
   
34,222,895


martin odersky

ungelesen,
26.06.2015, 08:09:0226.06.15
an Russ P., scala-debate
What I propose: We will discuss this in the next SIP meeting mid July.

Cheers

- Martin
> --
> You received this message because you are subscribed to the Google Groups
> "scala-debate" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to scala-debate...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
Martin Odersky
EPFL
Allen antworten
Antwort an Autor
Weiterleiten
0 neue Nachrichten