Ada has one good thing about numeric constants: They may contain _ (underscore) to make them more readable. Common Lisp can print numbers that look even nicer, with a , (comma) grouping digits, but it can't read them back in, again. This annoyed me today, as I was gawking at a whole bunch of large integers that had to be readable. So I made , non-terminating.
(set-macro-char #\, (get-macro-char #\,) t)
This meant 123,456,789 read back as the symbol named "123,456,789" which is almost exactly half as useful as what I had in mind. So I hacked the function that I thought made the wrong decision about numberhood of this string. The rest of the system was not amused, so I had to delete the commas from the input at a fairly low level, but that's not too satisfactory, as it's common in some cultures to use dot or space, not comma, for grouping (silly Europeans!), and parse-integer could really use some more options. Like a cancer my wish list item grew, so here's the condensed and sanitized version of with wish list item:
I'd like integers printed with internal commas for grouping to be readable as if the commas were absent. This requires the comma to be a non-terminating macro character, which _could_ cause problems with space-deficient legacy code. It would be nice if more of the system would grok human-readable numbers, and I would be positively thrilled if write and friends would do grouping, not just format.
Unfortunately, this requires rather low-level changes to most Common Lisp implementations, so I'm asking for reactions and suggestions.
#:Erik -- If this is not what you expected, please alter your expectations.
* I wrote: >> Unfortunately, this requires rather low-level changes to most Common >> Lisp implementations, so I'm asking for reactions and suggestions. > I'd like this to work.
I'd also like it to work for floats, actually, so I can say
100,000.00
(of course that should probably really be a scaled integer or something if it's a financial calculation, but...)
HP calculators offer the two options of comma for separator and point for point and the other way around, are there other common choices in use?
I was going to say that it doesn't really affect whether the char is a readmacro because it only matters in number parsing and that's kind of magic anyway, but apart from this being obviously wrong, it's also bad if something like comma could be used for decimal point then there's a whole nightmare there with anything like `(,1), which seems to me to be genuinely ambiguous (in the sense that, if comma was decimal point, I would not know what it should mean) so I don't know how to deal with that...
(I don't think that Erik was asking for the additional hairiness of being able to choose the point & grouping chars, probably because he thought about this harder than me!)
Use the FFI to call a Java program that uses that language's library for reading the numbers? Then modify the readtable for 0-9, + and - to read something and then call that parser.
-- Thomas A. Russ, USC/Information Sciences Institute t...@isi.edu
* Thomas A. Russ | Use the FFI to call a Java program that uses that language's library | for reading the numbers? Then modify the readtable for 0-9, + and - | to read something and then call that parser.
I'm sorry to say this, but you need to understand how the reader works before you can make useful suggestions.
#:Erik -- If this is not what you expected, please alter your expectations.
I confirm the need (you and others provided me with FORMAT examples for currency amount printing) and agree with the need for configurability. An OT question:
Erik Naggum wrote: > it's common in some cultures to use dot or space, not comma, for grouping > (silly Europeans!)
For what reason do you prefer the Anglo-Saxon way over the Continental European (and I think ISO) notation? Maybe because "." has a meaning of full stop, which could mean "stream of integer digits ended here", whereas comma suggests "read on"? Otherwise, they look like arbitrary conventions to me that work equally well internally and are equally confusing* across the sea (whichever it is).
* Example: 4,288 Non-example: "The state deficit of Andorra was EUR 423.234.544,95 in 1999"
> * Robert Monfera <monf...@fisec.com> > | For what reason do you prefer the Anglo-Saxon way over the > | Continental European (and I think ISO) notation?
> Well, to be honest: Because it doesn't suck.
> [Incredible, 85-line rant deleted]
Boy, Erik, that is the best rant I've read in a long time!
So many rants just ramble along, switching from topic to topic, spewing bad grammar and worse spelling, and generally making no sense whatever. This one was focused, well-written, and included wonderfully biting sarcasm!
Of course, the fact that I too disdain linguistic inanity had nothing to do with my enjoyment of your rant... :)
> The European notation is ambiguous. In a list of numbers, "they" > chose comma as the separator between numbers. Within a number, > "they" then chose, what -- are we out of symbols, yet?, [...]
In (some?) German schools the semicolon is used as list separator but I am not happy with this as the pupils have to learn to deal with the usual (ambigous) notation. Now usually this is not a problem, otherwise the notations would have changed. Usually it is clear from the context what is meant.
J.B.
-----= Posted via Newsfeeds.Com, Uncensored Usenet News =----- http://www.newsfeeds.com - The #1 Newsgroup Service in the World! -----== Over 80,000 Newsgroups - 16 Different Servers! =-----
In article <3167733463524...@naggum.no>, Erik Naggum <e...@naggum.no> wrote:
> Ada has one good thing about numeric constants: They may contain _ > (underscore) to make them more readable.
You know the use of the _ as a group spacer is mentioned on pg. 517 of CLTL2. The _ is an extension character for what he calls a potential number. Unfortunately, a potential number token is interpreted in an implementation dependent manner with the options being a symbol, signal an error, or take some other action.
Erik Naggum <e...@naggum.no> writes: > * Tim Bradshaw <t...@cley.com> > | I'd also like it to work for floats, actually, so I can say > | > | 100,000.00 > | > | (of course that should probably really be a scaled integer or > | something if it's a financial calculation, but...)
> I agree that this is a useful thing while we're at it: There would > be no harm in supporting noise characters for floats, too. As far > as I recall (my Ada literatue is at home), underscores are allowed > in floats in Ada, too.
Underscore makes more sense than comma. (There is some long-ago reason not to use "_" because it meant something different in older dialects of Lisp, but I imagine those are mostly forgotten, so it wouldn't bother me.)
The problem with comma is that 1,000 is a meaningful number to people in some locales and means what 1.000 means to us. A sufficient case could be made that comma in numbers be allowed as a synonym for "." that I'd rather just not allow it in numbers at all. It just invites people to think in a non-computery way about numbers, but there are multiple non-computery ways to think, and so it invites people to quarrel over whose non-computery way should dominate.
Not to mention the incompatibility of things like `(100,200). Though I doubt many exploit that.
Permitting "_" to occur in numbers, at arbitrary points even, would seem fine to me. Though in the noise level compared to the things that really need to be done to the language.
In article <3167733463524...@naggum.no>, Erik Naggum <e...@naggum.no> wrote: > I'd like integers printed with internal commas for grouping to be > readable as if the commas were absent.
Can't you achieve this behavior with an *evalhook* function that walks the form to be evaluated and replaces symbols with numbers as you want? It's not as clean as a reader mod, but it would seem to provide the behavior you want without having to resort to non-standard extensions.
* Erann Gat wrote: > Can't you achieve this behavior with an *evalhook* function that walks > the form to be evaluated and replaces symbols with numbers as you want? > It's not as clean as a reader mod, but it would seem to provide the > behavior you want without having to resort to non-standard extensions.
No, because a symbol could have an all-numeric name, or a name with a comma in, or anything generally. You have to do it at read-time as far as I can see.