The `real?` predicate in R5RS/R7RS and R6RS

190 views
Skip to first unread message

John Cowan

unread,
Feb 20, 2019, 1:07:37 AM2/20/19
to scheme-re...@googlegroups.com
In R5RS, the predicate `real?` is defined as `(lambda (z) (zero? (imag-part z)))`.  That is, it does not care if the imaginary part is 0 or 0.0 or even -0.0.  In R6RS, the `real?` predicate is redefined as (lambda (z) (and (exact? (imag-part z)) (zero? (imag-part z)))).  

The rationale here is that 0.0, considered as the result of an inexact computation, may not represent a true mathematical zero; in fact, it can be any number between 0 and the halfway point from 0 to the smallest representable positive inexact number.  Similarly, -0.0 can be any number between zero and the halfway point from 0 to the largest representable positive exact number.  By saying a number is real only if its imaginary part is exactly 0, we do not reduce complex numbers with such pseudo-zeros to real numbers.  This is what Common Lisp does, for instance: exact complex numbers with imaginary part 0 are real, but no inexact complex number is ever real.

The R7RS WG1, however, decided by vote that this silent breaking change in R6RS was unacceptable, and preserved the R5RS definition.  I have been thinking since then about whether the R6RS definition should also be present using some name other than `real?`.  For the purposes of this email, I will refer to the R5RS/R7RS `real?` as `r5rs:real?` and the R6RS `real?` as `r6rs:real?`.
(In R6RS, the R5RS definition is available using the name `real-valued?`, but its hard to remember which of `real?` and `real-valued?` does what.  Note:  the `rational?` and `integer?` predicates behave exactly the same way, and so I will disregard them for the rest of this email.)

Now Scheme's rectangular complex numbers can take any of four formats, depending on whether the real part, the imaginary part, both, or neither are inexact.  We write these, for example, as 3.0+0i, 3+0.0i, 3.0+0.0i, and 3+0i.  The first three count as inexact in all systems, the last as exact, although some implementations automatically coerce exact complex numbers to inexact ones, and some that keep exact and inexact separate, nevertheless automatically coerce mixed-exactness complex numbers (the first two types) to inexact complex numbers.  Finally, there are two real number formats: exact 3 and inexact 3.0.

The exact numbers 3+0i and 3 are treated the same way by both `r5rs:real?` and `r6rs:real?`; they return true. Likewise, 3.0+0.0i and 3+0.0i return true to `r5rs:real?` and false to `r6rs:real?` in all Schemes.  However, the mixed-exactness complex number 3.0+0i and the inexact real number 3.0 may return either 0 or 0.0 to `imag-part`: neither R5RS nor R7RS specifies the result.

All R6RS systems behave the same way:  `(imag-part 3.0+0i)` => 0.0, whereas `(imag-part 3.0)` => 0.  However, when we take into account the R5RS and R7RS Schemes that support both exact and inexact complex numbers, things become more complicated:

Chicken 4 with the numbers egg, Chicken 5, and Scheme 48 return 0.0 as the value of both `(imag-part 3.0+0i)` and `(imag-part 3.0)`.

Gambit, MIT, and STklos return 0 as the value of both `(imag-part 3.0+0i)` and `(imag-part 3.0)`.

Lastly, in Chibi `(imag-part 3.0+0i) => 0 but `(imag-part 3.0) => 0.0, the exact opposite of R6RS.

The consequence is that when we call `r5rs:real?` on any of these values, we always get true, but `r6rs:real?` is less predictable: exact real and complex numbers return true, inexact complex numbers return false, but mixed-exactness complex numbers and inexact real numbers may return either true or false.

What I conclude from this is that the `r6rs:real?` function is not useful as part of R7RS-large, since it doesn't consistently do what we expect it to do.  So either we must specify a particular behavior for `imag-part`, or else we must abandon making `r6rs:real?` part of R7RS-large.

Comments?

-- 
John Cowan          http://vrici.lojban.org/~cowan        co...@ccil.org
It's the old, old story.  Droid meets droid.  Droid becomes chameleon.
Droid loses chameleon, chameleon becomes blob, droid gets blob back
again.  It's a classic tale.  --Kryten, Red Dwarf

Marc Nieper-Wißkirchen

unread,
Feb 20, 2019, 2:28:40 AM2/20/19
to scheme-re...@googlegroups.com
What about adding an optional library "(scheme r6rs)" to R7RS-large?
This library would export procedures and keywords with the R6RS
semantics ("real?" and "letrec", for example). (A program including
"(scheme r6rs)" and, say, "(scheme base)" at the same time would have
to rename or exclude some identifiers.)

A program can check for the existence of "(scheme r6rs)" using
"cond-expand". If "(scheme r6rs)" is provided by a Scheme system, its
implementation makes some guarantees that are forced by R6RS. For
example that "(imag-part 3.0+0i) => 0.0" and "(imag-part 3.0)` => 0".

(It will need some care to specify the exact guarantees that have to
be made as soon as "(scheme r6rs)" is provided.)

Some prospective R7RS-large systems may not want to support "(scheme
r6rs)", while others like Larceny may easily support "(scheme r6rs)".

Marc
> --
> You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to scheme-reports-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Alex Shinn

unread,
Feb 20, 2019, 2:50:55 AM2/20/19
to scheme-re...@googlegroups.com
On Wed, Feb 20, 2019 at 2:07 PM John Cowan <co...@ccil.org> wrote:
All R6RS systems behave the same way:  `(imag-part 3.0+0i)` => 0.0, whereas `(imag-part 3.0)` => 0.  However, when we take into account the R5RS and R7RS Schemes that support both exact and inexact complex numbers, things become more complicated:

Chicken 4 with the numbers egg, Chicken 5, and Scheme 48 return 0.0 as the value of both `(imag-part 3.0+0i)` and `(imag-part 3.0)`.

Gambit, MIT, and STklos return 0 as the value of both `(imag-part 3.0+0i)` and `(imag-part 3.0)`.

Lastly, in Chibi `(imag-part 3.0+0i) => 0 but `(imag-part 3.0) => 0.0, the exact opposite of R6RS.

Chibi behaves the same as Gambit, MIT and STklos, returning exact 0 for numbers with no imaginary part.

--
Alex

Marc Nieper-Wißkirchen

unread,
Feb 20, 2019, 8:04:42 AM2/20/19
to scheme-re...@googlegroups.com
How is "rational?" related to this?

Chibi gives (real? 3.0+0.0i) => #t but (rational? 3.0+0.0i) => #f.

Jim Rees

unread,
Feb 20, 2019, 10:19:47 AM2/20/19
to scheme-re...@googlegroups.com
On Wed, Feb 20, 2019 at 8:04 AM Marc Nieper-Wißkirchen <marc....@gmail.com> wrote:
How is "rational?" related to this?

Chibi gives (real? 3.0+0.0i) => #t but (rational? 3.0+0.0i) => #f.

I observe differently on Chibi.
(real? 3.0+0.0i) ==> #f

...and this appears consistent with the code in lib/init-7.scm, when (features) contains "complex".   And it then means (rational? 3.0+0.0i) ==> #f is consistent. 

Marc Nieper-Wißkirchen

unread,
Feb 20, 2019, 10:34:35 AM2/20/19
to scheme-re...@googlegroups.com
Am Mi., 20. Feb. 2019 um 16:19 Uhr schrieb Jim Rees <jimr...@gmail.com>:
>
> On Wed, Feb 20, 2019 at 8:04 AM Marc Nieper-Wißkirchen <marc....@gmail.com> wrote:
>>
>> How is "rational?" related to this?
>>
>> Chibi gives (real? 3.0+0.0i) => #t but (rational? 3.0+0.0i) => #f.
>
>
> I observe differently on Chibi.
> (real? 3.0+0.0i) ==> #f

Indeed. I cannot reproduce my previous findings. However, this would
mean that Chibi's real? is not R5RS's?

John Cowan

unread,
Feb 20, 2019, 3:41:28 PM2/20/19
to scheme-re...@googlegroups.com
On Wed, Feb 20, 2019 at 2:28 AM Marc Nieper-Wißkirchen <marc....@gmail.com> wrote:

What about adding an optional library "(scheme r6rs)" to R7RS-large?

This is a reasonable idea, but someone would need to spell out exactly
which procedures and syntax would need to be included.  Would the
scope be (rnrs base) only, or would it be the whole of R6RS?  This
"someone" would probably not be me.  I have enough on my plate
juggling eggs in variable gravity.

-- 
John Cowan          http://vrici.lojban.org/~cowan        co...@ccil.org
By naming the names they rejoiced in the complexity and specificity,
the wealth and beauty of the world, they participated in the fullness of
being. They described, they named, they told all about everything. But
they did not pray for anything.  --Le Guin, The Telling

John Cowan

unread,
Feb 20, 2019, 3:57:24 PM2/20/19
to scheme-re...@googlegroups.com
On Wed, Feb 20, 2019 at 2:50 AM Alex Shinn <alex...@gmail.com> wrote:

Chibi behaves the same as Gambit, MIT and STklos, returning exact 0 for numbers with no imaginary part.

Thanks; I've corrected the ComplexImplementations page.  Probably just a glitch on my part when I was creating it, which (like all the other subpages of ImplementationContrasts) was basically a manual process: my script just ran each REPL in turn and let me paste in the test case(s) and write the result(s) directly into the subpage.

I will try to get Chicken changed to do one or the other, but it may not make sense because there are no mixed-exactness complex numbers there.

-- 
"Well, I'm back."  --Sam        John Cowan <co...@ccil.org>
 

Marc Nieper-Wißkirchen

unread,
Feb 20, 2019, 4:00:06 PM2/20/19
to scheme-re...@googlegroups.com
Am Mi., 20. Feb. 2019 um 21:41 Uhr schrieb John Cowan <co...@ccil.org>:
>
>
>
> On Wed, Feb 20, 2019 at 2:28 AM Marc Nieper-Wißkirchen <marc....@gmail.com> wrote:
>
>> What about adding an optional library "(scheme r6rs)" to R7RS-large?
>
>
> This is a reasonable idea, but someone would need to spell out exactly
> which procedures and syntax would need to be included. Would the
> scope be (rnrs base) only, or would it be the whole of R6RS?

One strategy could be to add a minimal set of variables/keywords to
(scheme r6rs), namely those that differ between R7RS and R6RS and
those that are needed to implement all the (rnrs *) libraries (and can
be reasonably be provided by an R7RS Scheme system).

If we follow this strategy, it wouldn't make sense to specify (scheme
r6rs) before the rest of R7RS-large is specified. For example, if we
get, say, (scheme syntax-case), we wouldn't need to include that in
(scheme r6rs).

Alternatively, (scheme r6rs) could include all bindings of (rnrs) but
this would be a rather large library and duplicate a lot of
functionality.

One point of having (scheme r6rs) would be that code could test for
(stricter) R6RS semantics by using cond-expand with that library name.
This would also work with a minimal library.

Marc

> This
> "someone" would probably not be me. I have enough on my plate
> juggling eggs in variable gravity.
>
> --
> John Cowan http://vrici.lojban.org/~cowan co...@ccil.org
> By naming the names they rejoiced in the complexity and specificity,
> the wealth and beauty of the world, they participated in the fullness of
> being. They described, they named, they told all about everything. But
> they did not pray for anything. --Le Guin, The Telling
>

Jim Rees

unread,
Feb 21, 2019, 11:46:34 AM2/21/19
to scheme-re...@googlegroups.com
I am confused about the errata at https://small.r7rs.org/wiki/R7RSSmallErrata/

Item #24 regarding the example "(real? 2.5+0.0i)" says the conflict hasn't yet been resolved, but you say here and item #19 says that r7rs did not adopt r6rs semantics, which to me says (real? 2.5+0.0i) ==> #t as an R7RS requirement and there is no conflict.

Beyond that, I was going to make some observations, but later realized it's all been hashed out already.   I'm going with my personal conclusion that mixed-exactness (and mixed-precision) complex numbers are problematic.   They are less problematic in an R6RS world where <foo>+0.0i is not real and thus does not have to be supported by procedures which accept real arguments.


Yongming Shen

unread,
Aug 8, 2022, 5:25:26 AM8/8/22
to scheme-reports-wg2
The rationale mentioned at the beginning of this thread for making `(real? 3.0+0.0i)` return `#f` seems problematic. By the same logic, `(integer? 3.0)` should also return `#f` because the fractional part may not be exactly zero. And `(rational? 3.14)` should also return `#f` because an inexact number which has a rational value may in fact be an inexact representation of an irrational number. To extrapolate further, even `(zero? 0.0)` should return `#f`. R5RS/R6RS/R7RS all state that the type of a number (in the numerical tower) is orthogonal to the exactness of the number, involving exactness testing in `real?/rational?/integer?` violates this principal. This violation aside, a reasonable design should either consistently use a number's "apparent value (i.e., the value in memory)" to decide the number's type without consulting exactness, or consistently take exactness into account. The R6RS semantic is a mixture of both.

Also worth mentioning is that the documented rationale for R6RS's change to `real?` is in fact not what's mentioned in the beginning of this thread. Rather, it has to do with making `real?` more flow analysis friendly [http://www.r6rs.org/final/html/r6rs-rationale/r6rs-rationale-Z-H-13.html#node_sec_11.6.6.2, `(real? x)` should probably be `(and (real? x) (inexact? x))` instead in the example there]. Although, I'm not convinced that changing `real?` is the right solution to the flow analysis problem. `flonum?` is part of the R6RS standard library, and can be used instead of `(and (real? x) (inexact? x))` to identify floating point code, it is shorter and communicates the programmer's intention more clearly. Furthermore, the same rationale mentioned that `flsqrt` and `flexpt` were introduced to aid flow analysis in floating point code, using `flonum?` to identify floating point code is similar in spirit, so it is not "extra trouble".

Marc Nieper-Wißkirchen

unread,
Aug 9, 2022, 4:18:58 AM8/9/22
to scheme-re...@googlegroups.com
I know that this is not the motivation for R6RS's specification of the `real?' procedure, but mathematically it can make sense to view x+0.0i and x-0.0i as proper complex and not as real numbers at least when +0.0 and -0.0 are distinguished.  One hint at that is the discontinuity of the principal value of the logarithm at the negative real axis.  Operator theory where one inverts $A \pm i \epsilon$ for a small \epsilon (and a self-adjoint A) is another example.  Here, $\pm i \epsilon$ could be replaced by $+0.0i$ and $-0.0i$.

In any case, I don't think that exact non-real complex numbers are very interesting.  Complex numbers of mixed exactness even less (I would remove them if that makes things simpler).

Yongming Shen

unread,
Aug 13, 2022, 3:41:01 PM8/13/22
to scheme-re...@googlegroups.com
This seems to boil down to whether or not to use +0.0/-0.0 to mean "some extremely small positive/negative number" or "zero". Even without complex numbers, both interpretations are already in use, as `(/ 1.0 +0.0) => +inf.0`, `(/ 1.0 -0.0) => -inf.0`, `(= +inf.0 -inf.0) => #f`, but `(= +0.0 -0.0) => #t` and `(= +/-0.0 0) => #t`. So the consistency ship has sailed. For complex numbers, we can do the same and choose among the two interpretations based on need. For example, when computing the principal value of the logarithm, we go with "some extremely small positive/negative number", but when evaluating `(= 1.0+0.0i 1.0-0.0i)`, we go with "zero"; for `r7rs:real?` we go with `zero`, but for `r6rs:real?` we go with "some extremely small positive/negative number", etc.

Back to the opening question of whether to abandon r6rs:real? as a library function or refine `imag-part` to be as R6RS to make `r6rs:real?` useful. Refining `imag-part` provides a portable way to tell `1.0 + 0.0i` and `1.0` apart, which is important in cases where the "some extremely small positive/negative number" interpretation of `+/-0.0` is desired for `+/-0.0i`, such as `r6rs:real?` and the two cases that Marc mentioned. Overall, refining `imag-part` reduces ambiguity, and has practical use cases, and does not seem too much to ask from implementations, so it's likely the right choice.

BTW, the opening email mentions that "All R6RS systems behave the same way: `(imag-part 3.0+0i)` => 0.0", that is not true. In Chez Scheme 9.5.8, `(imag-part 3.0+0i) => 0`. I think Chez Scheme simply stores `3.0+0i` as a flonum.


You received this message because you are subscribed to a topic in the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scheme-reports-wg2/EAF0ZsFeqmU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scheme-reports-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scheme-reports-wg2/CAEYrNrTUyqcUg0huKtQ_hC9T3tvkZOs%3D4jVhUu3iYTm9Q%3Dy2HQ%40mail.gmail.com.

Marc Nieper-Wißkirchen

unread,
Aug 15, 2022, 6:45:49 AM8/15/22
to scheme-re...@googlegroups.com, Bradley Lucier
CCing Bradley, who is mentioned in the R⁶RS rationale for having come
up with the idea of the way ‘real?’ is specified in R⁶RS.

Marc
> To view this discussion on the web visit https://groups.google.com/d/msgid/scheme-reports-wg2/CAAFrGrRCABUPz7-L%2B2i7bjLFDrsO-hRd%3DXbJzMsyMpDr1BbcSA%40mail.gmail.com.

Linas Vepstas

unread,
Aug 15, 2022, 7:37:15 AM8/15/22
to scheme-re...@googlegroups.com
FYI, a string of minor comments.

The branch cut of the logarithm extends along the negative real axis only by convention. It can be moved anywhere, and need not be a straight line. It can be an arbitrary curve on the complex plane, extending from the origin to infinity in any direction. The polylogarithm has two branch cuts; one extending from 0 to infty and the other from +1 to infty (by convention on the negative and positive real axis.)

In practice, some versions of guile  sometimes return x+0.0i for the root of a positive real number. Clearly a bug, but I don't know why, Often enough that I have to explicitly work around it. Not often enough to complain about it to the guile people.

It would have been nicer(??) if guile had thrown an exception if it can't compute some square root, instead of silently returning a complex number. All that the complex number did was to raise an exception somewhere further down the road.

My wish-list from 10 years ago: scheme with arbitrary-precision complex numbers. As I no longer do arbitrary-precision work, this is now moot, but thought I should mention it.

-- linas



--
Patrick: Are they laughing at us?
Sponge Bob: No, Patrick, they are laughing next to us.
 

Marc Nieper-Wißkirchen

unread,
Aug 15, 2022, 10:54:11 AM8/15/22
to scheme-re...@googlegroups.com
Another comment: We shouldn't think too much of a complex number as a
pair of real numbers. This is just one way of representing them. The
polar representation is another one.

If `real?' returns true on `(make-rectangular x 0.0)', one can make
the case that it should also return true on `(make-polar 1 (atan +0.0
-0.0))'. But I am not sure whether the latter can be guaranteed.
(Note that the atan-expression is only present to describe the value
of pi.)

Am Mo., 15. Aug. 2022 um 13:37 Uhr schrieb Linas Vepstas
<linasv...@gmail.com>:
> To view this discussion on the web visit https://groups.google.com/d/msgid/scheme-reports-wg2/CAHrUA35zyYxxHsJNtPG4Ltm6YC69yFRKVM9wHYVNBn%3DUuPfCOA%40mail.gmail.com.

Jay Freeman

unread,
Aug 16, 2022, 5:18:08 AM8/16/22
to scheme-re...@googlegroups.com, Jay Freeman
I have been watching this discussion for a while, and am hesitant to speak up since my Scheme-implementing experience stops at R5 (Wraith Scheme). But if you don't mind comments from my limited viewpoint, I have three points to make:

First point: Some of the issues in this thread depend on implementation details in ways that perhaps should not be allowed to affect the language specification. For example, a Scheme implementation might operate as follows: (This is what Wraith Scheme does, but I am not suggesting this is the best way, just using it as an example.)

1) Treat any numerical constant with an explicit zero imaginary part as if the imaginary part weren't there. Thus "3+0i" would be treated as if it were "3", and might well end up stored as a fixnum with an exact bit set. (I am being vague about what and where the exact bits are, and about how the system can tell whether a stored number is a fixnum, a flonum, or whatever, but those are details that all implementers have to deal with and there is no need to go there at present.) The system would report (number? 3) (real? 3) and (complex? 3) as #t, and would report (real-part 3) as 3 and (imag-part 3) as 0. And ...

2) If any series of mathematical operations produces a result with zero imaginary part, forget the value of the imaginary part and store the result as (e.g.) a fixnum, flonum or whatever, but make sure to clear the exact bit if the calculated imaginary part was inexact. Thus a result which might have turned out looking something like (exact three)+(inexact zero times i) would be returned as three with the exact bit cleared, and would display as "3.". (There are of course other reasons to clear the exact bit.)

Second point: I suspect that most cases when a Scheme mathematical operation produces a result with an underflowed zero as its imaginary part, are cases when the true mathematical operation would produce a number with exactly zero imaginary part, and the problem amounts to the Scheme implementation peddling hard but not quite hard enough when faced with a complex number whose angle is pi. In these cases it would be risky to assume that the sign of the underflowed zero indicated which side of a branch cut was involved, even if there was agreement about where the branch cut should be in the first place.

Third point: I suggest that parts of this discussion tread uncomfortably close to the assumption that when Scheme reports a number as inexact, the number presented is in some everyday sense "close to" the mathematically correct value. That of course isn't necessarily so, as persons who subtract two nearly identical but uncertain numbers and then take the reciprocal will realize: The mathematical correct result in that case could be infinity or anywhere along the real line.

-- Jay Reynolds Freeman
---------------------
Jay_Reynol...@mac.com
http://JayReynoldsFreeman.com (personal web site)

Yongming Shen

unread,
Aug 17, 2022, 5:03:53 AM8/17/22
to scheme-re...@googlegroups.com, Jay Freeman
Interpreting an inexact number as an inexact representation of some math number (a number in the sense of mathematics) is indeed problematic. Without context, there is no way to know what the math number is, or what the error margin introduced by inexactness is. Related to this, I think there are two possible mental models for explaining what an inexact number in Scheme is.

1)  An inexact number is a tainted number. When an arithmetic function can not find a mathematically correct result, it returns a mathematically incorrect but "sensible" number as the result, and marks the number as tainted. When thinking of inexact numbers in this way, it is natural to think that an inexact number is a math number, the taint is just there to show that some mathematically incorrect arithmetics are involved in its creation. In this case, having `-0.0`, `+inf.0`, `-inf.0`, and `+nan.0` as inexact numbers doesn't make sense, having `1.0` and `1.0+0.0i` be distinct inexact numbers also doesn't make sense. And while it makes perfect sense to propagate the taint, it doesn't make much sense to say that "since X and Y are tainted, `(+ X Y)` can return a mathematically incorrect result". So if possible, arithmetic operations should return mathematically correct results even if its inputs are tainted. Overall, thinking of inexact numbers as tainted numbers is clean and straightforward, but does not fit well with the reality of Scheme.

2) An inexact number is some sort of a strange number, or pseudo number. There is an incomplete (`+inf.0`, `-inf.0`, and `+nan.0` are excluded) mapping from strange numbers to math numbers, but strange numbers are not math numbers. Arithmetic operations for strange numbers bear resemblance to arithmetic operations for math numbers, but they are not the same. For example, if X and Y are strange numbers and their math counterparts are A and B, `(exact (+ X Y))` and `(+ A B)` are not necessarily the same math number. As another example, two strange numbers X and Y can be mapped to the same math number (e.g., +0.0 and -0.0 are mapped to 0), but arithmetic operations for strange numbers can tell X and Y apart. The classification of strange numbers into integral, rational, real and complex (strange) numbers is again resembling the classification of math numbers, but not the same, as it involves `+inf.0` and `-inf.0`. The propagation of inexactness (or strangeness) when strange numbers and math numbers are mixed in generic arithmetic is not a matter of taint propagation, but a matter of implicit conversion of math numbers into strange numbers. The returning of a strange number from a generic arithmetic operation which has only math number inputs is not a matter of returning an incorrect math number, but a matter of "performing this operation in the math number domain is not supported, so I converted the inputs to strange numbers and performed a 'similar' operation in the strange number domain, and here's the result". Overall, thinking of inexact numbers as strange numbers is a bit messy, but does fit well with the reality of Scheme.

Within the mental model of inexact numbers as strange numbers, what R6RS does is to add more strangeness to strange numbers, by distinguishing `1.0` and `1.0+0.0i` as two different strange numbers, but mapping them both to the math number 1, and yet not classifying`1.0+0.0i` as a real (strange) number. Having `(imag-part 1.0)` return the math number `0` instead of the strange number `0.0` is also adding strangeness, but really is just a way to tell `1.0` and `1.0+0.0i` apart. A less confusing design would be to introduce a strange number predicate `has-explicit-imag-part?`and make `(imag-part 1.0)` return `0.0` instead of `0`.

Changing "exact number and inexact number" to "normal number and strange number", or to "math number and pseudo number" may reduce confusion when describing numbers in Scheme.


--
You received this message because you are subscribed to a topic in the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scheme-reports-wg2/EAF0ZsFeqmU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scheme-reports-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scheme-reports-wg2/78306483-DB18-46B3-B42C-EBC4BAD19B60%40mac.com.

Alaric Snell-Pym

unread,
Aug 17, 2022, 8:01:09 AM8/17/22
to scheme-re...@googlegroups.com, Jay Freeman, Yongming Shen
Another angle to take is: What are people using numbers for in Scheme?
What are the use cases? What are their expectations (or, to avoid beuing
trapped by the common case of
programmers-not-really-understanding-floating-point-numbers) what SHOULD
their expectations be?

One reason to want numbers-in-Scheme to be "like" the numbers described
in mathematics is to allow reasoning about them. C is infamous for its
arithmetic being full of surprises and undefined behaviour, which means
that although it looks like "int x = a+b; return x-c" should be
identical to "int x = a-c; return x+b" it isn't, as if (a+b) overflows
but (a-c)+b doesn't the former has undefined behaviour and the latter
doesn't. Making numbers-in-Scheme behave "like" "actual" numbers as we
learnt about in school reduces the chance for subtle bugs and annoying
problems (have you seen C code that tries to detect if an arithmetic
operation is going to overflow? It's... convoluted). This is pleasant
for programmers, and also quite pleasant for people writing optimising
compilers. And essential for people writing maths software like theorem
provers.

And yet sometimes that's not what we want. When writing code to emulate
hardware arithmetic circuits (eg, because we're emulating some VM, or an
encryption/hashing algorithm defined in terms of fixed-width registers),
or just interfacing with C code, we might *want* to use specific types
like "twos-complement signed 32-bit integer with overflowing arithmetic"
(or we might want ones-complement, or might want saturating arithmetic,
or want different bit widths, which might not only correspond to the
usual 8/16/32/64/128... progression). Should a Scheme library offering
those wrap them in a completely opaque fixed-width-binary-integer type
outside of the numeric tower and have special procedures like fix+ and
fix- to operate on them? Or should they be manipulable with + and -?
Should "(+ (make-twos-complement-signed-32-bit-integer-with-overflow 1)
1)" work? My hunch would be that these types of things are sufficiently
alien from "proper" numbers, although their behaviour may seem
indistinguishable *within a limited range*, that we should be using
special procedures to deal with them, and maybe those special procedures
should accept "proper" numbers and attempt to fit them into the
binary-integer world using appropriate and well-defined semantics as a
convenient shortcut for programmers who want to add 1 to such an object.

And that's just integers. What about floating point numbers? All our
talk of the real meaning of inexact numbers in Scheme, I feel, is
skirting around the fact that we're trying to give a sensible
mathematical semantics to things like IEEE floats (and their relatives,
such as bfloat16 - and maybe other less-used representations (I vaguely
remember reading about a numeric representation that was a fixed-point
binary representation of the logarithm of the number, whatever happened
to that?)).

Floats are certainly a useful data type. They have fast hardware
implementations and are appropriate for a lot of approximate numerical
work, but I think too many programming languages have dropped them in
and called them "real numbers" and treated them as the de-facto,
natural, even *only* way of representing numbers with
better-than-integer precision. I mean, in most languages any number with
a decimal point in it automatically gets handled as a float. Of some
unspecified precision.

I am not a numerical computing expert, so please correct me if I'm
wrong, but I think that float-of-unspecifed-precision is far too woolly
a notion for any serious work where correctness matters; it's about
right for non-critical work such as calculating heuristics - I've
written code, deep in the belly of a SQL query engine, that used
floating point arithmetic to estimate the costs of different join orders
in order to optimise joins, for instance; the estimation used a simple
statistical model of the structure of the data, with "number of rows in
table" and "number of unique values of the join key in the table" as the
inputs, assuming a uniform distribution of join key values, and so on,
so the precision of the result was already *entirely undefined* and
"best effort" was all we could have, and vastly better than nothing.

(As an interesting aside, I wasn't the original author of this code. I
went in to fix it because it wasn't working and nobody knew why. I found
that the code was written as a direct translation of a formula from a
paper, and had an expression something like pow(1.0-1.0/rows_in_table,
values_in_column) - where rows_in_table would be in the billions or
trillions, so 1.0/rows_in_table would be a tiny number, and 1.0 fp-minus
that would often just come out as 1.0, so the code *always produced the
same estimates*. On that day I found out why the C math.h library has a
log1p function that computes log(1+x) even when x is close to zero, so I
could rewrite it as exp(values_in_column*log1p(-1.0/rows_in_table)) and
suddenly our engine produced half-sensible join orders - as I mentioned
earlier, programmers are often confused by the nature of floating point
numbers)

(As a second aside, the original programmer had seemed unaware of the
availability of the "pow" function in math.h so had implemented the
exponentiation with a repeated multiplication loop, which has even worse
numerical stability issues than calling "pow", and was slow when the
number of values in the column was in the millions - so a later visitor
to the code had limited the number of iterations of the loop to at most
a thousand for performance reasons, noting correctly that the
intermediate value would be 1.0 by then - but seemingly not considering
the fact that in a table of billions of rows there would be a massive
difference in join cost between using a column with a thousand unique
values and a column with a billion unique values... I suspect the change
made the join orders we produced *no worse* than the already broken
code, but made query planning a lot faster, so what's not to like, eh?)

So... I feel that a programming language might want to offer something
that could perhaps be called an "approximate number" or even a "vague
number" (to make programmers suitable aware that it's not to be trusted)
to be the "whatever floating point representation is fast on this CPU"
for that kind of work, if that is even a useful concept in this day and age.

But I strongly suspect a language should certainly offer types
explicitly labelled as "IEEE 754 binary32" or the like. Specific
floating-point types with well-defined semantics, so that people doing
proper numeric computation can ask for, and get, the semantics they
want, and the same code will run the same on different machines. And
there should be fixed-width decimal types so people doing money can do
money right (floats really, really, aren't the way to do money right).

And those things, like the fixed-width binary types, should be kept away
from the purer notion of a "number" that people might mistake them for
and make mistakes.

I suspect the inbuilt notion of "number" embedded in the core of a
language, and accessed with the nice obvious procedure names like +,
should only cover things that are close enough to the mathematical
concept of a "proper number" that the differences are irrelevant or obvious.

We can do integers properly, with bignums. The difference between a
bignum and an actual integer boils down to memory size limits. Failing
those will result in an error rather than silently wrong results. I can
live with that.

We can do rational numbers properly, subject to the same limit.

We can even do rational intervals, so things like sensors can return a
value with measurement error margins.

We can do symbolic numbers, so (sqrt 2) can be represented properly.

Complex numbers are more interesting, as they are usually represented as
two real parts. We could debate the mathematical meaning of a complex
number with an exact real part and an inexact imaginary part (or inexact
magnitude and exact angle), but I really don't know enough about the
practical uses of complex numbers in software to say anything useful
about "But what would actually be useful to programmers?".

Would "IEEE binary32 complex" be a useful type of number to somebody,
which in my world would be tainted by the IEEE-float-ness and so need to
be kept isolated from the numeric tower?

Would "complex of proper numbers (be they integers or rationals or
rational intervals or symbolic numbers)" be useful? To be fair, if we
have symbolic numbers, we basically get that for free because we can
have an exact answer to "(sqrt -1/4)".

Complex numbers with non-floaty semantics in the numeric tower is
certainly doable, but what would they be useful for?

Are the applications of complex numbers in software more in the
numeric-programming domain that floats are made for, and appropriate
for, and the users can be expected to know about the caveats?

But I think fixed-width integers and floats and so on need to be put in
a box of their own, rather than shoehorned into the numeric tower with
"proper" numbers. And that box of their own (by which I mean: a set of
distinct procedures for manipulating them) can have their own
"real?"/"zero?" procedures whose semantics are defined in terms that
make sense in that domain - or omitted if there are no sensible
semantics, as is arguably the case for rectangular-complex-of-ieee-binary32.

In other words, 1.0 should just be a verbose way of writing 1, and 1.5
should be an exact rational. (sqrt 2) and (sqrt -1) should raise an
error unless you have symbolic numbers, while (ieee734:sqrt
(ieee734:make-binary32 2)) should return ~1.4142135623731, (ieee734:sqrt
(ieee734:make-binary32 -1)) should return +nan.0, and
(ieee734-complex:sqrt (ieee734:make-binary32 -1)) should return
~0.0+~1.0i :-)

--
Alaric Snell-Pym (M0KTN neé M7KIT)
http://www.snell-pym.org.uk/alaric/
OpenPGP_signature

Linas Vepstas

unread,
Aug 17, 2022, 11:18:28 AM8/17/22
to scheme-re...@googlegroups.com, Jay Freeman, Yongming Shen
Against my better judgement, I will attempt a short reply to this long email. 

On Wed, Aug 17, 2022 at 3:01 PM Alaric Snell-Pym <ala...@snell-pym.org.uk> wrote:
Another angle to take is: What are people using numbers for in Scheme?

This year (and the last few, and the next few) the only thing I do with scheme numbers is ordinary floating-point probability theory. So lots of log_2 of sums of ordinary real numbers. Usually in the range of -300 to +300.  I'm more or less happy with 32-bit floating point, although out of paranoia, I used 64-bit floats.  Speed is important.  But not that important; the critical stuff I write in C anyway.

What are the use cases? What are their expectations (or, to avoid beuing
trapped by the common case of
programmers-not-really-understanding-floating-point-numbers) what SHOULD
their expectations be?

The few cases where I had to deal with equality comparisons, I did so with the intent of avoiding traps.

The people who really care about these things program in fortran or C, and can explain them, even at 3AM while drunk.

I don't see too much of a point in trying to alleviate such issues in scheme; if you create some complicated system, it will just be harder to grok what scheme is actually doing.



And yet sometimes that's not what we want. When writing code to emulate
hardware arithmetic circuits (eg, because we're emulating some VM, or an
encryption/hashing algorithm defined in terms of fixed-width registers),

Those people will be programming in ASP (answer set programming), with the Potsdam solver.  (or any one of a few dozen other arcane systems) SAT solvers have revolutionized hardware emulation. That crowd is lost to scheme. Unless, of course, you provide a scheme module for the carnegie-mellon BAP binary analysis program.  There is room to make headway there.



I am not a numerical computing expert, so please correct me if I'm
wrong, but I think that float-of-unspecifed-precision is far too woolly
a notion for any serious work where correctness matters;

Eh? In conventional arbitrary-precision libraries, you always state exactly how many digits of precision you want. You want 300 binary digits? 3000? 30K? you just say so.  It's a setting.

Correctness is more subtle; you have to understand rounding errors.  If you expect an answer that is correct to 3000 decimal places, you *will* be very conscientious about where and how you are losing digits of precision.  95% of the effort in arbitrary-precision work is understanding where those rounding errors go.
 

(As an interesting aside, I wasn't the original author of this code. I
went in to fix it because it wasn't working and nobody knew why. I found
that the code was written as a direct translation of a formula from a
paper, and had an expression something like pow(1.0-1.0/rows_in_table,
values_in_column) - where rows_in_table would be in the billions or
trillions, so 1.0/rows_in_table would be a tiny number, and 1.0 fp-minus
that would often just come out as 1.0, so the code *always produced the
same estimates*. On that day I found out why the C math.h library has a
log1p function that computes log(1+x) even when x is close to zero, so I
could rewrite it as exp(values_in_column*log1p(-1.0/rows_in_table)) and
suddenly our engine produced half-sensible join orders - as I mentioned
earlier, programmers are often confused by the nature of floating point
numbers)

Sure. But anyone in the business eventually learns this the hard way.  In your next job, you will be given code with a race condition in it, and you will learn the hard way all about mutexes, semaphores and condition variables.  I don't see that either issue is trivially avoidable.

Perhaps there is some magic way to automatically deal with disastrous floating-point rounding errors, but its going to be .. umm, "from left field". Very different than today's world.

(As a second aside, the original programmer had seemed unaware of

... And there you have it. Programming in a nutshell.


there should be fixed-width decimal types so people doing money can do
money right (floats really, really, aren't the way to do money right).

But people who do money already know this,  in the same way that scientists have math libraries, and kernel programmers know mutexes.

We can do symbolic numbers, so (sqrt 2) can be represented properly.

No, because symbolic algebra takes you down a very very different road, involving proof theory and deductive systems.  There are large and (extremely) complicated systems and libraries for this.

Complex numbers are more interesting, as they are usually represented as
two real parts. We could debate the mathematical meaning of a complex
number with an exact real part and an inexact imaginary part (or inexact
magnitude and exact angle), but I really don't know enough about the
practical uses of complex numbers in software to say anything useful
about "But what would actually be useful to programmers?".

Argh. What's a programmer?  Does a mathematician working on the Riemann hypothesis count?  How accurately can you compute sin(1/z) for complex-valued z very close to zero?  Say, z = 1.0e-12 + i2.0e-13 ? That is useful enough that I wrote a library function for it. https://github.com/linas/anant

Would "IEEE binary32 complex" be a useful type of number to somebody,
which in my world would be tainted by the IEEE-float-ness and so need to
be kept isolated from the numeric tower?

It would be to people designing Kalman filters to control the flight-path of ballistic missiles. Very much so; stability theory is all on the complex plane. Of course, they will not be programming in scheme... Software radio quadrature comes to mind, too. It's all 32-bit complex numbers ... written by hand in assembly code.  Yes, I did that once, on a motorola 68K on a GPS receiver.


Would "complex of proper numbers (be they integers or rationals or
rational intervals or symbolic numbers)" be useful?

Another naive answer: the number theorists already have well-developed libraries for this that they are committed to maintaining...  It's hard to see where scheme fits in, unless you get some powerful personality who decides to "just do it", and drag the entire number-theory community to scheme.

The group theory people also have their own stuff.
 

Complex numbers with non-floaty semantics in the numeric tower is
certainly doable, but what would they be useful for?

All I can think of are the algebraicists, and they've already got their systems for this ....

Are the applications of complex numbers in software more in the
numeric-programming domain that floats are made for, and appropriate
for, and the users can be expected to know about the caveats?

Yes. You will always have young, inexperienced programmers. And always experienced old hands ...

In other words, 1.0 should just be a verbose way of writing 1,

Err, for floating-point x, if you make (+ 1 x) return something different than (+ 1.0 x), you will be drawn and quartered.

and 1.5
should be an exact rational.

Same comment as above. Performance is important.  Don't make my floating-point code run slow just so you can shoe-horn in an exact rational.

(sqrt 2) and (sqrt -1) should raise an
error unless you have symbolic numbers,

Eh? it is a dis-servce to force people to write (sqrt 2.0) to avoid getting an exception.

Yes, (sqrt -1.0) should throw an exception, because it lies in the cut.  It can be +i or -i.

while (ieee734:sqrt (ieee734:make-binary32 2)) should return ~1.4142135623731,

please don't force me to type all that just to get (sqrt 2.0) ...
 
(ieee734:sqrt
(ieee734:make-binary32 -1)) should return +nan.0,

Maybe. Or maybe it should throw an exception? That I do not know.
 
and
(ieee734-complex:sqrt (ieee734:make-binary32 -1)) should return
~0.0+~1.0i :-)

In other languages, csqrt is how I say I want a complex number result.  Also, the abbreviation fsqrt for 32-bit float sqrt is the conventional abbreviation.

I really like these common-sense, easy-to-use, not-much-typing abbreviations, even though I understand that some people perceive this to be impure, imprecise and hiding grunge. But if you are writing more than 100 lines of code of math, then short, simple, easy-to-use notation is vastly prefereable to something verbose and precise.

--linas

Alaric Snell-Pym

unread,
Aug 17, 2022, 12:15:37 PM8/17/22
to scheme-re...@googlegroups.com, Linas Vepstas, Jay Freeman, Yongming Shen
On 17/08/2022 16:18, Linas Vepstas wrote:
> Against my better judgement, I will attempt a short reply to this long
> email.

Thank you :-)

>> Another angle to take is: What are people using numbers for in Scheme?

[Linas gives various examples of how people who know what they're doing
deal with interesting kinds of numbers]

It sounds like people doing these kinds of specialist things use special
libraries for them and don't really expect them to be part of their
language's "numeric tower" (by which I mean, the inbuilt support for
arithmetic in the language; the sorts of numbers you can use as indices
into an array).

>> I am not a numerical computing expert, so please correct me if I'm
>> wrong, but I think that float-of-unspecifed-precision is far too woolly
>> a notion for any serious work where correctness matters;
>
> Eh? In conventional arbitrary-precision libraries, you always state exactly
> how many digits of precision you want. You want 300 binary digits? 3000?
> 30K? you just say so. It's a setting.

That's what I mean; people who know what they're doing want to request,
and get, a defined level of precision, rather than just having an
"inexact" type of unspecified precision.

[my anecdote of broken floating point code]
> Sure. But anyone in the business eventually learns this the hard way. In
> your next job, you will be given code with a race condition in it, and you
> will learn the hard way all about mutexes, semaphores and condition
> variables. I don't see that either issue is trivially avoidable.

> Perhaps there is some magic way to automatically deal with disastrous
> floating-point rounding errors, but its going to be .. umm, "from left
> field". Very different than today's world.

Locking floating-point arithmetic operations in a box labelled "Here be
dragons, open only if you know what you're doing" might be a start :-)

>> (As a second aside, the original programmer had seemed unaware of
>>
>
> ... And there you have it. Programming in a nutshell.

Alas!

>> We can do symbolic numbers, so (sqrt 2) can be represented properly.
>>
>
> No, because symbolic algebra takes you down a very very different road,
> involving proof theory and deductive systems. There are large and
> (extremely) complicated systems and libraries for this.

Yeah, for the record I don't actually think it's a good idea to include
in a numeric tower (just that I think it's allowable from my
numbers-are-allowed-in-if-they-behave-like-youd-expect rule).

[...various examples of specific user groups that have special maths
libraries for their objects of interest...]

> Another naive answer: the number theorists already have well-developed
> libraries for this that they are committed to maintaining... It's hard to
> see where scheme fits in, unless you get some powerful personality who
> decides to "just do it", and drag the entire number-theory community to
> scheme.
>
> The group theory people also have their own stuff.
>

Ok, as to whether to bind/write these kinds of specialist arithmetic
libraries in C - or whether to mandate them in a standard - well, I can
but hope that Scheme can be a platform for this kind of work.

I think it'd be remiss NOT to have, say, binary32 floats available in a
standard.

Not mandated; they're a pain to implement on some hardware, and not
always necessary, so I've no problem with implementers treating them as
an optional feature.

They're useful for a whole bunch of heuristic fudging, but I'd be
happier having them in a language if they were a *little* fenced off
from the core numeric tower, just because it makes it all too easy to
assume they work like real numbers.

>> In other words, 1.0 should just be a verbose way of writing 1,
>
>
> Err, for floating-point x, if you make (+ 1 x) return something different
> than (+ 1.0 x), you will be drawn and quartered.
>
> and 1.5
>> should be an exact rational.
>
>
> Same comment as above. Performance is important. Don't make my
> floating-point code run slow just so you can shoe-horn in an exact rational.

I know, I know, I'm talking about MY dream where floating point numbers
are held at arm's length from the core numerical types in a programming
language :-)

> while (ieee734:sqrt (ieee734:make-binary32 2)) should return
>> ~1.4142135623731,
>
> please don't force me to type all that just to get (sqrt 2.0) ...
[...]
> In other languages, csqrt is how I say I want a complex number result.
> Also, the abbreviation fsqrt for 32-bit float sqrt is the conventional
> abbreviation.

Well, just because in my hypothetical example I've prefixed my imports
from the ieee734 library with a long name, doesn't mean somebody writing
lots of code using them has to :-)

> --linas
OpenPGP_signature

John Cowan

unread,
Aug 17, 2022, 1:51:36 PM8/17/22
to scheme-re...@googlegroups.com


On Tue, Aug 9, 2022 at 4:18 AM Marc Nieper-Wißkirchen <marc....@gmail.com> wrote:

I know that this is not the motivation for R6RS's specification of the `real?' procedure, but In any case, I don't think that exact non-real complex numbers are very interesting.  Complex numbers of mixed exactness even less (I would remove them if that makes things simpler).

I hear talk of Gaussian integers from time to time, but I confess to not being interested enough to know what they are used for, if anything.  Wikipedia treats them as purely objets d'art, as in most of its mathematical articles.
On Thursday, February 21, 2019 at 11:46:34 AM UTC-5 Jim Rees wrote:
I am confused about the errata at https://small.r7rs.org/wiki/R7RSSmallErrata/

Item #24 regarding the example "(real? 2.5+0.0i)" says the conflict hasn't yet been resolved, but you say here and item #19 says that r7rs did not adopt r6rs semantics, which to me says (real? 2.5+0.0i) ==> #t as an R7RS requirement and there is no conflict.

The trouble is that the prose says one thing, the examples (which were copied from R6RS, incautiously, by me) say another.  It had never occurred to me that anyone would think examples were normative, but the Steering Committee didn't agree.  So the matter is in suspense and will probably remain so.  The main reason R7RS adopted the R5RS rule, as far as committee discussions were concerned, was to avoid a silent breaking change; the Racket/R6RS viewpoint was that the breaking change was in fact what people expected and wanted.

Marc Nieper-Wißkirchen

unread,
Aug 17, 2022, 2:50:09 PM8/17/22
to scheme-re...@googlegroups.com
Am Mi., 17. Aug. 2022 um 19:51 Uhr schrieb John Cowan <co...@ccil.org>:
>
>
>
> On Tue, Aug 9, 2022 at 4:18 AM Marc Nieper-Wißkirchen <marc....@gmail.com> wrote:
>
>> I know that this is not the motivation for R6RS's specification of the `real?' procedure, but In any case, I don't think that exact non-real complex numbers are very interesting. Complex numbers of mixed exactness even less (I would remove them if that makes things simpler).
>
>
> I hear talk of Gaussian integers from time to time, but I confess to not being interested enough to know what they are used for, if anything. Wikipedia treats them as purely objets d'art, as in most of its mathematical articles.

The Gaussian integers and their field of quotients (complex numbers of
the form x + i y with x and y rational numbers) are used in number
theory. In fact, they are one of the simplest examples of the ring of
integers in an algebraic number field. (To understand why they are
useful, consider an expression like x^2 + y^2. This can be factored
over the Gaussian integers as (x + i y) (x - i y). Such factorization
problems appear in solutions to Fermat's last theorem, for example.)

So, the Gaussian integers and their field of quotients are
interesting. But in isolation, they are very specific. To support
algebraic number theory, Scheme would have to provide other number
fields as well, e.g. the Eisenstein integers, which, for example,
appear in solutions to Fermat for n = 3. Here, i is replaced by omega.
While i is a primitive fourth root of unity, omega is a primitive
sixth root of unity.

>>>
>>> On Thursday, February 21, 2019 at 11:46:34 AM UTC-5 Jim Rees wrote:
>>>>
>>>> I am confused about the errata at https://small.r7rs.org/wiki/R7RSSmallErrata/
>>>>
>>>> Item #24 regarding the example "(real? 2.5+0.0i)" says the conflict hasn't yet been resolved, but you say here and item #19 says that r7rs did not adopt r6rs semantics, which to me says (real? 2.5+0.0i) ==> #t as an R7RS requirement and there is no conflict.
>
>
> The trouble is that the prose says one thing, the examples (which were copied from R6RS, incautiously, by me) say another. It had never occurred to me that anyone would think examples were normative, but the Steering Committee didn't agree. So the matter is in suspense and will probably remain so. The main reason R7RS adopted the R5RS rule, as far as committee discussions were concerned, was to avoid a silent breaking change; the Racket/R6RS viewpoint was that the breaking change was in fact what people expected and wanted.

I would have also expected the text to be more normative than an
example. I don't understand the reasoning of the R7RS committee,
though. A silent breaking change from R5RS was avoided, but a silent
breaking change from R6RS was caused. Obviously, one cannot have both,
so it would be good to have a reason that is independent of questions
of compatibility.

In any case, even should it be so that the matter cannot be resolved
in R7RS (small), we have the chance to fix the semantics of `real?' in
the Foundations.

I would like to hear from the R6RS people why they chose to change the
meaning of `real?'. As it seems by Yongming's comment, `flonum?' is
all that is needed for flow analysis.

Linas Vepstas

unread,
Aug 17, 2022, 3:46:34 PM8/17/22
to scheme-re...@googlegroups.com

I know that this is not the motivation for R6RS's specification of the `real?' procedure, but In any case, I don't think that exact non-real complex numbers are very interesting.  Complex numbers of mixed exactness even less (I would remove them if that makes things simpler).

All that I want out of scheme is a conventional, "common-sense" kind of double-precision math library, with any strange warts and blebs cleaned up so that everything is as you would expect it to be, when coming from a different programming language.  This means that (real?  1+0i) should return false and that (sqrt -1.0) should return NaN ... unless I explicitly asked for a complex number -- say (csqrt -1.0) or perhaps (sqrt -1.0+i1e-300)  -- Principle of least surprise.


I hear talk of Gaussian integers from time to time,

The fragment of a "numeric tower" in math is shown in the first paragraph of https://en.wikipedia.org/wiki/Integral_domain

Things that can be added, subtracted, multiplied and divided are the second from the last entry: https://en.wikipedia.org/wiki/Field_(mathematics)  Observe that there's a zoo, there, of strange beasts.

Fields that can *also* have greater, less than relations are https://en.wikipedia.org/wiki/Ordered_field  and they are *always* isomorphic to the real numbers.  That is, the real numbers are "unique" in this sense.

Complex numbers go NOT have greater/less-than relations, but they are "algebraically closed" https://en.wikipedia.org/wiki/Algebraically_closed_field

There are many kinds of algebraically closed fields, but the complex numbers are unique in model theory (transfinite set theory) -- in all the crazy infinitely-infinite towers of transfinite numbers,  there is exactly one model of the complex numbers.  I don't understand it; I was told it's true.

The upshot is that -- surprise -- integers, fractions, reals and complex are not just useful, but they are also kind-of-ish unique for what they are. Once you stray from this cozy little world, it's a zoo, a bestiary, with thousands of strange beasts, of which "Gaussian integers" are but one.

The systems that can deal with this bestiary tend to take the form of a domain-specific language, because, to even get started, you need to have a notation for which beast you're interested in, and how to write down an instance of it. Things that you might want to do might be, for example, to find automorphism groups, which is not interesting for reals, complex, which is why conventional math libs don't do that.  They usually do NOT have foreign language bindings.  They do usually have REPL's.  My experience is limited.

-- Linas

Jay Freeman

unread,
Aug 17, 2022, 3:54:20 PM8/17/22
to scheme-re...@googlegroups.com
When I was working at Sun Microsystems Laboratory (1997 - 2006) there was a project on "interval arithmetic" -- representing each number as an interval (x, y) where x < y, and (I think) x and y were both IEEE floats of some kind. I did not work on this project and I am not sure whether anything about it ever made its way into print or onto the Internet, but some quite bright people worked on it for a long time and I am sure they encountered many of the issues recently discussed here.

The project was in connection with a proposed high-performance supercomputer that Sun was working on, that did not make it out of R&D.

I mention it here in case anyone is interested and wants to search, and also because such an approach is a relatively straightforward way to enhance numerical computation at the programming-language level without relying on a new underlying numeric type: It would be a great rathole for your copious spare time. 😸😸😸

By coincidence, this thread became active when I was immersed in a major rewrite of numerical work -- parsing, arithmetic and display -- in Wraith Scheme (R5 parallel Scheme for the Macintosh, new release "real soon now"*). I have been a bit wary of following the thread for fear of learning something that would make me lose focus or give up in despair, but so far, so good. 😸

*New release in a month or two, knock on wood. 🙀🙀  Current release is 2.26

John Cowan

unread,
Aug 17, 2022, 8:33:09 PM8/17/22
to scheme-re...@googlegroups.com, Jay Freeman, Yongming Shen
On Wed, Aug 17, 2022 at 8:01 AM Alaric Snell-Pym <ala...@snell-pym.org.uk> wrote:

One reason to want numbers-in-Scheme to be "like" the numbers described
in mathematics is to allow reasoning about them. C is infamous for its
arithmetic being full of surprises and undefined behaviour,

Well, we've got that stuff in R[67]RS fixnums and flonums if you want it.
Making numbers-in-Scheme behave "like" "actual" numbers as we
learnt about in school reduces the chance for subtle bugs and annoying
problems

Yes, at a price.  Sometimes you *do* want it faster at the expense of being sometimes wrong.  Of fixed-width formats, fixnums are faster but blow up their range; flonums are just about as fast (nowadays) and have a wide range.
 
And that's just integers. What about floating point numbers? All our
talk of the real meaning of inexact numbers in Scheme, I feel, is
skirting around the fact that we're trying to give a sensible
mathematical semantics to things like IEEE floats (and their relatives,
such as bfloat16 - and maybe other less-used representations

http://www.quadibloc.com/comp/cp0201.htm lists just about every floating-point format ever used.
(I vaguely
remember reading about a numeric representation that was a fixed-point
binary representation of the logarithm of the number, whatever happened
to that?)).

Google "quasilogarithmic floating point".  It  shares the disadvantages of binary FP with those of decimal FP.
 
And
there should be fixed-width decimal types so people doing money can do
money right (floats really, really, aren't the way to do money right).
 
PL/I has a nice model: you declare a variable as either FIXED or FLOAT and either BINARY or DECIMAL, and you specify the number of bits or digits you want, both total and right of the decimal/binary point.  You don't necessarily get exactly what you ask for: FIXED BINARY(20) will probably come out as  FIXED BINARY(31).  The PL/I General Purpose Subset, which is what most compilers other than IBM's actually support, restricts FIXED BINARY to integers only.  The compiler does type checking but not inference.  Overflow throws an exception: the behavior of underflow is machine-dependent.
 
Are the applications of complex numbers in software more in the
numeric-programming domain that floats are made for, and appropriate
for, and the users can be expected to know about the caveats?

The engineering applications of complex numbers (electromagnetic fields, e.g.) are as easily satisfied by pairs of IEEE floats as the engineering applications of real numbers are satisfied by one IEEE float.

John Cowan

unread,
Aug 17, 2022, 8:42:20 PM8/17/22
to scheme-re...@googlegroups.com
On Wed, Aug 17, 2022 at 3:46 PM Linas Vepstas <linasv...@gmail.com> wrote:
 
All that I want out of scheme is a conventional, "common-sense" kind of double-precision math library, with any strange warts and blebs cleaned up so that everything is as you would expect it to be, when coming from a different programming language.

If everything must be as you expected it to be in C, why not write in C?  When coming from a language without proper tail calling, you no doubt don't expect proper tail calling, and think of recursion as something much too dangerous to use outside a college class.  Schemers think otherwise.

Linas Vepstas

unread,
Aug 18, 2022, 4:11:01 AM8/18/22
to scheme-re...@googlegroups.com
Eh?  But I *do* write in C (well C++) and use guile as a wrapper on that code.  Despite this, I have lots and lots of scheme code where I use casual floating-point arithmetic -- sums, accumulation, division, logs. None of it is performance-critical, but I don't want it to be slow, either. It does contribute to wall-clock time, and the python contingent are happy to thumb their noses at scheme when given the opportunity.

I don't expect things to be "exactly as in C", but I do expect things to behave in a way that a conventional person of normal intelligence might find unsurprising.  That's because I like getting things done, rather than spending an afternoon searching documentation.  Simple things should be simple, should not require deep theoretical understanding.

Returning to (real? 1+i0) -- the "obvious solution" is to have not one but two predicates, here: (is-imaginary-part-zero? 1+i0) returns true and (is-complex-number-type? 1+i0) returns true.  The friction in the conversation seems to be that some people in the conversation are thinking is-real? means the first, and some the second. The naive programmer who is skimming the docs with glazed eyes will be .. surprised.   Because naive programmers think that "real?" is a reference to the *type* and not a reference to the *value*.

Here's the other way to put it:   (is-integer? 1.0)  -- what should it return?  #t or #f? I claim most naive programmers think of "integer" as a type, and not as a way of asking if the fractional part is zero.

Since we're in an argumentative mood:  in 1982, I was privy to a conversation with the IBM xlC compiler developers, who had just implemented a fragment of tail calls for C.  Bascally, the idea was that you didn't "buy a stack frame", if it wasn't really needed, and there was technical discussion on just how far this could be pushed in the direction of true tail calls.  There were AIX kernel programmers in the room, and they understood what was going on quite clearly, and were very interested in this.  The real magic would have been tail calls across shared-library boundaries, as shlib calls always pass through a pointer table (a thunk).

The moral of the story: there are senior programmers who grok this stuff, and junior programmers who do not. Whatever you do, don't make design decisions that cause the senior programmers think "that's a stupid idea". And don't make life hard for the junior programmers, because they will come back to bite in some blog post.

--linas

Marc Nieper-Wißkirchen

unread,
Aug 18, 2022, 4:48:05 AM8/18/22
to scheme-re...@googlegroups.com
Am Mi., 17. Aug. 2022 um 11:03 Uhr schrieb Yongming Shen <sym...@gmail.com>:
>
> Interpreting an inexact number as an inexact representation of some math number (a number in the sense of mathematics) is indeed problematic. Without context, there is no way to know what the math number is, or what the error margin introduced by inexactness is. Related to this, I think there are two possible mental models for explaining what an inexact number in Scheme is.
>
> 1) An inexact number is a tainted number. When an arithmetic function can not find a mathematically correct result, it returns a mathematically incorrect but "sensible" number as the result, and marks the number as tainted. When thinking of inexact numbers in this way, it is natural to think that an inexact number is a math number, the taint is just there to show that some mathematically incorrect arithmetics are involved in its creation. In this case, having `-0.0`, `+inf.0`, `-inf.0`, and `+nan.0` as inexact numbers doesn't make sense, having `1.0` and `1.0+0.0i` be distinct inexact numbers also doesn't make sense. And while it makes perfect sense to propagate the taint, it doesn't make much sense to say that "since X and Y are tainted, `(+ X Y)` can return a mathematically incorrect result". So if possible, arithmetic operations should return mathematically correct results even if its inputs are tainted. Overall, thinking of inexact numbers as tainted numbers is clean and straightforward, but does not fit well with the reality of Scheme.

Actually, I like this point of view.

Think of the definition of (uniform) continuity: Given a (uniformly)
continuous function f and a real number x, I can (within a suitable
model) effectively compute the real number f(x). If I want to know the
result up to an error of epsilon, uniform continuity gives me a delta,
telling me how exactly I have to know x. (In this model, a real number
is a kind of oracle. Given an error estimate, the oracle tells me a
rational number that is within this error estimate of the "ideal" real
number.)

In general, however, theorems stating that a certain function is
continuous are not formulated in an effective way. The epsilon/delta
relation is usually hidden in the proof.

Now the Scheme procedures for numbers are locally uniformly continuous
(when branch cuts are removed or the left and the right of a branch
cut are counted separately), so the above reasoning applies. Of
course, an actual Scheme number is not an ideal real number as one
modeled by an oracle from above, but its precision is limited. An
inexact number (real) in Scheme is thus a real number up to an error
estimate that is implicit and that is as sharp as possible. Procedures
operating on inexact real numbers and returning inexact real numbers
make mathematically sense when they model locally uniformly continuous
functions. As much as in the above observation about the formulation
of a typical mathematical theorem stating continuity, the propagation
of the error estimate is implicit, but the programmer should be able
to expect that the error estimate of the resulting value is as sharp
as possible.

This way, one can explain the behavior of procedures like `+'.

We need a different explanation for non-continuous procedures like
`integer?'. These procedures only work meaningfully for exact
arguments. So, here, any inexact argument has to be replaced by an
exact argument so that the exact argument's value is as closest to the
inexact value (meaning that any error estimate is not weakened).

Under this point of view, `real?' should behave as in R7RS, and not as in R6RS.

If we drop the R6RS definition, one should - to help the compiler -
introduce the negative of `r6rs:real?' instead. Let us call it
`compnum?'. In optimizing implementations, it can just mean the C type
double complex.

John Cowan

unread,
Aug 18, 2022, 9:23:40 PM8/18/22
to scheme-re...@googlegroups.com
On Thu, Aug 18, 2022 at 4:11 AM Linas Vepstas <linasv...@gmail.com> wrote:

If everything must be as you expected it to be in C, why not write in C? 

I was using the impersonal "you" here.
 
Eh?  But I *do* write in C (well C++) and use guile as a wrapper on that code.  Despite this, I have lots and lots of scheme code where I use casual floating-point arithmetic -- sums, accumulation, division, logs. None of it is performance-critical, but I don't want it to be slow, either.

Since Scheme mandates that there be both exact and inexact numbers (although the latter are extremely underspecified -- I think it would be conformant for only #i0 to exist), the vast majority of implementers choose C doubles as their implementation of inexact numbers, and the vast majority of processors in production use (as opposed to those in museums) implement those as IEEE binary64s.
 
It does contribute to wall-clock time, and the python contingent are happy to thumb their noses at scheme when given the opportunity.

That is why the vast majority of *them* choose the slowest available implementation, namely CPython, I suppose.  :-)

Simple things should be simple, should not require deep theoretical understanding.

"Simple" here means something like "intuitive", I think.  We find floating-point arithmetic simple because it is mostly intutitive, and when it is not we either appeal to our pre-existing deep theoretical understanding or (as the Scheme standards advise) consult a numerical analyst.

Returning to (real? 1+i0) -- the "obvious solution" is to have not one but two predicates, here: (is-imaginary-part-zero? 1+i0) returns true and (is-complex-number-type? 1+i0) returns true. 

I don't think anyone disputes that.  The question is, what happens when the imaginary part is inexact zero?

The friction in the conversation seems to be that some people in the conversation are thinking is-real? means the first, and some the second. The naive programmer who is skimming the docs with glazed eyes will be .. surprised.   Because naive programmers think that "real?" is a reference to the *type* and not a reference to the *value*.

The Racket and R6RS folks think it will be less surprising if numbers with inexact imaginary parts are non-real.  The R7RS folks think it will be less surprising if R5RS semantics remain the same.
Here's the other way to put it:   (is-integer? 1.0)  -- what should it return?  #t or #f? I claim most naive programmers think of "integer" as a type, and not as a way of asking if the fractional part is zero.

That's not the case in Scheme, so they will have to learn better.

Since we're in an argumentative mood:  in 1982, I was privy to a conversation with the IBM xlC compiler developers, who had just implemented a fragment of tail calls for C.  Bascally, the idea was that you didn't "buy a stack frame", if it wasn't really needed, and there was technical discussion on just how far this could be pushed in the direction of true tail calls.

Gcc does tail-calling for "sibling calls"; that is, if the caller and callee return the same type and require the same number of bytes for their argument lists.  Tail recursion is obviously a special case of this.  Clang can be made to tail-call all calls in tail position, provided the callee isn't constrained to use a calling convention that won't work properly.

My issue is not whether we should have the two different predicates, it's what we should call them so that nobody mixes them up.  I think that Marc's idea is pretty good except for the specific name `compnum?`.

Yongming Shen

unread,
Aug 21, 2022, 12:25:07 AM8/21/22
to scheme-re...@googlegroups.com
On Thu, Aug 18, 2022 at 4:48 AM Marc Nieper-Wißkirchen <marc....@gmail.com> wrote:

If we drop the R6RS definition, one should - to help the compiler -
introduce the negative of `r6rs:real?' instead. Let us call it
`compnum?'. In optimizing implementations, it can just mean the C type
double complex.


Assuming `compnum?` is introduced, there are two related questions to consider:

1) Can a function which promises to return a `real?` value (when all the inputs are `real?` values) be allowed to return a `compnum?` value? For example, can `(+ 0.0 1.0)` return `1.0+0.0i`? R6RS does not allow it, but R5RS and R7RS-small does. Vice versa, can `(+ 0.0 1.0+0.0i)` return `1.0`? Imposing the same restrictions as R6RS may make `compnum?` more useful, but complicates the specification of generic arithmetic.

2) Should `(compnum? 1.0+0i)` return #t or #f? In R6RS there is an example of `(real? -2.5+0i) => #t`, and in Chez Scheme `1.0+0i` and `1.0` are synonymous. Making `(compnum? 1.0+0i)` return #f helps with R6RS compatibility, but again introduces the notion of exact vs. inexact zero imaginary parts, albeit limited to the discussion of numeric literals.

Taylan Kammer

unread,
Aug 21, 2022, 12:24:57 PM8/21/22
to scheme-re...@googlegroups.com
On 21.08.2022 06:25, Yongming Shen wrote:
Well, mathematically the complex numbers include the reals, so

(real? x)

implies

(complex? x)

Indeed, R6RS has 'complex?' which is basically equivalent to 'number?' since
there isn't a broader set of numbers than complex in R6RS (or in mathematics,
unless you count quaternions, which are weird and usually excluded).

I guess the predicate you have in mind should be called 'imaginary?' but it
could (probably should) just be defined as:

(and (number? x) (not (real? x)))

So it again comes down to whether 1.0+0i and 1.0+0.0i are 'real?' or not.

--
Taylan

Jakob Wuhrer

unread,
Aug 21, 2022, 12:52:09 PM8/21/22
to scheme-re...@googlegroups.com, Taylan Kammer

Taylan Kammer <taylan...@gmail.com> writes:
> Well, mathematically the complex numbers include the reals, so
> (real? x)
> implies
> (complex? x)

While R is often embedded in C, this embedding is not always considered
an identity. Furthermore, I'd wager a guess that the majority of
mathematicians would reply negatively when asked whether, say, pi is a
complex number.

There's no exact consensus as to how one should talk about these terms,
and even if someone has an opinion they may not apply it consistently.

> Indeed, R6RS has 'complex?' which is basically equivalent to 'number?' since
> there isn't a broader set of numbers than complex in R6RS (or in mathematics,
> unless you count quaternions, which are weird and usually excluded).
>
> I guess the predicate you have in mind should be called 'imaginary?'
This may be interpreted as a predicate that evaluates to true iff its
argument is purely imaginary (has a real part of zero).

I think (as others have argued before) we should have predicates that
distinguish based on the type but not value of their argument as well as
predicates that distinguish based on the type and value of their
argument.

Marc Nieper-Wißkirchen

unread,
Aug 21, 2022, 2:39:39 PM8/21/22
to scheme-re...@googlegroups.com
Am So., 21. Aug. 2022 um 18:52 Uhr schrieb Jakob Wuhrer
<jakob....@gmail.com>:
>
>
> Taylan Kammer <taylan...@gmail.com> writes:
> > Well, mathematically the complex numbers include the reals, so
> > (real? x)
> > implies
> > (complex? x)
>
> While R is often embedded in C, this embedding is not always considered
> an identity. Furthermore, I'd wager a guess that the majority of
> mathematicians would reply negatively when asked whether, say, pi is a
> complex number.

A mathematician speaking. If you call 5 a real number, you will also
call pi a complex number.

Of course, within a given framework like ZFC set theory, the
*representation* of pi may be different than the representation of the
pair (pi, 0), but it should not make a difference for everyday
mathematics.

[...]

> > Indeed, R6RS has 'complex?' which is basically equivalent to 'number?' since
> > there isn't a broader set of numbers than complex in R6RS (or in mathematics,
> > unless you count quaternions, which are weird and usually excluded).

There are many more numbers in mathematics, e.g. the p-adic numbers.
The number "tower" in mathematics is more like a number tree.

[...]

Yongming Shen

unread,
Aug 21, 2022, 2:43:17 PM8/21/22
to scheme-re...@googlegroups.com
It doesn't come down to whether 1.0+0i and 1.0+0.0i are `real?` or not because Marc's suggestion to introduce `compnum?` is assuming a context in which `real?` is defined according to R5RS/R7RS-small. In this context, `real?` returns #t for both 1.0+0i and 1.0+0.0i and so can't tell them apart, neither can `(and (number? x) (not (real? x)))`. Furthermore, `real?` can't tell 1.0 and 1.0+0.0i apart either. `compnum?` (not to be confused with `complex?`) is suggested as a way to tell 1.0 and 1.0+0.0i apart in this context by returning #f for 1.0 and #t for 1.0+0.0i. And related to this, the second question I posed is about whether `(compnum? 1.0+0i)` should return #t or #f, which basically is asking should 1.0+0i be a synonym of 1.0 or 1.0+0.0i.
 

Marc Nieper-Wißkirchen

unread,
Aug 21, 2022, 2:53:44 PM8/21/22
to scheme-re...@googlegroups.com
Am So., 21. Aug. 2022 um 06:25 Uhr schrieb Yongming Shen <sym...@gmail.com>:
>
> On Thu, Aug 18, 2022 at 4:48 AM Marc Nieper-Wißkirchen <marc....@gmail.com> wrote:
>>
>>
>> If we drop the R6RS definition, one should - to help the compiler -
>> introduce the negative of `r6rs:real?' instead. Let us call it
>> `compnum?'. In optimizing implementations, it can just mean the C type
>> double complex.
>>
>
> Assuming `compnum?` is introduced, there are two related questions to consider:
>
> 1) Can a function which promises to return a `real?` value (when all the inputs are `real?` values) be allowed to return a `compnum?` value? For example, can `(+ 0.0 1.0)` return `1.0+0.0i`? R6RS does not allow it, but R5RS and R7RS-small does. Vice versa, can `(+ 0.0 1.0+0.0i)` return `1.0`? Imposing the same restrictions as R6RS may make `compnum?` more useful, but complicates the specification of generic arithmetic.

R6RS and R7RS-large have the concept of flonums, which govern a
particular representation of a subset of the reals. Similar, a compnum
should be a particular representation of a complex number (namely one
made from two flonum parts). The R7RS-large restriction would be (at
least) that the sum of two flonums must not be a compnum.

> 2) Should `(compnum? 1.0+0i)` return #t or #f? In R6RS there is an example of `(real? -2.5+0i) => #t`, and in Chez Scheme `1.0+0i` and `1.0` are synonymous. Making `(compnum? 1.0+0i)` return #f helps with R6RS compatibility, but again introduces the notion of exact vs. inexact zero imaginary parts, albeit limited to the discussion of numeric literals.

This is, indeed, more a question of the reader and not of what a
compnum is. Where it helps, I would suggest to strengthen the large
language and not leaving as many things undefined as in R[57]RS.

This is my tentative definition of `compnum?':

(define compnum?
(lambda (obj)
(and (complex? obj)
(flonum? (real-part obj))
(flonum? (imag-part obj))
(eqv? obj (make-rectangular (real-part obj) (imag-part obj))))))

John Cowan

unread,
Aug 22, 2022, 12:00:35 PM8/22/22
to scheme-re...@googlegroups.com
On Sun, Aug 21, 2022 at 2:53 PM Marc Nieper-Wißkirchen <marc....@gmail.com> wrote:
 
> 2) Should `(compnum? 1.0+0i)` return #t or #f? In R6RS there is an example of `(real? -2.5+0i) => #t`, and in Chez Scheme `1.0+0i` and `1.0` are synonymous. Making `(compnum? 1.0+0i)` return #f helps with R6RS compatibility, but again introduces the notion of exact vs. inexact zero imaginary parts, albeit limited to the discussion of numeric literals.

It's normal for the meaning of the literal x+yi to be the same as (make-rectangular x y), so it's not limited to the reader.
 
This is my tentative definition of `compnum?':

(define compnum?
  (lambda (obj)
    (and (complex? obj)
         (flonum? (real-part obj))
         (flonum? (imag-part obj))
         (eqv? obj (make-rectangular (real-part obj) (imag-part obj))))))

Without necessarily accepting this, I would point out that we can replace eqv? with =, since they mean the same thing if both arguments are numbers and I think using = is clearer.

However, the behavior of compnum? is still variable depending on the behavior of imag-part.  What are the values of:

(imag-part 1.0) => 0 or 0.0?
(imag-part (make-rectangular 1.0 2)) => 2 or 2.0?

In Chibi (R7RS) and Vicare (R6RS) the answers are 0 and 2,  in Chicken (R7RS) and Chez (R6RS) the answers are 0.0 and 2.0, in Guile (R6RS) and Cyclone (R7RS) the answers are 0 and 2.0.  I have not found any Schemes where the answers are 0.0 and 2, but it is conceivable.



Marc Nieper-Wißkirchen

unread,
Aug 22, 2022, 12:51:57 PM8/22/22
to scheme-re...@googlegroups.com
Am Mo., 22. Aug. 2022 um 18:00 Uhr schrieb John Cowan <co...@ccil.org>:
>
>
>
> On Sun, Aug 21, 2022 at 2:53 PM Marc Nieper-Wißkirchen <marc....@gmail.com> wrote:
>
>>
>> > 2) Should `(compnum? 1.0+0i)` return #t or #f? In R6RS there is an example of `(real? -2.5+0i) => #t`, and in Chez Scheme `1.0+0i` and `1.0` are synonymous. Making `(compnum? 1.0+0i)` return #f helps with R6RS compatibility, but again introduces the notion of exact vs. inexact zero imaginary parts, albeit limited to the discussion of numeric literals.
>
>
> It's normal for the meaning of the literal x+yi to be the same as (make-rectangular x y), so it's not limited to the reader.
>
>>
>> This is my tentative definition of `compnum?':
>>
>> (define compnum?
>> (lambda (obj)
>> (and (complex? obj)
>> (flonum? (real-part obj))
>> (flonum? (imag-part obj))
>> (eqv? obj (make-rectangular (real-part obj) (imag-part obj))))))
>
>
> Without necessarily accepting this, I would point out that we can replace eqv? with =, since they mean the same thing if both arguments are numbers and I think using = is clearer.

They only mean the same if the arguments are numbers of the same exactness.

(= 0 0.0) but (not (eqv? 0 0.0))

Here, I would like to stress that the exactness must also be the same.

[...]

John Cowan

unread,
Aug 22, 2022, 3:36:47 PM8/22/22
to scheme-re...@googlegroups.com
On Mon, Aug 22, 2022 at 12:51 PM Marc Nieper-Wißkirchen <marc....@gmail.com> wrote:


They only mean the same if the arguments are numbers of the same exactness.
(= 0 0.0) but (not (eqv? 0 0.0))
Here, I would like to stress that the exactness must also be the same.

True.  However, flonum? can only return #t on an inexact value, and make-rectangular always returns an inexact value if either argument is inexact.  Therefore, exactness is not relevant as far as I can see.

[...]

--
You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheme-reports-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scheme-reports-wg2/CAEYrNrRr-L_tvbGfERgqkg3w4_O-%3Da9bZ7n%3DNsRn1470yjXgPA%40mail.gmail.com.

Yongming Shen

unread,
Aug 23, 2022, 1:16:54 AM8/23/22
to scheme-re...@googlegroups.com
On Mon, Aug 22, 2022 at 12:00 PM John Cowan <co...@ccil.org> wrote:
However, the behavior of compnum? is still variable depending on the behavior of imag-part.  What are the values of:

(imag-part 1.0) => 0 or 0.0?
(imag-part (make-rectangular 1.0 2)) => 2 or 2.0?

In Chibi (R7RS) and Vicare (R6RS) the answers are 0 and 2,  in Chicken (R7RS) and Chez (R6RS) the answers are 0.0 and 2.0, in Guile (R6RS) and Cyclone (R7RS) the answers are 0 and 2.0.  I have not found any Schemes where the answers are 0.0 and 2, but it is conceivable.


Chez Scheme has probably changed, in the latest version (9.5.8), the answers for Chez are 0 and 2.0. In fact, `(imag-part 1.0) => 0` is mandated by R6RS, because `(imag-part 1.0) => 0.0` would imply `(real? 1.0) => #f` by R6RS's definition of `real?`. The following also holds for Chez Scheme:

  (imag-part (make-rectangular 1.0 0)) => 0
  (string->number "1.0+0i") => 1.0
  (flonum? 1.0+0i)=> #t

Not sure if all three are mandated by R6RS, but IMO it is counter intuitive for an R6RS implementation to behave otherwise.

Linas Vepstas

unread,
Aug 23, 2022, 4:10:18 AM8/23/22
to scheme-re...@googlegroups.com
Hi John,

On Fri, Aug 19, 2022 at 4:23 AM John Cowan <co...@ccil.org> wrote:

Returning to (real? 1+i0) -- the "obvious solution" is to have not one but two predicates, here: (is-imaginary-part-zero? 1+i0) returns true and (is-complex-number-type? 1+i0) returns true. 

I don't think anyone disputes that.  The question is, what happens when the imaginary part is inexact zero?

The friction in the conversation seems to be that some people in the conversation are thinking is-real? means the first, and some the second. The naive programmer who is skimming the docs with glazed eyes will be .. surprised.   Because naive programmers think that "real?" is a reference to the *type* and not a reference to the *value*.

The Racket and R6RS folks think it will be less surprising if numbers with inexact imaginary parts are non-real.  The R7RS folks think it will be less surprising if R5RS semantics remain the same.

Ah. I finally understand the issue!

So, there are two classes of predicates. -- "is object x of type t?"  and "does the value of x have property p?"  and `real?` is of the second class.

Well, does inexact zero have the property of being zero? seems like the answer is yes. 

So, does inexact 1.0+i0.0 have the property that the imaginary part is zero? Answer seems to be "yes".

I'm happy with the r5rs/r7rs choice, ... however, is there some other predicate that allows me to ask the question "is 1.0+i0.0 of complex number type?" It seems like there isn't and it seems like this is needed?

BTW in guile
(real? 1+0i)    #t
(real? 1.0+0.0i) #f
(exact? 1+0i)   #t
(exact? 1+1i)  #f   ... really? WTF?

That implies that 1+0i is being "automatically" converted to exact 1.  Which seems weird. We don't automatically convert hass tables to lists ... or bitvectors to lists ... why would that be done implicitly, here?

-- linas


John Cowan

unread,
Aug 23, 2022, 3:52:18 PM8/23/22
to scheme-re...@googlegroups.com
On Tue, Aug 23, 2022 at 4:10 AM Linas Vepstas <linasv...@gmail.com> wrote:

So, there are two classes of predicates. -- "is object x of type t?"  and "does the value of x have property p?"  and `real?` is of the second class.

Well, if you want to put it that way.  But I think of "has the property of being complex" and "belonging to (dynamic) type Complex" as equivalent, and the same for all other property/type pairs.

Well, does inexact zero have the property of being zero? seems like the answer is yes. 

That's correct in both R6 and R5/R7.
 
I'm happy with the r5rs/r7rs choice, ... however, is there some other predicate that allows me to ask the question "is 1.0+i0.0 of complex number type?" It seems like there isn't and it seems like this is needed?

It's easy to have both predicates, the one that returns #t on 1.0+0.0i and the one that returns #f.  The question is, what shall we call them to avoid confusing people?

BTW in guile
(real? 1+0i)    #t
(real? 1.0+0.0i) #f
(exact? 1+0i)   #t
(exact? 1+1i)  #f   ... really? WTF?

That implies that 1+0i is being "automatically" converted to exact 1. 

That's true, because the "+0i" is being discarded and the result is a real number (or to put it another way, when make-rectangular's second argument is an exact 0, the result is real).

But what is really odd is that 1+1i becomes 1.0+1.0i.  In other words, Guile supports exact and inexact real numbers but only inexact non-real numbers.  R6RS seems to allow this.

Yongming Shen

unread,
Aug 23, 2022, 8:44:11 PM8/23/22
to scheme-re...@googlegroups.com
It appears that Scheme's rationale behind numeric literals is similar to that of its generic arithmetic: be exact if possible, if not, fallback to inexact. `(exact? 1+1i) => #f` is probably not common among Scheme implementations, but I suspect for the majority of implementations out there, `(exact? 1@1)` will return #f. In R6RS, one can use the `#e` prefix to force an exact number to be produced from a numeric literal, and the exact number must match the mathematical interpretation of the literal, if an implementation does not support such an exact number, then the parsing of the literal fails. For example, in Chez Scheme:

    (string->number "1@1") => 0.5403023058681398+0.8414709848078965i
    (string->number "#e1@1") => #f
    (string->number "#e1.0@1.0") => #f 
    (string->number "#e1.0@0.0") => 1

Alex Shinn

unread,
Aug 23, 2022, 9:08:27 PM8/23/22
to scheme-re...@googlegroups.com
On Wed, Aug 24, 2022 at 4:52 AM John Cowan <co...@ccil.org> wrote:

It's easy to have both predicates, the one that returns #t on 1.0+0.0i and the one that returns #f.  The question is, what shall we call them to avoid confusing people?

real? and exactly-real? :)

--
Alex

Yongming Shen

unread,
Aug 23, 2022, 9:18:32 PM8/23/22
to scheme-re...@googlegroups.com
In R6RS they are `real?` and `real-valued?`, with `(real? 1.0+0.0i) => #f` and `(real-valued? 1.0+0.0i) => #t`. And it is not quite just a matter of naming. Some generic arithmetic functions in the standard require some or all of their arguments to be "real numbers", and for those functions `real?` is used as the definition of real numbers.

Yongming Shen

unread,
Aug 23, 2022, 10:57:09 PM8/23/22
to scheme-re...@googlegroups.com
On Sun, Aug 21, 2022 at 2:53 PM Marc Nieper-Wißkirchen <marc....@gmail.com> wrote:

This is my tentative definition of `compnum?':

(define compnum?
  (lambda (obj)
    (and (complex? obj)
         (flonum? (real-part obj))
         (flonum? (imag-part obj))
         (eqv? obj (make-rectangular (real-part obj) (imag-part obj))))))


For implementations that return 0.0 for `(imag-part 1.0)`, this definition only works if `eqv?` can tell 1.0 and 1.0+0.0i apart. In R6RS `eqv?` can indeed tell 1.0 and 1.0+0.0i apart, not sure if it is the case for R7RS. Replacing `eqv?` with `=` as John suggested won't work, because whether it is R6RS or R7RS, `=` can not tell 1.0 and 1.0+0.0i apart. Maybe it is better to just provide `compnum?` as a primitive like `flonum?`?

Yongming Shen

unread,
Aug 23, 2022, 11:02:36 PM8/23/22
to scheme-re...@googlegroups.com
On Tue, Aug 23, 2022 at 3:52 PM John Cowan <co...@ccil.org> wrote:

But what is really odd is that 1+1i becomes 1.0+1.0i.  In other words, Guile supports exact and inexact real numbers but only inexact non-real numbers.  R6RS seems to allow this.



Checked R6RS and found that it actually requires `make-rectangular` to return an exact number when both arguments are exact, so Guile is in fact not conforming to R6RS on this matter. The same requirement does not apply to `make-polar` though.  

Linas Vepstas

unread,
Aug 24, 2022, 12:01:59 PM8/24/22
to scheme-re...@googlegroups.com
Hi John,

On Tue, Aug 23, 2022 at 10:52 PM John Cowan <co...@ccil.org> wrote:

On Tue, Aug 23, 2022 at 4:10 AM Linas Vepstas <linasv...@gmail.com> wrote:

So, there are two classes of predicates. -- "is object x of type t?"  and "does the value of x have property p?"  and `real?` is of the second class.

Well, if you want to put it that way.  But I think of "has the property of being complex" and "belonging to (dynamic) type Complex" as equivalent, and the same for all other property/type pairs.

I dunno. The type "complex" has two cells, and it should not matter if the second cell holds a value of zero. Now some uber-clever scheme implementer could design a system having a dynamic complex type, where the second cell is released to the garbage collector whenever a zero is written into it.  Are you protecting the right of scheme implementors to do this?

I see this as being more like the distinction between vectors and lists. In a certain sense, lists and vectors are "exactly the same thing". You could have a scheme implementation that automatically converts between lists and vectors on a whim, whenever some clever optimization can be made.  I doubt most users would be happy with that. They really expect `is-vector?` to return #f on lists, and would be surprised to find that sometimes, it's #t.

(In philosophy, we can argue about whether and how types differ from having-the-property-of. In programming practice, types refer to the (internal, hidden, possibly dynamic) structure of the storage location, independent of the contents of that location. Properties are always about specific values, and not about how they are represented "under the covers". I think this is a safe distinction in quotidien programming.)


It's easy to have both predicates, the one that returns #t on 1.0+0.0i and the one that returns #f.  The question is, what shall we call them to avoid confusing people?


I like  Yongming Shen's answer:

> In R6RS they are `real?` and `real-valued?`, with `(real? 1.0+0.0i) => #f` and `(real-valued? 1.0+0.0i) => #t`.

but it does leave open the question of how `complex?` and `complex-valued?` should behave.

--linas

John Cowan

unread,
Aug 24, 2022, 2:48:52 PM8/24/22
to scheme-re...@googlegroups.com
On Wed, Aug 24, 2022 at 12:01 PM Linas Vepstas <linasv...@gmail.com> wrote:
 
I see this as being more like the distinction between vectors and lists. In a certain sense, lists and vectors are "exactly the same thing". You could have a scheme implementation that automatically converts between lists and vectors on a whim, whenever some clever optimization can be made.  I doubt most users would be happy with that. They really expect `is-vector?` to return #f on lists, and would be surprised to find that sometimes, it's #t.

In fact R[4567] forbid such behavior: anything you create with `{make-,}vector` answers #t to `vector?`, anything you create with `cons` does not.  What is *not* forbidden is something that contains contiguous storage but is a list (see <https://www.cs.cmu.edu/Groups/AI/html/faqs/lang/lisp/part2/faq-doc-9.html> for cdr-coding), or something that is represented as a tree but is a vector.
 
In programming practice, types refer to the (internal, hidden, possibly dynamic) structure of the storage location

If something is hidden, what is the point in having a label for it?  We never know the structure of the storage location in Scheme (or CL, Perl, Python, etc.)  Maybe a complex number occupies two cells, maybe it doesn't.  In Chicken, for example, an inexact complex number is two raw binary64s, but an exact complex number is typically two pointers with immutable contents (same size on 64-bit systems, not the same on 32-bit systems).

I like  Yongming Shen's answer:

> In R6RS they are `real?` and `real-valued?`, with `(real? 1.0+0.0i) => #f` and `(real-valued? 1.0+0.0i) => #t`.

In R5RS the `real-valued?` predicate is named `real?` and the other predicate does not have a standard name.  In any case, the R7RS WG1 (including me) didn't think it's obvious, given the names `real?` and `real-valued?`, which is which.
but it does leave open the question of how `complex?` and `complex-valued?` should behave.

That distinction makes no sense unless you have quaternions (a+bi+cj+dk) or equivalently hypercomplex numbers ((a+bi)+(c+di)j), in which case we deal with c = d = 0 vs. (c = 0.0) or (d = 0.0).

Alex Shinn

unread,
Aug 24, 2022, 6:32:07 PM8/24/22
to scheme-re...@googlegroups.com
On Wed, Aug 24, 2022 at 10:18 AM Yongming Shen <sym...@gmail.com> wrote:
On Tue, Aug 23, 2022 at 9:08 PM Alex Shinn <alex...@gmail.com> wrote:
On Wed, Aug 24, 2022 at 4:52 AM John Cowan <co...@ccil.org> wrote:

It's easy to have both predicates, the one that returns #t on 1.0+0.0i and the one that returns #f.  The question is, what shall we call them to avoid confusing people?

real? and exactly-real? :)
 
In R6RS they are `real?` and `real-valued?`, with `(real? 1.0+0.0i) => #f` and `(real-valued? 1.0+0.0i) => #t`. And it is not quite just a matter of naming. Some generic arithmetic functions in the standard require some or all of their arguments to be "real numbers", and for those functions `real?` is used as the definition of real numbers.

John was repeating his question from 2.5 years ago which so far has received much discussion but not a single suggestion.  I was just trying to get the ball rolling.

There's agreement that both predicates are useful.  R6RS introduced an incompatibility by providing both but renaming real? to real-valued? and introducing a new definition of real?.  R7RS-small had already discussed and voted on this change and decided not to break R5RS compatibility, in accordance with the WG1 charter.  For better or worse, we can't change that now.

--
Alex

John Cowan

unread,
Aug 24, 2022, 7:35:21 PM8/24/22
to scheme-re...@googlegroups.com
On Wed, Aug 24, 2022 at 6:32 PM Alex Shinn <alex...@gmail.com> wrote:
 
John was repeating his question from 2.5 years ago which so far has received much discussion but not a single suggestion.

Well, actually Marc N-W did put forward a suggestion, namely to incorporate the negation of r6rs:real under the name of `compnum?`.

Jay Freeman

unread,
Aug 24, 2022, 9:48:25 PM8/24/22
to scheme-re...@googlegroups.com, Jay Freeman
I think this discussion of Scheme numbers and mathematics still lacks a clear distinction between what is modeled and how it is represented, and that distinction seems very important for some of the points raised herein.

My thoughts on the subject turned out lengthier than I anticipated, so I will start by putting the summary first and then let you plow through the details if you wish:


Summary:

(1) In Scheme, there is, or ought to be, a clear distinction between the mathematical numbers modeled and the details of how those modeled numbers are represented as patterns of bits.

(2) I suggest that insofar as possible, the details of how mathematical numbers are represented as bits should be invisible to the user while performing calculations.

(3) I further suggest that the details of how mathematical numbers are represented as bits should be easily available to any user who wants to know about such things.

End of summary -- read on at risk of falling asleep from boredom.


I advocate the point of view that useful modeling of mathematical numbers should, insofar as possible, not let the details of how those numbers are represented get in the way of their effective use in performing calculations, yet it seems clear that some of those details of representation could matter a lot to some users. Thus I suggest that Scheme should provide easy ways for users to investigate how any number is represented. In less formal terms, I think Scheme should sweep the details of how numbers are represented under the rug but make it easy for users to roll back the rug and take a look if they are so wish. Yet there are some problems in doing so. Let me elaborate:

I think all agree that the intent is to model the mathematical numbers as developed in a course on mathematical analysis (which is *not* numerical analysis). These are the integers, the rationals, the reals and the complexes. Each type in that list is a subset of all the types that follow it, and together they make up the numbers. (Actually, an analysis course likely starts with the positive integers, and adds zero and then the negative integers before going further, but we don't need this level of detail in Scheme.)

In this modeling, we computer types find it useful to add placeholders to indicate that something has gotten bogus: Popular placeholders are nans, infs both signed and unsigned, and signed zeros. Other indications of wonkiness are often provided by various kinds of hardware floating-point exceptions or by suspicious users, but Schemers tend to lump them all into a single stick-on label of bogosity; namely, a cleared "exact bit" or its equivalent. I use terms like "bogus" because I don't believe there is general agreement among mathematicians about how to introduce these concepts into the hierarchy of mathematical types. (Except possibly for nans -- there are lots of nans known to mathematics, such as bananas, and I don't think that any formal mathematical operations are likely to produce bananas as results, so any formal model of numbers can exclude all bananas by fiat and thus be done with them. If there are any formal mathematical operations that produce bananas I am not sure I want to know about them.)

The preceding paragraphs say nothing about where the bits are that represent these numbers and friends, or how to use the bits of a particular representation to determine which number is intended, or how the Scheme system decides which representation has been used to create a particular bunch of bits in the first place.

There are lots of well-known representations. They include signed and unsigned integers of 8, 16, 32 and 64 bits, floating-point numbers in several formats of length 32, 64 and 80 bits, and several likely representations of bignums. (And there used to be more -- does anyone remember the saying, "If you don't have 36 bits, you're not playing with a full DEC!"?) There is also a need for made-up representations, e.g. for dealing with rationals represented as fractions with separate storage of numerator and denominator, or for complex numbers with non-zero imaginary part.

Tag bits, in essence stuck onto the side of the representation at a known location, are one way to identify the representation to the system -- all the Schemes I have implemented have used them -- but again, there are other ways, and there are no doubt many different tagging systems.

Having shown that there are a large number of possible representations of numbers, one problem becomes clear: Any specification of the Scheme language cannot hope to require tools to investigate all of them, not only because there are so many, but also because some representations will be specific to particular implementations of Scheme, perhaps includingimplementations that had not yet been created or designed when the R-whatever specification of the Scheme language was made final. Thus -- if you agree with my roll-back-the-rug analogy -- it is rather a moral obligation of the implementer to provide means to do so, rather than a potential requirement for the language specification.

Let me use the version of Wraith Scheme (R5 for the Macintosh) that I am presently working on, to show how these issues combine: Wraith Scheme's numeric tower is complicated enough to show that there are indeed issues, but I think it is still understandable, and of course it is the only implementation that I understand well enough to describe -- at least, I think I do -- because I wrote it. And since it is R5, there can be no issue of me banging on a drum about how R7 or later Schemes ought to work.

Wraith Scheme uses tagged objects in a register-, stack- and Scheme-main-memory- based architecture that is implemented in C++ and Objective C. Its registers and stack locations (which are not the processor registers and stack) contain tagged objects: Each tagged object is either an atom -- complete in itself -- or a tagged pointer to some particular kind of entry in Scheme main memory. All the registers, and all the stack locations, are roots for garbage collection.

Wraith Scheme in essence uses four kinds of representations of numbers, which I refer to as fixnums, floats, long ratnums and long complexes. (There will be more if I ever put in a bignum package or extend the numeric tower by adding quaternions (but not bananas).) In this context, "long" means something particular, as I shall shortly describe.

In Wraith Scheme, a fixnum is a tagged atom whose tag bits identify it as a signed 64-bit integer.

In Wraith Scheme, a float is a tagged atom whose tag bits identify it as an IEEE 64-bit floating-point number.

In Wraith Scheme, a long ratnum is a tagged pointer whose tag bits specify that it points to a dotted pair in Scheme main memory, whose car and cdr are respectively the numerator and denominator of a rational number represented as a fraction -- what you might get if you parsed "1/3" according to recent Scheme language specifications. "Long" means that it takes more than one tagged object to describe the number -- in the case of long ratnums, three -- one each for the tagged pointer, the numerator and the denominator. And incidentally, "long ratnum" is distinct from "list" -- the predicate "list?" returns #f when applied to any long ratnum.

In Wraith Scheme, a long complex is a tagged pointer whose tag bits specify that it points to a dotted pair in Scheme main memory, whose car and cdr are respectively the real and imaginary parts of a number; however, each of the car and the cdr may be either a fixnum, a float, or a long ratnum. Thus there are actually nine different representations for storing a number as a long complex, and it might take as many as seven tagged objects to do so. And incidentally, "long complex" is distinct from "list" -- the predicate "list?" returns #f when applied to any long complex.

I have described several different ways Wraith Scheme can use to represent mathematical numbers and their friends, but I haven't actually said what numbers Wraith Scheme represents thereby. That is important -- let's see why.

Wraith Scheme uses all but one of the 2-to-the-64th possible fixnums to represent unsigned integers, and performs arithmetic on them the usual way. The exception is the bit pattern #x8000000000000000 -- a non-zero number whose negative is itself makes my teeth itch, so I simply don't use that value -- if you try to type it in as a literal, the parser will coerce it to an inexact number of a different value, represented as a float, et cetera.

Wraith Scheme uses floats to represent a great many integers (such as the IEEE-64-bit-float representations of zero, four, and ten to the 100th), a great many rationals (such as the IEEE-64-bit-float representations of one-third and two-and-a-half -- note that the former representation is not an exact representation of one-third but the latter is an exact representation of two-and-a-half), as well as signed infinities of both signs, and nans. Wraith Scheme considers floats to contain only one kind each of nan, positive infinity, negative infinity and zero, even where there may be more than one bit pattern provided by the IEEE standard.

Wraith Scheme uses long ratnums to store rationals represented as fractions, in which the denominator is never zero. Any calculation that attempts to produce a long ratnum with zero denominator will return either an inf (for nonzero numerator) or a nan.

Wraith Scheme uses long complexes to store integers and rationals (in which case the imaginary part is zero), as well as a motley collection of numbers scattered over the complex plane, whose real and imaginary parts may be anything represented as a fixnum, a float or a long ratnum, except that no long complex may have a nan as either real or imaginary part -- Wraith Scheme returns float nans instead of long complexes when calculations with long complexes produce nans.

Now you see a problem: If x is a Wraith Scheme number, and (integer? x) returns #t, you have no idea whether you are looking at a number represented as a simple 64-bit integer, an IEEE 64-bit float, or a more complicated data structure containing several tagged objects, like 117/1 or 1.0+0i, and in some cases you might wish to know.

In the spirit of rolling back the rug, Wraith Scheme provides non-R5 predicates e::fixnum?, e::float?, e::long-ratnum? and e::complex? to investigate which representation is used (Wraith Scheme declares all symbols that start with "e::" reserved for enhancements), and in connection with the standard procedures numerator, denominator, real-part and imag-part, a user can ferret out all the details of how x is represented. There is also a procedure to print out the bit pattern of any tagged object. I could easily add more such procedures -- one to tell all you ever wanted to know about whether an IEEE 64-bit float is a nan, an inf, or a signed zero might be useful -- but I haven't done so yet.

There is a related issue: With many ways to represent the same mathematical number, there are times when it might be difficult for a user to tell which representation of a number is in use by looking at the literal value provided or printed. For example, if you type "12345." as a literal constant, Wraith Scheme will store it as an inexact fixnum. That is because Wraith Scheme can represent every mathematical integer in the range [-9223372036854775807, 9223372036854775807] as a fixnum, but the range in which every mathematical integer can be represented as a float is much smaller. Thus if you are working with integers, it is an advantage to keep them stored as fixnums as long as no overflow occurs, and Wraith Scheme does so. Wraith Scheme will detect fixnum overflow and will silently coerce the un-overflowed result to the best float it can find, and will clear the exact bit to advise that something wonky has happened. For another example, if a fixnum operation produces a real non-fixnum result, Wraith Scheme will coerce the intended result to a flonum and set the exact bit appropriately. And there are more.


-- Jay Reynolds Freeman
---------------------
Jay_Reynol...@mac.com
http://JayReynoldsFreeman.com (personal web site)

Alaric Snell-Pym

unread,
Aug 25, 2022, 7:08:25 AM8/25/22
to scheme-re...@googlegroups.com
On 25/08/2022 02:48, 'Jay Freeman' via scheme-reports-wg2 wrote:
> I think this discussion of Scheme numbers and mathematics still lacks a clear distinction between what is modeled and how it is represented, and that distinction seems very important for some of the points raised herein.

Yes. I think that distinction is one of the better things about Scheme,
but it's easy for us to forget it due to sloppy thinking and peer
pressure from other languages!

> (1) In Scheme, there is, or ought to be, a clear distinction between the mathematical numbers modeled and the details of how those modeled numbers are represented as patterns of bits.

Yes

> (2) I suggest that insofar as possible, the details of how mathematical numbers are represented as bits should be invisible to the user while performing calculations.

Yes

> (3) I further suggest that the details of how mathematical numbers are represented as bits should be easily available to any user who wants to know about such things.

Maybe (my main objection being: there are few users who really need to
know this, and too many who think they do when they really don't,
because they've succumbed to peer pressure from other languages)

> End of summary -- read on at risk of falling asleep from boredom.

Well, *I* found it interesting :-) Thanks for this well-reasoned
walkthrough of your design decisions.

--
Alaric Snell-Pym (M0KTN neé M7KIT)
http://www.snell-pym.org.uk/alaric/
OpenPGP_signature

John Cowan

unread,
Aug 25, 2022, 1:19:04 PM8/25/22
to scheme-re...@googlegroups.com, Jay Freeman
On Wed, Aug 24, 2022 at 9:48 PM 'Jay Freeman' via scheme-reports-wg2 <scheme-re...@googlegroups.com> wrote:

I advocate the point of view that useful modeling of mathematical numbers should, insofar as possible, not let the details of how those numbers are represented get in the way of their effective use in performing calculations, yet it seems clear that some of those details of representation could matter a lot to some users.

I don't think anyone disagrees with that.

Thus I suggest that Scheme should provide easy ways for users to investigate how any number is represented.

There are levels of representation, however.  I find it hard to swallow that anyone cares at the Scheme level whether integers are big-endian or little-endian, though it can and does make a difference when reading and writing binary files.  Still less is anyone interested in which voltages encode 0 or 1.  So I would focus on just the levels above raw machine integers (and raw machine floats if available), as it seems you do.

Of these levels, some can be left to the documentation.  If there is a range of integers that is represented particularly efficiently, it makes sense to be able to find the upper and lower bounds of that range programmatically.  Knowing that such integers are or are not represented using tagged pseudo-pointers, and how many bits of the pseudo-pointer are available to represent the integer, is something that can be put in an appendix to the implementation's manual and left there.

Yet there are some problems in doing so. Let me elaborate:
I think all agree that the intent is to model the mathematical numbers as developed in a course on mathematical analysis.

Yes.

In this modeling, we computer types find it useful to add placeholders to indicate that something has gotten bogus: Popular placeholders are nans, infs both signed and unsigned, and signed zeros. Other indications of wonkiness are often provided by various kinds of hardware floating-point exceptions or by suspicious users, but Schemers tend to lump them all into a single stick-on label of bogosity; namely, a cleared "exact bit" or its equivalent.

I don't think that's true any more.  Support for the special values slowly crept into R5RS+ systems, and R[67]RS standardized the literal notations +nan.0 (-nan.0 is also accepted, though it need not produce a different NaN), +inf.0, -inf.0, and -0.0 for them, as well as useful procedures like `finite?`, `infinite?`, and `nan?`.  (These apply only to real numbers in R6RS, but are extended to complex numbers in R7RS.)  There is no predicate `negative-zero?` because it is just the conjunction of `negative?` and `zero?`.  Today I think only MIT Scheme does not support the special values in its default mode (for hysterical raisins) though you can make it do so by changing a run-time setting.

I use terms like "bogus" because I don't believe there is general agreement among mathematicians about how to introduce these concepts into the hierarchy of mathematical types. 

I think you're right.  (For my purposes I model the IEEE floating-point types as intervals of the real line, with the understanding that floating-point operations such as + are not the same as interval arithmetic but different operations altogether.  For that purpose, NaN is ambiguous between two intervals: the empty interval returned by (flsqrt -2.0) and the universal interval returned by (/ 0 0).  But that's just me, and neither here nor there.)

There are lots of well-known representations. They include signed and unsigned integers of 8, 16, 32 and 64 bits, floating-point numbers in several formats of length 32, 64 and 80 bits, and several likely representations of bignums.

In addition, there are now 16-bit and 128-bit floats plus decimal floats of 64 and 128 bits, all standardized by IEEE 754:2008.  GNU GMP's representation of bignums, big rationals, and big floats are pretty standard nowadays too.

(And there used to be more -- does anyone remember the saying, "If you don't have 36 bits, you're not playing with a full DEC!"?)

No, but as an occasional TOPS-10 user, I like it!

Let me use the version of Wraith Scheme (R5 for the Macintosh)

(Speaking softly and without a stick of any sort, I'd urge you to take the additional steps to make it an R7RS-small implementation.  The complete list of differences (it was literally derived from a diff of the TeX source) is given in just two pages of the R7RS-small standard, in the section "Language changes".  The largest single change is library support, and if you don't want to do that (or any other specific things) just document that you conform to the R7RS-small standard with derogations and then say what they are.)

In Wraith Scheme, a float is a tagged atom whose tag bits identify it as an IEEE 64-bit floating-point number.

Where do you store these bits and the exact/inexact bit?  If you grab mantissa bits, you have to do your own rounding, very very carefully.  If you grab exponent bits, you seriously limit the range.  You could do NaN-boxing, at the expense of limiting yourself to 52-bit integers.  For fixnums, there don't seem to be any bits to grab, since they have the full 64-bit range minus the most negative value.

Or do you mean that *pointers* to fixnums and flonums are tagged, and that the fixnums and floats themselves are stored as untagged words?  But this seems to contradict what you say about atoms.  I'm confused.

what you might get if you parsed "1/3" according to recent Scheme language specifications

"Recent" meaning "1985", the date of R2RS.

Scheme considers floats to contain only one kind each of nan, positive infinity, negative infinity and zero, even where there may be more than one bit pattern provided by the IEEE standard.

Only NaN has multiple representations: the rest are unique.

 Any calculation that attempts to produce a long ratnum with zero denominator will return either an inf (for nonzero numerator) or a nan.

That conforms to R[57]RS, though most implementations raise an exception in that case.

no long complex may have a nan as either real or imaginary part -- Wraith Scheme returns float nans instead of long complexes when calculations with long complexes produce nans.

I *very strongly* urge you to change that.  Alex Shinn proposed this behavior for R7RS-small, and the numerical Scheme community pushed back hard, saying it would make R7RS-small useless for them.  The committee heard their arguments, which I cannot reproduce here, and agreed.  So in R7RS +nan.0+32.0i, 32+nan.0i, and +nan.0+nan.0i are perfectly cromulent complex numbers distinct (in the sense of eqv?) from +nan.0.

That is because Wraith Scheme can represent every mathematical integer in the range [-9223372036854775807, 9223372036854775807] as a fixnum, but the range in which every mathematical integer can be represented as a float is much smaller.

Well, it's [-2251799813685248, 2251799813685248], which is still pretty big, but I take your point.
Thus if you are working with integers, it is an advantage to keep them stored as fixnums as long as no overflow occurs

The real advantage is that although float arithmetic is about as fast as fixnum arithmetic nowadays, conversion between the two formats is much slower, and you need something close to machine integers for such purposes as indexing.

Jay Freeman

unread,
Aug 25, 2022, 11:30:00 PM8/25/22
to scheme-re...@googlegroups.com, Jay Freeman, John Cowan
Replying to John Cowan:

I will save you all the trouble of reading another long message from me by referring to Wraith Scheme documentation on line (whose total wordcount is between that of Pride and Prejudice and that of War and Peace). That documentation is also good for putting yourself to sleep when you run out of sheep to count. The most useful item is probably the first one listed -- a guide to the internal details of the implementation -- which I shall refer to herein as the INTERNALS DOCUMENT:


There are also:



This stuff is for the current release of Wraith Scheme, and is a little obsolete compared to the new release I am working on now, but I suspect it will suffice, especially if you can't sleep. (Hmn, I am not sure that mousing these links will open them, you may have to copy and paste into your browser.)


On Aug 25, 2022, at 10:18, John Cowan <co...@ccil.org> wrote:

There are levels of representation, however.  I find it hard to swallow that anyone cares at the Scheme level whether integers are big-endian or little-endian, though it can and does make a difference when reading and writing binary files. [...]

<manic chuckling with shaky hands>
When I first released Wraith Scheme, nearly 15 years ago, Apple was converting its hardware from PowerPC processors (big-endian) to Intel processors (little-endian), and developers at that time were encouraged to create "Universal Binary" software that would run on both architectures. There were plenty of tools to help do that, but the world files that Wraith Scheme read and wrote could not very well be both. I therefor created code that would recognize whether the architecture it was running on was PowerPC or Intel, and convert endian-ness appropriately when reading and writing world files. (The actual files were little-endian; conversion was necessary for the PowerPC.) That code is still present in the current Wraith Scheme release, though if Apple were ever to start supporting big-endian processors again <hands give a violent shake> I would have to write some more in-line assembly to deal with lockless coding on whatever the new architecture was. (Wraith Scheme does parallel processing, and needs lockless coding so different processes do not mess up the commonly-held Scheme main heap storage.)
</manic chuckling, but hands continue to shake a little>

[...]

(... does anyone remember the saying, "If you don't have 36 bits, you're not playing with a full DEC!"?)

No, but as an occasional TOPS-10 user, I like it!

... and every December, we gathered around the binary tree to celebrate the glory and wonder of the DEC-20 -- on December 20, of course. I have a friend who used to have a DEC-20 in her garage: Ah, what people will do in the name of startups. Eventually she was able to exchange the '20 for a Jaguar XJ-S, so it was not in vain ...

[...]

Speaking softly and without a stick of any sort, I'd urge you to take the additional steps to make [Wraith Scheme] an R7RS-small implementation.  [...]  The largest single change is library support [...]

Wraith Scheme has what I call a "package system", which I put in as an enhancement while the Scheme community was still, er, "discussing" where to go after R5. (Please note that although many people told it where to go in firm terms, I always refrained from giving travel advice.) It is sorta kinda based on the package system of Zetalisp/Common Lisp, and you can read about it in the "Package Management" section of the internals document. I think the primitives I built it with could be used to create an R5+-style library.

Most of my decisions about how to enhance Wraith Scheme were based on things I was interested in or thought I might need -- there is a lot of stuff in the program that the RN creators never heard of and probably don't want to, such as forgettable objects, a turtle-graphics system, and of course the Big Red Button.

[...]


In Wraith Scheme, a float is a tagged atom whose tag bits identify it as an IEEE 64-bit floating-point number.

Where do you store these bits and the exact/inexact bit? [...] I'm confused.

So am I, but let try to explain: You can look at the section on "Tagged Avals" in the internals document, or read on: The basic storage element of Wraith Scheme is what I call a tagged aval. (The 'a' is for "ambidextrous" and is intended to suggest that it might represent either of what other programming languages call "lval"s and "rval"s, the 'l' and 'r' standing respectively for "left" and "right".) A tagged aval comprises two adjacent machine words, one for the tag, and the other for a data element that is, e.g., some kind of atom or some kind of pointer into Scheme main memory, or perhaps something else.

That's right -- I have 8 bytes worth of tag bits -- there is no avoiding it, all the hardware memory accesses are machine-word aligned, so there is no sane way to pack more than one tag into a word and still know where its associated data element is. Perhaps surprisingly, I have found good use for half of those bits, and have some ideas for the rest.

All of Wraith Scheme's registers and stack locations (which are not the hardware registers and stack) are tagged avals, and the garbage collector and storage manager work in such a way that they never look into the "middle" of a tagged aval -- they always know what kind of thing they are looking at, and how to follow its pointer if the data element happens to be a pointer, and how much to bump C++ pointers by to get past this tagged aval to the next one in the heap.

When I wrote Pixie Scheme -- Wraith Scheme's predecessor -- I knew about implementations that stripped bits off pointers or machine words to use for tagging. We had a TI lisp machine in house at the AI lab where I was working, and it used five bits of a 32-bit word for tag bits. Unfortunately, the machine I was using to develop Pixie Scheme was a Mac II. which was almost certainly the most lisp-capable personal computer available at the time, but it only had a 24-bit address space. If I stripped five tag bits off that, the memory space left over would have been a bit small. (Though if I had implemented Pixie Scheme that way I would probably hold the rare distinction of having created a system whose natural word size had a prime number of bits.) (I must apologize for mentioning systems with 27-, 24- and 19-bit word sizes to you folks. If the thought of them gives you nightmares, it is all my fault.)

"Recent" meaning "1985", the date of R2RS.

"Recent" is a relative term. I started work on Pixie Scheme in 1987 -- and was proud that it conformed to the hot-off-the-presses R3 specification, though with an incomplete numeric tower.

[...]


no long complex may have a nan as either real or imaginary part -- Wraith Scheme returns float nans instead of long complexes when calculations with long complexes produce nans.

I *very strongly* urge you to change that.  Alex Shinn proposed this behavior for R7RS-small, and the numerical Scheme community pushed back hard, saying it would make R7RS-small useless for them.  The committee heard their arguments, which I cannot reproduce here, and agreed.  So in R7RS +nan.0+32.0i, 32+nan.0i, and +nan.0+nan.0i are perfectly cromulent complex numbers distinct (in the sense of eqv?) from +nan.0.

Do you have a reference for what those arguments were, or at least an idea of when they were made so that I can perhaps look them up in my archive of postings to this list? It seems an oxymoron, even to the point of being ludicrous, to have something that contains a nan respond to "number?" with #t: The production of a nan indicates that the computation has escaped from the boundaries of formal mathematical analysis, and if that happens in any system that claims to model formal mathematical analysis, something is wrong with the model.

John Cowan

unread,
Aug 26, 2022, 3:51:50 PM8/26/22
to Jay Freeman, scheme-re...@googlegroups.com
On Thu, Aug 25, 2022 at 11:30 PM Jay Freeman <jay_reynol...@mac.com> wrote:

I suspect it will suffice, especially if you can't sleep. (Hmn, I am not sure that mousing these links will open them, you may have to copy and paste into your browser.)

The links work great, though I am not sure if you are supplying them or Gmail is; 99% of the time I don't care, because an URL in plain text gets recognized (unless it happens to be a Gemini URL).  I am working my way through the internals document.

Eventually she was able to exchange the '20 for a Jaguar XJ-S, so it was not in vain ...

Huh.  Trade a perfectly fine computer for a broken means of transportation that won't even let you read while traveling unless you have your own human driver!  (h/t Arthur C. Clarke)

It is sorta kinda based on the package system of Zetalisp/Common Lisp, and you can read about it in the "Package Management" section of the internals document. I think the primitives I built it with could be used to create an R5+-style library.

I'll look into that.  CL-R <https://docs.google.com/document/d/1Nh28vxYjratsjYaUj_MOKZlgtWdsoJI0cO4J0O4JdHc> is a reduced version of Common Lisp that can be easily compiled to C or a similar low-level language: it is static (although dynamically typed).  I mention it here because on p. 6 there is an explanation of how to use a subset of the CL package system as a statically typed module system similar to R6RS.

 such as forgettable objects, a turtle-graphics system, and of course the Big Red Button.

 
A tagged aval comprises two adjacent machine words, one for the tag, and the other for a data element that is, e.g., some kind of atom or some kind of pointer into Scheme main memory, or perhaps something else.

That's expensive, but since you do cdr-coding you get to reclaim much of the cost.

 
When I wrote Pixie Scheme -- Wraith Scheme's predecessor -- I knew about implementations that stripped bits off pointers or machine words to use for tagging.

That's what people usually call "tagging" nowadays.  Smalltalk, for example, has a 1-bit tag for fixnums (or SmallIntegers, as they are called), but each object has a full-word pointer to the class object of an object.

Do you have a reference for what those arguments were, or at least an idea of when they were made so that I can perhaps look them up in my archive of postings to this list?

I thought they were in the WG1 list, but I can't find them in the archive.  However, the general idea IIRC was that NaN indicates that information has been lost, and it is better to lose just some of it than all of it when that is possible.  Consider the analogous case of infinities: (+ 2.0+1.6e308i 2.0+1.6e308i) is too far from 0+0i to be representable, but it is better to return 4.0+inf.0i, thus preserving the real part of the sum, than simply +inf.0.

Jay Freeman

unread,
Aug 27, 2022, 6:04:53 PM8/27/22
to scheme-re...@googlegroups.com, Jay Freeman, John Cowan
Replying again to John Cowan:

>> Eventually she was able to exchange the '20 for a Jaguar XJ-S, so it was not in vain ...
>
> Huh. Trade a perfectly fine computer for a broken means of transportation that won't even let you read while traveling unless you have your own human driver! (h/t Arthur C. Clarke)

She liked cats and she liked hot cars. And the '20 went to the startup once it was successful enough to have a building of its own, so she still got to play with it.

>> such as forgettable objects, a turtle-graphics system, and of course the Big Red Button.
>
> [...]

Wraith Scheme has a user interface that allows various kinds of non-text interactions: There is a graphics system much like "turtle graphics" -- which is of course called "kitten graphics", in keeping with Wraith Scheme's feline-based design paradigm -- and a couple of buttons that can be programmed to interrupt the REPL. And there are others.

>> A tagged aval comprises two adjacent machine words, one for the tag, and the other for a data element that is, e.g., some kind of atom or some kind of pointer into Scheme main memory, or perhaps something else.
>
> That's expensive, but since you do cdr-coding you get to reclaim much of the cost.

Yes. And cdr-coding has another advantage which may not always be understood: It much improves locality of reference of Scheme main-memory data, and thereby reduces the incidence of cache misses and page faults. With today's processors so much faster than memory and disks, both of those can cause run-time delays of enormous numbers of processor clocks -- perhaps millions of clocks in the case of page faults and slow disks. People who think that speed optimization is solely a matter of reducing the number of instructions the program has to execute, are sometimes chasing the wrong fox. (Note that I am careful not to say the Wraith Scheme is fast -- it isn't -- but cdr-coding provides some low-hanging fruit that I have tried to gather.)

> However, the general idea IIRC was that NaN indicates that information has been lost, and it is better to lose just some of it than all of it when that is possible. Consider the analogous case of infinities: (+ 2.0+1.6e308i 2.0+1.6e308i) is too far from 0+0i to be representable, but it is better to return 4.0+inf.0i, thus preserving the real part of the sum, than simply +inf.0.

I will have to think about that ...

Linas Vepstas

unread,
Aug 29, 2022, 8:25:34 AM8/29/22
to scheme-re...@googlegroups.com
I'm ready to throw my hands in the air, and mutter "whatever". There seems to be some kind of misunderstanding.

On Wed, Aug 24, 2022 at 9:48 PM John Cowan <co...@ccil.org> wrote:


On Wed, Aug 24, 2022 at 12:01 PM Linas Vepstas <linasv...@gmail.com> wrote:
 
In programming practice, types refer to the (internal, hidden, possibly dynamic) structure of the storage location

If something is hidden, what is the point in having a label for it?  We never know the structure of the storage location in Scheme (or CL, Perl, Python, etc.)  Maybe a complex number occupies two cells, maybe it doesn't.  In Chicken, for example, an inexact complex number is two raw binary64s, but an exact complex number is typically two pointers with immutable contents (same size on 64-bit systems, not the same on 32-bit systems).

To me, this comes off as intentional misunderstanding. The concept of "types" is a real thing. It is useful for program semantics, for reasoning about what programs do.  Ints and bools and strings are types. Their encoding is internal, hidden and possibly dynamic, and yet we continue to talk about ints and strings as being actual "things" that have different types.

We reason about software as if those types really exist, no matter what the bits are doing under the covers.  If we have a function that returns strings, we expect that the returned values to always be strings.  We don't expect the returned value to sometimes be an int or a bool. Even when one could argue that it makes sense: "ahh, but you see, the empty string is empty, so it is better to return 0 or #f in that case".

If someone gives me a function that is claimed to return strings, I'm going to be pretty ticked off if it sometimes returns bools or ints. The contract was for strings, that is what I want.

I'm applying the same reasoning to reals and complex.   If a function is supposed to return a real, I do not want to be surprised to find it sometimes returns 42+i0  If a function is supposed to return a complex number, it should do that, instead of sometimes returning an int.

But you might say "Ahh, don't worry, the reals and complex will automatically convert to one another as-needed under the covers!"  If that is the case, then the predicates should also "convert automatically", and always respond to questions about type correctly.  If, in your system, a function that returns strings might sometimes return 0, then (string? 0) should evaluate to #t.

-- Linas

John Cowan

unread,
Aug 29, 2022, 11:32:24 AM8/29/22
to scheme-re...@googlegroups.com
On Mon, Aug 29, 2022 at 8:25 AM Linas Vepstas <linasv...@gmail.com> wrote:

I'm ready to throw my hands in the air, and mutter "whatever". There seems to be some kind of misunderstanding.

There is, but it is not intentional.
  
The concept of "types" is a real thing.

I agree with everything that follows up to here:
 
I'm applying the same reasoning to reals and complex.   If a function is supposed to return a real, I do not want to be surprised to find it sometimes returns 42+i0  If a function is supposed to return a complex number, it should do that, instead of sometimes returning an int.
 
Scheme numbers simply doesn't work like that, and never have since R2RS back in 1985.  I'll quote Section 6.2.1 of R[4-7]RS, which is the most recent expression of this idea:

For `example, 3 is an integer. Therefore 3 is also a rational, a real, and a complex [number]. The same is true of the Scheme numbers that model 3. For Scheme numbers, these types are defined by the predicates number?, complex?, real?, rational?, and integer?.

So (complex? 3) is #t, and there is no way to distinguish the value of (+ 1 2) from the value of (make-rectangular (+ 1 2) 0).  This is a fundamental difference between Scheme and most programming languages.

But you might say "Ahh, don't worry, the reals and complex will automatically convert to one another as-needed under the covers!"  If that is the case, then the predicates should also "convert automatically", and always respond to questions about type correctly.  If, in your system, a function that returns strings might sometimes return 0, then (string? 0) should evaluate to #t.

Not at all, because `string?` and `number?` are mutually exclusive, whereas `complex?`, `real?`, ... `integer?` are not.  Functionally speaking, all Scheme numbers belong to the same type, `number?`, which has two subtypes, `exact?` and `inexact?`.  That's all. 

The current problem arises because a non-real number is on the one hand a number, and on the other hand a compound object with two parts, `real` and `imag`.

Yongming Shen

unread,
Aug 29, 2022, 12:34:50 PM8/29/22
to scheme-re...@googlegroups.com
Arithmetic operations in "traditional languages" don't exactly have "obvious types" either. Take C for example, there are implicit conversions from shorter int types to longer int types, and from int types to float types. So if one wants to give C arithmetic operators type signatures, the person will likely come up with concepts not unlike Scheme's `number?` (a type encompassing all number objects). Suppose on top of that one adds two unary operators to C, $i and $r. $i returns 1 if its operand is an integer valued number (including integer valued floating points), 0 otherwise. $r returns 1 if its operand is a rational valued number (so `$i x` is 1 implies `$r x` is 1), 0 otherwise (for '-inf`, `+inf` and `nan`). Now numbers and arithmetics in C start to resemble those in Scheme if you squint.


--
You received this message because you are subscribed to a topic in the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scheme-reports-wg2/EAF0ZsFeqmU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scheme-reports-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scheme-reports-wg2/CAD2gp_QxG4Q8rz7VKbgTohGWZLmsQb-UNOorz9Goha6qzOgrJw%40mail.gmail.com.

Marc Nieper-Wißkirchen

unread,
Aug 29, 2022, 12:43:45 PM8/29/22
to scheme-re...@googlegroups.com
In order to warrant the name `compnum?' and to provide a predicate
useful for type inference (as R6RS had in mind with `real?'), my
proposed specification for `compnum?' should probably read "a complex
number that is a compound object of two flonums, its real and
imaginary part".

PS The R7RS version of `real?` makes it a bit harder for
implementations to follow the specification. These are my Chibi
experiments:

> (max 1.0+0.0i 2)
1.0+0.0i
> (rationalize 1.0+0.0i 1/2)
ERROR in "floor": invalid type, expected Number: 1.5+0.0i

The point is that the domain of some procedures is restricted to
reals. If a real can have a non-trivial imaginary part (+0.0 or -0.0),
the implementation has to cope with that.

John Cowan

unread,
Aug 29, 2022, 1:40:33 PM8/29/22
to scheme-re...@googlegroups.com
That's because in Chibi (real? 5.0+0.0) = #f.  Which reminds me that what I've been calling r7rs:real is not actually mandated by R7RS, because of the apparently-normative examples s.v. `real?`.  So some R7RS implementations like Chicken (with the -R r7rs switch) and Kawa, provide r7rs:real; others, like Larceny and Chibi, provide r6rs:real.  Which makes things even messier than before.


--
You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheme-reports-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scheme-reports-wg2/CAEYrNrS3HRFj%3DJEFi%2BY8A3fa%3D8eK2KaSHMFgVgoP%3Ds-EHda1sg%40mail.gmail.com.

Jay Freeman

unread,
Aug 29, 2022, 6:56:12 PM8/29/22
to scheme-re...@googlegroups.com, Jay Freeman
Linas Vepstas wrote:

> If a function is supposed to return a real, I do not want to be surprised to find it sometimes returns 42+i0.

The trouble is, that when read as a mathematical number, 42+i0 *is* real -- no ifs, ands, buts or maybes.

Part of the problem is that the discussion has been using two different meanings of the word "type": (1) "Type" is used to mean the kind of mathematical number that a Scheme implementation is intending to model, and (2) "type" is used to mean the pattern of bits that the Scheme implementation is using to store its representation of that number.

These two uses are *not* the same.

Herein I will use the word "type" only with meaning (1), and will coin the phrase "storage class" to substitute for its use with meaning (2).

Some Scheme users wish to record that certain calculations have "underflowed", and produced a result which is not zero but is too small for the available storage classes to represent conventionally: Some storage classes provide special bit patterns to represent calculations that have underflowed, possibly different patterns for underflows from the positive and negative directions. We might use the syntax "+0" and "-0" -- as distinguished from just plain "0" -- when printing out these bit patterns, but there are some issues:

(A) Any Scheme number whose imaginary part is either "+0" or "-0" is clearly *not* representing a mathematical number of type real (because its imaginary part is non-zero), and the predicate "real?" should return #f for it -- *BUT* -- any Scheme number whose imaginary part is just plain "0" clearly *is* representing a mathematical number of type real, and the predicate "real?" should return #t for it.

(B) Suppose (number->string x) prints out "42+0i". Does x have an imaginary part that is the non-zero value "+0", or is it just plain "0"? I can imagine changes to the syntax to make such matters clearer, but any such proposal would probably start another controversial discussion, so I am not going to go there at present, but the lack of a good syntax for underflows should not obscure the fact that THE PRESENT SYNTAX FOR UNDERFLOWS IS CONFUSING.

(C) Procedure imag-part should return a value for every Scheme number, regardless of its storage class. E.g., even if Scheme is representing the mathematical number 3 and happens to use a 64-bit signed fixnum to store its representation, then (imag-part 3) should return 0. One might therefore in principle use imag-part to distinguish the cases in (B) by using the conventional syntax: Thus if (number->string x) printed out "42+0i", then (number->string (imag-part x)) might give either "0" or "+0". Procedure "=" might also distinguish them.

(D) Note that there are similar, related issues for infs and nans.

I suggest that if users wish to know details about Scheme numbers that are easily provided by exploring their storage classes, then the Scheme-language specifications should take the bull by the horns and provide means to do so. Just off the cuff, such means might include the following, some of which may already exist:

(a) Predicates to test for infs, nans and underflows.

(b) A procedure to return the name of the storage class used for a particular number? The language specification could define particular names for common storage classes, including, e.g., "ieee-64-bit-floating-point" or "signed-64-bit-integer", and allow implementations to add others as needed.

(c) A procedure to return the actual bit pattern(s) used for a particular number, possibly as a list of bytes or words.

I further suggest that predicates and other procedures that deal with numbers according to mathematical type (in the sense of (1) above) should not special-case inputs or outputs on the basis of storage class.

John Cowan

unread,
Aug 29, 2022, 9:54:05 PM8/29/22
to scheme-re...@googlegroups.com, Jay Freeman
On Mon, Aug 29, 2022 at 6:56 PM 'Jay Freeman' via scheme-reports-wg2 <scheme-re...@googlegroups.com> wrote:
 
 Herein I will use the word "type" only with meaning (1), and will coin the phrase "storage class" to substitute for its use with meaning (2).

Good (although note that "storage class" is used for the type of a backing vector on which an array is overlaid in SRFI 231).
 
We might use the syntax "+0" and "-0" -- as distinguished from just plain "0" -- when printing out these bit patterns

We might, but we don't.  We use the syntax +0.0 and -0.0, as distinguished from 0, when printing out these bit patterns.  Why?  Because anything with a decimal point or exponent, or that is prefixed with #i, plays that way, and anything with neither a decimal point nor an exponent, or that is prefixed with #e, does not.  If you don't like writing 0.0 as distinct from 0, write #i+0 and #i-0.  (Don't confuse #i with the i of complex number notation.)
(A) Any Scheme number whose imaginary part is either "+0" or "-0" is clearly *not* representing a mathematical number of type real (because its imaginary part is non-zero), and the predicate "real?" should return #f for it

That's logical, and it's why R6RS does it the way it does.  However, R5RS predates the signed zeros and the other oddball objects, and its rule is that if the imaginary part is the same as 0 in the sense of = (and #i+0 and #i-0 and 0 are all = for the same reason that 0.5 and 1/2 are), then the result is real.  "The life of [Lisp] has not been logic, it has been experience."
-- *BUT* -- any Scheme number whose imaginary part is just plain "0" clearly *is* representing a mathematical number of type real, and the predicate "real?" should return #t for it.

No one disagrees with that.
(B) Suppose (number->string x) prints out "42+0i". Does x have an imaginary part that is the non-zero value "+0", or is it just plain "0"?

Clearly the latter.
the fact that THE PRESENT SYNTAX FOR UNDERFLOWS IS CONFUSING.

(Except it's not a fact, because +0 and -0 aren't what you think they are.)
(C) Procedure imag-part should return a value for every Scheme number, regardless of its storage class. E.g., even if Scheme is representing the mathematical number 3 and happens to use a 64-bit signed fixnum to store its representation, then (imag-part 3) should return 0.

Yes.
(a) Predicates to test for infs, nans and underflows.

We have those:  infinity?, nan?, (lambda (x) (and (inexact? x) (zero? x) (positive? x)), and (lambda (x) (and (inexact? x) (zero? x) (negative? x)).  Note that without the inexact? clause, the last two procedures always return #f.
(b) A procedure to return the name of the storage class used for a particular number? The language specification could define particular names for common storage classes, including, e.g., "ieee-64-bit-floating-point" or "signed-64-bit-integer", and allow implementations to add others as needed.

We could, but for what?  I would need to be convinced that this is actually useful.

(c) A procedure to return the actual bit pattern(s) used for a particular number, possibly as a list of bytes or words.

Same thing.
I further suggest that predicates and other procedures that deal with numbers according to mathematical type (in the sense of (1) above) should not special-case inputs or outputs on the basis of storage class.

That's already true with one exception, namely that the oddballs answer #t to `real?` but not `rational?`.  Arguably this is the Wrong Thing, but see my earlier remark about experience (and it is true in both R6 and R7).

Taylan Kammer

unread,
Aug 30, 2022, 7:27:34 AM8/30/22
to scheme-re...@googlegroups.com, John Cowan, Jay Freeman
On 30.08.2022 03:53, John Cowan wrote:
>
> (lambda (x) (and (inexact? x) (zero? x) (positive? x))
> (lambda (x) (and (inexact? x) (zero? x) (negative? x))
>

Are you sure these work? In Guile, positive? and negative? give #f for 0.0 and -0.0,
which R6RS seems to define as the correct behavior, and R7RS-small I'm not sure (it
seems unspecified but I might have missed some nuance).

Purely intuitively I would expect zero? to be mutually exclusive with positive? and
negative? so the R6RS/Guile behavior seems desirable, but then I'm not a numerical
analyst so dunno. :-)

I guess (lambda (x) (eqv? x 0.0)) and (lambda (x) (eqv? x -0.0)) should work though.

--
Taylan

Linas Vepstas

unread,
Aug 30, 2022, 7:39:57 AM8/30/22
to scheme-re...@googlegroups.com
Ah hah! The tower!  OK, So then the question becomes "when is automatic/implicit conversion between subtypes allowed, and when is it forbidden?"

For example, there is a function that takes inexact reals as an argument, should it throw an exception if given an exact argument, or should it implicitly convert?

If there's a function that "usually" returns inexact reals, is it ever allowable for it to return a rectangular complex number?

Both of the above are determined by a "directionality" in the tower:  exact numbers can be converted into inexact ones, without ambiguity or confusion, but not the other way. Single-component reals can be converted to complex without confusion or ambiguity, but not the other way.

============
For those who enjoy long emails: look up Galois theory.  You can model certain subsets of the complex numbers with polynomials (that have integer coefficients, only).  The "numbers" are ratios of polynomials, and either they are reducible, or they are not.  For example,  (x^2 + 1) is a kind-of model for imaginary i.  Arithmetic proceeeds "as normal", and it "feels like" you are working with numbers.  (they are fields, after all) Just that you don't have access to all complex numbers. ...

--linas   

--
Patrick: Are they laughing at us?
Sponge Bob: No, Patrick, they are laughing next to us.
 

Linas Vepstas

unread,
Aug 30, 2022, 8:00:31 AM8/30/22
to scheme-re...@googlegroups.com, Jay Freeman
Sigh, well, here I throw up my hands again.

On Tue, Aug 30, 2022 at 1:56 AM 'Jay Freeman' via scheme-reports-wg2 <scheme-re...@googlegroups.com> wrote:
Linas Vepstas wrote:

> If a function is supposed to return a real, I do not want to be surprised to find it sometimes returns 42+i0.

The trouble is, that when read as a mathematical number, 42+i0 *is* real -- no ifs, ands, buts or maybes.

Part of the problem is that the discussion has been using two different meanings of the word "type": (1) "Type" is used to mean the kind of mathematical number that a Scheme implementation is intending to model, and (2) "type" is used to mean the pattern of bits that the Scheme implementation is using to store its representation of that number.

I was using the word "type" in sense number (3) -- that thing that the type-theory people call a type.  Type-theory types are neither (1) nor (2) but are something else again.

I object to the use of the concept of a "mathematical number" in these conversations, since that is an extremely vague concept, to the point of being unusable. What the heck, are you proposing that scheme implement finite function fields, which are technically "mathematical numbers"? 

What about irrational numbers? There are two kinds: quadratic irrationals (those with periodic continued fraction expansions) and those that are not (those with chaotic orbits.) The quadratic irrationals are a kind of "exact" number; you can represent them exactly.  I bet there aren't any schemes that treat them correctly, as exact numbers.  Why? What, do you really want to get into the theory of Bratelli-Vershik compacta to implement numbers correctly in scheme? 

-- linas

John Cowan

unread,
Aug 30, 2022, 8:43:31 AM8/30/22
to scheme-re...@googlegroups.com
On Tue, Aug 30, 2022 at 7:39 AM Linas Vepstas <linasv...@gmail.com> wrote:
 
 
For example, there is a function that takes inexact reals as an argument, should it throw an exception if given an exact argument, or should it implicitly convert?

All of the procedures specified in R7RS-small that accept inexact numbers also accept exact ones.  This and the I/O procedures are the only instances of ad hoc polymorphism in the language.  The only procedures that require exact numbers are ones like `vector-ref` that use exact integers as indexes; it is an error (= undefined behavior) to pass anything but an exact integer.  In R6RS, an exception is raised in this circumstance.

R6RS and R7RS-large have flonum libraries, where a flonum is a subset of the inexact real numbers (typically all of them).  Again, R6RS raises an exception if you pass a non-flonum argument; in R7RS it is an error.

If there's a function that "usually" returns inexact reals, is it ever allowable for it to return a rectangular complex number?

Every inexact real number is a complex number, per discussion above.  If you mean "is it allowable to return a non-real number?" you have to look at the definition of the procedure.  `Sqrt` returns non-real numbers if its argument is negative, whereas `flsqrt` (from the flonum libraries) always returns a flonum, and will return NaN if its argument is negative.
 
Both of the above are determined by a "directionality" in the tower:  exact numbers can be converted into inexact ones, without ambiguity or confusion, but not the other way.

It's not so simple, because typically the maximum range of an inexact number (excluding the infinities) is much less than that of an exact number.  When comparing exact and inexact numbers, the standard requires that the implementation behave as if it converted all inexact numbers (except the oddballs) to exact ones, because transitivity.

John Cowan

unread,
Aug 30, 2022, 8:52:51 AM8/30/22
to scheme-re...@googlegroups.com, Jay Freeman
On Tue, Aug 30, 2022 at 8:00 AM Linas Vepstas <linasv...@gmail.com> wrote:

Sigh, well, here I throw up my hands again.

Nil desperandum, or as we say in English, Don't give up the ship.

I was using the word "type" in sense number (3) -- that thing that the type-theory people call a type.  Type-theory types are neither (1) nor (2) but are something else again.

Indeed: static types are not types of objects but of expressions, and Scheme has only one static type, "any".  In any case they are not the subject here.
I object to the use of the concept of a "mathematical number" in these conversations, since that is an extremely vague concept, to the point of being unusable. What the heck, are you proposing that scheme implement finite function fields, which are technically "mathematical numbers"? 

Come, come, you know that we are talking about subsets of the integers and rational numbers, plus the four oddballs.  All other numbers, or "numbers", are out of the picture.  If you do not know what mathematical integers are, consult Russell and Whitehead's _Principia Mathematica_.

What about irrational numbers?

There is no reason why a Scheme can't support a subset of the irrational numbers (a trivial example would be "the existing numbers times pi"), although the lack of pluggability makes it hard for anyone but the implementer to do this, and no implementers have felt the need thus far.

Marc Nieper-Wißkirchen

unread,
Aug 30, 2022, 9:04:42 AM8/30/22
to scheme-re...@googlegroups.com, Jay Freeman
Am Di., 30. Aug. 2022 um 14:52 Uhr schrieb John Cowan <co...@ccil.org>:

>> I was using the word "type" in sense number (3) -- that thing that the type-theory people call a type.  Type-theory types are neither (1) nor (2) but are something else again.
>
>
> Indeed: static types are not types of objects but of expressions, and Scheme has only one static type, "any".  In any case they are not the subject here.

Comment irrelevant for the current discussion: One can make a point that Scheme's static types (the types of expressions) correspond not to a single type but to one type per natural number where each natural number corresponds to the number of values.  In order to have a type for each form, one would have to take subsets of natural numbers instead of individual natural numbers.

Alaric Snell-Pym

unread,
Aug 31, 2022, 4:40:19 AM8/31/22
to scheme-re...@googlegroups.com
On 30/08/2022 02:53, John Cowan wrote:

>> (b) A procedure to return the name of the storage class used for a
>> particular number? The language specification could define particular names
>> for common storage classes, including, e.g., "ieee-64-bit-floating-point"
>> or "signed-64-bit-integer", and allow implementations to add others as
>> needed.
>>
>
> We could, but for what? I would need to be convinced that this is actually
> useful.
>
>>
>> (c) A procedure to return the actual bit pattern(s) used for a particular
>> number, possibly as a list of bytes or words.
>>
>
> Same thing.

Yeah, this kind of what-is-the-bit-pattern stuff only really matters
when doing low-level FFIs or hardware-level I/O, and even then, I think
it's best handled by declaring "This external thing is of type
signed-64-bit-integer, now put the Scheme number 73 into it please".

After all, a Scheme implementation should be allowed to:

1) Represent ALL numbers as a complex pair of rationals, with
inexactness handled with bit flags rather than using floats. Integers
are just stored in the form "x/1+i0/1". So there is only one storage class.

2) Change representations of existing numbers during a garbage
collection (as long as the numeric semantics are preserved), perhaps
using a copying GC that compacts compnums with a zero imaginary
component down to flonums, compacts bignums within the fixnum range down
to fixnums, etc. This might conceivably arise from a situation where
numbers are sometimes mutated in-place as part of some other
optimisation, but the mutator can't change the "storage class" without
moving the number (there's type-specific heaps for small objects,
perhaps) and it's not in a position to find and update reference
pointers, which the GC of course is.

3) Be built on some kind of hardware (probably a VM in practice) that
provides its own numeric tower, so from the point of view of the
implementation, everything is just a "number" (possibly inexact) and
something outside of its control or view decides how to represent that
as bits.
OpenPGP_signature

Jay Freeman

unread,
Sep 22, 2022, 4:11:58 AM9/22/22
to scheme-re...@googlegroups.com, Jay Freeman
We were having a good discussion a month or so ago. For what it's worth (possibly not much), I just released a new version (2.27) of my Macintosh R5+ Scheme -- Wraith Scheme -- that implements some of the stuff I was talking about. You can find it at http://www.jayreynoldsfreeman.com/My/Wraith_Scheme_%2864-bit_version%29.html
Reply all
Reply to author
Forward
0 new messages