R6RS test suite

Matthew Flatt

unread,

Jul 22, 2008, 6:08:55 PM7/22/08

to

The R6RS test suite for PLT Scheme is written as a collection of R6RS
libraries, and we hope that it can be useful to other R6RS
implementors. We'd very much like to have R6RS implementors and users
contribute to the test suite.

You can find the test suite in the PLT Scheme SVN repository:

http://svn.plt-scheme.org/plt/trunk/collects/tests/r6rs/

So far, we've tried to get basic coverage of the standard: using each
function and syntactic form at least once, and trying the interesting
input cases for each.

The content of the "README.txt" file follows. As you can see, we
haven't had much success running the test suite on some other
implementations. For Ikarus and Larceny, the problems are not merely
how to load the libraries that contain the tests; the test suite's
organization prevents it from loading when an implementation is
missing small features. I'm not sure how to improve that without
relying on `eval' (and the content below explains why I'd prefer to
avoid `eval') or breaking up the tests into really small sets.

Any suggestions?

======================================================================
Files and libraries
======================================================================

Files that end ".sps" are R6RS programs. The main one is "main.sps",
which runs all the tests.

Files that end ".sls" are R6RS libraries. For example, "base.sls" is a
library that implements `(tests r6rs base)', which is a set of tests
for `(rnrs base)'. Many R6RS implementations will auto-load ".sls"
files if you put the directory of tests in the right place.

In general, for each `(rnrs <id> ... <id>)' in the standard:

* There's a library of tests "<id>/.../<id>.sls". It defines and
exports a function `run-<id>-...<id>-tests'.

* There's a program "run/<id>/.../<id>.sps" that imports
"<id>/.../<id>.sls", runs the tests, and reports the results.

And then there's "main.sps", which runs all the tests (as noted
above). Also, "test.sls" implements `(tests r6rs test)', which
implements the testing utilities that are used by all the other
libraries.

======================================================================
Limitations and feedback
======================================================================

One goal of this test suite is to avoid using `eval' (except when
specifcally testing `eval'). Avoiding `eval' makes the test suite as
useful as possible to ahead-of-time compilers that implement `eval'
with a separate interpreter. A drawback of the current approach,
however, is that if an R6RS implementation doesn't supply one binding
or does not support a bit of syntax used in a set of tests, then the
whole set of tests fails to load.

A related problem is that each set of tests is placed into one
function that runs all the tests. This format creates a block of code
that is much larger than in a typical program, which might give some
compilers trouble.

In any case, reports of bugs (in the tests) and new tests would be
very much appreciated. File either as a PLT Scheme bug report at

http://bugs.plt-scheme.org

======================================================================
Hints on running the tests
======================================================================

Ikarus
------

Put this directory at "<somewhere>/tests/r6rs" and run with "run.sps"

cd <somewhere>
ikarus --r6rs-script tests/r6rs/run.sps

or run an individual library's test, such as "run/program.sps" as

cd <somewhere>
ikarus --r6rs-script tests/r6rs/run/program.sps

As of Ikarus 0.3.0+ (revision 1548), many libraries fail to load,
mostly because condition names like &error cannot be used as
expressions.

Larceny
-------

Put this directory at "<somewhere>/tests/r6rs" and run with "run.sps"

larceny -path <somewhere> -r6rs -program run.sps

or run an individual library's test, such as "run/program.sps" as

larceny -path <somewhere> -r6rs -program run/program.sps

As of Larceny 0.962, many test suites (such as "base.sls") take too
long and use too much memory to load on our machine; probably the test
functions are too big.

PLT Scheme
----------

If you get an SVN-based or the "Full" nightly build, then these tests
are
in a `tests/r6rs' collection already. You can run all of the tests
using

mzscheme -l tests/r6rs/run.sps

and so on.

Otherwise, install this directory as a `tests/r6rs' collection,
perhaps in the location reported by

(build-path (find-system-path 'addon-dir)
(version) "collects"
"tests" "r6rs")

As of PLT Scheme 4.0.2.5, two tests fail. They correspond to
documented non-conformance with R6RS.

Ypsilon
-------

[If there's a library-autoload mechanism, we didn't figure it
out. Better ideas are welcome...]

Load the library declarations that you're interested in. For `(rnrs
<id> ... <id>)':

* Load "test.sls"
* Load "<id>/...<id>.sls"
* Eval `(import tests r6rs <id> ... <id>)'
* Eval `(run-<id>-...<id>-tests)'
* Eval `(import tests r6rs test)'
* Eval `(show-test-results)'

Matthew Flatt

unread,

Jul 22, 2008, 8:58:11 PM7/22/08

to

On Jul 22, 4:08 pm, Matthew Flatt <mfl...@cs.utah.edu> wrote:
> As of Ikarus 0.3.0+ (revision 1548), many libraries fail to load,
> mostly because condition names like &error cannot be used as
> expressions.

That would be because R6RS doesn't require it to work as an
expression. I was confused about whether `record-type-descriptor'
is required and even whether it's built into `condition-predicate'.

Thanks to Aziz for helping so quickly to sort out this test-suite
bug! Most of the tests now load into Ikarus.

Matthew

William D Clinger

unread,

Jul 22, 2008, 10:11:25 PM7/22/08

to

Matthew Flatt wrote:
> The R6RS test suite for PLT Scheme is written as a collection of R6RS
> libraries, and we hope that it can be useful to other R6RS
> implementors.

Is it ever! Thank you, Matthew. This is great.

From the README.txt file:

> As of Larceny 0.962, many test suites (such as "base.sls") take too
> long and use too much memory to load on our machine; probably the test
> functions are too big.

I wrote a quick-and-dirty shell script that runs every
library test program individually. In Larceny v0.962,
two of the test programs appear to go into an infinite
loop, and I don't yet know why. Several others failed
to compile (for the reasons shown below), terminating
before they could begin to execute. All other test
programs ran to completion in at most 11 seconds on my
test machine. That time includes compilation.

I'm posting a short summary of the test results below.
I will determine the cause of each failed test and add
the newly discovered bugs to Larceny's bug database.
Most of them look easy to fix; writing the tests was
the hard part. Thanks again for doing that.

Will

----
Summary of results for
R6RS test suite for PLT Scheme (revision 10866)
system tested:
Larceny v0.962 (Jul 18 2008 04:26:20, precise:Posix Unix:unified)
test machine:
MacBook Pro (2.4 GHz Intel Core 2 Duo, 4 GB RAM)
Mac OS X 10.5.3

library under test summary of outcome
================== ==================
(rnrs arithmetic bitwise) 6 of 232 tests failed.
(rnrs arithmetic fixnums) infinite loop during compilation?
(rnrs arithmetic flonums) 3 of 365 tests failed.
(rnrs base) unbound variable: angle
(rnrs bytevectors) 464 tests passed
(rnrs conditions) unbound variable: record-type-
descriptor
(rnrs control) 11 tests passed
(rnrs enums) 2 of 26 tests failed.
(rnrs eval) 3 tests passed
(rnrs exceptions) 1 of 10 tests failed.
(rnrs hashtables) 15 of 248 tests failed.
(rnrs io ports) unbound variable: standard-error-port
(rnrs io simple) 56 tests passed
(rnrs lists) infinite loop during execution?
(rnrs mutable-pairs) 3 tests passed
(rnrs mutable-strings) 3 tests passed
(rnrs programs) 1 of 2 tests failed.
(rnrs r5rs) 71 tests passed
(rnrs reader) 70 tests passed
(rnrs records procedural) 21 tests passed
(rnrs records syntactic) unbound variable: record-type-
descriptor
(rnrs sorting) 1 of 4 tests failed.
(rnrs syntax-case) 24 of 95 tests failed.
(rnrs unicode) 1 of 118 tests failed.

Abdulaziz Ghuloum

unread,

Jul 22, 2008, 10:10:14 PM7/22/08

to

I'm thinking about refactoring the tests and expressing them as data,
or specifications, but not code that the implementation has to run.
A separate program should take this data and generate the test suite
code as well as a driver program to drive the implementation being
tested. A few reasons for this:

1. I would like the test suite to provide as much information as
possible regarding conformance. Currently, one bug/error may take
the rest down.

2. I'm not that comfortable having the implementation test itself
by itself. I'd rather have both MzScheme and Ikarus testing Ikarus
instead of have only Ikarus testing itself. This also allows for
the test to be terminated should it diverge and to catch its error
code should it crash. Unfortunately, there is no portable way of
doing this.

3. This allows testing different evaluating strategies (e.g., you
can run the tests as scripts, libraries, or using eval) without
having to rewrite your tests. One can also test different
optimization levels, etc., that an implementation may provide.

4. Often times in Ikarus, I special case some primitives when I know
the values of part of their inputs. For example, (vector-ref x 4)
is compiled differently from (vector-ref '#(1 2 3) y) which is also
compiled differently from (vector-ref x y). A test-suite generator
can produce four tests from a single (vector-ref '#(1 2 3) 1). This
can produce too many tests for large number of constants but may be
useful in some cases.

Thoughts?

leppie

unread,

Jul 23, 2008, 6:32:32 AM7/23/08

to

On Jul 23, 2:58 am, Matthew Flatt <mfl...@cs.utah.edu> wrote:
> Thanks to Aziz for helping so quickly to sort out this test-suite
> bug! Most of the tests now load into Ikarus.
>
> Matthew

Has the SVN been updated yet?

Matthew Flatt

unread,

Jul 23, 2008, 7:22:43 AM7/23/08

to

Yes.

Matthew Flatt

unread,

Jul 23, 2008, 7:44:17 AM7/23/08

to

On Jul 22, 8:10 pm, Abdulaziz Ghuloum <aghul...@cee.ess.indiana.edu>
wrote:

> I'm thinking about refactoring the tests and expressing them as data,
> or specifications, but not code that the implementation has to run.

Ok.

Just in case it's not obvious: the test suite should still be
represented as code, but the code can generate test descriptions
instead of (as currently) test results.

By writing the test suite as code, I mean that functions or macros can
be used to describe test cases in ways that fit the tests. For
example, "condition.sls" uses a `test-cond' macro to generate a set of
tests for a given condition type.

(And we could still have a set of R6RS programs that run tests in a
simple way. Each program would import for expand some libraries that
generate tests, and so on.)

> 2. I'm not that comfortable having the implementation test itself
> by itself. I'd rather have both MzScheme and Ikarus testing Ikarus
> instead of have only Ikarus testing itself. This also allows for
> the test to be terminated should it diverge and to catch its error
> code should it crash. Unfortunately, there is no portable way of
> doing this.
>

> [...]

>
> 4. Often times in Ikarus, I special case some primitives when I know
> the values of part of their inputs. For example, (vector-ref x 4)
> is compiled differently from (vector-ref '#(1 2 3) y) which is also
> compiled differently from (vector-ref x y). A test-suite generator
> can produce four tests from a single (vector-ref '#(1 2 3) 1). This
> can produce too many tests for large number of constants but may be
> useful in some cases.

On some of the details above, I worry about pushing the test suite's
role too far. Implementations will need a lot more tests than are
useful to try to share. But if it fits into a unified suite easily
enough, so much the better.

Matthew

leppie

unread,

Jul 23, 2008, 8:09:49 AM7/23/08

to

On Jul 23, 2:58 am, Matthew Flatt <mfl...@cs.utah.edu> wrote:

It seems a lot of the tests are still like that, using record names as
expressions. I have brought up this 'issue' before with Aziz, who in
turn explained the single namespace approach (I however tend to
disagree as I feel record names should not interfere, ie they are
separate).

So what is the situation here?
1. Can record names be used in expressions? [I agree with Aziz, that
it should not be allowed, is there any reason why the record name is
simply not wrapped in record-type-descriptor syntax?]
2. Should record names interfere with variable names? [I feel this
should not happen]

Cheers

leppie

Matthew Flatt

unread,

Jul 23, 2008, 8:35:23 AM7/23/08

to

On Jul 23, 6:09 am, leppie <xacc....@gmail.com> wrote:
> On Jul 23, 2:58 am, Matthew Flatt <mfl...@cs.utah.edu> wrote:
>
> > On Jul 22, 4:08 pm, Matthew Flatt <mfl...@cs.utah.edu> wrote:
>
> > > As of Ikarus 0.3.0+ (revision 1548), many libraries fail to load,
> > > mostly because condition names like &error cannot be used as
> > > expressions.
>
> > That would be because R6RS doesn't require it to work as an
> > expression. I was confused about whether `record-type-descriptor'
> > is required and even whether it's built into `condition-predicate'.
>
> > Thanks to Aziz for helping so quickly to sort out this test-suite
> > bug! Most of the tests now load into Ikarus.
>
> > Matthew
>
> It seems a lot of the tests are still like that, using record names as
> expressions.

Can you point me to examples?

I note that condition names are used directly in `test/exn' forms,
but the expansion of `test/exn' now wraps the condition name with
`record-type-descriptor'. It sounds like I've missed other places,
though.

> 1. Can record names be used in expressions?

Not in portable code, as far as I can tell, though it seems that R6RS
allows an implementation to allow a record name as an expression.

> 2. Should record names interfere with variable names?

Sorry - I'm not sure what you mean.

Matthew

leppie

unread,

Jul 23, 2008, 9:08:30 AM7/23/08

to

On Jul 23, 12:08 am, Matthew Flatt <mfl...@cs.utah.edu> wrote:
>
> In any case, reports of bugs (in the tests) and new tests would be
> very much appreciated. File either as a PLT Scheme bug report at
>
> http://bugs.plt-scheme.org
>

Hi again

Sorry for posting the bug here, but I see this too much!

Line 44 of hashtables.sls: should be eqv?, not eq? EVER! (this goes
for the slatex and compiler benchmarks from Larceny too)

Also in line 161 of hashtables.sls: string-ci-hash and string-hash are
not comparable, unless you define string-ci-hash as:
(define (string-ci-hash str)
(string-hash (string-downcase str)))

But that is inefficient.

IMO the test is wrong, it should compared string-ci-hash to string-ci-
hash with different cased strings.

Cheers

leppie

Matthew Flatt

unread,

Jul 23, 2008, 9:20:56 AM7/23/08

to

On Jul 23, 7:08 am, leppie <xacc....@gmail.com> wrote:
> Line 44 of hashtables.sls: should be eqv?, not eq? EVER!

I think it should be `eq?' for an `eq?'-based hashtable, but I see
your point for other hashtables. I've adjusted the test to vary the
comparison based on the kind of hash table being testing.

> Also in line 161 of hashtables.sls: string-ci-hash and string-hash are
> not comparable, unless you define string-ci-hash as:
> (define (string-ci-hash str)
> (string-hash (string-downcase str)))
>
> But that is inefficient.
>
> IMO the test is wrong, it should compared string-ci-hash to string-ci-
> hash with different cased strings.

Right.

These bugs are now fixed in SVN.

Thanks,
Matthew

leppie

unread,

Jul 23, 2008, 10:21:25 AM7/23/08

to

On Jul 23, 2:35 pm, Matthew Flatt <mfl...@cs.utah.edu> wrote:
>
> > 2. Should record names interfere with variable names?
>
> Sorry - I'm not sure what you mean.
>
> Matthew

Say I do (on Ikarus):

> cons
#<procedure cons>
> (define-record-type cons)
> cons
Unhandled exception
Condition components:
1. &who: cons
2. &message: "invalid expression"
3. &syntax:
form: cons
subform: #f
4. &trace: #<syntax cons>

Why should a record name interfere? Now it seems 'cons' is some kind
of syntax, but yet it is only usable in 2 places (record-type-
descriptor & record-constructor-descriptor).

leppie

unread,

Jul 23, 2008, 10:22:39 AM7/23/08

to

On Jul 23, 3:20 pm, Matthew Flatt <mfl...@cs.utah.edu> wrote:

> I think it should be `eq?' for an `eq?'-based hashtable, but I see
> your point for other hashtables. I've adjusted the test to vary the
> comparison based on the kind of hash table being testing.

Yeah, I missed that :)

William D Clinger

unread,

Jul 23, 2008, 10:59:43 AM7/23/08

to

I'd like to suggest two changes to the test suite. With
these two changes, the current development version of
Larceny is able to complete its execution of the entire
test suite in a reasonable amount of time, with some test
failures but without any fatal errors.

In tests/r6rs/io/ports.sls, I suggest changing the last
three calls to open-file-input/output-port to use a
transcoder with the latin-1-codec instead of utf-8-codec.
The R6RS does not specify a semantics for opening a file
for input/output with a variable-width codec. The R6RS
editors who insisted upon adding textual input/output
ports to the R6RS had the Posix semantics in mind, but
the Posix semantics for input/output ports assumes a
fixed-width encoding. Since the semantics of the R6RS
input/output ports with variable-width transcoding is
implementation-dependent, I believe the R6RS allows or
should allow implementations to raise an exception when
a program attempts to open a file for mixed input/output
with a variable-width encoding. Larceny does that, and
I believe this aspect of Larceny's behavior is a feature,
not a bug. Changing the test suite to use a fixed-wdith
encoding here would not alter the nature of the test.

The other change I suggest would reduce the code size of
tests/r6rs/arithmetic/fixnums.sls. Its large code size
is caused by this code:

;; If you put N numbers here, it expads to N^3 tests!
(carry-tests 0
[0 1 2 -1 -2 38734 -3843 2484598
-348732487 (greatest-fixnum) (least-fixnum)])

The following change would reduce those 1331 tests to
500 without losing much in the way of test coverage:

(carry-tests 0 [0 1 2 -1 -2])

(carry-tests 0 [2 -1 -2 38734 -3843])

(carry-tests 0 [2 -1 -2 (greatest-fixnum) (least-fixnum)])

(carry-tests 0 [-3843 2484598 -348732487
(greatest-fixnum) (least-fixnum)])

Thanks again for making these tests available to other
implementors of the R6RS.

Will

Matthew Flatt

unread,

Jul 23, 2008, 11:53:30 AM7/23/08

to

On Jul 23, 8:59 am, William D Clinger <cesur...@yahoo.com> wrote:
> In tests/r6rs/io/ports.sls, I suggest changing the last
> three calls to open-file-input/output-port to use a
> transcoder with the latin-1-codec instead of utf-8-codec.

Done in SVN.

> The other change I suggest would reduce the code size of
> tests/r6rs/arithmetic/fixnums.sls. Its large code size
> is caused by this code:
>
> ;; If you put N numbers here, it expads to N^3 tests!
> (carry-tests 0
> [0 1 2 -1 -2 38734 -3843 2484598
> -348732487 (greatest-fixnum) (least-fixnum)])
>
> The following change would reduce those 1331 tests to
> 500 without losing much in the way of test coverage:
>
> (carry-tests 0 [0 1 2 -1 -2])
>
> (carry-tests 0 [2 -1 -2 38734 -3843])
>
> (carry-tests 0 [2 -1 -2 (greatest-fixnum) (least-fixnum)])
>
> (carry-tests 0 [-3843 2484598 -348732487
> (greatest-fixnum) (least-fixnum)])

I'm less sold on this one. Don't we want tests combining
`(greatest-fixnum)' with 0 and with 1, for example?

In fact, in SVN, I've made it worse. The `carry-tests' form was
meant to expand to tests of `fx+/carry' and `fx-/carry' in addition
to `fx*/carry'. So, now there are 3 times as many tests, instead of
1/3 as many tests. And even if I had re-organized as above, adding the
missing operators would leave us with more tests than before.

But it does seem unlikely that 3993 tests are really necessary.
Any ideas that will shrink the set of tests enough, even with the
added operators?

Thanks,
Matthew

Matthew Flatt

unread,

Jul 23, 2008, 12:21:58 PM7/23/08

to

On Jul 23, 9:53 am, Matthew Flatt <mfl...@cs.utah.edu> wrote:
> Any ideas that will shrink the set of tests enough, even with the
> added operators?

There's a really simple solution: instead of generating N test
expressions, just iterate over the inputs. (I was too stuck on
macros, which is a convenient way to track the original expression,
but obviously not the only way.)

Done in SVN.

Matthew

William D Clinger

unread,

Jul 23, 2008, 3:08:12 PM7/23/08

to

leppie wrote:
> Line 44 of hashtables.sls: should be eqv?, not eq? EVER! (this goes
> for the slatex and compiler benchmarks from Larceny too)

I didn't write either of those benchmarks. Spurred by
your message, I went through the R6RS version of the
latex benchmark and corrected all uses of eq? and memq
on characters. You can now obtain the corrected version
from Larceny's Trac site. I'll check in a corrected
version of the R5RS slatex benchmark after I test it.
The more important thing, of course, is to correct the
slatex program itself, which I haven't done.

With the compiler benchmark, it's harder to tell which
uses of eq?, memq, and assq need to be corrected. It
looks as though the easiest way to find out is to add
some instrumentation to the code. That's why I haven't
yet corrected the compiler benchmark, but I plan to get
it done eventually.

Will

leppie

unread,

Jul 23, 2008, 4:19:30 PM7/23/08

to

Thanks :)

I managed to get slatex working, totally gave up on the compiler
one :)

Cheers

leppie

unread,

Jul 23, 2008, 4:20:22 PM7/23/08

to

Some more bugs (the conditions should all be wrapped with (record-type-
descriptor)):

- Line 80 of test.sls
- Line 264 & line 266 of flonums.sls

Cheers

leppie

Matthew Flatt

unread,

Jul 23, 2008, 5:03:10 PM7/23/08

to

Fixed in SVN.

Thanks,
Matthew

y.fuji...@gmail.com

unread,

Jul 26, 2008, 4:49:14 AM7/26/08

to

>Ypsilon
>-------
>[If there's a library-autoload mechanism, we didn't figure it out. Better ideas are welcome...]

I have released Ypsilon 0.9.5-update2 at:
http://code.google.com/p/ypsilon/downloads/list
It recognize '.sls' extension and autoload test suite libraries. :)

I write the description for "Hints on running the tests" section.
Please check.
===============================
Ypsilon 0.9.5-update2
------

Put this directory at "<somewhere>/tests/r6rs" and run with "run.sps"

cd <somewhere>
ypsilon --sitelib=. --no-letrec-check tests/r6rs/run.sps

or run an individual library's test, such as "run/program.sps" as

cd <somewhere>
ypsilon --sitelib=. --no-letrec-check tests/r6rs/run/program.sps
================================
*Because Ypsilon check letrec restriction violation during macro
expansion, the expression "(letrec ((x y) (y x)) 'should-not-get-
here)" in test/r6rs/base.sls raises exception during load. I added '--
no-letrec-check' to avoid this problem.

Finally, running all tests on Ypsilon 0.9.5-update2 show "93 of 8836
tests failed" :)
Thank you for your great work!

---
Yoshikatsu Fujita

Matthew Flatt

unread,

Jul 26, 2008, 8:32:50 AM7/26/08

to

On Jul 26, 2:49 am, y.fujita....@gmail.com wrote:
> I have released Ypsilon 0.9.5-update2 at:http://code.google.com/p/ypsilon/downloads/list
> It recognize '.sls' extension and autoload test suite libraries. :)
>
> I write the description for "Hints on running the tests" section.

Great! I've updated "README.txt" in SVN.

Thanks,
Matthew

samth

unread,

Jul 26, 2008, 8:59:32 AM7/26/08

to

On Jul 26, 4:49 am, y.fujita....@gmail.com wrote:

> *Because Ypsilon check letrec restriction violation during macro
> expansion, the expression "(letrec ((x y) (y x)) 'should-not-get-
> here)" in test/r6rs/base.sls raises exception during load. I added '--
> no-letrec-check' to avoid this problem.

I believe that the R6RS requires that this check be done at runtime
(at least sometimes). In section 11.4.6, discussing `letrec', it says,
under Implementation Responsibilities (emphasis mine):

Implementations must detect references to a <variable> during the
EVALUATION of the <init> expressions (using one particular evaluation
order and order of evaluating the <init> expressions).

There are some examples where it would be hard (or impossible) to
determine during macro expansion whether the letrec restriction will
be violated. For example (letrec ([x (if (eq? (cons 1 2) (cons 1 2))
x 1)]) x) should always return 1, but the analysis is not obvious.

sam th

leppie

unread,

Jul 26, 2008, 9:25:13 AM7/26/08

to

On Jul 26, 2:59 pm, samth <sam...@gmail.com> wrote:
> For example (letrec ([x (if (eq? (cons 1 2) (cons 1 2))
> x 1)]) x) should always return 1, but the analysis is not obvious.

That's a good test to add :)

y.fuji...@gmail.com

unread,

Jul 26, 2008, 10:26:27 AM7/26/08

to

Thank you for your input.
Now I understand why the letrec violation raises &asserion but not
&syntax!
I'm very happy to know that :)

On Jul 26, 10:25 pm, leppie <xacc....@gmail.com> wrote:
> On Jul 26, 2:59 pm, samth <sam...@gmail.com> wrote:
>

> > For example (letrec ([x (if (eq? (cons 1 2) (cons 1 2))
> > x 1)]) x) should always return 1, but the analysis is not obvious.
>

> That's a good test to add :)

Yes, it is! :)

William D Clinger

unread,

Jul 29, 2008, 10:11:59 PM7/29/08

to

As testimony to the value of a good test suite,
here are test results for Larceny v0.963. Please
compare to the results for v0.962 that I posted
one week ago.

Thanks again, Matthew.

Will

--------

Summary of results for
R6RS test suite for PLT Scheme (revision 10978)
29 July 2008
system tested:
Larceny v0.963

test machine:
MacBook Pro (2.4 GHz Intel Core 2 Duo, 4 GB RAM)
Mac OS X 10.5.3

45 of 8843 tests failed.

library under test summary of outcome
================== ==================

(rnrs arithmetic bitwise) 232 tests passed
(rnrs arithmetic fixnums) 4355 tests passed
(rnrs arithmetic flonums) 1 of 365 tests failed.
(rnrs base) 8 of 1995 tests failed.

(rnrs bytevectors) 464 tests passed

(rnrs conditions) 131 tests passed

(rnrs control) 11 tests passed

(rnrs enums) 26 tests passed

(rnrs eval) 3 tests passed

(rnrs exceptions) 10 tests passed
(rnrs hashtables) 249 tests passed
(rnrs io ports) 35 of 431 tests failed.

(rnrs io simple) 56 tests passed

(rnrs lists) 74 tests passed

(rnrs mutable-pairs) 3 tests passed
(rnrs mutable-strings) 3 tests passed

(rnrs programs) 2 tests passed

(rnrs r5rs) 71 tests passed
(rnrs reader) 70 tests passed
(rnrs records procedural) 21 tests passed

(rnrs records syntactic) 53 tests passed
(rnrs sorting) 4 tests passed
(rnrs syntax-case) 1 of 96 tests failed.
(rnrs unicode) 118 tests passed

leppie

unread,

Jul 30, 2008, 3:43:17 AM7/30/08

to

Looking good :)

I still got almost 140 in IronScheme!

Cheers

leppie

unread,

Jul 31, 2008, 3:30:08 PM7/31/08

to

Hi Matthew

Can the some of the arithmetic tests be adjusted to detect whether a
signed zero is a supported? Eg:

(log -1.0-0.0i)
=> 0.0-3.141592653589793i ; approximately
; if -0.0 is distinguished

The tests currently requires you distinguish between +0.0 and -0.0.

You could also add the following test: (test (= +nan.0 +nan.0) #f)

Cheers

leppie

unread,

Jul 31, 2008, 3:50:51 PM7/31/08

to

On Jul 31, 9:30 pm, leppie <xacc....@gmail.com> wrote:
> You could also add the following test: (test (= +nan.0 +nan.0) #f)

Also overflow and divide by zero tests for fixnums :)

Matthew Flatt

unread,

Jul 31, 2008, 4:18:55 PM7/31/08

to

On Jul 31, 1:30 pm, leppie <xacc....@gmail.com> wrote:
> Can the some of the arithmetic tests be adjusted to detect whether a
> signed zero is a supported? Eg:
>
> (log -1.0-0.0i)
> => 0.0-3.141592653589793i ; approximately
> ; if -0.0 is distinguished
>
> The tests currently requires you distinguish between +0.0 and -0.0.

I've wrapped the `log' and `angle' tests with

(unless (eqv? 0.0 -0.0) ....)

Is that right? Are there other tests that need a wrapper?

> You could also add the following test: (test (= +nan.0 +nan.0) #f)

Ok - added.

Matthew

Matthew Flatt

unread,

Jul 31, 2008, 4:45:30 PM7/31/08

to

On Jul 31, 1:50 pm, leppie <xacc....@gmail.com> wrote:
> Also overflow and divide by zero tests for fixnums :)

Added and committed in SVN.

I'm not sure what to do with `fldiv' and company. The spec for
`div' says that the first argument cannot be infinite, for example,
but `fldiv' doesn't say.

Thanks,
Matthew

leppie

unread,

Jul 31, 2008, 6:21:20 PM7/31/08

to

On Jul 31, 10:18 pm, Matthew Flatt <mfl...@cs.utah.edu> wrote:
>
> Is that right? Are there other tests that need a wrapper?
>

Not that I can see for now.

On Jul 31, 10:45 pm, Matthew Flatt <mfl...@cs.utah.edu> wrote:
> Added and committed in SVN.
>
> I'm not sure what to do with `fldiv' and company. The spec for
> `div' says that the first argument cannot be infinite, for example,
> but `fldiv' doesn't say.

No idea either :)

Thanks

leppie

y.fuji...@gmail.com

unread,

Aug 1, 2008, 3:54:28 AM8/1/08

to

I have released Ypsilon 0.9.6 at:
http://code.google.com/p/ypsilon/

New version has passed all 8886 tests in R6RS test suite revision
11016. :)

I write the description for "Hints on running the tests" section for
Ypsilon 0.9.6 as follows. Please update README.txt.
===============================
Ypsilon 0.9.6
------

Put this directory at "<somewhere>/tests/r6rs" and run with "run.sps"

cd <somewhere>
ypsilon --sitelib=. --clean-acc tests/r6rs/run.sps

or run an individual library's test, such as "run/program.sps" as

cd <somewhere>
ypsilon --sitelib=. --clean-acc tests/r6rs/run/program.sps
================================
*I removed the '--no-letrec-check' option because it is default
behavior of 0.9.6 and that option is deprecated.
*I added the '--clean-acc' option to force Ypsilon clean the auto-
compile-cache and run a fresh code.

Thank you again for your great test suite!

---
Yoshikatsu Fujita

leppie

unread,

Aug 1, 2008, 3:13:40 PM8/1/08

to

Here are some more tests:

(test (exact? (imag-part 0.0)) #t)
(test (exact? (imag-part 1.0)) #t)
(test (exact? (imag-part 1.1)) #t)
(test (exact? (imag-part +nan.0)) #t)
(test (exact? (imag-part +inf.0)) #t)
(test (exact? (imag-part -inf.0)) #t)

(test (zero? (imag-part 0.0)) #t)
(test (zero? (imag-part 1.0)) #t)
(test (zero? (imag-part 1.1)) #t)
(test (zero? (imag-part +nan.0)) #t)
(test (zero? (imag-part +inf.0)) #t)
(test (zero? (imag-part -inf.0)) #t)

http://www.r6rs.org/final/html/r6rs-rationale/r6rs-rationale-Z-H-2.html#node_toc_node_sec_11.6.5

Cheers

leppie

unread,

Aug 4, 2008, 3:28:32 PM8/4/08

to

Here is another tricky one, but I am not sure if it should assert or
not.

(letrec ((a (lambda () b))(b a)) (b))
(letrec* ((a (lambda () b))(b a)) (b))

letrec letrec*
IronScheme: assert ok
Ypsilon: assert assert
Ikarus: ok ok
PetiteChez: assert ok (not R6RS)

Any ideas? ;-)

Cheers

leppie

unread,

Aug 4, 2008, 3:59:15 PM8/4/08

to

On Aug 4, 9:28 pm, leppie <xacc....@gmail.com> wrote:
> Here is another tricky one, but I am not sure if it should assert or
> not.
>
> (letrec ((a (lambda () b))(b a)) (b))
> (letrec* ((a (lambda () b))(b a)) (b))

The following should assert though and does so in all but Ikarus.

(letrec ((a b)(b (lambda () a))) (b))
(letrec* ((a b)(b (lambda () a))) (b))

Cheers

leppie

Abdulaziz Ghuloum

unread,

Aug 4, 2008, 4:31:25 PM8/4/08

to

On Aug 4, 12:59 pm, leppie <xacc....@gmail.com> wrote:
> On Aug 4, 9:28 pm, leppie <xacc....@gmail.com> wrote:
>
> > Here is another tricky one, but I am not sure if it should assert or
> > not.
>
> > (letrec ((a (lambda () b))(b a)) (b))
> > (letrec* ((a (lambda () b))(b a)) (b))
>
> The following should assert though and does so in all but Ikarus.

Ikarus does not check on letrec[*] restriction as noted in bug report
https://bugs.launchpad.net/ikarus/+bug/216832 .

The first test must signal an assertion violation, the second
must not.

leppie

unread,

Aug 8, 2008, 5:24:19 PM8/8/08

to

Here is another test for io simple:

Make sure symbols are delimited in the output, so it can be reread.

(write 'a port)(write 'b port)(write 'c port)

Also a few 'silly' tests, but never the less required. These can be
easily overlooked if you are using some underlying runtime's type
system (eg .NET, Java).

(test (eq? 3 3.0) #f)
(test (eqv? 3 3.0) #f)
(test (equal? 3 3.0) #f)
(test (= 3 3.0) #t)

(test (eq? 3.0 3) #f)
(test (eqv? 3.0 3) #f)
(test (equal? 3.0 3) #f)
(test (= 3.0 3) #t)

;(test (eq? 3/1 3) #f)
(test (eqv? 3/1 3) #t)
(test (equal? 3/1 3) #t)
(test (= 3/1 3) #t)

(not sure about the commented test, can be either?)

Cheers

leppie

William D Clinger

unread,

Aug 8, 2008, 6:24:49 PM8/8/08

to

leppie wrote wrote:
> Here is another test for io simple:
>
> Make sure symbols are delimited in the output, so it can be reread.
>
> (write 'a port)(write 'b port)(write 'c port)

Implementations of the R6RS are required to output "abc"
for that test, with no delimiter. R6RS Library section 8.3
says "The write procedure operates in the same way as
put-datum; see section 8.2.12." R6RS Library section 8.2.12
says

The put-datum procedure merely writes the external
representation, but no trailing delimiter. If
put-datum is used to write several subsequent
external representations to an output port, care
should be taken to delimit them properly so they
can be read back in by subsequent calls to get-datum.

As was pointed out during the R6RS discussion, that
paragraph is ambiguous as to whether the implementation
or the programmer is to take "care", but the discussion
and the editors' response to formal comment 131 made
clear that it was the programmer who was to take care;
implementations are required *not* to delimit [1,2,3,4,5,6].

> ;(test (eq? 3/1 3) #f)
> (test (eqv? 3/1 3) #t)
> (test (equal? 3/1 3) #t)
> (test (= 3/1 3) #t)
>
> (not sure about the commented test, can be either?)

The result of the commented test must be a boolean, but
can be either #t or #f.

Will

[1] http://www.r6rs.org/formal-comments/comment-131.txt
[2] http://lists.r6rs.org/pipermail/r6rs-discuss/2007-June/002566.html
[3] http://lists.r6rs.org/pipermail/r6rs-discuss/2007-June/002587.html
[4] http://lists.r6rs.org/pipermail/r6rs-discuss/2007-June/002588.html
[5] http://lists.r6rs.org/pipermail/r6rs-discuss/2007-June/002590.html
[6] http://lists.r6rs.org/pipermail/r6rs-discuss/2007-June/002591.html

leppie

unread,

Aug 8, 2008, 6:38:35 PM8/8/08

to

On Aug 9, 12:24 am, William D Clinger <cesur...@yahoo.com> wrote:
> As was pointed out during the R6RS discussion, that
> paragraph is ambiguous as to whether the implementation
> or the programmer is to take "care", but the discussion
> and the editors' response to formal comment 131 made
> clear that it was the programmer who was to take care;
> implementations are required *not* to delimit [1,2,3,4,5,6].

Thanks :) Was not really clear to me.

>> (write 'a port)(write 'b port)(write 'c port)

> Implementations of the R6RS are required to output "abc"
> for that test, with no delimiter.

Good news for me :) Not sure about the others.

Cheers

leppie

Bill

unread,

Aug 9, 2008, 6:34:12 PM8/9/08

to

I wonder if my trig tests would be useful to you guys? By
chance, I'm using a very similar test setup. I currently
have about 4600 tests, and hope someday to finish the r5rs
scheme funcs. The tests are currently generated by numtst.c
in the Snd tools directory (at sourceforge, or available
via anonymous ftp at ccrma-ftp). The code assumes linux
currently, since this was just aimed at my own interests.

William D Clinger

unread,

Aug 10, 2008, 1:19:06 AM8/10/08

to

On Aug 9, 6:34 pm, Bill <schottsta...@gmail.com> wrote:
> I wonder if my trig tests would be useful to you guys?

I'm sure they would be. Thanks.

> By chance, I'm using a very similar test setup. I currently
> have about 4600 tests, and hope someday to finish the r5rs
> scheme funcs. The tests are currently generated by numtst.c
> in the Snd tools directory (at sourceforge, or available
> via anonymous ftp at ccrma-ftp). The code assumes linux
> currently, since this was just aimed at my own interests.

Can you be a little more specific about the anonymous ftp
directory? I couldn't find it at ccrma-ftp.stanford.edu.

Will

Bill

unread,

Aug 10, 2008, 7:05:51 AM8/10/08

to

> Can you be a little more specific about the anonymous ftp
> directory? I couldn't find it at ccrma-ftp.stanford.edu.

It's in the snd tarball:

ftp://ccrma-ftp.stanford.edu/pub/Lisp/snd-9.tar.gz

or http://sourceforge.net/projects/snd/

William D Clinger

unread,

Aug 10, 2008, 10:55:36 AM8/10/08

to

On Aug 10, 7:05 am, Bill <schottsta...@gmail.com> wrote:
> It's in the snd tarball:
>
> ftp://ccrma-ftp.stanford.edu/pub/Lisp/snd-9.tar.gz
>
> or http://sourceforge.net/projects/snd/

Got it, and got it running in Larceny. Will report later.
Thanks!

Will

Brad Lucier

unread,

Aug 11, 2008, 1:24:22 AM8/11/08

to

On Aug 10, 7:05 am, Bill <schottsta...@gmail.com> wrote:

Nice testsuite, it found a bug in Gambit's complex acos. Here are the
changes needed to make it work with gambit (these are the differences
with the output of "./numtest guile"):

diff test.scm ~/Desktop/snd-9/tools/
1,2d0
< (define (1+ x)
< (+ 1 x))
11c9
< (if (char=? c #\~)
---
> (if (char=? c #~)
17c15
< (if (member c (list #\A #\D #\F))
---
> (if (member c (list #A #D #F))
21,22c19,21
< (if (char=? c #\%)
< (set! result (string-append result (string #\newline)))
---
> (if (char=? c #%)
> (set! result (string-append result (string #
> ewline)))
26,29c25,26
< (define-macro (test tst expected)
< `(let ((result
< (with-exception-handler (lambda e 'error)
< (lambda () ,tst))))
---
> (defmacro test (tst expected)
> `(let ((result (catch #t (lambda () ,tst) (lambda args 'error))))

leppie

unread,

Aug 11, 2008, 3:05:50 AM8/11/08

to

To compile on Cygwin:
gcc -mno-cygwin -o numtest numtst.c

Else complex.h seems to be not included :|

Cheers

leppie

Bill

unread,

Aug 11, 2008, 6:00:48 AM8/11/08

to

Thanks very much for the bug fixes -- I'll update my version,
and I'll add the comment about cygwin. I'll try to get the other
funcs added this week, at least more of expt and rationalize.

leppie

unread,

Aug 11, 2008, 6:44:41 AM8/11/08

to

I am not however sure it is generating the output correctly. I get
lines like:

(if (not (eqv? 0 (string->number (number->string 0))))
(display (format #f ";string<->number ~A -> ~A -> ~A?~%"
42831282186485760 (number->string 42784196460019715) (string->number
(number->string 9848720786980866)))))
(if (not (eqv? 4294967297 (string->number (number->string
4294967297))))
(display (format #f ";string<->number ~A -> ~A -> ~A?~%"
42831282186485761 (number->string 42784196460019715) (string->number
(number->string 9848720786980866)))))
(if (not (eqv? 8589934594 (string->number (number->string
8589934594))))
(display (format #f ";string<->number ~A -> ~A -> ~A?~%"
42831282186485762 (number->string 42784196460019715) (string->number
(number->string 9848720786980866)))))

Note the (display) section, the numbers are completely different.

Also the following:

(test (expt 2147608202051587/5299989643265
1061850598/4742962578592890880) 0.99030354920325)
(test (expt 2147612497018877/5299989643265
1061850598/4742962578592890880) 0.99030033992383+0.00252117260474i)

Chez Petite gives me:
> (expt 2147608202051587/5299989643265 1061850598/4742962578592890880)
1.0000000013442614
> (expt 2147612497018877/5299989643265 1061850598/4742962578592890880)
1.0000000013442618

Also several places with invalid rationals:
(test (cos 4294967297/-9223372036854775808) 0.54030230586814)
(test (cos 8589934593/-9223372036854775808) 0.87758256189037)

Cheers

leppie

unread,

Aug 11, 2008, 7:38:29 AM8/11/08

to

On Aug 11, 12:44 pm, leppie <xacc....@gmail.com> wrote:

> I am not however sure it is generating the output correctly. I get
> lines like:

> .....

I wonder if fprintf is failing/overflowing on %lld, eg:

fprintf(fp, "(test (expt %lld %lld) ", int_args[i], -int_args[j]);

Will experiment a bit more.

Cheers

leppie

unread,

Aug 11, 2008, 7:57:32 AM8/11/08

to

On Aug 11, 1:38 pm, leppie <xacc....@gmail.com> wrote:
> I wonder if fprintf is failing/overflowing on %lld, eg:
>

Doing: printf("%d\n", sizeof(off_t));

prints 4.

And here is a ugly fix (insert somewhere near top):

#undef off_t
#define off_t long long

Then it will print 8.

And it continues to generate seamingly correct tests. :)

Thanks

leppie

William D Clinger

unread,

Aug 11, 2008, 9:41:30 AM8/11/08

to

Larceny v0.963 fails 67 of the 4357 tests. All but
3 failures are for large (and mostly for non-real)
arguments to asin and acos, and reflect bugs in
Larceny I'm working now to eradicate. The other
3 failures are:

(max 1.23+1.0i)
(min 1.23+1.0i)
(log 1.0+23.0i 1.0+23.0i)

For the third, Larceny returns 1.0, which is correct,
and is an extension allowed by the R5RS and mandated
by the R6RS.

I'm not sure the following tests are functioning as
intended:

(expt 1/500029
362880/3)
(expt -1/500029
362880/3)

Note that 362880/3 = 120960, that the results of both
tests should be an exact rational number, and that
their denominators should contain over half a million
decimal digits. Had the second argument been 362881/3,
Larceny would have returned 0.0 and been quick about it.

Will

leppie

unread,

Aug 11, 2008, 9:55:14 AM8/11/08

to

On Aug 11, 3:41 pm, William D Clinger <cesur...@yahoo.com> wrote:
> Larceny v0.963 fails 67 of the 4357 tests.

Cool!

Did you convert the test to R6RS format? If so, would you please be
kind enough to share it?

Thanks

leppie

Abdulaziz Ghuloum

unread,

Aug 11, 2008, 1:52:24 PM8/11/08

to

On Aug 9, 3:34 pm, Bill <schottsta...@gmail.com> wrote:
> I wonder if my trig tests would be useful to you guys?

Very useful indeed. Ikarus now shows only a few errors in
rationalize and expt given complex arguments. Thanks for
making the tests available.

Aziz,,,

Bill

unread,

Aug 11, 2008, 2:20:56 PM8/11/08

to

I'll take out that extreme expt case -- I was trying to avoid
anything bizarre. I have gcd/lcm now and I added a bunch of
tests from maxima (mainly to get some reasonable ctanh tests).
I'm very pleased that this stuff has turned out to be useful!

Abdulaziz Ghuloum

unread,

Aug 11, 2008, 4:29:51 PM8/11/08

to

It seems that some of the "rationalize" tests are incorrect according
to my understanding (listed below). Can you verify the tests? The
numbers that ikarus produces look simpler than the ones that the tests
expect.

(define (ex-rationalize x y)
;;; just to get exact results for easier comparison
(rationalize (exact x) (exact y)))

;(ex-rationalize 1.0 1.0) got 0, but expected 1
;(ex-rationalize -1.0 1.0) got 0, but expected -1
;(ex-rationalize 3.14159265358979 0.1) got 16/5, but expected 22/7
;(ex-rationalize -3.14159265358979 0.1) got -16/5, but expected -22/7
;(ex-rationalize 3.14159265358979 1e-3) got 201/64, but expected
333/106
;(ex-rationalize -3.14159265358979 1e-3) got -201/64, but expected
-333/106
;(ex-rationalize 2.71828182845905 1.0) got 2, but expected 3
;(ex-rationalize -2.71828182845905 1.0) got -2, but expected -3
;(ex-rationalize 2.71828182845905 3e-3) got 68/25, but expected 87/32
;(ex-rationalize -2.71828182845905 3e-3) got -68/25, but expected
-87/32
;(ex-rationalize 2.71828182845905 2e-5) got 878/323, but expected
1264/465
;(ex-rationalize -2.71828182845905 2e-5) got -878/323, but expected
-1264/465
;(ex-rationalize 1234.1234 0.1) got 6171/5, but expected 9873/8
;(ex-rationalize -1234.1234 0.1) got -6171/5, but expected -9873/8
;(ex-rationalize 1234.1234 1e-3) got 60472/49, but expected 90091/73
;(ex-rationalize -1234.1234 1e-3) got -60472/49, but expected
-90091/73
;(ex-rationalize 1.23400000001234e9 1e-3) got 92550000001/75, but
expected 99954000001/81
;(ex-rationalize -1.23400000001234e9 1e-3) got -92550000001/75, but
expected -99954000001/81
;(ex-rationalize 1.23400000001234e9 3e-3) got 81444000001/66, but
expected 99954000001/81
;(ex-rationalize -1.23400000001234e9 3e-3) got -81444000001/66, but
expected -99954000001/81
;(ex-rationalize 0.33 1e-3) got 26/79, but expected 33/100
;(ex-rationalize -0.33 1e-3) got -26/79, but expected -33/100
;(ex-rationalize 0.33 3e-3) got 18/55, but expected 33/100
;(ex-rationalize -0.33 3e-3) got -18/55, but expected -33/100
;(ex-rationalize 0.9999 1.0) got 0, but expected 1
;(ex-rationalize -0.9999 1.0) got 0, but expected -1
;(ex-rationalize 0.9999 2e-5) got 8333/8334, but expected 9999/10000
;(ex-rationalize -0.9999 2e-5) got -8333/8334, but expected
-9999/10000
;(ex-rationalize 0.501 1.0) got 0, but expected 1
;(ex-rationalize -0.501 1.0) got 0, but expected -1
;(ex-rationalize 0.501 1e-3) got 126/251, but expected 250/499
;(ex-rationalize -0.501 1e-3) got -126/251, but expected -250/499
;(ex-rationalize 0.501 2e-5) got 246/491, but expected 250/499
;(ex-rationalize -0.501 2e-5) got -246/491, but expected -250/499
;(ex-rationalize 0.499 1e-3) got 125/251, but expected 249/499
;(ex-rationalize -0.499 1e-3) got -125/251, but expected -249/499
;(ex-rationalize 0.499 2e-5) got 245/491, but expected 249/499
;(ex-rationalize -0.499 2e-5) got -245/491, but expected -249/499
;(ex-rationalize 1.501 1.0) got 1, but expected 2
;(ex-rationalize -1.501 1.0) got -1, but expected -2
;(ex-rationalize 1.501 2e-5) got 737/491, but expected 749/499
;(ex-rationalize -1.501 2e-5) got -737/491, but expected -749/499
;(ex-rationalize 1.499 2e-5) got 736/491, but expected 748/499
;(ex-rationalize -1.499 2e-5) got -736/491, but expected -748/499

Bill

unread,

Aug 11, 2008, 5:18:46 PM8/11/08

to

On Aug 11, 1:29 pm, Abdulaziz Ghuloum <aghul...@gmail.com> wrote:
> It seems that some of the "rationalize" tests are incorrect according
> to my understanding (listed below).

oh good grief -- I think you're right -- I didn't read the spec
closely enough. That's distressing because the continued fraction
trick is fast. Were there any other errors?

Brad Lucier

unread,

Aug 11, 2008, 5:23:13 PM8/11/08

to luc...@math.purdue.edu

On Aug 11, 9:41 am, William D Clinger <cesur...@yahoo.com> wrote:

> I'm not sure the following tests are functioning as
> intended:
>
> (expt 1/500029
> 362880/3)
> (expt -1/500029
> 362880/3)
>
> Note that 362880/3 = 120960, that the results of both
> tests should be an exact rational number, and that
> their denominators should contain over half a million
> decimal digits. Had the second argument been 362881/3,
> Larceny would have returned 0.0 and been quick about it.

Gambit seems to get an answer pretty quickly:

> (define a (time (expt 1/500029 362880/3)))
(time (expt 1/500029 120960))
96 ms real time
96 ms cpu time (92 user, 4 system)
4 collections accounting for 2 ms real time (0 user, 0 system)
17791000 bytes allocated
1252 minor faults
no major faults
> (integer-length (denominator a))
2289973
> (* (integer-length 500029) 120960)
2298240

I'm not suggesting the test stay, just that I didn't notice that it
was unreasonable.

On the other hand, your suggested test *is* much faster ;-):

> (define a (time (expt 1/500029 362881/3)))
(time (expt 1/500029 362881/3))
0 ms real time
0 ms cpu time (0 user, 0 system)
no collections
1280 bytes allocated
18 minor faults
no major faults

Brad

Abdulaziz Ghuloum

unread,

Aug 11, 2008, 5:42:49 PM8/11/08

to

On Aug 11, 2:23 pm, Brad Lucier <luc...@math.purdue.edu> wrote:

> Gambit seems to get an answer pretty quickly:
>
> > (define a (time (expt 1/500029 362880/3)))
>
> (time (expt 1/500029 120960))
> 96 ms real time
> 96 ms cpu time (92 user, 4 system)
> 4 collections accounting for 2 ms real time (0 user, 0 system)
> 17791000 bytes allocated
> 1252 minor faults
> no major faults> (integer-length (denominator a))
> 2289973
> > (* (integer-length 500029) 120960)
>
> 2298240
>
> I'm not suggesting the test stay, just that I didn't notice that it
> was unreasonable.

I don't think it was unreasonable either:

> (optimize-level 0) ;;; otherwise it will be folded at compile time
> (collect) ;;; and (time ---) will show 0.

> (define a (time (expt 1/500029 362880/3)))

running stats for (expt 1/500029 120960):
no collections
206 ms elapsed cpu time, including 0 ms collecting
207 ms elapsed real time, including 0 ms collecting
2133296 bytes allocated
> (bitwise-length (denominator a))
2289973

Bill

unread,

Aug 11, 2008, 6:03:35 PM8/11/08

to

I fixed rationalize, but I'm not happy... Also added
include_big_fractions_in_expt (default true) for that
expt case.

William D Clinger

unread,

Aug 11, 2008, 6:57:39 PM8/11/08

to

I have converted Bill Schottstaedt's tests into a top-level
R6RS program, which is listed under "Test Programs" at
http://www.ccs.neu.edu/home/will/R6RS/

I ran into the following problems. Most of them are minor
bugs or portability issues in the numtst.c program that
generated the tests.

* the macro generated by numtst.c doesn't detect an
error result when no error was expected. The R6RS
program repairs this problem.

* sinh, cosh, tanh, asinh, acosh, and atanh aren't part
of the R5RS or R6RS, and my feeble attempts to define
them weren't worthy of being tested. The R6RS program
bypasses those tests.

* When rationalize is passed an inexact argument, it is
required to return an inexact result. (Many R5RS
systems got that wrong, which was one of the reasons
for making the R6RS more explicit about such issues.)
The R6RS program bypasses tests that pass an inexact
argument to rationalize but expect an exact result.

* The output of numtst.c contains references to the
following unbound variables:
snd-display
search
arg
1+
double-0
double-1
double-2
double-3
double-4
double-6
The R6RS program defines those things; the first four
shouldn't ever be referenced, and the intended values
of the six doubles weren't too hard to figure out.

* The tests contain expressions followed by definitions,
which is not allowed by the R6RS. The R6RS program
solved that by splitting the tests into two separate
libraries.

The current development version of Larceny fails two of
the tests performed by the program I have just posted:

;(tan 1234000000/3) got -18.78095517910799, but expected
-18.7821359357167
;(log 1.0+23.0i 1.0+23.0i) got 1.0+0.0i, but expected error
Failed 2 of 4786 tests.

The second failure is actually an error in the test, since
the R6RS requires log to accept two arguments, and 1.0+0.0i
is the correct result.

Thanks again to Bill for making these tests available.

Will

William D Clinger

unread,

Aug 11, 2008, 7:07:31 PM8/11/08

to

On Aug 11, 5:23 pm, Brad Lucier <luc...@math.purdue.edu> wrote:
> I'm not suggesting the test stay, just that I didn't notice that it
> was unreasonable.

No one has said it was unreasonable. The question was whether
it was functioning as intended, since (1) 362880/3 is an odd
way to write 120960 and (2) the expected result was 0.0, not
an exact rational with hundreds of thousands of digits in the
denominator.

Gambit and Ikarus have notably good performance on bignum
arithmetic, but Larceny's bignum performance is notoriously
poor. That's why I noticed the issue, and why you probably
wouldn't have.

Will

Bill

unread,

Aug 11, 2008, 8:05:42 PM8/11/08

to

The tan bug was an inadvertent roundoff error; -18.78094727276203
is correct (according to maxima). I have fixed most of the bugs
you mention -- thanks for the feedback.

Abdulaziz Ghuloum

unread,

Aug 11, 2008, 8:23:04 PM8/11/08

to

On Aug 11, 3:57 pm, William D Clinger <cesur...@yahoo.com> wrote:
> I have converted Bill Schottstaedt's tests into a top-level
> R6RS program, which is listed under "Test Programs" at
> http://www.ccs.neu.edu/home/will/R6RS/

Are the R6RS violations (lexical or otherwise) in that
program intentional or are they a result of larceny being
"R6RS compatible"?

pnkf...@gmail.com

unread,

Aug 11, 2008, 9:09:28 PM8/11/08

to

I would guess that the answer is, at least in part, "Yes." In
particular, the one lexical R6RS violation I see is probably an
artifact of the developer not including a #!r6rs token at the top of
the file when writing the code.

After adding the #!r6rs token, Larceny promptly signals an error when
I try to run the program:

Error: no handler for exception #<record &compound-condition>
Compound condition has these components:
#<record &error>
#<record &who>
who : get-datum
#<record &message>
message : "Lexical Error: Illegal symbol syntax: 1+ "
#<record &irritants>
irritants : (#<INPUT PORT trigtest.sps>)

Terminating program execution.

I believe this particular problem (of the illegal 1+ syntax) is itself
easy to fix, by renaming 1+ to add1. (You do need to be sure to only
rename the occurrences of the 1+ token on its own, and not the
occurrences of 1+ within various number literals within the file.)

William D Clinger

unread,

Aug 11, 2008, 11:19:14 PM8/11/08

to

Aziz Ghuloum wrote:
> > I have converted Bill Schottstaedt's tests into a top-level
> > R6RS program, which is listed under "Test Programs" at
> > http://www.ccs.neu.edu/home/will/R6RS/
>
> Are the R6RS violations (lexical or otherwise) in that
> program intentional or are they a result of larceny being
> "R6RS compatible"?

The only R6RS violation I can find is the lexical
violation, which was unintentional. The 1+ tokens
were in the code generated by numtst.c, and I was
trying to leave that code as pristine as possible.
I should have tested the program with #!r6rs at
the top before I posted it.

I have renamed 1+ to add1 and updated the program
at the link cited above.

Will

leppie

unread,

Aug 12, 2008, 2:45:40 AM8/12/08

to

Thanks Will and Bill!

On IronScheme: Failed 1479 of 4786 tests. :|

A few simple ones, but most related to complex argument and complex
results.

Time to study some math again :)

Cheers

leppie

unread,

Aug 12, 2008, 8:17:32 AM8/12/08

to

On Aug 12, 8:45 am, leppie <xacc....@gmail.com> wrote:

> On IronScheme: Failed 1479 of 4786 tests. :|

Down to 98 :)

Bill

unread,

Aug 12, 2008, 3:17:26 PM8/12/08

to

I made a new numtst.c that covers all the numeric funcs in r5rs (I
think);
still some gaps, but I've run out of steam -- about 20,000 tests which
hit a good portion of the basic arithmetic.

Brad Lucier

unread,

Aug 12, 2008, 8:07:55 PM8/12/08

to luc...@math.purdue.edu

Bill:

Because .501 and .001 are converted to dyadic rationals (IEEE floating
point numbers) before the computation, I'm not sure these tests are
correct on machines with 64-bit IEEE floating point:

;(rationalize .501 .001) got .50199203187251, but expected 1/2
;(rationalize -.501 .001) got -.50199203187251, but expected -1/2
;(rationalize .499 .001) got .49800796812749004, but expected 1/2
;(rationalize -.499 .001) got -.49800796812749004, but expected -1/2

For example, on gambit I get:

> (<= (- (inexact->exact .501) 1/2) (inexact->exact .001))
#f
> (<= (- 1/2 (inexact->exact .499)) (inexact->exact .001))
#f

so 1/2 is not within .001 of .501 on a binary machine.

Also, the r5rs standard says

Note that 0 = 0/1 is the simplest rational of all.

so I think these tests should expect 0. as the answer:

;(rationalize 1. 1.) got 0., but expected 1
;(rationalize -1. 1.) got 0., but expected -1

About the gambit test suite, numtst.c should have the following small
fix:

[brad:~/Desktop/snd-9/tools] lucier% rcsdiff -u numtst.c
===================================================================
RCS file: RCS/numtst.c,v
retrieving revision 1.1
diff -u -r1.1 numtst.c
--- numtst.c 2008/08/12 23:33:13 1.1
+++ numtst.c 2008/08/13 00:00:50
@@ -1435,7 +1435,7 @@

if (strcmp(scheme_name, "gambit") == 0)
- fprintf(stderr, "\n\
+ fprintf(fp, "\n\
(define-macro (test tst expected)\n\
`(let ((result\n\
(with-exception-handler (lambda e 'error)\n\

Finally, r5rs doesn't have tanh, sinh, or cosh; I don't know whether
you'd want these tests included just for the Schemes that support
these functions (gambit doesn't).

So, your test suite found two more bugs in gambit (with single
argument gcd and lcm); pretty good.

Brad

Bill

unread,

Aug 12, 2008, 9:38:35 PM8/12/08

to

I made the gambit change, and added the include_hyperbolic_functions
switch
(defaults to false if gambit). I don't know what to do with
rationalize.

William D Clinger

unread,

Aug 13, 2008, 12:31:46 AM8/13/08

to

I have added Bill's new tests to the R6RS version
of his program at

http://www.ccs.neu.edu/home/will/R6RS/

The R6RS test program now checks the exactness of
results as well as their numerical values.

Bill's new tests found two to six ancient bugs in
Larceny, depending on how you count. Thank you,
Bill.

I agree with Brad Lucier's comments concerning
rationalize. In addition, both the R5RS and the
R6RS specify that the ordering predicates take two
or more arguments. For the R6RS version, the test
macro eliminates tests that expect a boolean result
when passing only one argument to a comparison. The
test macro also eliminates tests of the hyperbolic
trig functions, but those tests are included within
the program and it would be easy to modify the test
macro to enable them.

Will

Bill

unread,

Aug 13, 2008, 10:10:46 AM8/13/08

to

I added a relational_functions_require_2_arguments switch
(default true). On the exactness check, I was worried
about cases like (* 0 1.0) -- can't it be argued that
either 0 or 0.0 is correct?

William D Clinger

unread,

Aug 13, 2008, 12:47:08 PM8/13/08

to

Bill wrote:
> On the exactness check, I was worried
> about cases like (* 0 1.0) -- can't it be argued that
> either 0 or 0.0 is correct?

According to R6RS 11.7.1, both 0 and 0.0 are correct
results for (* 0 1.0).

That is part of why run-test-with-correct-exactness
(in the R6RS test program) is so complicated. (It's
so complicated that I just found a bug in it: It was
allowing a very few inexact results that are illegal.
The program at http://www.ccs.neu.edu/home/will/R6RS/
fixes that bug.)

So far as I know, the R5RS test programs generated by
numtst.c aren't checking for the correct exactness.
If you intend to add exactness checking to the output
of numtst.c, you'll have to add some processing along
the lines of run-test-with-correct-exactness and
run-test in the R6RS test program.

The rules that govern exactness in the R5RS aren't
quite so clear as in the R6RS, so your exactness
processing would have to be more complex for R5RS
than for R6RS.

Will

leppie

unread,

Aug 14, 2008, 5:34:23 AM8/14/08

to

On Aug 13, 6:47 pm, William D Clinger <cesur...@yahoo.com> wrote:
> That is part of why run-test-with-correct-exactness
> (in the R6RS test program) is so complicated. (It's
> so complicated that I just found a bug in it: It was
> allowing a very few inexact results that are illegal.

What about these? From what I can see, it could be exact or inexact.

;(expt 0 1e-08) got 0, but expected 0.0
;(expt 0 1.0) got 0, but expected 0.0
;(expt 0 3.14159265358979) got 0, but expected 0.0
;(expt 0 2.71828182845905) got 0, but expected 0.0
;(expt 0 1234.0) got 0, but expected 0.0
;(expt 0 1234000000.0) got 0, but expected 0.0

Cheers

leppie

William D Clinger

unread,

Aug 14, 2008, 8:04:43 AM8/14/08

to

leppie wrote:
> What about these? From what I can see, it could be exact or inexact.
>
> ;(expt 0 1e-08) got 0, but expected 0.0
> ;(expt 0 1.0) got 0, but expected 0.0
> ;(expt 0 3.14159265358979) got 0, but expected 0.0
> ;(expt 0 2.71828182845905) got 0, but expected 0.0
> ;(expt 0 1234.0) got 0, but expected 0.0
> ;(expt 0 1234000000.0) got 0, but expected 0.0

According to both the R5RS and the R6RS:

(expt 0 0) => 1
(expt 0 1) => 0

The result of (expt 0 z) therefore depends upon the
value of z. An inexact z means its value is uncertain.
Therefore the result of (expt 0 z), when z is inexact,
is also uncertain, and must be flagged as inexact under
the general rule stated in the next-to-last paragraph
of R6RS 11.7.1; (expt 0 z) does not qualify for the
exception to the general rule that is spelled out in
the last paragraph of R6RS 11.7.1.

leppie

unread,

Aug 14, 2008, 8:23:51 AM8/14/08

to

Thanks Will, but what about this as shown in the R6RS:

(expt 0 5+.0000312i) => 0

From what you say this should return 0.0, which will make this an
error (in the spec).

Btw, I do agree with what you are saying, I just want to make sure
about the required behavior :)

Cheers

leppie

unread,

Aug 14, 2008, 8:25:07 AM8/14/08

to

On Aug 14, 2:04 pm, William D Clinger <cesur...@yahoo.com> wrote:

Ha, guess I should have read that first before replying!

Thanks

leppie

Bill

unread,

Aug 18, 2008, 10:50:55 AM8/18/08

to

A small update: if you're interested in bignum tests, I translated
the Clisp numeric tests: clisp-number-tests.scm in the same places
as numtst.c. (Since it was easy to do, I also translated some of
the other clisp tests: clisp-other-tests.scm, but it probably needs
some cleaning up).

Brad Lucier

unread,

Aug 18, 2008, 6:32:26 PM8/18/08

to luc...@math.purdue.edu

I found the following two tests useful, as they tickle a rarely-needed
bit of code in Knuth's algorithm for long division using 16- and 32-
bit words:

> (quotient 295147905149568077200 34359738366)
8589934591
> (remainder 295147905149568077200 34359738366)
21754858894
> (quotient 696898287454081973170944403677937368733396 1180591620717411303422)
590295810358705651711
> (remainder 696898287454081973170944403677937368733396 1180591620717411303422)
314390899110894278354

and you get some insight why when you convert these to strings in base
2:

> (map (lambda (x) (number->string x 2)) '(696898287454081973170944403677937368733396 1180591620717411303422 295147905149568077200 34359738366))
("1111111111111111111111111111111111111111111111111111111111111111111100100010000101100001100110110010000011011101000100001101001011011010100"

"1111111111111111111111111111111111111111111111111111111111111111111110"

"11111111111111111111111111111111100100010000101100001100110110010000"
"11111111111111111111111111111111110")

Brad

Bill

unread,

Aug 18, 2008, 6:55:09 PM8/18/08

to

Thanks! I added them to my set.

William D Clinger

unread,

Aug 19, 2008, 2:19:27 PM8/19/08

to

On Aug 18, 6:55 pm, Bill <schottsta...@gmail.com> wrote:
> Thanks! I added them to my set.

For an R6RS version, see

http://www.ccs.neu.edu/home/will/R6RS/

Will