You can find the test suite in the PLT Scheme SVN repository:
http://svn.plt-scheme.org/plt/trunk/collects/tests/r6rs/
So far, we've tried to get basic coverage of the standard: using each
function and syntactic form at least once, and trying the interesting
input cases for each.
The content of the "README.txt" file follows. As you can see, we
haven't had much success running the test suite on some other
implementations. For Ikarus and Larceny, the problems are not merely
how to load the libraries that contain the tests; the test suite's
organization prevents it from loading when an implementation is
missing small features. I'm not sure how to improve that without
relying on `eval' (and the content below explains why I'd prefer to
avoid `eval') or breaking up the tests into really small sets.
Any suggestions?
======================================================================
Files and libraries
======================================================================
Files that end ".sps" are R6RS programs. The main one is "main.sps",
which runs all the tests.
Files that end ".sls" are R6RS libraries. For example, "base.sls" is a
library that implements `(tests r6rs base)', which is a set of tests
for `(rnrs base)'. Many R6RS implementations will auto-load ".sls"
files if you put the directory of tests in the right place.
In general, for each `(rnrs <id> ... <id>)' in the standard:
* There's a library of tests "<id>/.../<id>.sls". It defines and
exports a function `run-<id>-...<id>-tests'.
* There's a program "run/<id>/.../<id>.sps" that imports
"<id>/.../<id>.sls", runs the tests, and reports the results.
And then there's "main.sps", which runs all the tests (as noted
above). Also, "test.sls" implements `(tests r6rs test)', which
implements the testing utilities that are used by all the other
libraries.
======================================================================
Limitations and feedback
======================================================================
One goal of this test suite is to avoid using `eval' (except when
specifcally testing `eval'). Avoiding `eval' makes the test suite as
useful as possible to ahead-of-time compilers that implement `eval'
with a separate interpreter. A drawback of the current approach,
however, is that if an R6RS implementation doesn't supply one binding
or does not support a bit of syntax used in a set of tests, then the
whole set of tests fails to load.
A related problem is that each set of tests is placed into one
function that runs all the tests. This format creates a block of code
that is much larger than in a typical program, which might give some
compilers trouble.
In any case, reports of bugs (in the tests) and new tests would be
very much appreciated. File either as a PLT Scheme bug report at
======================================================================
Hints on running the tests
======================================================================
Ikarus
------
Put this directory at "<somewhere>/tests/r6rs" and run with "run.sps"
cd <somewhere>
ikarus --r6rs-script tests/r6rs/run.sps
or run an individual library's test, such as "run/program.sps" as
cd <somewhere>
ikarus --r6rs-script tests/r6rs/run/program.sps
As of Ikarus 0.3.0+ (revision 1548), many libraries fail to load,
mostly because condition names like &error cannot be used as
expressions.
Larceny
-------
Put this directory at "<somewhere>/tests/r6rs" and run with "run.sps"
larceny -path <somewhere> -r6rs -program run.sps
or run an individual library's test, such as "run/program.sps" as
larceny -path <somewhere> -r6rs -program run/program.sps
As of Larceny 0.962, many test suites (such as "base.sls") take too
long and use too much memory to load on our machine; probably the test
functions are too big.
PLT Scheme
----------
If you get an SVN-based or the "Full" nightly build, then these tests
are
in a `tests/r6rs' collection already. You can run all of the tests
using
mzscheme -l tests/r6rs/run.sps
and so on.
Otherwise, install this directory as a `tests/r6rs' collection,
perhaps in the location reported by
(build-path (find-system-path 'addon-dir)
(version) "collects"
"tests" "r6rs")
As of PLT Scheme 4.0.2.5, two tests fail. They correspond to
documented non-conformance with R6RS.
Ypsilon
-------
[If there's a library-autoload mechanism, we didn't figure it
out. Better ideas are welcome...]
Load the library declarations that you're interested in. For `(rnrs
<id> ... <id>)':
* Load "test.sls"
* Load "<id>/...<id>.sls"
* Eval `(import tests r6rs <id> ... <id>)'
* Eval `(run-<id>-...<id>-tests)'
* Eval `(import tests r6rs test)'
* Eval `(show-test-results)'
That would be because R6RS doesn't require it to work as an
expression. I was confused about whether `record-type-descriptor'
is required and even whether it's built into `condition-predicate'.
Thanks to Aziz for helping so quickly to sort out this test-suite
bug! Most of the tests now load into Ikarus.
Matthew
Is it ever! Thank you, Matthew. This is great.
From the README.txt file:
> As of Larceny 0.962, many test suites (such as "base.sls") take too
> long and use too much memory to load on our machine; probably the test
> functions are too big.
I wrote a quick-and-dirty shell script that runs every
library test program individually. In Larceny v0.962,
two of the test programs appear to go into an infinite
loop, and I don't yet know why. Several others failed
to compile (for the reasons shown below), terminating
before they could begin to execute. All other test
programs ran to completion in at most 11 seconds on my
test machine. That time includes compilation.
I'm posting a short summary of the test results below.
I will determine the cause of each failed test and add
the newly discovered bugs to Larceny's bug database.
Most of them look easy to fix; writing the tests was
the hard part. Thanks again for doing that.
Will
----
Summary of results for
R6RS test suite for PLT Scheme (revision 10866)
system tested:
Larceny v0.962 (Jul 18 2008 04:26:20, precise:Posix Unix:unified)
test machine:
MacBook Pro (2.4 GHz Intel Core 2 Duo, 4 GB RAM)
Mac OS X 10.5.3
library under test summary of outcome
================== ==================
(rnrs arithmetic bitwise) 6 of 232 tests failed.
(rnrs arithmetic fixnums) infinite loop during compilation?
(rnrs arithmetic flonums) 3 of 365 tests failed.
(rnrs base) unbound variable: angle
(rnrs bytevectors) 464 tests passed
(rnrs conditions) unbound variable: record-type-
descriptor
(rnrs control) 11 tests passed
(rnrs enums) 2 of 26 tests failed.
(rnrs eval) 3 tests passed
(rnrs exceptions) 1 of 10 tests failed.
(rnrs hashtables) 15 of 248 tests failed.
(rnrs io ports) unbound variable: standard-error-port
(rnrs io simple) 56 tests passed
(rnrs lists) infinite loop during execution?
(rnrs mutable-pairs) 3 tests passed
(rnrs mutable-strings) 3 tests passed
(rnrs programs) 1 of 2 tests failed.
(rnrs r5rs) 71 tests passed
(rnrs reader) 70 tests passed
(rnrs records procedural) 21 tests passed
(rnrs records syntactic) unbound variable: record-type-
descriptor
(rnrs sorting) 1 of 4 tests failed.
(rnrs syntax-case) 24 of 95 tests failed.
(rnrs unicode) 1 of 118 tests failed.
1. I would like the test suite to provide as much information as
possible regarding conformance. Currently, one bug/error may take
the rest down.
2. I'm not that comfortable having the implementation test itself
by itself. I'd rather have both MzScheme and Ikarus testing Ikarus
instead of have only Ikarus testing itself. This also allows for
the test to be terminated should it diverge and to catch its error
code should it crash. Unfortunately, there is no portable way of
doing this.
3. This allows testing different evaluating strategies (e.g., you
can run the tests as scripts, libraries, or using eval) without
having to rewrite your tests. One can also test different
optimization levels, etc., that an implementation may provide.
4. Often times in Ikarus, I special case some primitives when I know
the values of part of their inputs. For example, (vector-ref x 4)
is compiled differently from (vector-ref '#(1 2 3) y) which is also
compiled differently from (vector-ref x y). A test-suite generator
can produce four tests from a single (vector-ref '#(1 2 3) 1). This
can produce too many tests for large number of constants but may be
useful in some cases.
Thoughts?
Has the SVN been updated yet?
Yes.
Ok.
Just in case it's not obvious: the test suite should still be
represented as code, but the code can generate test descriptions
instead of (as currently) test results.
By writing the test suite as code, I mean that functions or macros can
be used to describe test cases in ways that fit the tests. For
example, "condition.sls" uses a `test-cond' macro to generate a set of
tests for a given condition type.
(And we could still have a set of R6RS programs that run tests in a
simple way. Each program would import for expand some libraries that
generate tests, and so on.)
> 2. I'm not that comfortable having the implementation test itself
> by itself. I'd rather have both MzScheme and Ikarus testing Ikarus
> instead of have only Ikarus testing itself. This also allows for
> the test to be terminated should it diverge and to catch its error
> code should it crash. Unfortunately, there is no portable way of
> doing this.
>
> [...]
>
> 4. Often times in Ikarus, I special case some primitives when I know
> the values of part of their inputs. For example, (vector-ref x 4)
> is compiled differently from (vector-ref '#(1 2 3) y) which is also
> compiled differently from (vector-ref x y). A test-suite generator
> can produce four tests from a single (vector-ref '#(1 2 3) 1). This
> can produce too many tests for large number of constants but may be
> useful in some cases.
On some of the details above, I worry about pushing the test suite's
role too far. Implementations will need a lot more tests than are
useful to try to share. But if it fits into a unified suite easily
enough, so much the better.
Matthew
It seems a lot of the tests are still like that, using record names as
expressions. I have brought up this 'issue' before with Aziz, who in
turn explained the single namespace approach (I however tend to
disagree as I feel record names should not interfere, ie they are
separate).
So what is the situation here?
1. Can record names be used in expressions? [I agree with Aziz, that
it should not be allowed, is there any reason why the record name is
simply not wrapped in record-type-descriptor syntax?]
2. Should record names interfere with variable names? [I feel this
should not happen]
Cheers
leppie
Can you point me to examples?
I note that condition names are used directly in `test/exn' forms,
but the expansion of `test/exn' now wraps the condition name with
`record-type-descriptor'. It sounds like I've missed other places,
though.
> 1. Can record names be used in expressions?
Not in portable code, as far as I can tell, though it seems that R6RS
allows an implementation to allow a record name as an expression.
> 2. Should record names interfere with variable names?
Sorry - I'm not sure what you mean.
Matthew
Hi again
Sorry for posting the bug here, but I see this too much!
Line 44 of hashtables.sls: should be eqv?, not eq? EVER! (this goes
for the slatex and compiler benchmarks from Larceny too)
Also in line 161 of hashtables.sls: string-ci-hash and string-hash are
not comparable, unless you define string-ci-hash as:
(define (string-ci-hash str)
(string-hash (string-downcase str)))
But that is inefficient.
IMO the test is wrong, it should compared string-ci-hash to string-ci-
hash with different cased strings.
Cheers
leppie
I think it should be `eq?' for an `eq?'-based hashtable, but I see
your point for other hashtables. I've adjusted the test to vary the
comparison based on the kind of hash table being testing.
> Also in line 161 of hashtables.sls: string-ci-hash and string-hash are
> not comparable, unless you define string-ci-hash as:
> (define (string-ci-hash str)
> (string-hash (string-downcase str)))
>
> But that is inefficient.
>
> IMO the test is wrong, it should compared string-ci-hash to string-ci-
> hash with different cased strings.
Right.
These bugs are now fixed in SVN.
Thanks,
Matthew
Say I do (on Ikarus):
> cons
#<procedure cons>
> (define-record-type cons)
> cons
Unhandled exception
Condition components:
1. &who: cons
2. &message: "invalid expression"
3. &syntax:
form: cons
subform: #f
4. &trace: #<syntax cons>
Why should a record name interfere? Now it seems 'cons' is some kind
of syntax, but yet it is only usable in 2 places (record-type-
descriptor & record-constructor-descriptor).
> I think it should be `eq?' for an `eq?'-based hashtable, but I see
> your point for other hashtables. I've adjusted the test to vary the
> comparison based on the kind of hash table being testing.
Yeah, I missed that :)
In tests/r6rs/io/ports.sls, I suggest changing the last
three calls to open-file-input/output-port to use a
transcoder with the latin-1-codec instead of utf-8-codec.
The R6RS does not specify a semantics for opening a file
for input/output with a variable-width codec. The R6RS
editors who insisted upon adding textual input/output
ports to the R6RS had the Posix semantics in mind, but
the Posix semantics for input/output ports assumes a
fixed-width encoding. Since the semantics of the R6RS
input/output ports with variable-width transcoding is
implementation-dependent, I believe the R6RS allows or
should allow implementations to raise an exception when
a program attempts to open a file for mixed input/output
with a variable-width encoding. Larceny does that, and
I believe this aspect of Larceny's behavior is a feature,
not a bug. Changing the test suite to use a fixed-wdith
encoding here would not alter the nature of the test.
The other change I suggest would reduce the code size of
tests/r6rs/arithmetic/fixnums.sls. Its large code size
is caused by this code:
;; If you put N numbers here, it expads to N^3 tests!
(carry-tests 0
[0 1 2 -1 -2 38734 -3843 2484598
-348732487 (greatest-fixnum) (least-fixnum)])
The following change would reduce those 1331 tests to
500 without losing much in the way of test coverage:
(carry-tests 0 [0 1 2 -1 -2])
(carry-tests 0 [2 -1 -2 38734 -3843])
(carry-tests 0 [2 -1 -2 (greatest-fixnum) (least-fixnum)])
(carry-tests 0 [-3843 2484598 -348732487
(greatest-fixnum) (least-fixnum)])
Thanks again for making these tests available to other
implementors of the R6RS.
Will
Done in SVN.
> The other change I suggest would reduce the code size of
> tests/r6rs/arithmetic/fixnums.sls. Its large code size
> is caused by this code:
>
> ;; If you put N numbers here, it expads to N^3 tests!
> (carry-tests 0
> [0 1 2 -1 -2 38734 -3843 2484598
> -348732487 (greatest-fixnum) (least-fixnum)])
>
> The following change would reduce those 1331 tests to
> 500 without losing much in the way of test coverage:
>
> (carry-tests 0 [0 1 2 -1 -2])
>
> (carry-tests 0 [2 -1 -2 38734 -3843])
>
> (carry-tests 0 [2 -1 -2 (greatest-fixnum) (least-fixnum)])
>
> (carry-tests 0 [-3843 2484598 -348732487
> (greatest-fixnum) (least-fixnum)])
I'm less sold on this one. Don't we want tests combining
`(greatest-fixnum)' with 0 and with 1, for example?
In fact, in SVN, I've made it worse. The `carry-tests' form was
meant to expand to tests of `fx+/carry' and `fx-/carry' in addition
to `fx*/carry'. So, now there are 3 times as many tests, instead of
1/3 as many tests. And even if I had re-organized as above, adding the
missing operators would leave us with more tests than before.
But it does seem unlikely that 3993 tests are really necessary.
Any ideas that will shrink the set of tests enough, even with the
added operators?
Thanks,
Matthew
There's a really simple solution: instead of generating N test
expressions, just iterate over the inputs. (I was too stuck on
macros, which is a convenient way to track the original expression,
but obviously not the only way.)
Done in SVN.
Matthew
I didn't write either of those benchmarks. Spurred by
your message, I went through the R6RS version of the
latex benchmark and corrected all uses of eq? and memq
on characters. You can now obtain the corrected version
from Larceny's Trac site. I'll check in a corrected
version of the R5RS slatex benchmark after I test it.
The more important thing, of course, is to correct the
slatex program itself, which I haven't done.
With the compiler benchmark, it's harder to tell which
uses of eq?, memq, and assq need to be corrected. It
looks as though the easiest way to find out is to add
some instrumentation to the code. That's why I haven't
yet corrected the compiler benchmark, but I plan to get
it done eventually.
Will
Thanks :)
I managed to get slatex working, totally gave up on the compiler
one :)
Cheers
leppie
- Line 80 of test.sls
- Line 264 & line 266 of flonums.sls
Cheers
leppie
Fixed in SVN.
Thanks,
Matthew
I have released Ypsilon 0.9.5-update2 at:
http://code.google.com/p/ypsilon/downloads/list
It recognize '.sls' extension and autoload test suite libraries. :)
I write the description for "Hints on running the tests" section.
Please check.
===============================
Ypsilon 0.9.5-update2
------
Put this directory at "<somewhere>/tests/r6rs" and run with "run.sps"
cd <somewhere>
ypsilon --sitelib=. --no-letrec-check tests/r6rs/run.sps
or run an individual library's test, such as "run/program.sps" as
cd <somewhere>
ypsilon --sitelib=. --no-letrec-check tests/r6rs/run/program.sps
================================
*Because Ypsilon check letrec restriction violation during macro
expansion, the expression "(letrec ((x y) (y x)) 'should-not-get-
here)" in test/r6rs/base.sls raises exception during load. I added '--
no-letrec-check' to avoid this problem.
Finally, running all tests on Ypsilon 0.9.5-update2 show "93 of 8836
tests failed" :)
Thank you for your great work!
---
Yoshikatsu Fujita
Great! I've updated "README.txt" in SVN.
Thanks,
Matthew
> *Because Ypsilon check letrec restriction violation during macro
> expansion, the expression "(letrec ((x y) (y x)) 'should-not-get-
> here)" in test/r6rs/base.sls raises exception during load. I added '--
> no-letrec-check' to avoid this problem.
I believe that the R6RS requires that this check be done at runtime
(at least sometimes). In section 11.4.6, discussing `letrec', it says,
under Implementation Responsibilities (emphasis mine):
Implementations must detect references to a <variable> during the
EVALUATION of the <init> expressions (using one particular evaluation
order and order of evaluating the <init> expressions).
There are some examples where it would be hard (or impossible) to
determine during macro expansion whether the letrec restriction will
be violated. For example (letrec ([x (if (eq? (cons 1 2) (cons 1 2))
x 1)]) x) should always return 1, but the analysis is not obvious.
sam th
That's a good test to add :)
Thank you for your input.
Now I understand why the letrec violation raises &asserion but not
&syntax!
I'm very happy to know that :)
On Jul 26, 10:25 pm, leppie <xacc....@gmail.com> wrote:
> On Jul 26, 2:59 pm, samth <sam...@gmail.com> wrote:
>
> > For example (letrec ([x (if (eq? (cons 1 2) (cons 1 2))
> > x 1)]) x) should always return 1, but the analysis is not obvious.
>
> That's a good test to add :)
Yes, it is! :)
Thanks again, Matthew.
Will
--------
Summary of results for
R6RS test suite for PLT Scheme (revision 10978)
29 July 2008
system tested:
Larceny v0.963
test machine:
MacBook Pro (2.4 GHz Intel Core 2 Duo, 4 GB RAM)
Mac OS X 10.5.3
45 of 8843 tests failed.
library under test summary of outcome
================== ==================
(rnrs arithmetic bitwise) 232 tests passed
(rnrs arithmetic fixnums) 4355 tests passed
(rnrs arithmetic flonums) 1 of 365 tests failed.
(rnrs base) 8 of 1995 tests failed.
(rnrs bytevectors) 464 tests passed
(rnrs conditions) 131 tests passed
(rnrs control) 11 tests passed
(rnrs enums) 26 tests passed
(rnrs eval) 3 tests passed
(rnrs exceptions) 10 tests passed
(rnrs hashtables) 249 tests passed
(rnrs io ports) 35 of 431 tests failed.
(rnrs io simple) 56 tests passed
(rnrs lists) 74 tests passed
(rnrs mutable-pairs) 3 tests passed
(rnrs mutable-strings) 3 tests passed
(rnrs programs) 2 tests passed
(rnrs r5rs) 71 tests passed
(rnrs reader) 70 tests passed
(rnrs records procedural) 21 tests passed
(rnrs records syntactic) 53 tests passed
(rnrs sorting) 4 tests passed
(rnrs syntax-case) 1 of 96 tests failed.
(rnrs unicode) 118 tests passed
Looking good :)
I still got almost 140 in IronScheme!
Cheers
leppie
Hi Matthew
Can the some of the arithmetic tests be adjusted to detect whether a
signed zero is a supported? Eg:
(log -1.0-0.0i)
=> 0.0-3.141592653589793i ; approximately
; if -0.0 is distinguished
The tests currently requires you distinguish between +0.0 and -0.0.
You could also add the following test: (test (= +nan.0 +nan.0) #f)
Cheers
leppie
Also overflow and divide by zero tests for fixnums :)
I've wrapped the `log' and `angle' tests with
(unless (eqv? 0.0 -0.0) ....)
Is that right? Are there other tests that need a wrapper?
> You could also add the following test: (test (= +nan.0 +nan.0) #f)
Ok - added.
Matthew
Added and committed in SVN.
I'm not sure what to do with `fldiv' and company. The spec for
`div' says that the first argument cannot be infinite, for example,
but `fldiv' doesn't say.
Thanks,
Matthew
Not that I can see for now.
On Jul 31, 10:45 pm, Matthew Flatt <mfl...@cs.utah.edu> wrote:
> Added and committed in SVN.
>
> I'm not sure what to do with `fldiv' and company. The spec for
> `div' says that the first argument cannot be infinite, for example,
> but `fldiv' doesn't say.
No idea either :)
Thanks
leppie
New version has passed all 8886 tests in R6RS test suite revision
11016. :)
I write the description for "Hints on running the tests" section for
Ypsilon 0.9.6 as follows. Please update README.txt.
===============================
Ypsilon 0.9.6
------
Put this directory at "<somewhere>/tests/r6rs" and run with "run.sps"
cd <somewhere>
ypsilon --sitelib=. --clean-acc tests/r6rs/run.sps
or run an individual library's test, such as "run/program.sps" as
cd <somewhere>
ypsilon --sitelib=. --clean-acc tests/r6rs/run/program.sps
================================
*I removed the '--no-letrec-check' option because it is default
behavior of 0.9.6 and that option is deprecated.
*I added the '--clean-acc' option to force Ypsilon clean the auto-
compile-cache and run a fresh code.
Thank you again for your great test suite!
---
Yoshikatsu Fujita
(test (exact? (imag-part 0.0)) #t)
(test (exact? (imag-part 1.0)) #t)
(test (exact? (imag-part 1.1)) #t)
(test (exact? (imag-part +nan.0)) #t)
(test (exact? (imag-part +inf.0)) #t)
(test (exact? (imag-part -inf.0)) #t)
(test (zero? (imag-part 0.0)) #t)
(test (zero? (imag-part 1.0)) #t)
(test (zero? (imag-part 1.1)) #t)
(test (zero? (imag-part +nan.0)) #t)
(test (zero? (imag-part +inf.0)) #t)
(test (zero? (imag-part -inf.0)) #t)
http://www.r6rs.org/final/html/r6rs-rationale/r6rs-rationale-Z-H-2.html#node_toc_node_sec_11.6.5
Cheers
leppie
Here is another tricky one, but I am not sure if it should assert or
not.
(letrec ((a (lambda () b))(b a)) (b))
(letrec* ((a (lambda () b))(b a)) (b))
letrec letrec*
IronScheme: assert ok
Ypsilon: assert assert
Ikarus: ok ok
PetiteChez: assert ok (not R6RS)
Any ideas? ;-)
Cheers
leppie
The following should assert though and does so in all but Ikarus.
(letrec ((a b)(b (lambda () a))) (b))
(letrec* ((a b)(b (lambda () a))) (b))
Cheers
leppie
Ikarus does not check on letrec[*] restriction as noted in bug report
https://bugs.launchpad.net/ikarus/+bug/216832 .
The first test must signal an assertion violation, the second
must not.
Make sure symbols are delimited in the output, so it can be reread.
(write 'a port)(write 'b port)(write 'c port)
Also a few 'silly' tests, but never the less required. These can be
easily overlooked if you are using some underlying runtime's type
system (eg .NET, Java).
(test (eq? 3 3.0) #f)
(test (eqv? 3 3.0) #f)
(test (equal? 3 3.0) #f)
(test (= 3 3.0) #t)
(test (eq? 3.0 3) #f)
(test (eqv? 3.0 3) #f)
(test (equal? 3.0 3) #f)
(test (= 3.0 3) #t)
;(test (eq? 3/1 3) #f)
(test (eqv? 3/1 3) #t)
(test (equal? 3/1 3) #t)
(test (= 3/1 3) #t)
(not sure about the commented test, can be either?)
Cheers
leppie
Implementations of the R6RS are required to output "abc"
for that test, with no delimiter. R6RS Library section 8.3
says "The write procedure operates in the same way as
put-datum; see section 8.2.12." R6RS Library section 8.2.12
says
The put-datum procedure merely writes the external
representation, but no trailing delimiter. If
put-datum is used to write several subsequent
external representations to an output port, care
should be taken to delimit them properly so they
can be read back in by subsequent calls to get-datum.
As was pointed out during the R6RS discussion, that
paragraph is ambiguous as to whether the implementation
or the programmer is to take "care", but the discussion
and the editors' response to formal comment 131 made
clear that it was the programmer who was to take care;
implementations are required *not* to delimit [1,2,3,4,5,6].
> ;(test (eq? 3/1 3) #f)
> (test (eqv? 3/1 3) #t)
> (test (equal? 3/1 3) #t)
> (test (= 3/1 3) #t)
>
> (not sure about the commented test, can be either?)
The result of the commented test must be a boolean, but
can be either #t or #f.
Will
[1] http://www.r6rs.org/formal-comments/comment-131.txt
[2] http://lists.r6rs.org/pipermail/r6rs-discuss/2007-June/002566.html
[3] http://lists.r6rs.org/pipermail/r6rs-discuss/2007-June/002587.html
[4] http://lists.r6rs.org/pipermail/r6rs-discuss/2007-June/002588.html
[5] http://lists.r6rs.org/pipermail/r6rs-discuss/2007-June/002590.html
[6] http://lists.r6rs.org/pipermail/r6rs-discuss/2007-June/002591.html
Thanks :) Was not really clear to me.
>> (write 'a port)(write 'b port)(write 'c port)
> Implementations of the R6RS are required to output "abc"
> for that test, with no delimiter.
Good news for me :) Not sure about the others.
Cheers
leppie
I'm sure they would be. Thanks.
> By chance, I'm using a very similar test setup. I currently
> have about 4600 tests, and hope someday to finish the r5rs
> scheme funcs. The tests are currently generated by numtst.c
> in the Snd tools directory (at sourceforge, or available
> via anonymous ftp at ccrma-ftp). The code assumes linux
> currently, since this was just aimed at my own interests.
Can you be a little more specific about the anonymous ftp
directory? I couldn't find it at ccrma-ftp.stanford.edu.
Will
It's in the snd tarball:
Got it, and got it running in Larceny. Will report later.
Thanks!
Will
Nice testsuite, it found a bug in Gambit's complex acos. Here are the
changes needed to make it work with gambit (these are the differences
with the output of "./numtest guile"):
diff test.scm ~/Desktop/snd-9/tools/
1,2d0
< (define (1+ x)
< (+ 1 x))
11c9
< (if (char=? c #\~)
---
> (if (char=? c #~)
17c15
< (if (member c (list #\A #\D #\F))
---
> (if (member c (list #A #D #F))
21,22c19,21
< (if (char=? c #\%)
< (set! result (string-append result (string #\newline)))
---
> (if (char=? c #%)
> (set! result (string-append result (string #
> ewline)))
26,29c25,26
< (define-macro (test tst expected)
< `(let ((result
< (with-exception-handler (lambda e 'error)
< (lambda () ,tst))))
---
> (defmacro test (tst expected)
> `(let ((result (catch #t (lambda () ,tst) (lambda args 'error))))
To compile on Cygwin:
gcc -mno-cygwin -o numtest numtst.c
Else complex.h seems to be not included :|
Cheers
leppie
I am not however sure it is generating the output correctly. I get
lines like:
(if (not (eqv? 0 (string->number (number->string 0))))
(display (format #f ";string<->number ~A -> ~A -> ~A?~%"
42831282186485760 (number->string 42784196460019715) (string->number
(number->string 9848720786980866)))))
(if (not (eqv? 4294967297 (string->number (number->string
4294967297))))
(display (format #f ";string<->number ~A -> ~A -> ~A?~%"
42831282186485761 (number->string 42784196460019715) (string->number
(number->string 9848720786980866)))))
(if (not (eqv? 8589934594 (string->number (number->string
8589934594))))
(display (format #f ";string<->number ~A -> ~A -> ~A?~%"
42831282186485762 (number->string 42784196460019715) (string->number
(number->string 9848720786980866)))))
Note the (display) section, the numbers are completely different.
Also the following:
(test (expt 2147608202051587/5299989643265
1061850598/4742962578592890880) 0.99030354920325)
(test (expt 2147612497018877/5299989643265
1061850598/4742962578592890880) 0.99030033992383+0.00252117260474i)
Chez Petite gives me:
> (expt 2147608202051587/5299989643265 1061850598/4742962578592890880)
1.0000000013442614
> (expt 2147612497018877/5299989643265 1061850598/4742962578592890880)
1.0000000013442618
Also several places with invalid rationals:
(test (cos 4294967297/-9223372036854775808) 0.54030230586814)
(test (cos 8589934593/-9223372036854775808) 0.87758256189037)
Cheers
leppie
> I am not however sure it is generating the output correctly. I get
> lines like:
> .....
I wonder if fprintf is failing/overflowing on %lld, eg:
fprintf(fp, "(test (expt %lld %lld) ", int_args[i], -int_args[j]);
Will experiment a bit more.
Cheers
leppie
Doing: printf("%d\n", sizeof(off_t));
prints 4.
And here is a ugly fix (insert somewhere near top):
#undef off_t
#define off_t long long
Then it will print 8.
And it continues to generate seamingly correct tests. :)
Thanks
leppie
(max 1.23+1.0i)
(min 1.23+1.0i)
(log 1.0+23.0i 1.0+23.0i)
For the third, Larceny returns 1.0, which is correct,
and is an extension allowed by the R5RS and mandated
by the R6RS.
I'm not sure the following tests are functioning as
intended:
(expt 1/500029
362880/3)
(expt -1/500029
362880/3)
Note that 362880/3 = 120960, that the results of both
tests should be an exact rational number, and that
their denominators should contain over half a million
decimal digits. Had the second argument been 362881/3,
Larceny would have returned 0.0 and been quick about it.
Will
Cool!
Did you convert the test to R6RS format? If so, would you please be
kind enough to share it?
Thanks
leppie
Very useful indeed. Ikarus now shows only a few errors in
rationalize and expt given complex arguments. Thanks for
making the tests available.
Aziz,,,
(define (ex-rationalize x y)
;;; just to get exact results for easier comparison
(rationalize (exact x) (exact y)))
;(ex-rationalize 1.0 1.0) got 0, but expected 1
;(ex-rationalize -1.0 1.0) got 0, but expected -1
;(ex-rationalize 3.14159265358979 0.1) got 16/5, but expected 22/7
;(ex-rationalize -3.14159265358979 0.1) got -16/5, but expected -22/7
;(ex-rationalize 3.14159265358979 1e-3) got 201/64, but expected
333/106
;(ex-rationalize -3.14159265358979 1e-3) got -201/64, but expected
-333/106
;(ex-rationalize 2.71828182845905 1.0) got 2, but expected 3
;(ex-rationalize -2.71828182845905 1.0) got -2, but expected -3
;(ex-rationalize 2.71828182845905 3e-3) got 68/25, but expected 87/32
;(ex-rationalize -2.71828182845905 3e-3) got -68/25, but expected
-87/32
;(ex-rationalize 2.71828182845905 2e-5) got 878/323, but expected
1264/465
;(ex-rationalize -2.71828182845905 2e-5) got -878/323, but expected
-1264/465
;(ex-rationalize 1234.1234 0.1) got 6171/5, but expected 9873/8
;(ex-rationalize -1234.1234 0.1) got -6171/5, but expected -9873/8
;(ex-rationalize 1234.1234 1e-3) got 60472/49, but expected 90091/73
;(ex-rationalize -1234.1234 1e-3) got -60472/49, but expected
-90091/73
;(ex-rationalize 1.23400000001234e9 1e-3) got 92550000001/75, but
expected 99954000001/81
;(ex-rationalize -1.23400000001234e9 1e-3) got -92550000001/75, but
expected -99954000001/81
;(ex-rationalize 1.23400000001234e9 3e-3) got 81444000001/66, but
expected 99954000001/81
;(ex-rationalize -1.23400000001234e9 3e-3) got -81444000001/66, but
expected -99954000001/81
;(ex-rationalize 0.33 1e-3) got 26/79, but expected 33/100
;(ex-rationalize -0.33 1e-3) got -26/79, but expected -33/100
;(ex-rationalize 0.33 3e-3) got 18/55, but expected 33/100
;(ex-rationalize -0.33 3e-3) got -18/55, but expected -33/100
;(ex-rationalize 0.9999 1.0) got 0, but expected 1
;(ex-rationalize -0.9999 1.0) got 0, but expected -1
;(ex-rationalize 0.9999 2e-5) got 8333/8334, but expected 9999/10000
;(ex-rationalize -0.9999 2e-5) got -8333/8334, but expected
-9999/10000
;(ex-rationalize 0.501 1.0) got 0, but expected 1
;(ex-rationalize -0.501 1.0) got 0, but expected -1
;(ex-rationalize 0.501 1e-3) got 126/251, but expected 250/499
;(ex-rationalize -0.501 1e-3) got -126/251, but expected -250/499
;(ex-rationalize 0.501 2e-5) got 246/491, but expected 250/499
;(ex-rationalize -0.501 2e-5) got -246/491, but expected -250/499
;(ex-rationalize 0.499 1e-3) got 125/251, but expected 249/499
;(ex-rationalize -0.499 1e-3) got -125/251, but expected -249/499
;(ex-rationalize 0.499 2e-5) got 245/491, but expected 249/499
;(ex-rationalize -0.499 2e-5) got -245/491, but expected -249/499
;(ex-rationalize 1.501 1.0) got 1, but expected 2
;(ex-rationalize -1.501 1.0) got -1, but expected -2
;(ex-rationalize 1.501 2e-5) got 737/491, but expected 749/499
;(ex-rationalize -1.501 2e-5) got -737/491, but expected -749/499
;(ex-rationalize 1.499 2e-5) got 736/491, but expected 748/499
;(ex-rationalize -1.499 2e-5) got -736/491, but expected -748/499
oh good grief -- I think you're right -- I didn't read the spec
closely enough. That's distressing because the continued fraction
trick is fast. Were there any other errors?
> I'm not sure the following tests are functioning as
> intended:
>
> (expt 1/500029
> 362880/3)
> (expt -1/500029
> 362880/3)
>
> Note that 362880/3 = 120960, that the results of both
> tests should be an exact rational number, and that
> their denominators should contain over half a million
> decimal digits. Had the second argument been 362881/3,
> Larceny would have returned 0.0 and been quick about it.
Gambit seems to get an answer pretty quickly:
> (define a (time (expt 1/500029 362880/3)))
(time (expt 1/500029 120960))
96 ms real time
96 ms cpu time (92 user, 4 system)
4 collections accounting for 2 ms real time (0 user, 0 system)
17791000 bytes allocated
1252 minor faults
no major faults
> (integer-length (denominator a))
2289973
> (* (integer-length 500029) 120960)
2298240
I'm not suggesting the test stay, just that I didn't notice that it
was unreasonable.
On the other hand, your suggested test *is* much faster ;-):
> (define a (time (expt 1/500029 362881/3)))
(time (expt 1/500029 362881/3))
0 ms real time
0 ms cpu time (0 user, 0 system)
no collections
1280 bytes allocated
18 minor faults
no major faults
Brad
> Gambit seems to get an answer pretty quickly:
>
> > (define a (time (expt 1/500029 362880/3)))
>
> (time (expt 1/500029 120960))
> 96 ms real time
> 96 ms cpu time (92 user, 4 system)
> 4 collections accounting for 2 ms real time (0 user, 0 system)
> 17791000 bytes allocated
> 1252 minor faults
> no major faults> (integer-length (denominator a))
> 2289973
> > (* (integer-length 500029) 120960)
>
> 2298240
>
> I'm not suggesting the test stay, just that I didn't notice that it
> was unreasonable.
I don't think it was unreasonable either:
Ikarus Scheme version 0.0.3+ (revision 1586, build 2008-08-11, 64-bit)
Copyright (c) 2006-2008 Abdulaziz Ghuloum
> (optimize-level 0) ;;; otherwise it will be folded at compile time
> (collect) ;;; and (time ---) will show 0.
> (define a (time (expt 1/500029 362880/3)))
running stats for (expt 1/500029 120960):
no collections
206 ms elapsed cpu time, including 0 ms collecting
207 ms elapsed real time, including 0 ms collecting
2133296 bytes allocated
> (bitwise-length (denominator a))
2289973
I ran into the following problems. Most of them are minor
bugs or portability issues in the numtst.c program that
generated the tests.
* the macro generated by numtst.c doesn't detect an
error result when no error was expected. The R6RS
program repairs this problem.
* sinh, cosh, tanh, asinh, acosh, and atanh aren't part
of the R5RS or R6RS, and my feeble attempts to define
them weren't worthy of being tested. The R6RS program
bypasses those tests.
* When rationalize is passed an inexact argument, it is
required to return an inexact result. (Many R5RS
systems got that wrong, which was one of the reasons
for making the R6RS more explicit about such issues.)
The R6RS program bypasses tests that pass an inexact
argument to rationalize but expect an exact result.
* The output of numtst.c contains references to the
following unbound variables:
snd-display
search
arg
1+
double-0
double-1
double-2
double-3
double-4
double-6
The R6RS program defines those things; the first four
shouldn't ever be referenced, and the intended values
of the six doubles weren't too hard to figure out.
* The tests contain expressions followed by definitions,
which is not allowed by the R6RS. The R6RS program
solved that by splitting the tests into two separate
libraries.
The current development version of Larceny fails two of
the tests performed by the program I have just posted:
;(tan 1234000000/3) got -18.78095517910799, but expected
-18.7821359357167
;(log 1.0+23.0i 1.0+23.0i) got 1.0+0.0i, but expected error
Failed 2 of 4786 tests.
The second failure is actually an error in the test, since
the R6RS requires log to accept two arguments, and 1.0+0.0i
is the correct result.
Thanks again to Bill for making these tests available.
Will
No one has said it was unreasonable. The question was whether
it was functioning as intended, since (1) 362880/3 is an odd
way to write 120960 and (2) the expected result was 0.0, not
an exact rational with hundreds of thousands of digits in the
denominator.
Gambit and Ikarus have notably good performance on bignum
arithmetic, but Larceny's bignum performance is notoriously
poor. That's why I noticed the issue, and why you probably
wouldn't have.
Will
Are the R6RS violations (lexical or otherwise) in that
program intentional or are they a result of larceny being
"R6RS compatible"?
I would guess that the answer is, at least in part, "Yes." In
particular, the one lexical R6RS violation I see is probably an
artifact of the developer not including a #!r6rs token at the top of
the file when writing the code.
After adding the #!r6rs token, Larceny promptly signals an error when
I try to run the program:
Error: no handler for exception #<record &compound-condition>
Compound condition has these components:
#<record &error>
#<record &who>
who : get-datum
#<record &message>
message : "Lexical Error: Illegal symbol syntax: 1+ "
#<record &irritants>
irritants : (#<INPUT PORT trigtest.sps>)
Terminating program execution.
I believe this particular problem (of the illegal 1+ syntax) is itself
easy to fix, by renaming 1+ to add1. (You do need to be sure to only
rename the occurrences of the 1+ token on its own, and not the
occurrences of 1+ within various number literals within the file.)
The only R6RS violation I can find is the lexical
violation, which was unintentional. The 1+ tokens
were in the code generated by numtst.c, and I was
trying to leave that code as pristine as possible.
I should have tested the program with #!r6rs at
the top before I posted it.
I have renamed 1+ to add1 and updated the program
at the link cited above.
Will
Thanks Will and Bill!
On IronScheme: Failed 1479 of 4786 tests. :|
A few simple ones, but most related to complex argument and complex
results.
Time to study some math again :)
Cheers
leppie
> On IronScheme: Failed 1479 of 4786 tests. :|
Down to 98 :)
Bill:
Because .501 and .001 are converted to dyadic rationals (IEEE floating
point numbers) before the computation, I'm not sure these tests are
correct on machines with 64-bit IEEE floating point:
;(rationalize .501 .001) got .50199203187251, but expected 1/2
;(rationalize -.501 .001) got -.50199203187251, but expected -1/2
;(rationalize .499 .001) got .49800796812749004, but expected 1/2
;(rationalize -.499 .001) got -.49800796812749004, but expected -1/2
For example, on gambit I get:
> (<= (- (inexact->exact .501) 1/2) (inexact->exact .001))
#f
> (<= (- 1/2 (inexact->exact .499)) (inexact->exact .001))
#f
so 1/2 is not within .001 of .501 on a binary machine.
Also, the r5rs standard says
Note that 0 = 0/1 is the simplest rational of all.
so I think these tests should expect 0. as the answer:
;(rationalize 1. 1.) got 0., but expected 1
;(rationalize -1. 1.) got 0., but expected -1
About the gambit test suite, numtst.c should have the following small
fix:
[brad:~/Desktop/snd-9/tools] lucier% rcsdiff -u numtst.c
===================================================================
RCS file: RCS/numtst.c,v
retrieving revision 1.1
diff -u -r1.1 numtst.c
--- numtst.c 2008/08/12 23:33:13 1.1
+++ numtst.c 2008/08/13 00:00:50
@@ -1435,7 +1435,7 @@
if (strcmp(scheme_name, "gambit") == 0)
- fprintf(stderr, "\n\
+ fprintf(fp, "\n\
(define-macro (test tst expected)\n\
`(let ((result\n\
(with-exception-handler (lambda e 'error)\n\
Finally, r5rs doesn't have tanh, sinh, or cosh; I don't know whether
you'd want these tests included just for the Schemes that support
these functions (gambit doesn't).
So, your test suite found two more bugs in gambit (with single
argument gcd and lcm); pretty good.
Brad
http://www.ccs.neu.edu/home/will/R6RS/
The R6RS test program now checks the exactness of
results as well as their numerical values.
Bill's new tests found two to six ancient bugs in
Larceny, depending on how you count. Thank you,
Bill.
I agree with Brad Lucier's comments concerning
rationalize. In addition, both the R5RS and the
R6RS specify that the ordering predicates take two
or more arguments. For the R6RS version, the test
macro eliminates tests that expect a boolean result
when passing only one argument to a comparison. The
test macro also eliminates tests of the hyperbolic
trig functions, but those tests are included within
the program and it would be easy to modify the test
macro to enable them.
Will
According to R6RS 11.7.1, both 0 and 0.0 are correct
results for (* 0 1.0).
That is part of why run-test-with-correct-exactness
(in the R6RS test program) is so complicated. (It's
so complicated that I just found a bug in it: It was
allowing a very few inexact results that are illegal.
The program at http://www.ccs.neu.edu/home/will/R6RS/
fixes that bug.)
So far as I know, the R5RS test programs generated by
numtst.c aren't checking for the correct exactness.
If you intend to add exactness checking to the output
of numtst.c, you'll have to add some processing along
the lines of run-test-with-correct-exactness and
run-test in the R6RS test program.
The rules that govern exactness in the R5RS aren't
quite so clear as in the R6RS, so your exactness
processing would have to be more complex for R5RS
than for R6RS.
Will
What about these? From what I can see, it could be exact or inexact.
;(expt 0 1e-08) got 0, but expected 0.0
;(expt 0 1.0) got 0, but expected 0.0
;(expt 0 3.14159265358979) got 0, but expected 0.0
;(expt 0 2.71828182845905) got 0, but expected 0.0
;(expt 0 1234.0) got 0, but expected 0.0
;(expt 0 1234000000.0) got 0, but expected 0.0
Cheers
leppie
According to both the R5RS and the R6RS:
(expt 0 0) => 1
(expt 0 1) => 0
The result of (expt 0 z) therefore depends upon the
value of z. An inexact z means its value is uncertain.
Therefore the result of (expt 0 z), when z is inexact,
is also uncertain, and must be flagged as inexact under
the general rule stated in the next-to-last paragraph
of R6RS 11.7.1; (expt 0 z) does not qualify for the
exception to the general rule that is spelled out in
the last paragraph of R6RS 11.7.1.
See also
http://lists.r6rs.org/pipermail/r6rs-discuss/2008-August/003555.html
Will
Thanks Will, but what about this as shown in the R6RS:
(expt 0 5+.0000312i) => 0
From what you say this should return 0.0, which will make this an
error (in the spec).
Btw, I do agree with what you are saying, I just want to make sure
about the required behavior :)
Cheers
leppie
> See also
>
> http://lists.r6rs.org/pipermail/r6rs-discuss/2008-August/003555.html
Ha, guess I should have read that first before replying!
Thanks
leppie
I found the following two tests useful, as they tickle a rarely-needed
bit of code in Knuth's algorithm for long division using 16- and 32-
bit words:
> (quotient 295147905149568077200 34359738366)
8589934591
> (remainder 295147905149568077200 34359738366)
21754858894
> (quotient 696898287454081973170944403677937368733396 1180591620717411303422)
590295810358705651711
> (remainder 696898287454081973170944403677937368733396 1180591620717411303422)
314390899110894278354
and you get some insight why when you convert these to strings in base
2:
> (map (lambda (x) (number->string x 2)) '(696898287454081973170944403677937368733396 1180591620717411303422 295147905149568077200 34359738366))
("1111111111111111111111111111111111111111111111111111111111111111111100100010000101100001100110110010000011011101000100001101001011011010100"
"1111111111111111111111111111111111111111111111111111111111111111111110"
"11111111111111111111111111111111100100010000101100001100110110010000"
"11111111111111111111111111111111110")
Brad
For an R6RS version, see
http://www.ccs.neu.edu/home/will/R6RS/
Will