The R6RS test suite for PLT Scheme is written as a collection of R6RS libraries, and we hope that it can be useful to other R6RS implementors. We'd very much like to have R6RS implementors and users contribute to the test suite.
You can find the test suite in the PLT Scheme SVN repository:
So far, we've tried to get basic coverage of the standard: using each function and syntactic form at least once, and trying the interesting input cases for each.
The content of the "README.txt" file follows. As you can see, we haven't had much success running the test suite on some other implementations. For Ikarus and Larceny, the problems are not merely how to load the libraries that contain the tests; the test suite's organization prevents it from loading when an implementation is missing small features. I'm not sure how to improve that without relying on `eval' (and the content below explains why I'd prefer to avoid `eval') or breaking up the tests into really small sets.
Any suggestions?
====================================================================== Files and libraries ======================================================================
Files that end ".sps" are R6RS programs. The main one is "main.sps", which runs all the tests.
Files that end ".sls" are R6RS libraries. For example, "base.sls" is a library that implements `(tests r6rs base)', which is a set of tests for `(rnrs base)'. Many R6RS implementations will auto-load ".sls" files if you put the directory of tests in the right place.
In general, for each `(rnrs <id> ... <id>)' in the standard:
* There's a library of tests "<id>/.../<id>.sls". It defines and exports a function `run-<id>-...<id>-tests'.
* There's a program "run/<id>/.../<id>.sps" that imports "<id>/.../<id>.sls", runs the tests, and reports the results.
And then there's "main.sps", which runs all the tests (as noted above). Also, "test.sls" implements `(tests r6rs test)', which implements the testing utilities that are used by all the other libraries.
====================================================================== Limitations and feedback ======================================================================
One goal of this test suite is to avoid using `eval' (except when specifcally testing `eval'). Avoiding `eval' makes the test suite as useful as possible to ahead-of-time compilers that implement `eval' with a separate interpreter. A drawback of the current approach, however, is that if an R6RS implementation doesn't supply one binding or does not support a bit of syntax used in a set of tests, then the whole set of tests fails to load.
A related problem is that each set of tests is placed into one function that runs all the tests. This format creates a block of code that is much larger than in a typical program, which might give some compilers trouble.
In any case, reports of bugs (in the tests) and new tests would be very much appreciated. File either as a PLT Scheme bug report at
====================================================================== Hints on running the tests ======================================================================
Ikarus ------
Put this directory at "<somewhere>/tests/r6rs" and run with "run.sps"
cd <somewhere> ikarus --r6rs-script tests/r6rs/run.sps
or run an individual library's test, such as "run/program.sps" as
cd <somewhere> ikarus --r6rs-script tests/r6rs/run/program.sps
As of Ikarus 0.3.0+ (revision 1548), many libraries fail to load, mostly because condition names like &error cannot be used as expressions.
Larceny -------
Put this directory at "<somewhere>/tests/r6rs" and run with "run.sps"
larceny -path <somewhere> -r6rs -program run.sps
or run an individual library's test, such as "run/program.sps" as
As of Larceny 0.962, many test suites (such as "base.sls") take too long and use too much memory to load on our machine; probably the test functions are too big.
PLT Scheme ----------
If you get an SVN-based or the "Full" nightly build, then these tests are in a `tests/r6rs' collection already. You can run all of the tests using
mzscheme -l tests/r6rs/run.sps
and so on.
Otherwise, install this directory as a `tests/r6rs' collection, perhaps in the location reported by
On Jul 22, 4:08 pm, Matthew Flatt <mfl...@cs.utah.edu> wrote:
> As of Ikarus 0.3.0+ (revision 1548), many libraries fail to load, > mostly because condition names like &error cannot be used as > expressions.
That would be because R6RS doesn't require it to work as an expression. I was confused about whether `record-type-descriptor' is required and even whether it's built into `condition-predicate'.
Thanks to Aziz for helping so quickly to sort out this test-suite bug! Most of the tests now load into Ikarus.
Matthew Flatt wrote: > The R6RS test suite for PLT Scheme is written as a collection of R6RS > libraries, and we hope that it can be useful to other R6RS > implementors.
Is it ever! Thank you, Matthew. This is great.
From the README.txt file:
> As of Larceny 0.962, many test suites (such as "base.sls") take too > long and use too much memory to load on our machine; probably the test > functions are too big.
I wrote a quick-and-dirty shell script that runs every library test program individually. In Larceny v0.962, two of the test programs appear to go into an infinite loop, and I don't yet know why. Several others failed to compile (for the reasons shown below), terminating before they could begin to execute. All other test programs ran to completion in at most 11 seconds on my test machine. That time includes compilation.
I'm posting a short summary of the test results below. I will determine the cause of each failed test and add the newly discovered bugs to Larceny's bug database. Most of them look easy to fix; writing the tests was the hard part. Thanks again for doing that.
Will
---- Summary of results for R6RS test suite for PLT Scheme (revision 10866) system tested: Larceny v0.962 (Jul 18 2008 04:26:20, precise:Posix Unix:unified) test machine: MacBook Pro (2.4 GHz Intel Core 2 Duo, 4 GB RAM) Mac OS X 10.5.3
library under test summary of outcome ================== ================== (rnrs arithmetic bitwise) 6 of 232 tests failed. (rnrs arithmetic fixnums) infinite loop during compilation? (rnrs arithmetic flonums) 3 of 365 tests failed. (rnrs base) unbound variable: angle (rnrs bytevectors) 464 tests passed (rnrs conditions) unbound variable: record-type- descriptor (rnrs control) 11 tests passed (rnrs enums) 2 of 26 tests failed. (rnrs eval) 3 tests passed (rnrs exceptions) 1 of 10 tests failed. (rnrs hashtables) 15 of 248 tests failed. (rnrs io ports) unbound variable: standard-error-port (rnrs io simple) 56 tests passed (rnrs lists) infinite loop during execution? (rnrs mutable-pairs) 3 tests passed (rnrs mutable-strings) 3 tests passed (rnrs programs) 1 of 2 tests failed. (rnrs r5rs) 71 tests passed (rnrs reader) 70 tests passed (rnrs records procedural) 21 tests passed (rnrs records syntactic) unbound variable: record-type- descriptor (rnrs sorting) 1 of 4 tests failed. (rnrs syntax-case) 24 of 95 tests failed. (rnrs unicode) 1 of 118 tests failed.
I'm thinking about refactoring the tests and expressing them as data, or specifications, but not code that the implementation has to run. A separate program should take this data and generate the test suite code as well as a driver program to drive the implementation being tested. A few reasons for this:
1. I would like the test suite to provide as much information as possible regarding conformance. Currently, one bug/error may take the rest down.
2. I'm not that comfortable having the implementation test itself by itself. I'd rather have both MzScheme and Ikarus testing Ikarus instead of have only Ikarus testing itself. This also allows for the test to be terminated should it diverge and to catch its error code should it crash. Unfortunately, there is no portable way of doing this.
3. This allows testing different evaluating strategies (e.g., you can run the tests as scripts, libraries, or using eval) without having to rewrite your tests. One can also test different optimization levels, etc., that an implementation may provide.
4. Often times in Ikarus, I special case some primitives when I know the values of part of their inputs. For example, (vector-ref x 4) is compiled differently from (vector-ref '#(1 2 3) y) which is also compiled differently from (vector-ref x y). A test-suite generator can produce four tests from a single (vector-ref '#(1 2 3) 1). This can produce too many tests for large number of constants but may be useful in some cases.
On Jul 22, 8:10 pm, Abdulaziz Ghuloum <aghul...@cee.ess.indiana.edu> wrote:
> I'm thinking about refactoring the tests and expressing them as data, > or specifications, but not code that the implementation has to run.
Ok.
Just in case it's not obvious: the test suite should still be represented as code, but the code can generate test descriptions instead of (as currently) test results.
By writing the test suite as code, I mean that functions or macros can be used to describe test cases in ways that fit the tests. For example, "condition.sls" uses a `test-cond' macro to generate a set of tests for a given condition type.
(And we could still have a set of R6RS programs that run tests in a simple way. Each program would import for expand some libraries that generate tests, and so on.)
> 2. I'm not that comfortable having the implementation test itself > by itself. I'd rather have both MzScheme and Ikarus testing Ikarus > instead of have only Ikarus testing itself. This also allows for > the test to be terminated should it diverge and to catch its error > code should it crash. Unfortunately, there is no portable way of > doing this.
> [...]
> 4. Often times in Ikarus, I special case some primitives when I know > the values of part of their inputs. For example, (vector-ref x 4) > is compiled differently from (vector-ref '#(1 2 3) y) which is also > compiled differently from (vector-ref x y). A test-suite generator > can produce four tests from a single (vector-ref '#(1 2 3) 1). This > can produce too many tests for large number of constants but may be > useful in some cases.
On some of the details above, I worry about pushing the test suite's role too far. Implementations will need a lot more tests than are useful to try to share. But if it fits into a unified suite easily enough, so much the better.
On Jul 23, 2:58 am, Matthew Flatt <mfl...@cs.utah.edu> wrote:
> On Jul 22, 4:08 pm, Matthew Flatt <mfl...@cs.utah.edu> wrote:
> > As of Ikarus 0.3.0+ (revision 1548), many libraries fail to load, > > mostly because condition names like &error cannot be used as > > expressions.
> That would be because R6RS doesn't require it to work as an > expression. I was confused about whether `record-type-descriptor' > is required and even whether it's built into `condition-predicate'.
> Thanks to Aziz for helping so quickly to sort out this test-suite > bug! Most of the tests now load into Ikarus.
> Matthew
It seems a lot of the tests are still like that, using record names as expressions. I have brought up this 'issue' before with Aziz, who in turn explained the single namespace approach (I however tend to disagree as I feel record names should not interfere, ie they are separate).
So what is the situation here? 1. Can record names be used in expressions? [I agree with Aziz, that it should not be allowed, is there any reason why the record name is simply not wrapped in record-type-descriptor syntax?] 2. Should record names interfere with variable names? [I feel this should not happen]
> On Jul 23, 2:58 am, Matthew Flatt <mfl...@cs.utah.edu> wrote:
> > On Jul 22, 4:08 pm, Matthew Flatt <mfl...@cs.utah.edu> wrote:
> > > As of Ikarus 0.3.0+ (revision 1548), many libraries fail to load, > > > mostly because condition names like &error cannot be used as > > > expressions.
> > That would be because R6RS doesn't require it to work as an > > expression. I was confused about whether `record-type-descriptor' > > is required and even whether it's built into `condition-predicate'.
> > Thanks to Aziz for helping so quickly to sort out this test-suite > > bug! Most of the tests now load into Ikarus.
> > Matthew
> It seems a lot of the tests are still like that, using record names as > expressions.
Can you point me to examples?
I note that condition names are used directly in `test/exn' forms, but the expansion of `test/exn' now wraps the condition name with `record-type-descriptor'. It sounds like I've missed other places, though.
> 1. Can record names be used in expressions?
Not in portable code, as far as I can tell, though it seems that R6RS allows an implementation to allow a record name as an expression.
> 2. Should record names interfere with variable names?
Sorry for posting the bug here, but I see this too much!
Line 44 of hashtables.sls: should be eqv?, not eq? EVER! (this goes for the slatex and compiler benchmarks from Larceny too)
Also in line 161 of hashtables.sls: string-ci-hash and string-hash are not comparable, unless you define string-ci-hash as: (define (string-ci-hash str) (string-hash (string-downcase str)))
But that is inefficient.
IMO the test is wrong, it should compared string-ci-hash to string-ci- hash with different cased strings.
On Jul 23, 7:08 am, leppie <xacc....@gmail.com> wrote:
> Line 44 of hashtables.sls: should be eqv?, not eq? EVER!
I think it should be `eq?' for an `eq?'-based hashtable, but I see your point for other hashtables. I've adjusted the test to vary the comparison based on the kind of hash table being testing.
> Also in line 161 of hashtables.sls: string-ci-hash and string-hash are > not comparable, unless you define string-ci-hash as: > (define (string-ci-hash str) > (string-hash (string-downcase str)))
> But that is inefficient.
> IMO the test is wrong, it should compared string-ci-hash to string-ci- > hash with different cased strings.
Why should a record name interfere? Now it seems 'cons' is some kind of syntax, but yet it is only usable in 2 places (record-type- descriptor & record-constructor-descriptor).
On Jul 23, 3:20 pm, Matthew Flatt <mfl...@cs.utah.edu> wrote:
> I think it should be `eq?' for an `eq?'-based hashtable, but I see > your point for other hashtables. I've adjusted the test to vary the > comparison based on the kind of hash table being testing.
I'd like to suggest two changes to the test suite. With these two changes, the current development version of Larceny is able to complete its execution of the entire test suite in a reasonable amount of time, with some test failures but without any fatal errors.
In tests/r6rs/io/ports.sls, I suggest changing the last three calls to open-file-input/output-port to use a transcoder with the latin-1-codec instead of utf-8-codec. The R6RS does not specify a semantics for opening a file for input/output with a variable-width codec. The R6RS editors who insisted upon adding textual input/output ports to the R6RS had the Posix semantics in mind, but the Posix semantics for input/output ports assumes a fixed-width encoding. Since the semantics of the R6RS input/output ports with variable-width transcoding is implementation-dependent, I believe the R6RS allows or should allow implementations to raise an exception when a program attempts to open a file for mixed input/output with a variable-width encoding. Larceny does that, and I believe this aspect of Larceny's behavior is a feature, not a bug. Changing the test suite to use a fixed-wdith encoding here would not alter the nature of the test.
The other change I suggest would reduce the code size of tests/r6rs/arithmetic/fixnums.sls. Its large code size is caused by this code:
;; If you put N numbers here, it expads to N^3 tests! (carry-tests 0 [0 1 2 -1 -2 38734 -3843 2484598 -348732487 (greatest-fixnum) (least-fixnum)])
The following change would reduce those 1331 tests to 500 without losing much in the way of test coverage:
On Jul 23, 8:59 am, William D Clinger <cesur...@yahoo.com> wrote:
> In tests/r6rs/io/ports.sls, I suggest changing the last > three calls to open-file-input/output-port to use a > transcoder with the latin-1-codec instead of utf-8-codec.
> The other change I suggest would reduce the code size of > tests/r6rs/arithmetic/fixnums.sls. Its large code size > is caused by this code:
> ;; If you put N numbers here, it expads to N^3 tests! > (carry-tests 0 > [0 1 2 -1 -2 38734 -3843 2484598 > -348732487 (greatest-fixnum) (least-fixnum)])
> The following change would reduce those 1331 tests to > 500 without losing much in the way of test coverage:
I'm less sold on this one. Don't we want tests combining `(greatest-fixnum)' with 0 and with 1, for example?
In fact, in SVN, I've made it worse. The `carry-tests' form was meant to expand to tests of `fx+/carry' and `fx-/carry' in addition to `fx*/carry'. So, now there are 3 times as many tests, instead of 1/3 as many tests. And even if I had re-organized as above, adding the missing operators would leave us with more tests than before.
But it does seem unlikely that 3993 tests are really necessary. Any ideas that will shrink the set of tests enough, even with the added operators?
On Jul 23, 9:53 am, Matthew Flatt <mfl...@cs.utah.edu> wrote:
> Any ideas that will shrink the set of tests enough, even with the > added operators?
There's a really simple solution: instead of generating N test expressions, just iterate over the inputs. (I was too stuck on macros, which is a convenient way to track the original expression, but obviously not the only way.)
leppie wrote: > Line 44 of hashtables.sls: should be eqv?, not eq? EVER! (this goes > for the slatex and compiler benchmarks from Larceny too)
I didn't write either of those benchmarks. Spurred by your message, I went through the R6RS version of the latex benchmark and corrected all uses of eq? and memq on characters. You can now obtain the corrected version from Larceny's Trac site. I'll check in a corrected version of the R5RS slatex benchmark after I test it. The more important thing, of course, is to correct the slatex program itself, which I haven't done.
With the compiler benchmark, it's harder to tell which uses of eq?, memq, and assq need to be corrected. It looks as though the easiest way to find out is to add some instrumentation to the code. That's why I haven't yet corrected the compiler benchmark, but I plan to get it done eventually.
> leppie wrote: > > Line 44 of hashtables.sls: should be eqv?, not eq? EVER! (this goes > > for the slatex and compiler benchmarks from Larceny too)
> I didn't write either of those benchmarks. Spurred by > your message, I went through the R6RS version of the > latex benchmark and corrected all uses of eq? and memq > on characters. You can now obtain the corrected version > from Larceny's Trac site. I'll check in a corrected > version of the R5RS slatex benchmark after I test it. > The more important thing, of course, is to correct the > slatex program itself, which I haven't done.
> With the compiler benchmark, it's harder to tell which > uses of eq?, memq, and assq need to be corrected. It > looks as though the easiest way to find out is to add > some instrumentation to the code. That's why I haven't > yet corrected the compiler benchmark, but I plan to get > it done eventually.
> Will
Thanks :)
I managed to get slatex working, totally gave up on the compiler one :)
I write the description for "Hints on running the tests" section. Please check. =============================== Ypsilon 0.9.5-update2 ------
Put this directory at "<somewhere>/tests/r6rs" and run with "run.sps"
cd <somewhere> ypsilon --sitelib=. --no-letrec-check tests/r6rs/run.sps
or run an individual library's test, such as "run/program.sps" as
cd <somewhere> ypsilon --sitelib=. --no-letrec-check tests/r6rs/run/program.sps ================================ *Because Ypsilon check letrec restriction violation during macro expansion, the expression "(letrec ((x y) (y x)) 'should-not-get- here)" in test/r6rs/base.sls raises exception during load. I added '-- no-letrec-check' to avoid this problem.
Finally, running all tests on Ypsilon 0.9.5-update2 show "93 of 8836 tests failed" :) Thank you for your great work!
> *Because Ypsilon check letrec restriction violation during macro > expansion, the expression "(letrec ((x y) (y x)) 'should-not-get- > here)" in test/r6rs/base.sls raises exception during load. I added '-- > no-letrec-check' to avoid this problem.
I believe that the R6RS requires that this check be done at runtime (at least sometimes). In section 11.4.6, discussing `letrec', it says, under Implementation Responsibilities (emphasis mine):
Implementations must detect references to a <variable> during the EVALUATION of the <init> expressions (using one particular evaluation order and order of evaluating the <init> expressions).
There are some examples where it would be hard (or impossible) to determine during macro expansion whether the letrec restriction will be violated. For example (letrec ([x (if (eq? (cons 1 2) (cons 1 2)) x 1)]) x) should always return 1, but the analysis is not obvious.
> > *Because Ypsilon check letrec restriction violation during macro > > expansion, the expression "(letrec ((x y) (y x)) 'should-not-get- > > here)" in test/r6rs/base.sls raises exception during load. I added '-- > > no-letrec-check' to avoid this problem.
> I believe that the R6RS requires that this check be done at runtime > (at least sometimes). In section 11.4.6, discussing `letrec', it says, > under Implementation Responsibilities (emphasis mine):
> Implementations must detect references to a <variable> during the > EVALUATION of the <init> expressions (using one particular evaluation > order and order of evaluating the <init> expressions).
> There are some examples where it would be hard (or impossible) to > determine during macro expansion whether the letrec restriction will > be violated. For example (letrec ([x (if (eq? (cons 1 2) (cons 1 2)) > x 1)]) x) should always return 1, but the analysis is not obvious.
> sam th
Thank you for your input. Now I understand why the letrec violation raises &asserion but not &syntax! I'm very happy to know that :)
On Jul 26, 10:25 pm, leppie <xacc....@gmail.com> wrote:
> On Jul 26, 2:59 pm, samth <sam...@gmail.com> wrote:
> > For example (letrec ([x (if (eq? (cons 1 2) (cons 1 2)) > > x 1)]) x) should always return 1, but the analysis is not obvious.