New testing framework

Ralf Hemmecke

unread,

Mar 6, 2010, 7:41:31 PM3/6/10

to fricas-devel

Hello,

as proposed, I've started a new testing framework. This is work in
progress, but I'd be happy if I get some feedback.

http://github.com/hemmecke/fricas/commits/test-framework

It currently is rather independent from the rest of FriCAS. It could
live anywhere even outside of FriCAS, but that makes no sense. I've,
therefore, chosen the subdirectory src/test.

Assuming that you have a compiled FriCAS build-tree... actually the only
thing that is needed is notangle + AXIOMsys (but
")set breakmode quit" must work, otherwise no failure will be detected).

You get it working like this...

mkdir $HOME/hemmecke
cd $HOME/hemmecke
git co git://github.com/hemmecke/fricas.git
cd fricas
git co test-framework
cd src/test
export PATH=/path/to/where/AXIOMsys/lives:/path/to/notangle:$PATH
make -f Makefile.am mk
./build-setup.sh
mkdir $HOME/hemmecke/fricas-build
cd $HOME/hemmecke/fricas-build
$HOME/hemmecke/fricas/src/test/configure
make -j5 check #parallel tests allowed

Go to set.input.pamphlet and change something to make one or two tests
fail. Then type "make check" again. Well... I probably should let the
email address point to fricas-devel. ;-)

If you want to add more tests, just create a new file
foo.input.pamphlet, write a few <<test:somename>>= ... @ chunks, then

make -f Makefile.am mk

in the source directory and now in the build directory you type

make check

. This will invoke automake.

I'm waiting for feedback.

Ralf

Martin Rubey

unread,

Mar 7, 2010, 2:25:41 AM3/7/10

to fricas...@googlegroups.com

Ralf Hemmecke <ra...@hemmecke.de> writes:

> Hello,
>
> as proposed, I've started a new testing framework. This is work in
> progress, but I'd be happy if I get some feedback.

Could you say how it differs from / how it improves on unittest.spad?

Martin

Ralf Hemmecke

unread,

Mar 7, 2010, 3:43:59 AM3/7/10

to fricas...@googlegroups.com

>> as proposed, I've started a new testing framework. This is work in
>> progress, but I'd be happy if I get some feedback.
>
> Could you say how it differs from / how it improves on unittest.spad?

http://github.com/hemmecke/fricas/blob/test-framework/src/test/README
http://github.com/hemmecke/fricas/blob/test-framework/src/test/spadunit.spad.pamphlet

Instead of

f := (x+a)/(x*(x^3+(b+c)*x^2+b*c*x))
testEquals("numberOfFractionalTerms partialFraction(f, x)", "3")
s1 := "ab"; s2 := concat("a", "b");
testEquals("s1", "s2")

You would write

f := (x+a)/(x*(x^3+(b+c)*x^2+b*c*x))
(*) assertEquals(numberOfFractionalTerms partialFraction(f, x), 3)
s1 := "ab"; s2 := concat("a", "b");
(*) assertEquals(s1, s2)

modification in lines (*)

I don't use interpretString, but rather just simply pipe the .input file
to the interpreter and wait for zero or non-zero status. As proposed,
each test runs in its own session. (That is why the can be run in parallel.)

My assertEquals relies on the equality defined in the domain. I don't
care about whether the string output should be identical, equality in
the domain is what one cares about. For example, x+1 and 1+x are the
same polynomials, but as strings they are different.

I am not saying that your testing framework should be removed. The more
tests the better.

Ralf

Ralf Hemmecke

unread,

Mar 7, 2010, 3:44:46 AM3/7/10

to fricas...@googlegroups.com

>> as proposed, I've started a new testing framework. This is work in
>> progress, but I'd be happy if I get some feedback.
>
> Could you say how it differs from / how it improves on unittest.spad?

http://github.com/hemmecke/fricas/blob/test-framework/src/test/README
http://github.com/hemmecke/fricas/blob/test-framework/src/test/spadunit.spad.pamphlet

And, btw, why not just running it...?

Martin Rubey

unread,

Mar 7, 2010, 8:01:00 AM3/7/10

to fricas...@googlegroups.com

Ralf Hemmecke <ra...@hemmecke.de> writes:

>>> as proposed, I've started a new testing framework. This is work in
>>> progress, but I'd be happy if I get some feedback.
>>
>> Could you say how it differs from / how it improves on unittest.spad?
>
> http://github.com/hemmecke/fricas/blob/test-framework/src/test/README
> http://github.com/hemmecke/fricas/blob/test-framework/src/test/spadunit.spad.pamphlet
>
> And, btw, why not just running it...?

because I don't know how and I have no time to read the git manual.

> Instead of
>
> f := (x+a)/(x*(x^3+(b+c)*x^2+b*c*x))
> testEquals("numberOfFractionalTerms partialFraction(f, x)", "3")
> s1 := "ab"; s2 := concat("a", "b");
> testEquals("s1", "s2")
>
> You would write
>
> f := (x+a)/(x*(x^3+(b+c)*x^2+b*c*x))
> (*) assertEquals(numberOfFractionalTerms partialFraction(f, x), 3)
> s1 := "ab"; s2 := concat("a", "b");
> (*) assertEquals(s1, s2)
>
> modification in lines (*)

OK. Thus, when

s1 := "ab"; s2 := concat("a", "x");
assertEquals(s1, s2)

fails, you get

"ab" is not equal to "ax", rather then "s1" is not equal to "s2", right?

> My assertEquals relies on the equality defined in the domain. I don't
> care about whether the string output should be identical, equality in
> the domain is what one cares about. For example, x+1 and 1+x are the
> same polynomials, but as strings they are different.

That's almost as it currently is. *IF* the domain has InputForm, we
compare the InputForm, which should be (should be just as in "=")
something reasonably close to a normal form. In particular, x+1 and 1+x
have the same InputForm as Polynomials, Expressions, etc., but "x+1" and
"1+x" are of course different strings.

> I am not saying that your testing framework should be removed. The
> more tests the better.

I agree about tests, but I'm not so sure about testing frameworks.
Thus, I'd rather have mine removed to have only one.

Martin

Ralf Hemmecke

unread,

Mar 7, 2010, 11:03:44 AM3/7/10

to fricas-devel

>> And, btw, why not just running it...?
>
> because I don't know how and I have no time to read the git manual.

That is why I've also provided the exact sequence of commands to run it. ;-)

http://groups.google.com/group/fricas-devel/msg/eba54af0ed67f16c

> OK. Thus, when
>
> s1 := "ab"; s2 := concat("a", "x");
> assertEquals(s1, s2)
>
> fails, you get
>
> "ab" is not equal to "ax", rather then "s1" is not equal to "s2", right?

To be more precise... for the file martin.input.pamphlet, I get (just
running exclusively your tests via

(+)
make TESTS="poly.martin.input string.martin.input" check

) two failures and the files poly.martin.log and string.martin.log.
(see below)

>> My assertEquals relies on the equality defined in the domain. I don't
>> care about whether the string output should be identical, equality in
>> the domain is what one cares about. For example, x+1 and 1+x are the
>> same polynomials, but as strings they are different.
>
> That's almost as it currently is. *IF* the domain has InputForm, we
> compare the InputForm, which should be (should be just as in "=")
> something reasonably close to a normal form.

If the InputForm is equal then we certainly also have equality in the
domain. But, I don't require InputForm as an export, why should I?
Equality is what counts. I don't care about the internal representation
of the (mathematical) objects. If I want to test the latter, then I'd
have to invent another form of tests.

>> I am not saying that your testing framework should be removed. The
>> more tests the better.

> I agree about tests, but I'm not so sure about testing frameworks.
> Thus, I'd rather have mine removed to have only one.

First, let's hear about other opinions. Of course, a testing framework
should not only test, but also be rather convenient for developers to use.

One drawback of my framework is that it requires automake+autoconf if
one wants to add new tests. I think that is not a big issue for core
developers. Other people that don't have write access to the SVN repo,
just have to send a patch with a new testfile, they can execute it (even
without automake and autoconf) by simply listing the tests like I have
done in (+) above. Or they add some lines to the TESTS variable defined
in src/tests/Makefile by hand. Thus, automake and autoconf are not
really hard dependencies.

Ralf

---rhxBEGIN martin.input.pamphlet
<<test:poly>>=
f := (x+a)/(x*(x3+(b+c)*x2+b*c*x))
assertEquals(numberOfFractionalTerms partialFraction(f, x), 3)
@
<<test:string>>=

s1 := "ab"; s2 := concat("a", "x");
assertEquals(s1, s2)

@
---rhxEND martin.input.pamphlet

---rhxBEGIN poly.martin.log
FAIL: poly.martin.input (exit: 1)
=================================

Checking for foreign routines
AXIOM=NIL
spad-lib="/lib/libspad.so"
FriCAS (AXIOM fork) Computer Algebra System
Version: FriCAS 2010-01-08
Timestamp: Friday February 26, 2010 at 22:05:33
-----------------------------------------------------------------------------
Issue )copyright to view copyright notices.
Issue )summary for a summary of useful system commands.
Issue )quit to leave FriCAS and return to shell.
-----------------------------------------------------------------------------

(1) -> (1) -> SpadUnit0 is now explicitly exposed in frame initial
SpadUnit0 will be automatically loaded when needed from
/home/hemmecke/scratch/build/test-fricas/SPADZERO.NRLIB/SPADZERO
(1) -> SpadUnit is now explicitly exposed in frame initial
SpadUnit will be automatically loaded when needed from
/home/hemmecke/scratch/build/test-fricas/SPADUNIT.NRLIB/SPADUNIT
(1) -> Warning: HyperTeX macro table not found

x + a
(1) ---------------------------
2
x x3 + (c + b)x x2 + b c x
Type:
Fraction(Polynomial(Integer))
(2) -> Expected equal values, but got the following.
Value1 1:
2
Value1 2:
3
---rhxEND poly.martin.log

---rhxBEGIN string.martin.log
FAIL: string.martin.input (exit: 1)
===================================

Checking for foreign routines
AXIOM=NIL
spad-lib="/lib/libspad.so"
FriCAS (AXIOM fork) Computer Algebra System
Version: FriCAS 2010-01-08
Timestamp: Friday February 26, 2010 at 22:05:33
-----------------------------------------------------------------------------
Issue )copyright to view copyright notices.
Issue )summary for a summary of useful system commands.
Issue )quit to leave FriCAS and return to shell.
-----------------------------------------------------------------------------

(1) -> (1) -> SpadUnit0 is now explicitly exposed in frame initial
SpadUnit0 will be automatically loaded when needed from
/home/hemmecke/scratch/build/test-fricas/SPADZERO.NRLIB/SPADZERO
(1) -> SpadUnit is now explicitly exposed in frame initial
SpadUnit will be automatically loaded when needed from
/home/hemmecke/scratch/build/test-fricas/SPADUNIT.NRLIB/SPADUNIT
(1) -> Warning: HyperTeX macro table not found

Type:
String
(2) -> Expected equal values, but got the following.
Value1 1:
"ab"
Value1 2:
"ax"
---rhxEND string.martin.log

Waldek Hebisch

unread,

Mar 7, 2010, 2:18:55 PM3/7/10

to fricas...@googlegroups.com

Ralf Hemmecke wrote:
> >> My assertEquals relies on the equality defined in the domain. I don't
> >> care about whether the string output should be identical, equality in
> >> the domain is what one cares about. For example, x+1 and 1+x are the
> >> same polynomials, but as strings they are different.
> >
> > That's almost as it currently is. *IF* the domain has InputForm, we
> > compare the InputForm, which should be (should be just as in "=")
> > something reasonably close to a normal form.
>
> If the InputForm is equal then we certainly also have equality in the
> domain. But, I don't require InputForm as an export, why should I?
> Equality is what counts. I don't care about the internal representation
> of the (mathematical) objects. If I want to test the latter, then I'd
> have to invent another form of tests.

Let me remaind you why we settled on using InputForm. One problematic
case is testing integrals. Namely, proving that unevaluated integral
is equal to given function is easier than computing integral. In
particular "normalize" normalizes to 0 many such differences.
Another problem is that due to noncanonical representation there
are many correct answers. Mathematically 1 and cos^(x) + sin^2(x)
are equal, but practically the first one is much better.
InputForm was chosen because it avoids textual comparison but
approximates reasonably well what user would see.

So I care about equality in the domain, but I want more. I am
not saying that existing way is best one (actually, I several
times mentioned problems with current famework). But it solves
some problems and we do not want to lose this.

Let me add my to topmost wishes for testing framework:

- summary of results, something like "Run 65537 tests, no
unexpected failures" (that is easy)
- avoid going trough interprter (for better performance)

Also, I think that real problem is writing tests. It is easy
to write large/log running tests that test only a little. It
is much harder to write small test with good coverage. Testing
framework should help with recording results and provide some
utilities. But no framework will solve deeper problems. In
particular test author needs to provide acceptance criteria
(for example for integrals I wrote a little routine which
rejects unevaluated integrals and then tests if derivative
normalizes to original function).

In other words I think that typical test should look like follows:

1) table of data
2) problem specific routine that goes trough table, performs
calculations and checks result

Only experience with writing such routines can tell if some
parts are reusable.

--
Waldek Hebisch
heb...@math.uni.wroc.pl

Ralf Hemmecke

unread,

Mar 7, 2010, 5:15:30 PM3/7/10

to fricas...@googlegroups.com

> Let me remaind you why we settled on using InputForm. One
> problematic case is testing integrals. Namely, proving that
> unevaluated integral is equal to given function is easier than
> computing integral. In particular "normalize" normalizes to 0 many
> such differences. Another problem is that due to noncanonical
> representation there are many correct answers. Mathematically 1 and
> cos^(x) + sin^2(x) are equal, but practically the first one is much
> better. InputForm was chosen because it avoids textual comparison
> but approximates reasonably well what user would see.
>
> So I care about equality in the domain, but I want more. I am not
> saying that existing way is best one (actually, I several times
> mentioned problems with current famework). But it solves some
> problems and we do not want to lose this.

Uh, I think you are raising the problem with Expression(Integer)
otherwise you would probably have a canonical representative in the domain.

But anyway, my framework does not forbid to check equality of the
respective InputForms, just add

<<setUP>>=
assertEQ(x, y) ==> assertEquals(x::InputForm, y::InputForm)
@

and then use assertEQ in your tests. Since InputForm exports
SetCategory, that perfectly fits in comparing expressions with the = of
a given domain.

> Let me add my to topmost wishes for testing framework:
>
> - summary of results, something like "Run 65537 tests, no unexpected
> failures" (that is easy)

OK. Done.

> - avoid going trough interprter (for better performance)

Pfffff, I don't know whether I agree in general. The interpreter should
only pass on arguments to the respective function that one wants to
test. So I don't see much overhead in interfacing the interpreter.
Of course, one doesn't want to write functions in the interpreter and
test them, since then one is basically testing the interpreter and not
the respective implementation under src/algebra.

I also think that one test should not run longer than about 10 seconds
(which is already quite long). Also little functions should be testet
that have no obvious mathematical meaning. And most important is (at
least for me) that tests run independent. I don't want any test B to
pass just because the testsuite always runs test A before B and A
affects (in some strange hidden way) the outcome of B.

The whole execution time of the testsuite should be rather short,
because every commit should pass the whole testsuite.

As you might have seen, I was thinking not only about *.input.pamphlet
files, but also about *.spad.pamphlet files. The idea was that test
commands comming from .spad.pamphlet files will be wrapped into a domain
and compiled and then run. However, I don't see at the moment, what
advantages this would have in contrast to calling algebra routines via
the interpreter. It also costs compilation time.

If I had proper exceptions available, I could probably also do a similar
implementation like AldorUnit. Then the tests would not run in
independent AXIOMsys sessions, but rather would be compiled and run just
in one session. Disadvantage would also be that these tests cannot be
run in parallel.

> Also, I think that real problem is writing tests. It is easy to
> write large/log running tests that test only a little. It is much
> harder to write small test with good coverage.

Well, that we can only achieve if everyone looks at what the other is
doing ... quite a hard task in an open source project like FriCAS.

> Testing framework should help with recording results and provide
> some utilities.

Could you be more specific? What do you want to record? Results are
written inside test files. That stuff is recorded. If you mean
timings... well, I have no idea how this should be done. At the moment I
was/am only concerned with correctness.

If someone has an idea how we could/should record timings or memory
usage and how this could be compared wrt different computers, just let
me know.

> But no framework will solve deeper problems. In
> particular test author needs to provide acceptance criteria (for
> example for integrals I wrote a little routine which rejects
> unevaluated integrals and then tests if derivative normalizes to
> original function).

This I don't understand. If one wants to test "integrate", one wouldn't
use "differentiate" to check the result. Both could contain a bug, so
that one bug eliminates the visibility of the other bug. One should
rather hardcode the exact result one expects.

And if there is something like the branch cut problem that Tim mentioned
the other day, then I'd say the function specification should be made
more precise and clarify which branch cut it is using so that the result
is in some sense unique.

> In other words I think that typical test should look like follows:
>
> 1) table of data 2) problem specific routine that goes trough table,
> performs calculations and checks result

What is the big difference to a file with a list of entries of the form

assertEquals(command-to-be-computed, expected-result)

?

and if you test the same command again and again like

assertEquals(foo(input), expected-result)

then simply start with

<<setUp>>=
FOO(x, y) ==> assertEquals(foo(x), y)
@

and then use

FOO(input, expected-result)

That's basically the table you want. No?

Ralf

PS: I still have no idea how I can catch spelling errors where the
interpreter tells me that the function does not exist. Any chance that
with ")set breakmode quit" such errors would be fatal and exist with
non-zero exit status?

Reply all

Reply to author

Forward