[Haskell-cafe] Type System vs Test Driven Development

679 views
Skip to first unread message

Jonathan Geddes

unread,
Jan 5, 2011, 4:44:20 AM1/5/11
to haskell-cafe
Cafe,

In every language I program in, I try to be as disciplined as possible
and use Test-Driven Development. That is, every language except
Haskell.

There are a few great benefits that come from having a comprehensive
test suite with your application:

1. Refactoring is safer/easier
2. You have higher confidence in your code
3. You have a sort of 'beacon' to show where code breakage occurs

Admittedly, I don't believe there is any magical benefit that comes
from writing your tests before your code. But I find that when I don't
write tests first, it is incredibly hard to go back and write them for
'completed' code.

But as mentioned, I don't write unit tests in Haskell. Here's why not.

When I write Haskell code, I write functions (and monadic actions)
that are either a) so trivial that writing any kind of unit/property
test seems silly, or are b) composed of other trivial functions using
equally-trivial combinators.

So, am I missing the benefits of TDD in my Haskell code?

Is the refactoring I do in Haskell less safe? I don't think so. I
would assert that there is no such thing as refactoring with the style
of Haskell I described: the code is already super-factored, so any
code reorganization would be better described as "recomposition." When
"recomposing" a program, its incredibly rare for the type system to
miss an introduced error, in my experience.

Am I less confidence in my Haskell code? On the contrary. In general,
I feel more confident in Haskell code WITHOUT unit tests than code in
other languages WITH unit tests!

Finally, am I missing the "error beacon" when things break? Again I
feel like the type system has got me covered here. One of the things
that immediately appealed to me about Haskell is that the strong type
system gives the feeling of writing code against a solid test base.

The irony is that the type system (specifically the IO monad) force
you to structure code that would be very easy to test because logic
code is generally separated from IO code.

I explained these thoughts to a fellow programmer who is not familiar
with Haskell and his response was essentially that any language that
discourages you from writing unit tests is a very poor language. He
(mis)quoted: "compilation [is] the weakest form of unit testing" [0].
I vehemently disagreed, stating that invariants embedded in the type
system are stronger than any other form of assuring correctness I know
of.

I know that much of my code could benefit from a property test or two
on the more complex parts, but other than that I can't think that unit
testing will improve my Haskell code/programming practice. Am I
putting too much faith in the type system?

[0] http://blog.jayfields.com/2008/02/static-typing-considered-harmful.html

_______________________________________________
Haskell-Cafe mailing list
Haskel...@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Erik de Castro Lopo

unread,
Jan 5, 2011, 5:05:49 AM1/5/11
to haskel...@haskell.org
Jonathan Geddes wrote:

<snip>

> So, am I missing the benefits of TDD in my Haskell code?

Probably. I work on a project which has 40000+ lines of
haskell code (a compiler written in haskell) and has a huge
test suite that is a vital to continued development.

I've also written relatively small functions (eg a function
to find if a graph has cycles) that was wrong first time I
wrote it. During debugging I wrote a test that I'm keeping
as part of the unit tests.

Furthermore tests are also useful for preventing regressions
(something the programmer is doing today, breaks something
that was working 6 months ago). Without tests, that breakage
may go un-noticed.

> I explained these thoughts to a fellow programmer who is not familiar
> with Haskell and his response was essentially that any language that
> discourages you from writing unit tests is a very poor language.

Haskell most certainly does not discourage anyone from writing
tests. One simply needs to look at the testing category of
hackage:

http://hackage.haskell.org/package/#cat:testing

to find 36 packages for doing testing.

> Am I putting too much faith in the type system?

Probably.

> [0] http://blog.jayfields.com/2008/02/static-typing-considered-harmful.html

Complete bollocks!

Good type systems combined with good testing leads to better
code than either good type systems or good testing alone.

Erik
--
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/

Sönke Hahn

unread,
Jan 5, 2011, 9:30:36 AM1/5/11
to haskel...@haskell.org
Erik de Castro Lopo wrote:

> Jonathan Geddes wrote:
>
> <snip>
>
>> So, am I missing the benefits of TDD in my Haskell code?
>
> Probably. I work on a project which has 40000+ lines of
> haskell code (a compiler written in haskell) and has a huge
> test suite that is a vital to continued development.

<snip>

If I may, I would like to agree with you both: A test suite should ideally
cover all the aspects of the tested program, that are not checked statically
by the compiler. So in python, I end up writing test cases that check for
runtime type errors; in haskell, I don't. In both languages, it's good
advice to write a test suite that checks the correctness of calculated
values.

Haskell's static type system feels to me like an automatically generated,
somehow dumb test suite. It does not replace a full-flegded hand-written
one, but it does replace a big part of it (that is, of what you would have
to write in a dynamic language). And it runs much faster.

I also tend to write test suites when I feel, the code exceeds a certain
level of complexity. This level is language dependent and in haskell's case,
it's pretty high. (I should probably lower that level and write more test
cases, but that seems to be true for all languages.)

And yes, haskell has great support for writing test suites.

Jake McArthur

unread,
Jan 5, 2011, 12:12:13 PM1/5/11
to haskel...@haskell.org
On 01/05/2011 03:44 AM, Jonathan Geddes wrote:
> When I write Haskell code, I write functions (and monadic actions)
> that are either a) so trivial that writing any kind of unit/property
> test seems silly, or are b) composed of other trivial functions using
> equally-trivial combinators.

"There are two ways of constructing a software design. One way is to
make it so simple that there are obviously no deficiencies. And the
other way is to make it so complicated that there are no obvious
deficiencies." -- C.A.R. Hoare

If you actually manage to do the former, I'd say you don't need to test
those parts in isolation.

That said, I disagree with you overall. The Haskell type system is
simply not rich enough to guarantee everything you might need. Even if
it was, it would take a lot of work to encode all your invariants,
probably more work than writing tests would have been (although there
are obvious advantages to the former as far as having a high level of
assurance that your code is correct).

Haskell has some awesome testing tool, and I highly recommend getting
acquainted with them. In particular, you should definitely learn how to
use QuickCheck, which allows you to easily check high-level properties
about your code; this is beyond what most traditional unit tests could
hope to achieve. I tend to use QuickCheck, SmallCheck, *and*
LazySmallCheck in my test suites, as I feel that they complement each
other well. HUnit is probably the main one for traditional unit tests. I
admit I have never used it, and I'm not sure whether I'm missing out on
anything. There are also some pretty nice test frameworks out there to
help you manage all your tests, although they could probably use a
little more work overall.

- Jake

Jonathan Geddes

unread,
Jan 5, 2011, 3:02:37 PM1/5/11
to Jake McArthur, haskell-cafe
> The Haskell type system is simply not rich enough to guarantee everything you might need.

That's true, and after giving this a bit more thought, I realized it's
not JUST the type system that I'm talking about here. There are a few
other features that make it hard for me to want to use unit/property
tests.

For example, say (for the sake of simplicity and familiarity) that I'm
writing the foldl function. If I were writing this function in any
other language, this would be my process: first I'd write a test to
check that foldl returns the original accumulator when the list is
empty. Then I would write code until the test passed. Then I would
move on to the next property of foldl and write a test for it. Rinse
repeat.

But in Haskell, I would just write the code:

> foldl _ acc [] = acc

The function is obviously correct for my (missing) test. So I move on
to the next parts of the function:

>foldl _ acc [] = acc
>foldl f acc (x:xs) = foldl f (f acc x) xs

and this is equally obviously correct. I can't think of a test that
would increase my confidence in this code. I might drop into the ghci
repl to manually test it once, but not a full unit test.

I said that writing Haskell code feels like "writing code against a
solid test base." But I think there's more to it than this. Writing
Haskell code feels like writing unit tests and letting the machine
generate the actual code from those tests. Declarative code for the
win.

Despite all this, I suspect that since Haskell is at a higher level of
abstraction than other languages, the tests in Haskell must be at a
correspondingly higher level than the tests in other languages. I can
see that such tests would give great benefits to the development
process. I am convinced that I should try to write such tests. But I
still think that Haskell makes a huge class of tests unnecessary.

> Haskell has some awesome testing tool, and I highly recommend getting
> acquainted with them.

I will certainly take your advice here. Like I said, I use TDD in
other languages but mysteriously don't feel its absence in Haskell. I
probably need to get into better habits.

--Jonathan

Anthony Cowley

unread,
Jan 5, 2011, 3:30:26 PM1/5/11
to Jonathan Geddes, haskell-cafe
On Wed, Jan 5, 2011 at 3:02 PM, Jonathan Geddes
<geddes....@gmail.com> wrote:
>> The Haskell type system is simply not rich enough to guarantee everything you might need.
> Despite all this, I suspect that since Haskell is at a higher level of
> abstraction than other languages, the tests in Haskell must be at a
> correspondingly higher level than the tests in other languages. I can
> see that such tests would give great benefits to the development
> process. I am convinced that I should try to write such tests. But I
> still think that Haskell makes a huge class of tests unnecessary.

The way I think about this is that you want to write tests for things
that can not be usefully represented in the type system. If you have a
parametrically typed function, then the type system is doing a lot of
useful "testing" for you. If you want to make sure that you properly
parse documents in a given format, then having a bunch of examples
that feed into unit tests is a smart move.

Anthony

Edward Z. Yang

unread,
Jan 5, 2011, 3:35:26 PM1/5/11
to Jonathan Geddes, haskell-cafe
Haskell's type system makes large classes of traditional "unit tests"
irrelevant. Here are some examples:

- Tests that simply "run" code to make sure there are no syntax
errors or typos,

- Tests that exercise simple input validation that is handled by the
type system, i.e. passing an integer to a function when it
expects a string,

But, as many other people have mentioned, that doesn't nearly cover all
unit tests (although when I look at some people's unit tests, one might
think this was the case.)

Cheers,
Edward

Gregory Collins

unread,
Jan 5, 2011, 4:27:29 PM1/5/11
to Jonathan Geddes, haskell-cafe
On Wed, Jan 5, 2011 at 9:02 PM, Jonathan Geddes
<geddes....@gmail.com> wrote:

> Despite all this, I suspect that since Haskell is at a higher level of
> abstraction than other languages, the tests in Haskell must be at a
> correspondingly higher level than the tests in other languages. I can
> see that such tests would give great benefits to the development
> process. I am convinced that I should try to write such tests. But I
> still think that Haskell makes a huge class of tests unnecessary.

The testing stuff available in Haskell is top-notch, as others have
pointed out. One of the biggest PITAs with testing in other languages
is having to come up with a set of test cases to fully exercise your
code. If you don't keep code coverage at 100% or close to it, it is
quite easy to test only the inputs you are *expecting* to see (because
programmers are lazy) and end up with something which is quite broken
or even insecure w.r.t. buffer overruns, etc. (Of course we don't
usually have those in Haskell either.)

QuickCheck especially is great because it automates this tedious work:
it fuzzes out the input for you and you get to think in terms of
higher-level invariants when testing your code. Since about six months
ago with the introduction of JUnit XML support in test-framework, we
also have plug-in instrumentation support with continuous integration
tools like Hudson:

http://buildbot.snapframework.com/job/snap-core/
http://buildbot.snapframework.com/job/snap-server/

It's also not difficult to set up automated code coverage reports:

http://buildbot.snapframework.com/job/snap-core/HPC_Test_Coverage_Report/
http://buildbot.snapframework.com/job/snap-server/HPC_Test_Coverage_Report/

Once I had written the test harness, I spent literally less than a
half-hour setting this up. Highly recommended, even if it is a (blech)
Java program. Testing is one of the few areas where I think our
"software engineering" tooling is on par with or exceeds that which is
available in other languages.

G
--
Gregory Collins <gr...@gregorycollins.net>

Iustin Pop

unread,
Jan 5, 2011, 4:32:27 PM1/5/11
to Gregory Collins, haskell-cafe
On Wed, Jan 05, 2011 at 10:27:29PM +0100, Gregory Collins wrote:
> Once I had written the test harness, I spent literally less than a
> half-hour setting this up. Highly recommended, even if it is a (blech)
> Java program. Testing is one of the few areas where I think our
> "software engineering" tooling is on par with or exceeds that which is
> available in other languages.

Indeed, I have found this to be true as well, and have been trying to
explain it to non-Haskellers. Though I would also rank the memory/space
profiler very high compared to what is available for some other
languages.

And note, it's also easy to integrate with the Python-based buildbot, if
one doesn't want to run Java :)

regards,
iustin

Erik de Castro Lopo

unread,
Jan 5, 2011, 6:02:33 PM1/5/11
to haskel...@haskell.org
Jonathan Geddes wrote:

> I know that much of my code could benefit from a property test or two
> on the more complex parts, but other than that I can't think that unit
> testing will improve my Haskell code/programming practice.

One other thing I should mention that is that since a lot of
Haskell code is purely functional its actually easier to test
than imperative code and particularly OO code.

The difficulties in unit testing OO code is coaxing objects
into the correct state to test a particular property. Usually
this means a whole bunch of extra code to implement mock
objects to feed the right data to the object under test.

By contrast, much Haskell code is purely functional. With
pure functions there is no state that needs to be set up. For
testing pure functions, its just a matter of collecting a set
of representative inputs and making sure the correct output is
generated by each input. For example, Don Stewart reported that
the XMonad developers conciously made as much of the XMonad code
pure so it was more easily testable.

Erik
--
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/

_______________________________________________

John Zabroski

unread,
Jan 5, 2011, 6:41:49 PM1/5/11
to Gregory Collins, Jonathan Geddes, haskell-cafe
These are some heuristics & memories I have for myself, and you can feel free to take whatever usefulness you can get out of it.

1. Don't confuse TDD with writing tests, in general.

2. Studies show that if you do TDD, you can write more tests than if you write tests after you write the code.  Therefore, TDD is the most productive way to test your code.

3. TDD has nothing to do with what language you are using; if you want a great book on TDD, I'd recommend Nat Pryce and Steve Freeman's Growing Object-Oriented Software; it has nothing to do with Haskell but everything to do with attitude towards software process.  A language is not enough to dramatically improve quality, you need a sane process.  Picking a good language is just as important as picking a sane process, and the two go hand-in-hand in creating great results.

4. Haskell's type system gives you confidence, not certainty, that you are correct.  Understand the difference.  The value of TDD is that the tests force you to think through your functional and non-functional requirements, before you write the code.

5. I have a hard time understanding statements like "The difficulties in unit testing OO code is coaxing objects into the correct state to test a particular property."  Difficulty in unit testing OO code is best documented in Robert Binder's tome [1], which is easily the best text on testing I've ever read and never gets cited by bloggers and other Internet programmarazzi (after all, who has time to read 1,500 pages on testing when you have to maintain a blog).  Moreover, you should not be mocking objects.  That will lead to a combinatorial explosion in tests and likely reveal that your object model leaks encapsulation details (think about it).  Mock the role the object plays in the system instead; this is kind of a silly way to say "use abstraction" but I've found most people need to hear a good idea 3 different ways in 3 different contexts before they can apply it beyond one playground trick.

6. If you care about individual objects, use design by contract and try to write your code as stateless as possible; design by contract is significantly different from TDD.

7. Difficulty in testing objects depends on how you describe object behavior, and has nothing to do with any properties of objects as compared with abstract data types!  For example, if object actions are governed by an event system, then to test an interaction, you simply mock the event queue manager.  This is because you've isolated your test to three variants: (A) the state prior to an atomic action, (B) the state after that action, and (C) any events the action generates.   This is really not any more complicated than using QuickCheck, but unfortunately most programmers are only familiar with using xUnit libraries for unit testing and they have subdued the concept of "unit testing" to a common API that is not particularly powerful.  Also, note earlier my dislike of the argument that "The difficulties in unit testing OO code is coaxing objects into the correct state to test a particular property."; under this testing methodology, there is no "particular property" to test, since the state of the application is defined in terms of all the object's attributes *after* the action has been processed.  It doesn't make much sense to just test one property.  It's called unit testing, not property testing.

8. If you've got a complicated problem, TDD will force you to decompose it before trying to solve it.  This is sort of a silly point, again, since more naturally good programmers don't waste their time writing random code first and then trying to debug it.  Most naturally good programmers will think through the requirements and write it correctly the first time.  TDD is in some sense just a bondage & discipline slogan for the rest of us mere mortals; you get no safety word, however.  You just have to keep at it.

9. Watch John Hughes' Functional Programming Secret Weapon talk [2].  I'd recommend watching it if you haven't already. 

10. Watch and learn.  Google "QuickCheck TDD" [3] and see what comes up.  Maybe you can be inspired by real world examples?

[1] http://www.amazon.com/dp/0201809389
[2] http://video.google.com/videoplay?docid=4655369445141008672
[3] http://www.google.com/search?q=QuickCheck+TDD

John Zabroski

unread,
Jan 5, 2011, 7:20:33 PM1/5/11
to Gregory Collins, Jonathan Geddes, haskell-cafe
On Wed, Jan 5, 2011 at 6:41 PM, John Zabroski <johnza...@gmail.com> wrote:

5. I have a hard time understanding statements like "The difficulties in unit testing OO code is coaxing objects into the correct state to test a particular property."  Difficulty in unit testing OO code is best documented in Robert Binder's tome [1], which is easily the best text on testing I've ever read and never gets cited by bloggers and other Internet programmarazzi (after all, who has time to read 1,500 pages on testing when you have to maintain a blog).  Moreover, you should not be mocking objects.  That will lead to a combinatorial explosion in tests and likely reveal that your object model leaks encapsulation details (think about it).  Mock the role the object plays in the system instead; this is kind of a silly way to say "use abstraction" but I've found most people need to hear a good idea 3 different ways in 3 different contexts before they can apply it beyond one playground trick.


7. Difficulty in testing objects depends on how you describe object behavior, and has nothing to do with any properties of objects as compared with abstract data types!  For example, if object actions are governed by an event system, then to test an interaction, you simply mock the event queue manager.  This is because you've isolated your test to three variants: (A) the state prior to an atomic action, (B) the state after that action, and (C) any events the action generates.   This is really not any more complicated than using QuickCheck, but unfortunately most programmers are only familiar with using xUnit libraries for unit testing and they have subdued the concept of "unit testing" to a common API that is not particularly powerful.  Also, note earlier my dislike of the argument that "The difficulties in unit testing OO code is coaxing objects into the correct state to test a particular property."; under this testing methodology, there is no "particular property" to test, since the state of the application is defined in terms of all the object's attributes *after* the action has been processed.  It doesn't make much sense to just test one property.  It's called unit testing, not property testing.

One small update for pedagogic purposes: Testing properties is really just a form of testing called negative testing; testing that something doesn't do something it shouldn't do.  The testing I covered above describes positive testing.  Negative testing is always going to be difficult, regardless of how you abstract your system and what language you use.  Think about it!

Erik de Castro Lopo

unread,
Jan 5, 2011, 7:26:57 PM1/5/11
to haskel...@haskell.org
John Zabroski wrote:

> 5. I have a hard time understanding statements like "The difficulties in
> unit testing OO code is coaxing objects into the correct state to test a
> particular property."

This is my direct experience of inheriting code written by others
without any tests and trying to add tests before doing more serious
work on extending and enhancing the code base.

> Difficulty in unit testing OO code is best documented
> in Robert Binder's tome [1],

I'm sure thats a fine book for testing OO code. I'm trying to avoid
OO code as much as possible :-).

My main point was that testing pure functions is easy and obvious
in comparison to test objects with internal state.

Cheers,


Erik
--
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/

_______________________________________________

Evan Laforge

unread,
Jan 5, 2011, 9:26:52 PM1/5/11
to Gregory Collins, haskell-cafe
On Wed, Jan 5, 2011 at 1:27 PM, Gregory Collins <gr...@gregorycollins.net> wrote:
> On Wed, Jan 5, 2011 at 9:02 PM, Jonathan Geddes
> <geddes....@gmail.com> wrote:
>
>> Despite all this, I suspect that since Haskell is at a higher level of
>> abstraction than other languages, the tests in Haskell must be at a
>> correspondingly higher level than the tests in other languages. I can
>> see that such tests would give great benefits to the development
>> process. I am convinced that I should try to write such tests. But I
>> still think that Haskell makes a huge class of tests unnecessary.

I write plenty of tests. Where static typing helps is that of course
I don't write tests for type errors, and more things are type errors
than might be in other languages (such as incomplete cases). But I
write plenty of tests to verify high level relations: with this input,
I expect this kind of output.

A cheap analogue to "test driven" that I often do is "type driven", I
write down the types and functions with the hard bits filled in with
'undefined'. Then I :reload the module until it typechecks. Then I
write tests against the hard bits, and run the test in ghci until it
passes.

However:

> QuickCheck especially is great because it automates this tedious work:
> it fuzzes out the input for you and you get to think in terms of
> higher-level invariants when testing your code. Since about six months
> ago with the introduction of JUnit XML support in test-framework, we
> also have plug-in instrumentation support with continuous integration
> tools like Hudson:

Incidentally, I've never been able to figure out how to use
QuickCheck. Maybe it has more to do with my particular app, but
QuickCheck seems to expect simple input data and simple properties
that should hold relating the input and output, and in my experience
that's almost never true. For instance, I want to ascertain that a
function is true for "compatible" signals and false for "incompatible"
ones, where the definition of compatible is quirky and complex. I can
make quickcheck generate lots of random signals, but to make sure the
"compatible" is right means reimplementing the "compatible" function.
Or I just pick a few example inputs and expected outputs. To get
abstract enough that I'm not simply reimplementing the function under
test, I have to move to a higher level, and say that notes that have
incompatible signals should be distributed among synthesizers so they
don't make each other sound funny. But now it's too high level: I
need a definition of "sound funny" and a model of a synthesizer... way
too much work and it's fuzzy anyway. And at this level the input data
is complex enough that I'd have to spend a lot of time writing and
tweaking (and testing!) the data generator to verify it's covering the
part of the state space I want to verify.

I keep trying to think of ways to use QuickCheck, and keep failing.

In my experience, the main work of testing devolves to a library of
functions to create the input data, occasionally very complex, and a
library of functions to extract the interesting bits from the output
data, which is often also very complex. Then it's just a matter of
'equal (extract (function (generate input data))) "abstract
representation of output data"'. This is how I do testing in python
too, so I don't think it's particularly haskell-specific.

I initially tried to use the test-framework stuff and HUnit, but for
some reason it was really complicated and confusing to me, so I gave
up and wrote my own that just runs all functions starting with
'test_'. It means I don't get to use the fancy tools, but I'm not
sure I need them. A standard profile output to go into a tool to draw
some nice graphs of performance after each commit would be nice
though, surely there is such a thing out there?

Chung-chieh Shan

unread,
Jan 5, 2011, 10:31:32 PM1/5/11
to haskel...@haskell.org
Evan Laforge <qdu...@gmail.com> wrote in article <AANLkTinfYp-bpbS1GA8_=o9WcrHe+duu...@mail.gmail.com> in gmane.comp.lang.haskell.cafe:

> Incidentally, I've never been able to figure out how to use
> QuickCheck. Maybe it has more to do with my particular app, but
> QuickCheck seems to expect simple input data and simple properties
> that should hold relating the input and output, and in my experience
> that's almost never true. For instance, I want to ascertain that a
> function is true for "compatible" signals and false for "incompatible"
> ones, where the definition of compatible is quirky and complex. I can
> make quickcheck generate lots of random signals, but to make sure the
> "compatible" is right means reimplementing the "compatible" function.
> Or I just pick a few example inputs and expected outputs.

Besides those example inputs and expected outputs, what about:
If two signals are (in)compatible then after applying some simple
transformations to both they remain (in)compatible? A certain family of
signals is always compatible with another family of signals? Silence is
compatible with every signal? Every non-silent signal is (in)compatible
with itself (perhaps after applying a transformation)?

--
Edit this signature at http://www.digitas.harvard.edu/cgi-bin/ken/sig
<INSERT PARTISAN STATEMENT HERE>

Jesse Schalken

unread,
Jan 5, 2011, 11:45:35 PM1/5/11
to Jonathan Geddes, haskell-cafe
You need both. A good static type system will tell you whether or not the code is type-correct. It will not tell you whether or not it does what it's supposed to do.

Consider:

sort :: [a] -> [a]

If you change sort to be:

sort = id

It will still type check, but it obviously doesn't do what it's supposed to do anymore. You need tests to verify that.

If you then change sort to be:

sort _ = 5

Now it's also type-incorrect. Static typing will catch it at compile time (eg. Haskell will now infer the type as "Num b => a -> b" which will not unify with "[a] -> [a]"), and dynamic typing will likely throw some sort of type error at run time in the places it was previously used. (Any error thrown by the language itself, like PHP's "Cannot call method on non-object" or Python's "TypeError" or even Java's "NullPointerException" or C++'s "Segmentation Fault" can be considered a type error.)

So with static typing, the machine will verify type-correctness, but you still need tests to verify the program meets its specification. With dynamic typing, you need tests to verify that the program meets both its specification and doesn't throw any type errors - so you need to test more.

The fact that most errors in programming are type errors and that Haskell programs therefore tend to "just work" once you can get them past the type checker may lead you to believe you don't need to test at all. But you still do for the reasons above, you just need to test a hell of a lot less.

Arnaud Bailly

unread,
Jan 6, 2011, 2:55:24 AM1/6/11
to haskell-cafe
I would supplement this excellent list of advices with an emphasis on
the first one: Test-Driven Development is *not* testing, TDD is a
*design* process. Like you said, it is a discipline of thought that
forces you first to express your intent with a test, second to write
the simplest thing that can possibly succeed, third to remove
duplication and refactor your code.

It happens that this process is somewhat different in Haskell than in
say Java, and actually much more fun and interesting thanks to the
high signal-to-noise ratio syntax of Haskell (once you get acquainted
with it of course) and its excellent support for abstaction,
duplication removal, generalization and more generally refactoring
(tool support may be better though...). For example, if I were to
develop map in TDD (which I did actually...), I could start with the
following unit test:

> map id [] ~?= []

which I would make pass very simply by copy and pasting, only changing
one symbol.

> map id [] = []

Then I would add a failing test case:

> TestList [ map id [] ~?= [] , map id [1] ~?= [1] ]

which I would make pass with, once again simple copy-pasting:

> map id [] = []
> map id [1] = [1]

Next test could be :

> TestList [ map id [] ~?= [] , map id [1] ~?= [1], , map id [2] ~?= [2] ]

Which of course would pass with:

> map id [] = []
> map id [1] = [1]
> map id [2] = [2]

then I would notice an opportunity for refactoring:

> map id [] = []
> map id [x] = [x]

etc, etc...Sound silly? Sure it is at first sight, and any
self-respecting haskeller would write such a piece, just like you
said, without feeling the need to write the tests, simply by stating
the equations about map.

The nice thing with haskell is that it has a few features that helps
in making those bigger steps in TDD, whereas less gifted languages and
platforms requires a lot more experience and self-confidence to "start
running":
- writing types delivers some of the design burden off the tests,
while keeping their intent (being executable instead of laying in dead
trees),
- quickcheck and friends help you express a whole class of those unit
tests in one invariant expression, while keeping the spirit of TDD as
one can use the counter-examples produced to drive the code-writing
process.

<plug>
Some people might be interested in
http://www.oqube.net/projects/beyond-tdd/ (a session I co-presented at
SPA2009) which was an experiment to try bringing the benefits of TDD
with quickcheck in haskell to the java world.
</plug>

Regards,
Arnaud

PS: On

Serguey Zefirov

unread,
Jan 6, 2011, 5:21:21 AM1/6/11
to Arnaud Bailly, haskell-cafe
2011/1/6 Arnaud Bailly <arnaud...@gmail.com>:

> I would supplement this excellent list of advices with an emphasis on
> the first one: Test-Driven Development is *not* testing, TDD is a
> *design* process. Like you said, it is a discipline of thought that
> forces you first to express your intent with a test, second to write
> the simplest thing that can possibly succeed, third to remove
> duplication and refactor your code.

Change T in TDD from Test to Type and you still get a valid
description like "It is a discipline of thought that forces you first
to express your intent with a type, second to write the simplest


thing that can possibly succeed, third to remove duplication and
refactor your code."

As for me, I prefer testing in the largest possible.

I write functions, experiment with them in REPL, combine them and
check combination result in REPL and when I cannot specify experiment
in one line of ghci, I write test.

Serguey Zefirov

unread,
Jan 6, 2011, 5:36:55 AM1/6/11
to Evan Laforge, haskell-cafe
2011/1/6 Evan Laforge <qdu...@gmail.com>:

>> QuickCheck especially is great because it automates this tedious work:
>> it fuzzes out the input for you and you get to think in terms of
>> higher-level invariants when testing your code. Since about six months
>> ago with the introduction of JUnit XML support in test-framework, we
>> also have plug-in instrumentation support with continuous integration
>> tools like Hudson:
> Incidentally, I've never been able to figure out how to use
> QuickCheck.  Maybe it has more to do with my particular app, but
> QuickCheck seems to expect simple input data and simple properties
> that should hold relating the input and output, and in my experience
> that's almost never true.  For instance, I want to ascertain that a
> function is true for "compatible" signals and false for "incompatible"
> ones, where the definition of compatible is quirky and complex.  I can
> make quickcheck generate lots of random signals, but to make sure the
> "compatible" is right means reimplementing the "compatible" function.

I should say that this reimplementation would be good. If you can
compare two implementations (one in plain Haskell and second in
declarative QuickCheck rules) you will be better that with only one.

We did that when testing implementations of commands in CPU model. Our
model was built to specification and we have to be sure we implement
it right. One problem was in CPU flags setup, specification was
defined in terms of bit manipulation, we wrote tests that did the same
but with ordinary arithmetic. Like carry = (a+b) `shirtR` 8 was
compared with carry = bit operandA 7 && bit operandB 7 && not (bit
result 7). We found errors in our implementation, we fixed them and
there was almost no errors found after that.

Doing two implementation for testing purposes can be boldly likened to
code review with only one person.

Felipe Almeida Lessa

unread,
Jan 7, 2011, 9:02:34 AM1/7/11
to Serguey Zefirov, haskell-cafe
Seeing all the good discussion on this thread, I think we are missing
a TDD page on our Haskell.org wiki. =)

Cheers,

--
Felipe.

Florian Weimer

unread,
Jan 7, 2011, 2:36:26 PM1/7/11
to Jonathan Geddes, haskell-cafe
* Jonathan Geddes:

> When I write Haskell code, I write functions (and monadic actions)
> that are either a) so trivial that writing any kind of unit/property
> test seems silly, or are b) composed of other trivial functions using
> equally-trivial combinators.

You can write in this style in any language which has good support for
functional composition (which means some sort of garbage collection
and perhaps closures, but strong support for higher-order functions is
probably not so important). But this doesn't mean that you don't have
bugs. There are a few error patterns I've seen when following this
style (albeit not in Haskell):

While traversing a data structure (or parsing some input), you fail to
make progress and end up in an infinite loop.

You confuse right and left (or swap two parameters of the same type),
leading to wrong results.

All the usual stuff about border conditions still applies.

Input and output is more often untyped than typed. Typos in magic
string constants (such as SQL statements) happen frequently.

Therefore, I think that you cannot really avoid extensive testing for
a large class of programming tasks.

> I vehemently disagreed, stating that invariants embedded in the type
> system are stronger than any other form of assuring correctness I know
> of.

But there are very interesting invariants you cannot easily express in
the type system, such as "this list is of finite length". It also
seems to me that most Haskell programmers do not bother to turn the
typechecker into some sort of proof checker. (Just pick a few
standard data structures on hackage and see if they perform such
encoding. 8-)

Evan Laforge

unread,
Jan 7, 2011, 11:16:52 PM1/7/11
to Chung-chieh Shan, haskel...@haskell.org
On Wed, Jan 5, 2011 at 7:31 PM, Chung-chieh Shan
<ccs...@post.harvard.edu> wrote:
> Besides those example inputs and expected outputs, what about:
> If two signals are (in)compatible then after applying some simple
> transformations to both they remain (in)compatible?  A certain family of
> signals is always compatible with another family of signals?  Silence is
> compatible with every signal?  Every non-silent signal is (in)compatible
> with itself (perhaps after applying a transformation)?

Well, signals are never transformed. Silence is, in fact, not
specially compatible. The most I can say is that signals that don't
overlap are always compatible. So you're correct in that it's
possible to extract properties. However, this particular property,
being simple, is also expressed in a simple way directly in the code,
so past a couple tests to make sure I didn't reverse any (>)s, I don't
feel like it needs the exhaustive testing that quickcheck brings to
bear. And basically it's just reimplementing a part of the original
function, in this case the first guard... I suppose you could say if I
typoed the (>)s in the original definition, maybe I won't in the test
version. But this is too low level, what I care about is if the whole
thing has the conceptually simple but computationally complex result
that I expect. The interesting bug is when the first guard shadows an
exception later on, so it turns out it's *not* totally true that
non-overlapping signals must be compatible, or maybe my definition of
"overlapping" is not sufficiently defined, or defined different ways
in different places, or needs to be adjusted, or.... I suppose input
fuzzing should be able to flush out things like fuzzy definitions of
overlapping

I can also say weak things about complex outputs, that they will be
returned in sorted order, that they won't overlap, etc. But I those
are rarely the interesting complicated things that I really want to
test. Even my "signal compatibility" example is relatively amenable
to extracting properties, picking some other examples:

- Having a certain kind of syntax error will result in a certain kind
of error msg, and surrounding expressions will continue to be included
in the output. The error msg will include the proper location. So
I'd need an Arbitrary to generate the right structure with an error or
two and then have code to figure out the reported location from the
location in the data structure, and debug all that. There's actually
a fair amount of stuff that wants to look for a log msg, like "hit a
cache, the only sign of which is a log msg of a certain format".
Certainly caches can be tested by asserting that you get the same
results with the cache turned off, that's an easy property.

- Hitting a certain key sequence results in certain data being entered
in the UI. There's nothing particularly "property" like about this,
it's too ad-hoc... this seems to apply for all UI-level tests.

- There's also a large class of "integration" type tests: I've tested
the signal compatibility function separately, but the final proof is
that the high-level user input of this shape results in this output,
due to do signal compatibility. These are the ones whose failure is
the most valuable because they test the emergent behaviour of a set of
interacting systems, and that's ultimately the user-visible behaviour
and also the place where the results are the most subtle. But those
are also the ones that have huge state spaces and, similar to the UI
tests, basically ad-hoc relationships between input and output.

- Testing for laziness of course doesn't work either. Or timed
things. As far as performance goes, some can be tested with tests
("taking the first output doesn't force the entire input" or "a new
key cancels the old threads and starts new ones") but some must be
tested with profiling and eyeballing the results.

QuickCheck seems to fit well when you have small input and output
spaces, but complicated stuff in the middle, but still simple
relations between the input and output. I think that's why data
structures are so easy to QuickCheck. I suppose I should look around
for more use of QuickCheck for non-data structures... the examples
I've seen have been trivial stuff like 'reverse . reverse = id'.

Evan Laforge

unread,
Jan 8, 2011, 12:11:34 AM1/8/11
to Serguey Zefirov, haskell-cafe
> I should say that this reimplementation would be good. If you can
> compare two implementations (one in plain Haskell and second in
> declarative QuickCheck rules) you will be better that with only one.

This presumes I know how to write a simple but slow version. Clearly,
that's an excellent situation, since you can trust your simple but
slow version more than the complex but fast one. Unfortunately, I'm
usually hard enough pressed to write just the slow version. If I
could think of a simpler way to write it I'd be really set, but I'm
already writing things in the simplest possible way I know how.

> Doing two implementation for testing purposes can be boldly likened to
> code review with only one person.

Indeed, but unfortunately it still all comes from the same brain. So
if it's too low level, I'll make the same wrong assumptions about the
input. If it's too high level, then writing a whole new program is
too much work. I think you make a good point, but one that's only
applicable in certain situations.

Heinrich Apfelmus

unread,
Jan 8, 2011, 8:33:54 AM1/8/11
to haskel...@haskell.org
Florian Weimer wrote:
> * Jonathan Geddes:
>
>> When I write Haskell code, I write functions (and monadic actions)
>> that are either a) so trivial that writing any kind of unit/property
>> test seems silly, or are b) composed of other trivial functions using
>> equally-trivial combinators.
>
> You can write in this style in any language which has good support for
> functional composition (which means some sort of garbage collection
> and perhaps closures, but strong support for higher-order functions is
> probably not so important). But this doesn't mean that you don't have
> bugs. There are a few error patterns I've seen when following this
> style (albeit not in Haskell):

As you mention, the bugs below can all be avoided in Haskell by using
the type system and the right abstractions and combinators. You can't
put everything in the type system - at some point, you do have to write
actual code - but you can isolate potential bugs to the point that their
correctness becomes obvious.

> While traversing a data structure (or parsing some input), you fail to
> make progress and end up in an infinite loop.

Remedy: favor higher-order combinators like fold and map over
primitive recursion.

> You confuse right and left (or swap two parameters of the same type),
> leading to wrong results.

Remedy: use more descriptive types, for instance by putting Int into
an appropriate newtype. Use infix notation source `link` target.

> All the usual stuff about border conditions still applies.

Partial remedy: choose natural boundary conditions, for example and []
= True, or [] = False.

But I would agree that this is one of the main use cases for QuickCheck.

> Input and output is more often untyped than typed. Typos in magic
> string constants (such as SQL statements) happen frequently.

Remedy: write magic string only once. Put them in a type-safe combinator
library.

> Therefore, I think that you cannot really avoid extensive testing for
> a large class of programming tasks.

Hopefully, the class is not so large anymore. ;)

>> I vehemently disagreed, stating that invariants embedded in the type
>> system are stronger than any other form of assuring correctness I know
>> of.
>
> But there are very interesting invariants you cannot easily express in
> the type system, such as "this list is of finite length".

Sure, you can.

data FiniteList a = Nil | Cons a !(FiniteList a)

I never needed to know whether a list is finite, though. It is more
interesting to know whether a list is infinite.

data InfiniteList a = a :> InfiniteList a

> It also
> seems to me that most Haskell programmers do not bother to turn the
> typechecker into some sort of proof checker. (Just pick a few
> standard data structures on hackage and see if they perform such
> encoding. 8-)

I at least regularly encode properties in the types, even if it's only a
type synonym. I also try to avoid classes of bugs by "making them
obvious", i.e. organizing my code in such a way that correctness becomes
obvious.


Regards,
Heinrich Apfelmus

--
http://apfelmus.nfshost.com

Serge Le Huitouze

unread,
Jan 12, 2011, 6:16:26 AM1/12/11
to haskel...@haskell.org
Evan Laforge <qdu...@gmail.com> wrote:

> QuickCheck seems to fit well when you have small input and output
> spaces, but complicated stuff in the middle, but still simple
> relations between the input and output. I think that's why data
> structures are so easy to QuickCheck. I suppose I should look around
> for more use of QuickCheck for non-data structures... the examples

> I've seen have been trivial stuff like 'reverse . reverse = id'.

I second this feeling...

For example, I've never seen (I've not looked hard, though) Quickcheck's
testing applied on graphs. Generating "interesting" (whatever that means
for your particular problem) graphs doesn't seem to be a trivial test, even
if it's a mere data structure...
Does anyone know of such examples?

Maybe genetic programming producing graphs is of relevance here (where
generating candidate graphs have exactly the same problem).

Also, I wonder how you guys do when you're trying to tests code
using a lot of numbers (be them floating point or even integer).

E.g., integers:
A code doing addition and substraction of some sort.
A property such as "X = (X add Y) sub Y" is easily falsifiable when
the number of bits of your integer is too small for your numbers.

E.g. floating point numbers:
A code doing coordinate transformations of some sort.
Properties akin to "roundtripping" can be easily formulated (e.g.
"X = trfY_toX(trfX_toY(X))"), but they'll be falsified often due to lost
bits in the mantissa.
Replacing strict equality (almost always meaningless for floats) with
approximate equality will work.
However, definitions of such an approximate equality might vary (the
immediate idea is to compare the difference of the two numbers with
a small "epsilon", say 10^-9, but that will obviously not work if you
numbers are around 10^-12...).

And there are also, of course, overflows and underflows, and singularities (e.g.
division by zero), non invertibility...

So, in addition to defining the approximation (not always easy as I tried
to "demonstrate" above) to be used in comparisons, one probably needs
ad'hoc generators whose complexity might very well exceed that of
the code one wants to test...

So, do you have any "methodology" for such use cases?

--Serge

Ivan Lazar Miljenovic

unread,
Jan 12, 2011, 7:16:52 AM1/12/11
to Serge Le Huitouze, haskel...@haskell.org
On 12 January 2011 21:16, Serge Le Huitouze <serge.le...@gmail.com> wrote:
> Evan Laforge <qdu...@gmail.com> wrote:
>
>> QuickCheck seems to fit well when you have small input and output
>> spaces, but complicated stuff in the middle, but still simple
>> relations between the input and output.  I think that's why data
>> structures are so easy to QuickCheck.  I suppose I should look around
>> for more use of QuickCheck for non-data structures... the examples
>> I've seen have been trivial stuff like 'reverse . reverse = id'.
>
> I second this feeling...
>
> For example, I've never seen (I've not looked hard, though) Quickcheck's
> testing applied on graphs. Generating "interesting" (whatever that means
> for your particular problem) graphs doesn't seem to be a trivial test, even
> if it's a mere data structure...
> Does anyone know of such examples?

I do some graph-based testing in graphviz [1]. It is non-trivial to
generate decent Arbitrary instances due to the recursive definitions
:s

[1]: http://hackage.haskell.org/package/graphviz

--
Ivan Lazar Miljenovic
Ivan.Mi...@gmail.com
IvanMiljenovic.wordpress.com

Henning Thielemann

unread,
Jan 12, 2011, 7:18:03 AM1/12/11
to Serge Le Huitouze, haskel...@haskell.org

On Wed, 12 Jan 2011, Serge Le Huitouze wrote:

> Also, I wonder how you guys do when you're trying to tests code
> using a lot of numbers (be them floating point or even integer).

Machine size integers and floating point numbers are indeed nasty. I test
a lot in NumericPrelude with QuickCheck, but then I test on Integers and
Rationals in order to see whether my algorithms are in principle correct.
I can test floating point algorithms only with approximate equality tests
and finding general valid tolerances is difficult, as you pointed out.


> E.g., integers:
> A code doing addition and substraction of some sort.
> A property such as "X = (X add Y) sub Y" is easily falsifiable when
> the number of bits of your integer is too small for your numbers.

Since fix-width words represent modulo-arithmetic, your law would even
hold in case of overflows.

Alberto G. Corona

unread,
Jan 12, 2011, 7:20:40 AM1/12/11
to Serge Le Huitouze, haskel...@haskell.org


2011/1/12 Serge Le Huitouze <serge.le...@gmail.com>
Evan Laforge <qdu...@gmail.com> wrote:
.....

So, in addition to defining the approximation (not always easy as I tried
to "demonstrate" above) to be used in comparisons, one probably needs
ad'hoc generators whose complexity might very well exceed that of
the code one wants to test...

So, do you have any "methodology" for such use cases?


For this reason whenever the space of test cases is difficult to cover or when the border cases are not know well due to the complexity of the code, I  created the package properties (http://hackage.haskell.org/package/properties) .

The idea is to check the relevant subset of the test space that is precisely the one that the real application generates, so no extra generator coding are needed while at the same time making sure that the property holds for all the real data.

It is in essence, a library that permits assertions of properties defined somewhere else . (something that assert does not permit). and a mechanism to create informative messages.

I did´nt use it very mucho owever. Theoreticaly I found it a good idea.

Alberto

Serge Le Huitouze

unread,
Jan 12, 2011, 7:45:22 AM1/12/11
to haskel...@haskell.org
Henning Thielemann <lem...@henning-thielemann.de> wrote:
>
>> A code doing addition and substraction of some sort.
>> A property such as "X = (X add Y) sub Y" is easily falsifiable when
>> the number of bits of your integer is too small for your numbers.
>
> Since fix-width words represent modulo-arithmetic, your law would even hold
> in case of overflows.

True in this very example, but it's overly simplistic and I chose it
just for the
sake of illustration.
A property such as "X = (X mul Y) div Y", with Y != 0 (of course ;-),
Y prime wrt
2^nbBits (nbBits being the size of your integers), and the intermediate product
exceeding 2^nbBits would fail miserably...

--Serge

Ketil Malde

unread,
Jan 12, 2011, 8:05:52 AM1/12/11
to haskel...@haskell.org
Serge Le Huitouze <serge.le...@gmail.com> writes:

> So, do you have any "methodology" for such use cases?

QuickCheck has the ==> operator, which lets you add a precondition. So
you could limit the testing of your property to values that satisfy the
precondition.

An alternative is to use newtype with a custom generator to produce only
data that makes sense.

Of course, ideally you should design your types so that all possible
values are meaningful :-)

-k
--
If I haven't seen further, it is by standing in the footprints of giants

Gregory Crosswhite

unread,
Jan 12, 2011, 8:34:51 AM1/12/11
to haskel...@haskell.org
On 1/12/11 5:05 AM, Ketil Malde wrote:
> Of course, ideally you should design your types so that all possible
> values are meaningful:-)

Sadly we cannot all program in Agda. :-)

Cheers,
Greg

Charles Lambert

unread,
Jan 12, 2012, 1:32:14 AM1/12/12
to haskel...@googlegroups.com, haskell-cafe
I know this thread is about a year old. However no one bothered to mention this:

Your tests should be the business requirements for the system you are building in the form of source code. Every test should point back to, either directly or indirectly to a business rule. If you have tests that don't do this, then you are writing too many tests. Too many tests can make your code brittle and difficult to modify. This defeats the purpose of having the tests in the first place.

Also, when you get down to the unit level of testing, you should only be testing publicly visible functionality. In other words: don't test your hidden functions. If the publicly visible functions are working as expected, then so are your private ones.

For example, the system I'm working on right now has hard deletes. We have a new business rule that nothing gets deleted.

We already have rules that verify if an item is deleted then you cannot retrieve it. I don't have to touch those tests. What I have to do is write some tests for a lower layer that verify when you delete something it still exists in the data store. After those tests are written, I now have new rules in the system that I can use to verify if deleted items exist in the data store. Of course, my tests are going to fail because I have not implemented that functionality, but now I can write my code with confidence knowing that the tests will verify that my soft delete functionality works. The existing tests will verify that my new code does not break the existing business rule that when something is deleted you cannot retrieve it. Therefore regressions are less likely to be introduced into the system while making changes.

That is the whole point of red/green/refactor. You shouldn't be writing any code unless you have a business reason to do so. Therefore any new tests you write are for new business rules that did not exist before, and obviously the system does not currently provide that functionality. There for your test should fail. Once you write that code correctly, the test passes. After you have written it, you should go back and look for ways to reduce the complexity, using the tests to ensure that your reduced complexity still works as expected. The refactor step is just as important as the first two, its what keeps your code maintainable.
Reply all
Reply to author
Forward
0 new messages