A critique of test-first...

CTips

unread,

Nov 15, 2004, 12:07:06 AM11/15/04

to

Everyone agrees that testing is important, but what use is it to write
the unit tests before starting coding (rather than after)?

IMO, its one advantage is that it forces the programmer to focus on the
problem before starting to code. Of course, this presumes that the
programmer is one of those who don't spend time thinking about the
problem, its specification, possible data-structures and algorithms etc.

Other advantages claimed for this rule (I'm quoting from
http://www.extremeprogramming.org/rules/testfirst.html ) are discused below.

It is claimed: Test-first helps nail down (document?) the
requirements/specifications. Clearly, this is true when the
requirements/specifications are "shallow" - i.e. when one module
implements one (or many) requirements. On the other hand, as problem
complexity grows, this becomes false. Consider the requirement "the
database must implement rollback". This requirement will possibly be
implemented using multiple modules, each of which will have unit tests.
These tests, however, won't help nail down any specification/requirement.

It is claimed: Test-first helps define the scope of the module. (This
para assumes that all "we create our unit tests first", which is
somewhat contradictory to the test-driven development approach
recommended 2 paras later). _IF_ one can define the functionality a
particular module early, this may make sense. However, I have found that
the partitioning of work betwen modules only becomes clearer when a
certain amount of code has been written.

These two paras make a certain amount of sense in a situation where unit
tests are a subset of the acceptance tests - i.e. where modules directly
implement end-user function. They are less applicable in more complex
problems, where the end-user functionality requires several modules to
implement.

It is claimed: test-first makes code more testable at a system level. I
don't see how this follows. Does it matter whether the unit-tests are
written before or after coding? In fact, does it matter if the
unit-tests are written by a different team?

Phlip

unread,

Nov 15, 2004, 12:14:57 AM11/15/04

to

CTips wrote:

> Everyone agrees that testing is important, but what use is it to write
> the unit tests before starting coding (rather than after)?
>
> IMO, its one advantage is that it forces the programmer to focus on the
> problem before starting to code. Of course, this presumes that the
> programmer is one of those who don't spend time thinking about the
> problem, its specification, possible data-structures and algorithms etc.
>
> Other advantages claimed for this rule (I'm quoting from
> http://www.extremeprogramming.org/rules/testfirst.html ) are discused
below.

You may want to read /Code Complete 2nd Edition/ by Steve McConnell on the
topic.

--
Phlip
http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces

Laurent Bossavit

unread,

Nov 15, 2004, 4:04:57 AM11/15/04

to

> Everyone agrees that testing is important, but what use is it to write
> the unit tests before starting coding (rather than after)?

One key benefit, not listed in the page you refer to, is avoiding
confirmation bias. You start with the hypothesis that your code doesn't
work, confirm the hypothesis, then write just enough code to "falsify"
the hypothesis.

Laurent

CTips

unread,

Nov 15, 2004, 7:38:47 AM11/15/04

to

There is some truth to that. However, the bias will still exist, partly
because:
- at the time the test is written, the programmer may also have an
implementation in mind, so the test and implementation are coupled
- the implementation may pass the test, but have other bugs that will
show up.

The "write just enough code" approach will make it unlikely that there
are untested bugs _IF_ the module is simple. For instance, there are
modules where one has to write 1kloc+ of code before it is possible to
run the simplest "real" test.

Alternative approaches to avoid confirmation bias (well, actually to
ensure that the module is adequately tested) include:
- use a coverage tool to ensure that a particular level of coverage has
been met
- have a separate tester.

Ronald E Jeffries

unread,

Nov 15, 2004, 7:52:12 AM11/15/04

to

On Mon, 15 Nov 2004 00:07:06 -0500, CTips <ct...@bestweb.net> wrote:

>It is claimed: test-first makes code more testable at a system level. I
>don't see how this follows. Does it matter whether the unit-tests are
>written before or after coding? In fact, does it matter if the
>unit-tests are written by a different team?

Yes, it does matter. It's quite common for programmers to write code
that is hard to test when the tests are done after the fact. Much of
the point and value of Michael Feathers' new book on legacy code is
that he shows how to get such code back in testable shape.

When we write the tests first, at a bare minimum, we ensure that tests
can be written. Even if no other benefit accrued, that would be a
significant advantage compared to a lot of code that we see "out
there".

(I couldn't tell from CTips' entire article whether he understands
that TDD is done one test at a time, not a whole bunch of tests, then
code to match. It's really not well-described as "test-first";
"test-driven" does a better job. One alternates between test, code,
back to test.)

Regards,

--
Ron Jeffries
www.XProgramming.com
I'm giving the best advice I have. You get to decide if it's true for you.

Ronald E Jeffries

unread,

Nov 15, 2004, 7:56:26 AM11/15/04

to

On Mon, 15 Nov 2004 07:38:47 -0500, CTips <ct...@bestweb.net> wrote:

>Alternative approaches to avoid confirmation bias (well, actually to
>ensure that the module is adequately tested) include:
>- use a coverage tool to ensure that a particular level of coverage has
>been met
>- have a separate tester.

Yes, both these things are of value. A coverage tool will show up
parts of the code that need testing, but it will also often show up
"blind spots" in the testing that we're actually trying to do well.
Observing what's being missed will often help us adjust our style of
work for better coverage.

Separate testing is also of value. In Extreme Programming, one of the
methods that recommends TDD, this shows up in the "Customer Acceptance
Testing" practice, where independent tests are specified by the
customer team and used to verify overall system behavior.

I am not very favorably inclined to separate testers doing unit
testing, because the feedback comes to me later than I would like,
they often have difficulty understanding what I was doing well enough
to test it, and so on. Even in the case of Customer Acceptance
Testing, I'd like to have the tests automated and available right when
I think I'm done, so that I can run them before passing the code on.

Laurent Bossavit

unread,

Nov 15, 2004, 8:34:35 AM11/15/04

to

> For instance, there are modules where one has to write 1kloc+ of
> code before it is possible to run the simplest "real" test.

What do you call a "real" test ?

What kind of module would require writing hundreds or a thousand lines
of code before you could unit test it ?

Laurent

John Roth

unread,

Nov 15, 2004, 9:38:15 AM11/15/04

to

"CTips" <ct...@bestweb.net> wrote in message
news:10pgebf...@corp.supernews.com...

>
>
> Everyone agrees that testing is important, but what use is it to write the
> unit tests before starting coding (rather than after)?

As several other posters have mentioned, we're not talking about
writing the tests (plural) before writing the code. In fact, we're
not talking about designing a module, writing the tests, and then
writing the module.

What we're talking about is Test Driven Development, which
says to do some design, write one (singular) failing test, then
write exactly the code required to make that test pass, and
not one keystroke more. Repeat until done.

Designing a module, writing all the tests and then writing
the module is a strategy that has been suggested many times
over the last several decades, and it has never gained
enough traction to be fairly evaluated. It's pretty much regarded
as a failure.

Test Driven Development (write one test, write the code
to make it pass, refactor, repeat) is a success.

I think we've fallen into one of the major traps here: Test Driven
Development is a design technique first, and a testing technique
second. It is (or should be) a well known fact that the suite
of unit tests that come out of TDD is not the same as a professional
tester would write after the fact. That test suite is grossly
insufficient according to classical testing techniques. However,
it works.

> IMO, its one advantage is that it forces the programmer to focus on the
> problem before starting to code. Of course, this presumes that the
> programmer is one of those who don't spend time thinking about the
> problem, its specification, possible data-structures and algorithms etc.

And that's simply wrong. Any developer who doesn't think about
design issues is going to be lost sooner rather than later if they
try to use Test Driven Development.

> Other advantages claimed for this rule (I'm quoting from
> http://www.extremeprogramming.org/rules/testfirst.html ) are discused
> below.
>
> It is claimed: Test-first helps nail down (document?) the
> requirements/specifications. Clearly, this is true when the
> requirements/specifications are "shallow" - i.e. when one module
> implements one (or many) requirements. On the other hand, as problem
> complexity grows, this becomes false. Consider the requirement "the
> database must implement rollback". This requirement will possibly be
> implemented using multiple modules, each of which will have unit tests.
> These tests, however, won't help nail down any specification/requirement.

We've got another confusion here. You've shifted from programmer
(unit) tests to customer (acceptance) tests. They are two very different
things.

In any case, I wouldn't write "the database must implement rollback"
except to specify which data base to purchase, and that only if the
program required it.

What I would write is something like: "all transactions must leave
the data in the original state if they have to be abandoned before
successful completion." This is a "motherhood" or global requirement;
it should generate tests for each story that deals with a failure.

Mike Cohen (User Stories Applied) calls these constraints,
and suggests writing the word "constraint" on story cards
that must be obeyed rather than implemented. See p. 77.

> It is claimed: Test-first helps define the scope of the module. (This para
> assumes that all "we create our unit tests first", which is somewhat
> contradictory to the test-driven development approach recommended 2 paras
> later). _IF_ one can define the functionality a particular module early,
> this may make sense. However, I have found that the partitioning of work
> betwen modules only becomes clearer when a certain amount of code has been
> written.
>
> These two paras make a certain amount of sense in a situation where unit
> tests are a subset of the acceptance tests - i.e. where modules directly
> implement end-user function. They are less applicable in more complex
> problems, where the end-user functionality requires several modules to
> implement.

See the comments I started out with. We don't deal with
module level practices.

> It is claimed: test-first makes code more testable at a system level. I
> don't see how this follows. Does it matter whether the unit-tests are
> written before or after coding? In fact, does it matter if the unit-tests
> are written by a different team?

It's very easy to write difficult to test code if you don't have
the tests in front of you. It's even easier if someone else is
doing the testing some time later.

It's practically impossible to write untestable code if you're
doing Test Driven Development. If you're also implementing
story by story, and doing continuous integration, it's practically
impossible to write a system that can't be tested at all
levels.

John Roth

Phlip

unread,

Nov 15, 2004, 10:05:22 AM11/15/04

to

Has any of the coaches here experienced anyone continue to ask questions
like these after they had used TDD, correctly and sustainably, for a few
weeks?

--
Phlip
http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces

CTips

unread,

Nov 15, 2004, 10:42:55 AM11/15/04

to

Laurent Bossavit wrote:

Many compiler optimizations - in one style of writing optimizations, we
perform an analysis phase (which does not perform any observable change
on the output), followed by an transform phase (which does). The
analysis phases can be quite complex - several klocs long. Even if we
break it down to the minimal amount of analysis per transform kind, it
will still amount to 1kloc+. (BTW: this is based on recent experience,
not hypothetical).

Another case is certain kinds of device drivers - you have to get pretty
much all the basic parameters right or the device wont work at all - no
observable behavior. This also might require 1kloc+ code. (Also based on
personal experience).

I suspect that a lot of your experience is in "business applications"
and/or GUI-centric apps, which tend to be "shallow", and other
small-to-medium sized, low complexity apps. Otherwise you'd probably be
able to think of lots of examples of stuff for which passing the "next"
unit test requires writing kloc+ of code.

CTips

unread,

Nov 15, 2004, 10:44:46 AM11/15/04

to

Ilja Preuß

unread,

Nov 15, 2004, 10:59:53 AM11/15/04

to

CTips wrote:

> IMO, its one advantage is that it forces the programmer to focus on
> the problem before starting to code. Of course, this presumes that the
> programmer is one of those who don't spend time thinking about the
> problem, its specification, possible data-structures and algorithms
> etc.

Actually, even those programmers who already do all those things might find
writing the tests a helpfull tool to concentrate on client needs and to let
the thinking become more concrete.

> Other advantages claimed for this rule (I'm quoting from
> http://www.extremeprogramming.org/rules/testfirst.html ) are discused
> below.
>
> It is claimed: Test-first helps nail down (document?) the
> requirements/specifications. Clearly, this is true when the
> requirements/specifications are "shallow" - i.e. when one module
> implements one (or many) requirements. On the other hand, as problem
> complexity grows, this becomes false. Consider the requirement "the
> database must implement rollback". This requirement will possibly be
> implemented using multiple modules, each of which will have unit
> tests. These tests, however, won't help nail down any
> specification/requirement.

The web page isn't very clear in this point, but I'd think that it is more
speaking about the implicite requirement on a module - "what needs this
class to do?" Your are right that otherwise it doesn't make much sense.

> It is claimed: Test-first helps define the scope of the module. (This
> para assumes that all "we create our unit tests first", which is
> somewhat contradictory to the test-driven development approach
> recommended 2 paras later). _IF_ one can define the functionality a
> particular module early, this may make sense. However, I have found
> that the partitioning of work betwen modules only becomes clearer
> when a
> certain amount of code has been written.

Yes. But you certainly also can't start implementing a module before you
have a single idea what it should do. That's why the tests are written in
parallel to the production code, alternating between writing a test and
making it run.

> It is claimed: test-first makes code more testable at a system level.
> I don't see how this follows. Does it matter whether the unit-tests
> are written before or after coding? In fact, does it matter if the
> unit-tests are written by a different team?

If I write the test first, and then the production code to make the test
pass, that production code is testable by definition, isn't it?

Cheers, Ilja

CTips

unread,

Nov 15, 2004, 12:05:30 PM11/15/04

to

Ronald E Jeffries wrote:
> On Mon, 15 Nov 2004 00:07:06 -0500, CTips <ct...@bestweb.net> wrote:
> (I couldn't tell from CTips' entire article whether he understands
> that TDD is done one test at a time, not a whole bunch of tests, then
> code to match. It's really not well-described as "test-first";
> "test-driven" does a better job. One alternates between test, code,
> back to test.)

I did assume that it was talking about TDD. However, lets look at para 3
from http://www.extremeprogramming.org/rules/testfirst.html

"It is often not clear when a developer has finished all the necessary
functionality. Scope creep can occur as extensions and error conditions
are considered. If we create our unit tests first then we know when we
are done; the unit tests all run."

These appears to say that test-first limits scope-creep by using the
tests to specify the functionality. If all the unit-tests are written
ahead of time, that make sense. However, if you're using TDD, then this
para is nonsense. If we can keep adding tests (=> function, scope), then
we will never "know when we are done" since we can always add more tests.

Also, if you're using TDD, the benefit described in para 2 becomes less
clear - "Requirements are nailed down firmly by tests." With TDD, we
will be adding tests incrementally, so in a sense we are making up the
requirements as we go along. In fact, the requirements on the module
won't be completely "nailed down firmly by tests" until the last test is
written.

There seems to be a disconenct between TDD and those 2 paras. Was it
ever the case that extreme programming first advocated all (or most)
tests first, and then switched to TDD?

Vladimir Levin

unread,

Nov 15, 2004, 12:13:46 PM11/15/04

to

CTips <ct...@bestweb.net> wrote in message news:<10pgebf...@corp.supernews.com>...

> It is claimed: Test-first helps define the scope of the module. (This
> para assumes that all "we create our unit tests first", which is
> somewhat contradictory to the test-driven development approach
> recommended 2 paras later). _IF_ one can define the functionality a
> particular module early, this may make sense. However, I have found that
> the partitioning of work betwen modules only becomes clearer when a
> certain amount of code has been written.

I agree, and this is one of the reasons for doing TDD. As you evolve the code,
you refactor the design. The tests are there to buttress those changes.
Refactoring code may require tests to be modified, but the modifications
are relatively trivial, and it's good to know that everything is still
working after you've moved things around, tweaked the api, etc.

CTips, you seem interested enough in TDD and XP to criticise its practices,
but I have not read anything from you about trying it. I think it would
be pretty cool if you went to a conference to try it out. Maybe someone
like Robert Martin would be willing to do a session with you. If you come
out of it with something positive to say, then it would be good PR for
XP, but if not, I think your criticisms would be concrete instead of
abstract.

David

unread,

Nov 15, 2004, 12:38:49 PM11/15/04

to

This is a very good thread. Thank you CTips and all the rest of you.
It cleared up a few misconceptions that I've not heard any of you mention
yet.

I'm not sold on TDD itself, but still interested in the concept.
Since I have had good teachers and experience, almost everything you
have mentioned is already part of my work plan. I tend to think
and design rather deep though and most of the refactoring is done
in my head before the base concepts are written down.

I write tests as I go as well. This has annoyed other programmers
a bit, since it makes my code base larger. However, everyone
seems to understand how to use and generally maintain the code I've
produced. It has also served well that any failure in a system
should leave some trace of why that happened. I test all code
changes before checking in and never bother with Release vs. Debug
code -- its all Release code and must be diagnosable at a distance.
But that is also the world I live in.

David

CTips

unread,

Nov 15, 2004, 4:21:02 PM11/15/04

to

Here's a question - remember the psuedo-hangman example you posted early
this year? It took me about 3 hours to code and debug a _complete_
hangman, including dictionary management and I/O, _NOT_ using TDD. (
source is at http://users.bestweb.net/~ctips/hangman.c )

How long had you been working on the problem, using TDD? Do you think
if you hadn't tried doing it with TDD but just gone ahead and written
it, adding tests as and when it seemed appropriate, you might have been
able to get it done quicker?

Note that I have *tried* to do things TDD; unfortunately, it became
apparent that TDD is a completely inadequate way of doing things.

- With TDD you write code to pass test #1, then #1 & #2 and so on. If,
at some point along the process, you discover that the best way to doing
#1...#n is to use a different approach (say, to use a table-driven
approach instead of a switch statement) then you have to rewrite the
code ("refactor"). If, however, you had written it using a table-driven
approach in the first place you would have saved yourself a lot of time.

- There are many situations in which it is not possible to write tests.
As an extreme example, consider implementing synchronization primitives.
They have to be proved to be correct.

- The problem is quite often complex enough that the first non-trivial
test requires most of the work.

I suspect that TDD works better for tackling simpler problems in less
complex domains, and at lower productivities.

CTips

unread,

Nov 15, 2004, 4:21:43 PM11/15/04

to

Vladimir Levin wrote:

Laurent Bossavit

unread,

Nov 15, 2004, 4:51:44 PM11/15/04

to

> It took me about 3 hours to code and debug a _complete_
> hangman, including dictionary management and I/O, _NOT_ using TDD.

Having written it that way, do you think it would be very difficult to
do that one over, this time test-driven ? Since you know where you're
going to end up, massive rewrites shouldn't be a concern. Also...

> #1...#n is to use a different approach (say, to use a table-driven
> approach instead of a switch statement) then you have to rewrite the
> code ("refactor").

Refactor doesn't mean "rewrite". It means a small, reversible change
which leaves all tests running. The accumulation of such transformations
tends to stress the code in such a way that interface ("what") and
implementation ("how") drift away from each other; that happens to be a
desirable property of code.

Laurent

Laurent Bossavit

unread,

Nov 15, 2004, 4:55:55 PM11/15/04

to

> These appears to say that test-first limits scope-creep by using the

> tests to specify the functionality. [...] If we can keep adding tests

> (=> function, scope), then we will never "know when we are done" since
> we can always add more tests.

Note that there are two kinds of scope creep - requirements inflation
(the customer wants more than she originally said) and gold-plating
(developers beef up the code more than needed to make customer happy).

Tests are involved in limiting scope creep at more than one level of
abstraction; system-level tests pin down the "big picture" view of what
the system does - while smaller (unit) tests pin down the implementation
details. Think of drawing an outline, then filling it in.

There is one level of tests that enjoy a special status - "acceptance
tests", which serve as a vehicle for requirements. If you go all the way
up to that level, then requirements inflation might occur, but gold-
plating is less likely.

Laurent

Laurent Bossavit

unread,

Nov 15, 2004, 5:23:53 PM11/15/04

to

> Even if we break it down to the minimal amount of analysis per transform kind,
> it will still amount to 1kloc+. (BTW: this is based on recent experience,
> not hypothetical).

I suspect we differ on what we call a "unit" test, or a "real" test.

Here is a thought experiment. Suppose you suspect that, somewhere in
these 1000 lines of optimizer code for a given transformation, there is
one nasty bug.

I would suppose that you will instrument the code for debugging - put a
printf in there, or activate one that you've already put there for that
purpose.

The next step might be running the compiler on a test source file that
is carefully designed to exercise, among other paths, one that causes
the printf statement to be activated.

Now we have a string that has been output to stdout. If the debug
instrumentation is at all helpful, we have a precise idea what that
string should be. If the string actually output differs from that, we
have a "smoking gun" that the bug is indeed, in the main, what and where
we suspected.

We could make this more efficient by writing a small function that
examines stdout for us, and compares the particular line in the output
that we're interested in with the value we expect. Couldn't we ?

Maybe we don't need to exercise the whole compiler, either. The analysis
is, I suppose, running over part of the partially compiled code - mostly
assembly with unresolved jump locations, or somesuch. Maybe we can store
the binary representation of that instead of the source file. And do we
really need more than a dozen or a few dozen bytes of it in order to
expose the bug ? I may be wrong (I've only written one compiler in my
life, with fairly simple peephole optimizations), but I'll assume that's
about right.

Here's the one part I'm not 100% sure of. Is the 1000 lines of analysis
code one big function, or is it broken down further into functions ? In
the latter case, presumably we can massage the input data further still,
until we know exactly what *would* be passed into the function that has
the debug printf, if we were encountering the bug in real use. Instead
of a printf, perhaps we can devise other ways to get the output back -
say in a return value, or a parameter passed by reference.

So we can write a test function that:
- constructs the input data we need
- passes it into the relevant function of the analysis module
- receives an encoded "status" string as one result of the function
- compares the string it received with an "expected" string

Well, surprise - that test function is an XP-style unit test. Ex
hypothesi (unless I've made a mistake in the above) it tests something
that is considerably less than 1000 lines of code. (I haven't put a
number on that size, but it's "whatever the function length you consider
reasonable"). Is it a "real" test ? Well, again ex hypothesi, I wrote it
because of an actual bug, so presumably it tests something relevant.

Instead of writing these tests when a bug surfaces, when arguably it is
too late, XP suggests that they should be written before the code. This
helps developers focus on testability in their design (they can't get
away with a 1000-line function), avoids confirmation bias, and helps
with debugging in much the same way that instrumentation does, when a
rare bug does surface. Though actually the beginning of this paragraph
should read "In addition to", not "Instead of" - in XP it's considered a
best practice to write a new test for any bug found.

Laurent

Ronald E Jeffries

unread,

Nov 15, 2004, 9:50:02 PM11/15/04

to

On Mon, 15 Nov 2004 12:05:30 -0500, CTips <ct...@bestweb.net> wrote:

>Ronald E Jeffries wrote:
>> On Mon, 15 Nov 2004 00:07:06 -0500, CTips <ct...@bestweb.net> wrote:
>> (I couldn't tell from CTips' entire article whether he understands
>> that TDD is done one test at a time, not a whole bunch of tests, then
>> code to match. It's really not well-described as "test-first";
>> "test-driven" does a better job. One alternates between test, code,
>> back to test.)
>
>I did assume that it was talking about TDD. However, lets look at para 3
>from http://www.extremeprogramming.org/rules/testfirst.html
>
>"It is often not clear when a developer has finished all the necessary
>functionality. Scope creep can occur as extensions and error conditions
>are considered. If we create our unit tests first then we know when we
>are done; the unit tests all run."
>
>These appears to say that test-first limits scope-creep by using the
>tests to specify the functionality. If all the unit-tests are written
>ahead of time, that make sense. However, if you're using TDD, then this
>para is nonsense. If we can keep adding tests (=> function, scope), then
>we will never "know when we are done" since we can always add more tests.

"Wisdom begins when we discover the difference between 'that makes no
sense' and 'I don't understand'" -- Mary Doria Russell

Permit me please, to help increase your understanding:

What we find in doing TDD, is that when we switch from adding one bit
of function to writing the next test, it's easier to notice that we
really don't need the next bit. I'm not sure why that would be the
case, but many people report the same effect, so I'm rather sure it's
real.

>
>Also, if you're using TDD, the benefit described in para 2 becomes less
>clear - "Requirements are nailed down firmly by tests." With TDD, we
>will be adding tests incrementally, so in a sense we are making up the
>requirements as we go along. In fact, the requirements on the module
>won't be completely "nailed down firmly by tests" until the last test is
>written.

We are /translating/ the requirements that we have in our head, or in
what the customer told us or wherever we got them. Since TDD is
commonly done at the /unit/ level, it's like a kind of design. We
might have said to ourselves, "We need to write a linked list class",
written down some requirements or just winged them, and coded it up
and then tested. Or using TDD, we'd write a series of tests like
"emptylist.next() returns null", and so on, and again stop when we're
done.

But when we stop, the tests define the requrements that we used to
write the tests and code. The requirements are now written, whether
they were on paper elsewhere or not, in executable code. Therefore,
"nailed down".

>
>There seems to be a disconenct between TDD and those 2 paras. Was it
>ever the case that extreme programming first advocated all (or most)
>tests first, and then switched to TDD?

Not to my recollection. In early days we said you had to test
"everything that could possibly break" but we wrote most of the tests
after the fact. Then the test-first thing came along, where we write
them, basically, one at a time.

Ronald E Jeffries

unread,

Nov 15, 2004, 9:57:12 PM11/15/04

to

On Mon, 15 Nov 2004 16:21:02 -0500, CTips <ct...@bestweb.net> wrote:

>Note that I have *tried* to do things TDD; unfortunately, it became
>apparent that TDD is a completely inadequate way of doing things.

For you, perhaps. It would be interesting to work with you on
something to see what we'd discover. I'd bet we'd each learn something
new.

>
>- With TDD you write code to pass test #1, then #1 & #2 and so on. If,
>at some point along the process, you discover that the best way to doing
>#1...#n is to use a different approach (say, to use a table-driven
>approach instead of a switch statement) then you have to rewrite the
>code ("refactor"). If, however, you had written it using a table-driven
>approach in the first place you would have saved yourself a lot of time.

Yes, if we had thought of it. But by your hypothesis, we didn't think
of it, in which case, we'd have to rewrite it no matter whether we
were doing TDD or not, n'est pas?

>
>- There are many situations in which it is not possible to write tests.
>As an extreme example, consider implementing synchronization primitives.
>They have to be proved to be correct.

Yes, there are some situations where tests are not sufficient --
though I am aware of no situation where tests literally cannot be
written. And sync primitives /should/ probably be proven correct. But
I'd wager that many have been written that were not.

However, the number of such cases, where testing is not adequate,
while perhaps large in absolute numbers, is not in my experience, a
large percentage of the code that needs to be written.

>
>- The problem is quite often complex enough that the first non-trivial
>test requires most of the work.

I've felt that way sometimes, but what I've found is usually that I
just hadn't thought of the simple starting point yet.

>
>I suspect that TDD works better for tackling simpler problems in less
>complex domains, and at lower productivities.

Well, I don't know. I've worked in a lot of domains that people
consider complex and I'd use TDD most everywhere. As for productivity,
I know it raises mine, but perhaps if I were as good as you report
your teams to be, it wouldn't help me, or wouldn't help as much.

For me, TDD avoids defects, and saves time because I do far less
debugging. If a team avoided defects some other way -- expecially by
just being incredibly smart, as opposed to some time-consuming way
such as extensive reviews -- then TDD might not help so much.

I can't remember the last time I encountered a team that had an very
very low defect rate, but I'm sure they are out there somewhere. And
maybe they don't need TDD. Someday maybe I'll get to observe such a
team and find out.

regards,

CTips

unread,

Nov 15, 2004, 11:00:36 PM11/15/04

to

Lets take an example. I want to do a peephole optimization that converts
z = x + 0
into
z = x
OK, so thats test#1, and I write a bunch of C/C++/what-have-you to
implement that.

Now, test #2 is
t = x + K, K is a constant
z = t + L, L is a constant
----------
z = x + (K+L)

and so on and so forth, recognizing more patterns. Each pattern
generates say on the average of about 30 lines of C. There are
eventually going to be more than 500 such patterns, resulting in about
15kloc of code.

At some point, it should become clear that it is much more efficient and
much less error-prone to write a tool that will automatically convert
descriptions such as
t = x + y
z = t - y
---------
z = x
into the C code required to implement them. This tool takes less than
2kloc, and the descriptions total another 2 kloc, resulting in a net
savings of 75% of the total effort (actually, its much more, because the
descriptions are much harder to get wrong than the code).

At this point you throw away all the implementation code which you wrote
by hand.

Now, if you had been doing TDD you would either have written 15kloc or
you would have written some amount of code, then after realizing the
right approach, had to throw it all away.

If you had been smart, thought about the problem, and decided to
implement the solution the right way from the beginning, you'd have to
write about 700loc or so of code before test#1 passed, and probably
another 700loc before test#2 passed.

CTips

unread,

Nov 15, 2004, 11:05:35 PM11/15/04

to

Laurent Bossavit wrote:
>>These appears to say that test-first limits scope-creep by using the
>>tests to specify the functionality. [...] If we can keep adding tests
>>(=> function, scope), then we will never "know when we are done" since
>>we can always add more tests.
>

<snip>

> Tests are involved in limiting scope creep at more than one level of
> abstraction; system-level tests pin down the "big picture" view of what
> the system does - while smaller (unit) tests pin down the implementation
> details. Think of drawing an outline, then filling it in.
>

Thats specious. Tests specify the functionality that the module
implements. If you can add more tests, then you can increase the
functionality/scope of the module (or vice-versa). TDD does not specify
when to stop adding tests. Therefore, TDD does not address scope creep
in any way.

Phlip

unread,

Nov 15, 2004, 11:52:29 PM11/15/04

to

David wrote:

> I'm not sold on TDD itself, but still interested in the concept.
> Since I have had good teachers and experience, almost everything you
> have mentioned is already part of my work plan. I tend to think
> and design rather deep though and most of the refactoring is done
> in my head before the base concepts are written down.

Everyone can take a given program some distance in their head. Call this
distance X. I'm the first to admit my average X is less than most
programmers. But when X runs out, that's where TDD. You have to do all of X
via TDD so that when you run off the end of your ability to design in your
head, you can still keep going.

--
Phlip
http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces

Phlip

unread,

Nov 16, 2004, 12:10:58 AM11/16/04

to

CTips wrote:

> Thats specious. Tests specify the functionality that the module
> implements. If you can add more tests, then you can increase the
> functionality/scope of the module (or vice-versa). TDD does not specify
> when to stop adding tests. Therefore, TDD does not address scope creep
> in any way.

I think Ron once said, "Add tests until fear turns into boredom".

Each test must show an unbroken chain: Requirement->feature->test->code. If
you can't think of a new test to fulfill a requirement, you are allowed to
add more test-last, but you are not allowed to

Tests, alone, naturally cannot stop the scope creep. However, they provide a
very powerful system for engineers to stop it. Only with wall-to-wall tests
can you _remove_ lines, and see if they weren't needed.

--
Phlip
http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces

Vladimir Levin

unread,

Nov 16, 2004, 1:06:07 AM11/16/04

to

CTips <ct...@bestweb.net> wrote in message news:<10pi7f8...@corp.supernews.com>...

> Here's a question - remember the psuedo-hangman example you posted early
> this year? It took me about 3 hours to code and debug a _complete_
> hangman, including dictionary management and I/O, _NOT_ using TDD. (
> source is at http://users.bestweb.net/~ctips/hangman.c )
>
> How long had you been working on the problem, using TDD? Do you think
> if you hadn't tried doing it with TDD but just gone ahead and written
> it, adding tests as and when it seemed appropriate, you might have been
> able to get it done quicker?

I don't remember how long it took me at the time (2, 3 hours?). I have
written several similar types of programs since then, games like
blackjack, pong, breakout, bouncing ball in a box... I teach some
classes in my spare time, so that gives me a chance to play with
simple programs a fair bit... I'd estimate my productivity with these
to be about the same as without TDD. I'd say the time taken up writing
the tests seems to roughly make up for the dumb mistakes I introduce
into the code as I modify it, which the tests help me to notice. In
the case of the simple ball I games, I found the tests to be a very
convenient way to check my collision detection code. This is just my
own experience, but I'd say the code I write with TDD is more "robust"
than without it. I think of more things to test with TDD than I would
if I just sat down, wrote the code, then debugged it. Also, I find
that the code breaks down into smoother separation of implementation
and interface, which is a quality I consider to be desirable...
Anyway, while I am sure you're a very productive programmer, much
better than average, I am not so sure that your productivity would
really suffer from TDD once you got used to it. I think you'd find
benefits such as being able to slide in changes where before you'd
have had to re-write the module entirely to accomodate...

> If, however, you had written it using a table-driven
> approach in the first place you would have saved yourself a lot of time.

This is the kind of situation where I would expect TDD to really
shine.
You re-write the implementation, and the tests still pass. Maybe I am
mistaken... I do think XP allows for just hacking out a bunch of code,
but you have to throw it away and re-write it using TDD before you can
actually use it as part of the production codebase. So if you're not
sure how to do something, you can hack some code until you know you're
going in the right direction, then quickly write the "correct"
implementation using TDD. That way you have tests that back up the
fact stuff still works when you start making some changes/enhancements
months later.

> - There are many situations in which it is not possible to write tests.
> As an extreme example, consider implementing synchronization primitives.
> They have to be proved to be correct.

Fair enough. The whole area of concurrency is pretty difficult. I am
fairly comfortable with the idea that you can abandon TDD if it's too
hard or just not worthwhile in a specific case. But in the end, no
matter what you're developing, I believe that 85% of the time or more,
TDD should not be a problem.

> - The problem is quite often complex enough that the first non-trivial
> test requires most of the work.

> - I suspect that TDD works better for tackling simpler problems in less

> complex domains, and at lower productivities.

I am inclined to disagree with the 2 comments above. I really believe
you ought to be able to break things done to smaller units and that
this is a desirable goal (see Laurent's posting...). Also, no matter
what you are developing, it is ultimately made up of layers of
components of relatively simpler complexity... Robert Martin pointed
out at a talk he gave recently that the 1,000,000 line program was
once a humble 1,000 line program. There is probably a degenerate case
somewhere that violate this principle. I imagine a true AI will be
some horrible, fundamentally un-scramblable knot of code, but that's
not what the vast majority of programs are about...

Phlip

unread,

Nov 16, 2004, 1:24:23 AM11/16/04

to

Phlip wrote:

> CTips wrote:
>
> > Thats specious. Tests specify the functionality that the module
> > implements. If you can add more tests, then you can increase the
> > functionality/scope of the module (or vice-versa). TDD does not specify
> > when to stop adding tests. Therefore, TDD does not address scope creep
> > in any way.
>
> I think Ron once said, "Add tests until fear turns into boredom".
>
> Each test must show an unbroken chain: Requirement->feature->test->code.
If
> you can't think of a new test to fulfill a requirement, you are allowed to
> add more test-last, but you are not allowed to

add more failing tests and make them pass.

Laurent Bossavit

unread,

Nov 16, 2004, 3:12:05 AM11/16/04

to

> Lets take an example. I want to do a peephole optimization that converts

> [...]

> Now, if you had been doing TDD you would either have written 15kloc or
> you would have written some amount of code, then after realizing the
> right approach, had to throw it all away.

If I had done that, then I would not have been doing TDD.

The rules of TDD require me to write dead-simple, "naive" code for the
first peephole optimization. You had just the right instinct for which
optimization to start with, by the way - the simplest one possible.

The rules of TDD require me to again write dead-simple code for the
second peephole optimization, *but* they also require me to ensure that
there is no code duplication at all in the analysis module that results.
I'll repeat that: no duplication at all.

To get rid of duplication, I might well have to start moving toward a
process driven by data tables, or some such. What matters is that the
edict, "no duplication", tends to drive the code to ever greater levels
of abstraction. By the 100th pattern I might be nowhere near 3000 lines.

If you had already done this before, you might choose to drive the
implementation toward the approach you mention, a code generator. I'm
not fond of code generation myself, so I might choose differently.

By this time, a distinction will have started to emerge between the
coarser-grained tests that match an input pattern of register moves and
operations to an expected output pattern - and true unit tests, which
exercise the low-level functionality of the code generator, or pattern
matcher, or whatever our implementation of choice is. The former
category describes the functionality of the analysis module, the latter
describe its design. It's quite likely that by now the test harness is
capable of using the transformation spec *themselves* as an input
format.

At any rate, by this time the functionality of the analysis module is
completely covered by tests, so that any regression introduced, say, in
modifying the generator or matcher to be able to handle a whole new
class of optimization is detected right away.

I would expect that someone with experience in designing peephole
optimizers would have no major difficulty driving the design toward a
solution known to be serviceable, but they would also end up with code
much less likely to suffer from regressions (more robust in that sense)
than if they'd written 700LOC then one test, then 700 further LOC and
one further test.

Laurent
http://bossavit.com/thoughts/

Ronald E Jeffries

unread,

Nov 16, 2004, 8:36:14 AM11/16/04

to

One might similarly say that reviewers might miss defects, and
therefore reviews do not address defects in any way.

I find, and many people with whom I talk find, that because a test
/specifies/ a requirement, whereas the code /implements/ a
requirement, TDD helps us stop when it's time to stop.

I suspect it's because as we contemplate writing the next test, "OK,
what if he would like to remove three elements at once and then get
the next element", it's easier to recognize that we're going beyond
our current need. As we write the code, we see that if we do it just
this way, with only a little more effort, we can arrange it so that
you can have an extra parameter that says how many elements to remove
before doing next, so we just do it.

Developing the habit of writing the test is equivalent to writing down
a new requirement, which seems to be enough to get us, frequently, to
realize that we don't really need that thing.

Logically, it might seem that we could write tests forever and
therefore scope creep would not be addressed. In practice, that's not
what happens. I think it's a psychology thing, not a logic thing.

Phlip

unread,

Nov 16, 2004, 9:58:55 AM11/16/04

to

Laurent Bossavit wrote:

> If I had done that, then I would not have been doing TDD.
>
> The rules of TDD require me to write dead-simple, "naive" code for the
> first peephole optimization. You had just the right instinct for which
> optimization to start with, by the way - the simplest one possible.

Okay, I'm seventeen times smarter than Laurent, however I do that too. I'm
so smart that I know a good design for everything, and I'm smart enough to
_refrain_ from implementing it. The code that passes my tests is nothing but

SCAFFOLDING TO SUPPORT THE TESTS.

Put another way, given a choice between losing the code or the tests, I
would lose the code and keep the tests. They are where all the features and
all the design decisions are stretched out and visible, not crammed together
via refactoring.

> The rules of TDD require me to again write dead-simple code for the
> second peephole optimization, *but* they also require me to ensure that
> there is no code duplication at all in the analysis module that results.
> I'll repeat that: no duplication at all.

Now the fun begins. The tests are like huge magnets forming a confinement
field, and the code is like a droplet of Bose-Einstein condensate in the
middle. The more tests we turn on, the smaller the droplet gets, and the
more superfluid becomes its trapped nuclear matter.

Okay, that's /too/ smart. Try again.

What we mean is that to seek the minimal but most elegant code to pass the
tests, we put the code thru repeated passes of adding features and removing
duplication. This anneals the code. And even if I planned the design it
eventually arives at, TDD will most likely arive at a simpler design in a
shorter amount of time than I could have predicted.

> To get rid of duplication, I might well have to start moving toward a
> process driven by data tables, or some such. What matters is that the
> edict, "no duplication", tends to drive the code to ever greater levels
> of abstraction. By the 100th pattern I might be nowhere near 3000 lines.
>
> If you had already done this before, you might choose to drive the
> implementation toward the approach you mention, a code generator. I'm
> not fond of code generation myself, so I might choose differently.

Okay, here the idea was there's a quantum leap (or possibly a very long
leap) between the design that simple tests lead to, and the best design for
all of them. That's still good.

SOMETIMES YOU THROW THAT ELEGANT DESIGN AWAY.

But you do it by following TDD's other rule: "no more than 10 edits before
passing tests". After you grow a design, and get it reviewed by your peers
(even those with 1/17th of your intellijence), you might decide to replace
it.

You do that by adding a test that forces the beginning of the new system to
exist. You leave the other system online while the new system grows. Then
you start replacing the old system with the new one, at its call sites, one
by one. Then you erase the old system, and you then seek opportunities to
refactor _everywhere_ based on the features of the new system. (This is
Substitute Algorithm Refactor.)

TDD gives these benefits:

- lots of tests, most of whom don't care what the design is
- the ability to remove code and see if tests pass
- the ability to deploy, release, or deliver, _during_ a refactor
- the ability to review the design based on its testage
- the ability to escallate testage into customer tests
- the force to rapidly find a minimal and elegant design
- the ability to replace that design without a blackout
- the ability to continously integrate
- the ability to swap modules with colleagues

Other systems give those benefits too, but not so easily.

--
Phlip
http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces

CTips

unread,

Nov 16, 2004, 10:30:35 AM11/16/04

to

Laurent Bossavit wrote:
>>Even if we break it down to the minimal amount of analysis per transform kind,
>>it will still amount to 1kloc+. (BTW: this is based on recent experience,
>>not hypothetical).
>
>
> I suspect we differ on what we call a "unit" test, or a "real" test.
>

<snip: description of method to have analysis phase dump output and
then check that output for correctness>

Every now and then we have someone who suggests doing something like that.

However, here's the problem: the analysis phase is broken down into
several routines. A 1kloc phase might have between 10 and 25 such
routines; lets call it 20 routines averaging 50 loc.

Each analysis routine takes in a graph and either annotates the graph
or creates auxiliary data-structures (or both). So, if the input graph
has 25 nodes + 25 edges, and creates 4 pieces of information for each,
thats 200 output values.

So, per test, we'd have to figure out and write 200 lotest to test 50
loc. Since these routines have multiple paths, I'd expect that the total
testing effort following the suggested strategy would be, oh, lets say
several 1000 entries - probably a 20x to 50x effort compared to actually
writing the function.

If you wait until the transformation phase, you get to check the
analysis phase by seeing if the transforms actually happened correctly.
That is relatively straight-forward - compile and run the code to see
if it gives the same result with and without the optimization.

Specifically, we can usually exercise a phase "completely" with about
1x-4x the effort of actually writing it. (Completely means at the very
least branch level coverage, but usually includes correlated paths -
i.e. if what happens in A0 impacts B0 then some test will exercise a
path containing A0 and B0). The extra overhead includes time to
- add extra self-check code
- add extra debug trace code
- write phase-specific tests/test-drivers/test-generators etc.
- find bugs and isolate them

There are other techniques you can use to build confidence in program
correctness and to elicit other than simply testing (much less TDD).
They are also _much_ higher return for effort. Unfortunately, people who
have worked on simple problems have never been forced into situations
where they need to develop skill in those kinds of techniques.

CTips

unread,

Nov 16, 2004, 11:26:56 AM11/16/04

to

Phlip wrote:

> David wrote:
>
>
>> I'm not sold on TDD itself, but still interested in the concept.
>>Since I have had good teachers and experience, almost everything you
>>have mentioned is already part of my work plan. I tend to think
>>and design rather deep though and most of the refactoring is done
>>in my head before the base concepts are written down.
>
>
> Everyone can take a given program some distance in their head. Call this
> distance X. I'm the first to admit my average X is less than most
> programmers. But when X runs out, that's where TDD. You have to do all of X
> via TDD so that when you run off the end of your ability to design in your
> head, you can still keep going.
>

Next time you have to develop a module do the following: think about the
problem. What routines need to be implemented? What data-structures are
needed? What alogrithms will be needed? Try and visualize what the final
code will look like. Don't write a single line of code until you're
confident that you have a good idea about the final picture. In fact,
don't even sit at the computer - just think. At most, use a pencil and
paper.

Now, when you're going to start coding write out all the external
functions that module is going to contain, and the description for them
(If it was C, I'd say write the .h file first). When you're going to
write a function, start off by writing comments of the form

/* first do this ... */
/* then do this ... */
/* finally do this ... */

and then start filling in the code.

After you've done this a few times, you'll get a handle on how to write
a module without having to resort to sub-optimal intermediate points.

sted...@yahoo.com

unread,

Nov 16, 2004, 11:41:37 AM11/16/04

to

What about TDD as regression testing? Isn't that worth something CTips?
Stede

CTips

unread,

Nov 16, 2004, 12:02:04 PM11/16/04

to

sted...@yahoo.com wrote:

> What about TDD as regression testing? Isn't that worth something CTips?
> Stede
>

Its better than nothing. But is it better than the alternatives?

For instance, consider testing to achieve a particular level of coverage.
- will TDD give you that level of coverage? It won't give path coverage,
thats for sure. It should give statement coverage (if you write tests to
cover error/undefined scenarios).
- will TDD give a minimal set of tests for that level of coverage? In
many cases, I don't expect it to.

In fact, post-facto white-box testing will probably yeild a "better" (in
the sense of coverage) and smaller test suite than that developed by TDD.

sted...@yahoo.com

unread,

Nov 16, 2004, 12:10:06 PM11/16/04

to

Makes sense CTip. What about Design by Contract? Is that TDD. Do you
feel that will help? Maybe a combination of Design by Contract,
Regression Testing and maybe a little TDD for finding the simpliest way
to build a class?

sted...@yahoo.com

unread,

Nov 16, 2004, 12:09:50 PM11/16/04

to

CTips

unread,

Nov 16, 2004, 1:49:52 PM11/16/04

to

sted...@yahoo.com wrote:

Design by contract is an interesting idea. Unfortunately, its also more
overhead than its worth in most situations.

Lets look at the whole idea of preconditions, postconditions and
invariants. They are expressions on the state of the program that are
(supposted to be) true before, after, and during the execution of a
piece of code. Quite often, the actual condition ends up being very hard
to express correctly.

For instance, how do I specify the post-condition for the factorial
function?
n = fact(i);
/* at the end of this n is i! */
To actually check this would require rewriting the factorial function
some other way.
n = fact(i)
assert( n == fact_1(i) );
So, in practice, one would just leave the post-condition as a comment.
Which kind of defeats the purpose.

When conditions get very complex, then one has several options:
- leave them as documentation only, in a somewhat fuzzy (non-executable)
form. This allows any number of misunderstandings to creep through (I
thought you meant *that*! No, I didn't)
- convert them to an executable form, which may require more work than
other approaches
- use simpler executable conditions and leave the rest as documentation.
e.g. replace
assert( i > 0 && n == fact_1(i) );
with
assert( i > 0 ); /* n == i! */
- change the behavior of the function so that the conditions can be
simplified.

Another problem with pre-, post- and invariant conditions in complex
situations is that coming up with the "right" conditions is pretty
difficult (particularily the invariants). Here "right" means that they
correctly capture the state, are weak enough to be written succintly,
and are strong enough to detect many bugs.

Laurent Bossavit

unread,

Nov 16, 2004, 3:01:33 PM11/16/04

to

> <snip: description of method to have analysis phase dump output and
> then check that output for correctness>
> Every now and then we have someone who suggests doing something like that.

That was a thought experiment, not a suggestion. The point was to show
that it can't be necessary to write 1000 lines of code before you write
one "real" test, if by "real" you mean "can expose a bug".

In practice I would test-drive the whole thing from scratch, not start
from a less than ideally testable design as I did in the thought
experiment.

> So, per test, we'd have to figure out and write 200 lotest to test 50
> loc.

I refer you to my passages on refactoring, especially the bit about
removing duplication. Reread the bit about removing duplication. Your
unit testing effort does not increase linearly with the number of code
chunks, neither is it a combinatorial function of the dimensions along
which input data can vary.

Rather, think of the unit tests produced by TDD as that many lemmas in a
very long proof. Or think of them as executable examples, chosen for
pedagogical value in the problem domain.

Laurent

CTips

unread,

Nov 16, 2004, 3:46:58 PM11/16/04

to

Laurent Bossavit wrote:

>> <snip: description of method to have analysis phase dump output and
>>then check that output for correctness>
>>Every now and then we have someone who suggests doing something like that.
>
>
> That was a thought experiment, not a suggestion. The point was to show
> that it can't be necessary to write 1000 lines of code before you write
> one "real" test, if by "real" you mean "can expose a bug".
>
> In practice I would test-drive the whole thing from scratch, not start
> from a less than ideally testable design as I did in the thought
> experiment.

Whats the largest program you wrote yourself? How long did it take?
Whats the most complex algorithm you've implemented? How long was it?
Did you do either of them TDD?

I tend to write about 100klocs of production/production-quality code
every year (I'm not counting things like the test code itself). Its
slipped now that I'm doing customer/sales/management kind of stuff, but
I'm still trying to get to about that half that. And its not exactly
simple stuff either - at this point a large part of what I write
involves development of new algorithms as well as the coding.

So permit me to say this - if you can hit those kinds of productivities
using TDD and the kinds of testing approaches you're advocating, more
power to you. If you can't, then maybe you should consider how to change
your approach to get there.

>
>>So, per test, we'd have to figure out and write 200 lotest to test 50
>>loc.
>
>
> I refer you to my passages on refactoring, especially the bit about
> removing duplication. Reread the bit about removing duplication. Your
> unit testing effort does not increase linearly with the number of code
> chunks, neither is it a combinatorial function of the dimensions along
> which input data can vary.

Sorry, I think you're not getting it. If I have to check for the
correctness of one run through a function, and that function generates
200 values, then I have to check 200 values to ensure that it is
correct. If I am very lucky I might be able to check it using less than
200 lines of code/test/result-specification {say, for instance, that all
4 of these nodes must have the same annotation}, but in practice it does
n't work that way.

If I have a typical 50 line function then it probably will need 8
different tests to get minimal (branch) coverage. That means that to
test a single function adequately, I need to write 1600 lines for the
tests - that is a 32-to-1 ratio of test code to actual code.

Now, in a 1000 kloc block, I have 20 such functions. To test each one
separately, I'd have to write 32kloc.

Note: no combinatorial explosion anywhere. Just simple multiplication:
200 output values/function/test x 8 tests/function x 20 functions =
32kloc of test values.

CTips

unread,

Nov 16, 2004, 3:49:33 PM11/16/04

to

Ronald E Jeffries wrote:

> I can't remember the last time I encountered a team that had an very
> very low defect rate, but I'm sure they are out there somewhere. And
> maybe they don't need TDD. Someday maybe I'll get to observe such a
> team and find out.
>

You know you have a standing invitation. Swing by whenever you're in the
area. I'll make sure you get access to our bug database, and see if I
can arrange access to our customer help e-mails.

sted...@yahoo.com

unread,

Nov 16, 2004, 3:53:44 PM11/16/04

to

Your thoughts make a lot of sense CTip. Now for the really big
question. How do I get a low defect like you without doing TDD?

Stede

John Roth

unread,

Nov 16, 2004, 4:51:07 PM11/16/04

to

"CTips" <ct...@bestweb.net> wrote in message

news:10pkckb...@corp.supernews.com...

It depends on what you're looking for. TDD isn't
a testing methodology first. It's a design methodology
first and a testing methodology second.

TDD normally gives statement coverage in the high
90s, and branch coverage in the middle 90s. As far
as other coverage metrics, I haven't a clue - I haven't
seen either measurements or theoretical arguments.

We've had some comparisons of TDD and classical
testing methodologies in the past. The general conclusion
was that TDD gave a much *smaller* set of tests than
the testing gurus thought was an adequate set, by a
factor of at least five.

I normally don't worry overmuch about a "minimal"
set of tests. I understand that's a consideration in
classical testing, but it's irrelevant in XP for one very
simple reason: tests *must* be written to run as fast
as possible, because you will be running literally
hundreds of them on every edit and compile cycle.

As a very high level rule of thumb, I'd expect
an average of around 5 loc for each test written.
Since TDD does one test and then the code to
make it pass on each pass through the loop, the
problem must be constructible in that kind of
very small step. There may very well be areas
where this isn't possible, and compiler optimization
algorithms may very well be one of them.

Let's just say that I have my doubts. What
I'd consider as an adequate demonstration is
someone who has a lot of experience with TDD
trying it and deciding that it's not an appropriate
area to apply that technique.

John Roth

Laurent Bossavit

unread,

Nov 16, 2004, 5:27:36 PM11/16/04

to

> Whats the largest program you wrote yourself?

I haven't tended to use LOC as a metric. I've preferred counting how
many people used what I wrote, for instance.

Anyway, I could well be wrong about that, but I suspect the program I
alluded to, which included a compiler, might have been the largest
single program I wrote by myself. It ran to 30KLOC of Java code, which
according to Capers Jones is equivalent to about 72KLOC of C code.

There was a C++ project to which I was a contributor for a year and a
bit, where I might have produced quite a bit more, but I no longer have
access to the source.

> How long did it take?

I recall a little under three months. I did a number of other projects
that same year, for an aggregate size of, well, whatever it was. Again,
I wasn't counting.

> Whats the most complex algorithm you've implemented? How long was it?

I couldn't say - depends on what you mean by "complex". The most
challenging at the time might have been that RTP protocol implementation
for an Internet telephony thingie, but it was tiny.

> Did you do either of them TDD?

Nope. I've tended to shift my focus even further away from "pissing
contest" statistics since I started doing TDD, for reasons semi-directly
related to my motivations in doing so.

One of my pleasant accomplishments using a mix of techniques derived
from a then-fresh understanding of TDD was *removing* 15KLoc from a Java
program of about 45K initially, *adding* functionality in the process
(and quite a bit of robustness).

By my own assessment, TDD has been a productivity boost, but I couldn't
care less about showing it in terms of LOC.

Laurent

Thad Smith

unread,

Nov 16, 2004, 7:03:10 PM11/16/04

to

sted...@yahoo.com wrote:

> What about TDD as regression testing?

One of the things that occurs to me is that if you write a test, write
code, write a test, etc. cycle, you are basically designing your tests
with assumptions about the implementation. Now suppose you change the
implementation. Let's say you need to increase performance and can do
this by handling an input parameter x in two ranges: 0 - 15, and 16 or
greater, rather than a single range as originally designed. If the
original test didn't test these ranges separately, the test after change
might not catch a new bug. How do you prevent such cracks opening when
you refactor?

Thad

Stede Troisi

unread,

Nov 16, 2004, 6:33:57 PM11/16/04

to

"John Roth" <newsg...@jhrothjr.com> wrote in message:

> It depends on what you're looking for. TDD isn't
> a testing methodology first. It's a design methodology
> first and a testing methodology second.

That is really true. I didn't get that until the Fibinacci example in Kent's
book. I guess I am slow.

>
> TDD normally gives statement coverage in the high
> 90s, and branch coverage in the middle 90s. As far
> as other coverage metrics, I haven't a clue - I haven't
> seen either measurements or theoretical arguments.

I don't know what metric you are using but you and CTip have vastly
different numbers. How is something like this provable? How can we have
scientific data?

John, you must understand how frustrating this can get when you here two
experts with such vastly different data sets. It is like hearing 50% of
economists saying outsourcing is good and another 50% saying it is bad. It
just makes people who want to learn so much less faithful in the whole
process.

> We've had some comparisons of TDD and classical
> testing methodologies in the past. The general conclusion
> was that TDD gave a much *smaller* set of tests than
> the testing gurus thought was an adequate set, by a
> factor of at least five.

Can you show it? Prove it? I am not saying this in an adversary tone.

> I normally don't worry overmuch about a "minimal"
> set of tests. I understand that's a consideration in
> classical testing, but it's irrelevant in XP for one very
> simple reason: tests *must* be written to run as fast
> as possible, because you will be running literally
> hundreds of them on every edit and compile cycle.
>
> As a very high level rule of thumb, I'd expect
> an average of around 5 loc for each test written.
> Since TDD does one test and then the code to
> make it pass on each pass through the loop, the
> problem must be constructible in that kind of
> very small step. There may very well be areas
> where this isn't possible, and compiler optimization
> algorithms may very well be one of them.
>
> Let's just say that I have my doubts. What
> I'd consider as an adequate demonstration is
> someone who has a lot of experience with TDD
> trying it and deciding that it's not an appropriate
> area to apply that technique.
>

That certainly makes sense.

Stede

> John Roth
>

CTips

unread,

Nov 16, 2004, 7:41:08 PM11/16/04

to

Laurent Bossavit wrote:
>>Whats the largest program you wrote yourself?
>
>

<snip> ... It ran to 30KLOC of Java code [in 3 months]

>
> There was a C++ project to which I was a contributor for a year and a
> bit, where I might have produced quite a bit more, but I no longer have
> access to the source.

<snip>

>
>>Whats the most complex algorithm you've implemented? How long was it?
>
> I couldn't say - depends on what you mean by "complex". The most
> challenging at the time might have been that RTP protocol implementation
> for an Internet telephony thingie, but it was tiny.
>
>>Did you do either of them TDD?
>
> Nope.

> By my own assessment, TDD has been a productivity boost, but I
> couldn't care less about showing it in terms of LOC.

This set of questions wasn't meant to question whether TDD was or was
not a productivity boost, but to see to what sized programs you'd
applied it to. The answer appears to be to small programs only.

Now, the question is what makes you think that it is better than (or
even remotely comparable to) other approaches for building medium and
large sized programs with a "reasonable" degree of correctness.

It can't be your personal experience, because you've never used TDD on a
even a medium sized or medium complexity program.

Lets hear it from other people - what is the largest or most complex or
most challenging (or some combination) program that you wrote using TDD?

[Note that I have never worked on a large software program - most of my
experience is with medium-sized programs, in the 100k to 300kloc range.
I typically consider 50-100kloc small, 100kloc - 500kloc medium,
500kloc+ large]

Andrew McDonagh

unread,

Nov 16, 2004, 7:58:17 PM11/16/04

to

CTips wrote:
snipped

> Lets hear it from other people - what is the largest or most complex or
> most challenging (or some combination) program that you wrote using TDD?

We are using TDD (and XP in fact) on (IMO) a medium sized application.
Its a telecomms equipment Network Management System. Multiple Servers
managing telecomms equipment and providing that data and manageability
to multiple clients.

We use CORBA as the RPC mechanism between each (n equipment <-> n
servers <-> n clients)

The system currently supports Fault and Configuration Management and in
due course will provide Auditing, Performance and Security management of
the telecomms equipment.

>
> [Note that I have never worked on a large software program - most of my
> experience is with medium-sized programs, in the 100k to 300kloc range.
> I typically consider 50-100kloc small, 100kloc - 500kloc medium,
> 500kloc+ large]

These number are IMO meaningless. Its totally Dependant upon the
language, persistence mechanisms, RPC technology, etc which are always
going to be different for people here on this NG to be able to compare.

For example, a C++ program can usually be much large than its Java
equivalent, yet the same program developed in Ruby can be even smaller.

Then there's auto generated code. Our systems CORBA's IDL is only a few
hundred LOC, but generates 000s of lines of Java code.

Also, anyone can write lots of code. If productivity is measured by LoC,
then people tend to just write lots of code, rather than well factored
code that does the same job, but in few loc.

How about we compare functionality instead?

CTips

unread,

Nov 16, 2004, 8:25:13 PM11/16/04

to

Stede Troisi wrote:

> "John Roth" <newsg...@jhrothjr.com> wrote in message:
>
>
>>It depends on what you're looking for. TDD isn't
>>a testing methodology first. It's a design methodology
>>first and a testing methodology second.
>
>
> That is really true. I didn't get that until the Fibinacci example in Kent's
> book. I guess I am slow.
>
>
>>TDD normally gives statement coverage in the high
>>90s, and branch coverage in the middle 90s. As far
>>as other coverage metrics, I haven't a clue - I haven't
>>seen either measurements or theoretical arguments.
>
>
> I don't know what metric you are using but you and CTip have vastly
> different numbers. How is something like this provable? How can we have
> scientific data?
>
> John, you must understand how frustrating this can get when you here two
> experts with such vastly different data sets. It is like hearing 50% of
> economists saying outsourcing is good and another 50% saying it is bad. It
> just makes people who want to learn so much less faithful in the whole
> process.
>

Actually, we're not disagreeing about the coverage - I do agree that it
is possible to use strict TDD and get 90% statement coverage. Branch
coverage will depend on the kinds of programs one writes, but for what
TDD gets used for, its probably reasonable.

I'm also not arguing about TDD giving a productivity boost. Given how
non-systematically many (most?) programmers write code, any structure
will show a productivity boost.

However, after a point, TDD starts getting in the way. Consider an ideal
scenario - you spend time figuring out what the final code should look
like, you code it, and then generate a minimal set of tests. Clearly,
this is faster than coding a test, writing the code, possibly
refactoring, and so on. You avoid all the overhead of extra tests and of
extra refactoring.

Advocates of TDD don't believe that that this ideal is achievable (or at
least, not achievable by most programmers). However, all of the most
productive programmers I know actually work this way.

I don't think that using TDD will cause long term damage to a
programmers development. I'd probably say its like training wheels -
eventually you need to move beyond them.

CTips

unread,

Nov 16, 2004, 8:38:06 PM11/16/04

to

Andrew McDonagh wrote:

Absolutely - I'd consider optimizing compilers, database kernels,
operating system kernels to be medium functionality programs. Large
functionality programs would be fighter avionics packages, air-traffic
control programs and teleco switches. Note that many of these tend to be
written in C (or C++ or assembler (and maybe Ada, though I think many
avionics packages used to get waivers)).

I suspect that I'd call your program on the small size. Unless of
course, the program is a distributed fault-tolerant program with support
for things like recovery after network partitioning. Or unless it
requires switch-level 5-9s reliability. In which case it definitely
counts as a medium sized program.

Stede Troisi

unread,

Nov 16, 2004, 8:37:24 PM11/16/04

to

I get your point now. So I guess you don't use TDD at all because you are
past the training wheel stage? If you do use a little TDD what is the best
way to determine when to use it and when not?

Thanks,
Stede

"CTips" <ct...@bestweb.net> wrote in message

news:10pla3r...@corp.supernews.com...

John Roth

unread,

Nov 16, 2004, 9:41:05 PM11/16/04

to

"Stede Troisi" <st...@verizon.net> wrote in message
news:F7wmd.7314$063.6686@trndny03...

>
> "John Roth" <newsg...@jhrothjr.com> wrote in message:
>
>> It depends on what you're looking for. TDD isn't
>> a testing methodology first. It's a design methodology
>> first and a testing methodology second.
>
> That is really true. I didn't get that until the Fibinacci example in
> Kent's
> book. I guess I am slow.

I wouldn't say slow. The word "test" seems to mislead
a lot of people, and it's not helped by [name withheld] insisting
that it's the right word because we usually use xUnit
as the vehicle.

As far as testing goes, I (and quite a few others)
would describe the test suite as a regression test,
not a unit or integration test suite in the classical
sense.

>> TDD normally gives statement coverage in the high
>> 90s, and branch coverage in the middle 90s. As far
>> as other coverage metrics, I haven't a clue - I haven't
>> seen either measurements or theoretical arguments.
>
> I don't know what metric you are using but you and CTip have vastly
> different numbers. How is something like this provable? How can we have
> scientific data?
>
> John, you must understand how frustrating this can get when you here two
> experts with such vastly different data sets. It is like hearing 50% of
> economists saying outsourcing is good and another 50% saying it is bad. It
> just makes people who want to learn so much less faithful in the whole
> process.

Well, emperically it comes out of a number of measurements
that were reported on the XP mailing list. Then we looked
at why we were getting such high numbers, and discovered
that if you do TDD ***exactly*** by the book, not adding
one more keystroke than you needed to make each test
pass, you _should_ get 100% statement and branch coverage.

As a practical matter, though, its just soooooo easy to slide
in one more statement because you "know you're going to
need it", and there goes your 100% mark. Things like
Java's exception mechanism also tend to force you to write
untested code unless you really take care to do things like
write the tests for exception handlers first.

CTips, however, was talking about a number of other
coverage metrics. I have no data on them, and have
no particular reason to think that TDD would do particularly
well, although the tendency to produce lots of little straight-line
methods should have a good effect on path coverage.

>
>> We've had some comparisons of TDD and classical
>> testing methodologies in the past. The general conclusion
>> was that TDD gave a much *smaller* set of tests than
>> the testing gurus thought was an adequate set, by a
>> factor of at least five.
>
> Can you show it? Prove it? I am not saying this in an adversary tone.

You'd have to find the original e-mail set. I believe
that the discussion was between either Kent or Ron
and a rather highly respected testing guru. Kent said
that a test set of about 5 was adequate, the guru said
it needed about 80!

I can see where they were both coming from;
it's simply a different mindset.

John Roth

>
> Stede
>
>> John Roth
>>
>
>

John Roth

unread,

Nov 16, 2004, 9:46:03 PM11/16/04

to

"Thad Smith" <Thad...@acm.org> wrote in message
news:419a8807$1...@omega.dimensional.com...

That is a truely excellent question, to which there is,
unfortunately, no equally excellent answer.

The best I can say is that there are a number of factors
that have to be considered.

One is that refactoring is, technically, behavior preserving
on some level. Tests written outside of that level shouldn't
be affected, and code written inside that level should have
new tests written for them.

A second is that you have to keep refactoring and
reworking the tests as well as the code. Programmer
tests are not static and you can get quite a boost by
refactoring them occasionally.

John Roth
>
> Thad
>

CTips

unread,

Nov 16, 2004, 9:49:05 PM11/16/04

to

Stede Troisi wrote:

> I get your point now. So I guess you don't use TDD at all because you are
> past the training wheel stage? If you do use a little TDD what is the best
> way to determine when to use it and when not?

A good question to which I unfortunately don't necessarily have a good
answer. Maybe the following will help.

I'm assuming you're using TDD for designing your modules. Next time you
have to, try and work out the design, preferably in your head, but
possibly with paper and pencil, and see how far you can get. See how
much of the module you can actually visualize. Definitely _think_ about
all the test cases you'd expect to write. Think about all the invariants
you'd expect to hold. Think about how you expect the module to be used.
Think about the data-structures you'd use. Think about any algorithms
that will be required.

Now, if at the end of this process, you think you have a fairly good
handle on what the final code will look like, go ahead and code it up.
You may want to code it in phases, and test each phase separately.

If you find that you are not going back and rewriting code, then you
know you don't need TDD. If you find that you're rewriting code because
your understanding of the problem was at fault, then maybe TDD is still
a good design technique. If you find that you're rewriting code because
the code becomes cleaner that way, you probably don't need TDD, but
maybe you do.

Thad Smith

unread,

Nov 17, 2004, 1:33:35 AM11/17/04

to

CTips wrote:

> sted...@yahoo.com wrote:
>
>> What about Design by Contract?
>

> Design by contract is an interesting idea. Unfortunately, its also more
> overhead than its worth in most situations.
>
> Lets look at the whole idea of preconditions, postconditions and
> invariants. They are expressions on the state of the program that are
> (supposted to be) true before, after, and during the execution of a
> piece of code. Quite often, the actual condition ends up being very hard
> to express correctly.
>
> For instance, how do I specify the post-condition for the factorial
> function?
> n = fact(i);
> /* at the end of this n is i! */
> To actually check this would require rewriting the factorial function
> some other way.
> n = fact(i)
> assert( n == fact_1(i) );
> So, in practice, one would just leave the post-condition as a comment.
> Which kind of defeats the purpose.

A partial solution, which I might use is
assert ((i == 0 && n == 1) || n == i * fact(i-1));

That requires computing another factorial and doesn't fully test the
function, but does do one consistency check. If tested over enough
values, it should actually confirm correct behavior (assuming no hidden
states in the function).

> Another problem with pre-, post- and invariant conditions in complex
> situations is that coming up with the "right" conditions is pretty
> difficult (particularily the invariants). Here "right" means that they
> correctly capture the state, are weak enough to be written succintly,
> and are strong enough to detect many bugs.

Difficult, yes, but I think the exercise of doing it builds confidence
in a correct approach.

Thad

Ronald E Jeffries

unread,

Nov 17, 2004, 5:18:48 AM11/17/04

to

On Tue, 16 Nov 2004 15:46:58 -0500, CTips <ct...@bestweb.net> wrote:

>I tend to write about 100klocs of production/production-quality code
>every year (I'm not counting things like the test code itself). Its
>slipped now that I'm doing customer/sales/management kind of stuff, but
>I'm still trying to get to about that half that. And its not exactly
>simple stuff either - at this point a large part of what I write
>involves development of new algorithms as well as the coding.
>
>So permit me to say this - if you can hit those kinds of productivities
>using TDD and the kinds of testing approaches you're advocating, more
>power to you. If you can't, then maybe you should consider how to change
>your approach to get there.

Most programmers are not up to your standards, no matter what
techniques they use. It's problematical to measure any process against
you and people like you, don't you think?

--
Ron Jeffries
www.XProgramming.com
I'm giving the best advice I have. You get to decide if it's true for you.

Ronald E Jeffries

unread,

Nov 17, 2004, 5:21:45 AM11/17/04

to

On Tue, 16 Nov 2004 23:33:57 GMT, "Stede Troisi" <st...@verizon.net>
wrote:

>> TDD normally gives statement coverage in the high
>> 90s, and branch coverage in the middle 90s. As far
>> as other coverage metrics, I haven't a clue - I haven't
>> seen either measurements or theoretical arguments.
>
>I don't know what metric you are using but you and CTip have vastly
>different numbers. How is something like this provable? How can we have
>scientific data?
>
>John, you must understand how frustrating this can get when you here two
>experts with such vastly different data sets. It is like hearing 50% of
>economists saying outsourcing is good and another 50% saying it is bad. It
>just makes people who want to learn so much less faithful in the whole
>process.

Stede, let me offer two notions:

First, someone else's data tells us very little about what will happen
to us. Something, but very little. It might encourage us to try
something, or discourage us from trying it, but our own results are
what matters.

Second, CTips reports productivity from himself and his team which is
almost unprecedentedly high. Taking their figures at face value, their
experience in inapplicable to ordinary mortals. I'd like to get
someone in there to look at what they really do, but so far we haven't
been able to set that up.

Regards,

Ronald E Jeffries

unread,

Nov 17, 2004, 5:23:06 AM11/17/04

to

On Tue, 16 Nov 2004 21:49:05 -0500, CTips <ct...@bestweb.net> wrote:

> If you find that you are not going back and rewriting code, then you
>know you don't need TDD. If you find that you're rewriting code because
>your understanding of the problem was at fault, then maybe TDD is still
>a good design technique. If you find that you're rewriting code because
>the code becomes cleaner that way, you probably don't need TDD, but
>maybe you do.

I'd suggest that while doing this experiment one would also want to
note time spent debugging, and final defect rates in the released
code.

Ronald E Jeffries

unread,

Nov 17, 2004, 5:26:51 AM11/17/04

to

On Tue, 16 Nov 2004 17:03:10 -0700, Thad Smith <Thad...@acm.org>
wrote:

>One of the things that occurs to me is that if you write a test, write
>code, write a test, etc. cycle, you are basically designing your tests
>with assumptions about the implementation. Now suppose you change the
>implementation. Let's say you need to increase performance and can do
>this by handling an input parameter x in two ranges: 0 - 15, and 16 or
>greater, rather than a single range as originally designed. If the
>original test didn't test these ranges separately, the test after change
>might not catch a new bug. How do you prevent such cracks opening when
>you refactor?

Wouldn't we want to write new tests for each of the ranges, given that
we are trying to (a) improve performance and (b) will very likely be
creating two new methods?

Certainly if we don't, we're in the danger you mention. Therefore ...
we need to think, write new tests as we see fit. And if we get a
defect later, then, as always, we need to reflect on what we've
learned. (Which will be something like: if we write range-dependent
code, be sure to write tests for all ranges.)

Does that help? Does it raise new questions or answers for you?

Ronald E Jeffries

unread,

Nov 17, 2004, 5:27:55 AM11/17/04

to

I'm sure it will be fascinating. Did Ralph wind up deciding there was
no fit for a visit on his part?

Phlip

unread,

Nov 16, 2004, 8:14:02 PM11/16/04

to

Thad Smith wrote:

> One of the things that occurs to me is that if you write a test, write
> code, write a test, etc. cycle, you are basically designing your tests
> with assumptions about the implementation. Now suppose you change the
> implementation. Let's say you need to increase performance and can do
> this by handling an input parameter x in two ranges: 0 - 15, and 16 or
> greater, rather than a single range as originally designed. If the
> original test didn't test these ranges separately, the test after change
> might not catch a new bug. How do you prevent such cracks opening when
> you refactor?

By making the tests more hyperactive than they need to be (to incidentally
test more details than the end-result needs), and by running all the tests
after the fewest possible edits. If refactoring fails a test, you hit undo
and try again. Either your code will squirm around within the very small
space the tests allow it to, or you will give up and make the most
frequently failing tests fuzzier. These are both good - especially if you
include a dose of "thinking about the problem space" along with the
relentless testing.

--
Phlip
http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces

CTips

unread,

Nov 17, 2004, 7:00:13 AM11/17/04

to

If you rewrite your test as:
assert( n == ( i == 0 ? 1 : i*fact(i-1)) );
it becomes more apparent that you're really just rewriting the factorial
function, recursively.

>> Another problem with pre-, post- and invariant conditions in complex
>> situations is that coming up with the "right" conditions is pretty
>> difficult (particularily the invariants). Here "right" means that they
>> correctly capture the state, are weak enough to be written succintly,
>> and are strong enough to detect many bugs.
>
>
> Difficult, yes, but I think the exercise of doing it builds confidence
> in a correct approach.

Oh, this exercise is useful. But DBC is by no means as generally
applicable, or as powerful as its advocates suggest. Like many ideas it
works better on small examples, and starts to show limitations once you
try to apply it to bigger or more complex situations.

CTips

unread,

Nov 17, 2004, 7:12:43 AM11/17/04

to

Ronald E Jeffries wrote:

> On Tue, 16 Nov 2004 15:46:58 -0500, CTips <ct...@bestweb.net> wrote:
>
>
>>I tend to write about 100klocs of production/production-quality code
>>every year (I'm not counting things like the test code itself). Its
>>slipped now that I'm doing customer/sales/management kind of stuff, but
>>I'm still trying to get to about that half that. And its not exactly
>>simple stuff either - at this point a large part of what I write
>>involves development of new algorithms as well as the coding.
>>
>>So permit me to say this - if you can hit those kinds of productivities
>>using TDD and the kinds of testing approaches you're advocating, more
>>power to you. If you can't, then maybe you should consider how to change
>>your approach to get there.
>
>
> Most programmers are not up to your standards, no matter what
> techniques they use. It's problematical to measure any process against
> you and people like you, don't you think?
>

Why are they not at those standards? Some of it may be talent + desire +
experience, but some of it is definitely the practices we adopt.

Now, if you look at those practices, and see that they are different
from the ones you use/advocate, then you have to ask yourself - are your
current practices holding you back? Are you at your limit, or can you
get better? Obviously, switching to a new set of practices means that
your productivity may go down before it goes up again, and it may be
that the new practices don't work as well for you - but unless you try,
you'll basically plateau.

For example - I've picked up splint, and am toying with the idea of
writing my next program splint -strict warning free. I'd like to see its
impact on my overall productivity. Unfortunately, for the next few
months most of my coding is going to be VHDL, so its unlikely I'll be
able to get around to it soon.

For some of the simpler practices I use, have a look at
http://users.bestweb.net/~ctips/

CTips

unread,

Nov 17, 2004, 7:14:07 AM11/17/04

to

Ronald E Jeffries wrote:

> On Tue, 16 Nov 2004 15:49:33 -0500, CTips <ct...@bestweb.net> wrote:
>
>
>>Ronald E Jeffries wrote:
>>
>>
>>
>>>I can't remember the last time I encountered a team that had an very
>>>very low defect rate, but I'm sure they are out there somewhere. And
>>>maybe they don't need TDD. Someday maybe I'll get to observe such a
>>>team and find out.
>>>
>>
>>You know you have a standing invitation. Swing by whenever you're in the
>>area. I'll make sure you get access to our bug database, and see if I
>>can arrange access to our customer help e-mails.
>
>
> I'm sure it will be fascinating. Did Ralph wind up deciding there was
> no fit for a visit on his part?
>

He never got back to me. And I got caught up in work, and didn't push him.

John Roth

unread,

Nov 17, 2004, 8:46:19 AM11/17/04

to

"CTips" <ct...@bestweb.net> wrote in message

news:10pmfap...@corp.supernews.com...

> Thad Smith wrote:
>> CTips wrote:
>>
>>> sted...@yahoo.com wrote:
>>>
>>>> What about Design by Contract?

DBC is an interesting beast. From my viewpoint, it's
an attempt to implement a lot of the formal methods
material from the '70s in a testing environment rather
than a formal or informal proof environment.

Its advocates miss a number of things, one of which
is that it's essentially opportunistic testing. It tests
based on whatever values happen to be generated
during the test run, rather than the carefully selected
values of classical testing or the different (but equally
carefully selected) values of TDD.

And of course, just about everyone who's tried
it seriously points out that you can't write what
you'd really like for postconditions much of the
time.

What I'd really like is a program verifier. I know
that it seems to be a really "hard" research topic,
but I suspect that's part of the problem. What
little I've seen on the subject is that they're trying
to verify pre-written modules, and we know
(from an XP standpoint) that we don't really
want to do that. Integrating testing with coding
has really pervasive effects on how we code;
integrating verification with coding should have
equally pervasive effects.

John Roth

Laurent Bossavit

unread,

Nov 17, 2004, 11:13:26 AM11/17/04

to

> For instance, how do I specify the post-condition for the factorial
> function?
> n = fact(i);
> /* at the end of this n is i! */

I'm not sure why you refer to "the" post-condition - there could be any
number, depending on which characteristics of fact were important in the
program.

Factorial would have a precondition of "i >= 0", and offhand we may
suppose that the postcondition "n > 0" is relevant.

The point of contracts is that they avoid having to resort to defensive
programming - if your factorial gets passed a negative integer, then
that's a bug in client code, not in factorial. You can leave out all
error-handling code related to that condition. By the same token, a
violated postcondition says that the bug was in your function.

Laurent

Laurent Bossavit

unread,

Nov 17, 2004, 11:33:05 AM11/17/04

to

> Now, the question is what makes you think that it is better than (or
> even remotely comparable to) other approaches for building medium and
> large sized programs with a "reasonable" degree of correctness.

I don't remember stating that I think that, so it most certainly isn't
"the" question. :)

Your experience with large programs will yield valuable insights as to
how TDD might fare in such a context, and I'm interested in these
insights, should I ever work on a large program. However, I'd like to be
sure that your scrutiny bears upon TDD as I practice it, rather than on
some misunderstanding of it.

Has the conversation thus far cleared up some of the points that were
unclear for you about TDD ?

In particular, what differences has it revealed between the way you
attempted to use TDD and the way practitioners use it ?

Laurent

CTips

unread,

Nov 17, 2004, 11:41:02 AM11/17/04

to

Laurent Bossavit wrote:

> Has the conversation thus far cleared up some of the points that were
> unclear for you about TDD ?
>
> In particular, what differences has it revealed between the way you
> attempted to use TDD and the way practitioners use it ?
>

There are no differences. I did it exactly the way its advocated. It
just turns out to be inefficient - extremely so. I'd estimate that I'd
probably be about 2x to 10x less efficient if I used TDD to program.

>>Now, the question is what makes you think that it is better than (or
>>even remotely comparable to) other approaches for building medium and
>>large sized programs with a "reasonable" degree of correctness.
>
>
> I don't remember stating that I think that, so it most certainly isn't
> "the" question. :)
>
> Your experience with large programs will yield valuable insights as to
> how TDD might fare in such a context, and I'm interested in these
> insights, should I ever work on a large program. However, I'd like to be
> sure that your scrutiny bears upon TDD as I practice it, rather than on
> some misunderstanding of it.
>

Like I said, most of my work has been with medium sized programs. I have
never really worked on a 500kloc+ sized program. So I don't really have
any insights about large programs - except that I'd try not to write a
large program - either break it up into medium sized programs, or try to
find a way of reducing the size, possibly through the use of a little
language.

I think that someone who has worked on more than one such project in a
non-management but non-grunt position is really qualified to talk about
practices in programs of that scale. And it would help if a couple of
those projects were after the introduction of workstations (mainframe
style practices might be a little dated). I don't really think that
there are too many such people around, unfortunately.

Laurent Bossavit

unread,

Nov 17, 2004, 12:32:48 PM11/17/04

to

> There are no differences. I did it exactly the way its advocated. It
> just turns out to be inefficient - extremely so.

OK - that clears up doubts I had from some of your comments, such as
about having to write 1000 lines before you could write a real test.

I'm assuming you tried TDD on a problem you selected specifically as
being amenable to it, rather than what you would consider a "real"
programming task ?

Inefficient as full-on TDD turned out to be for you when you tried it,
did you notice any particular effect on the resulting design of your
code, or its defect density, or any other characteristic ?

Laurent

CTips

unread,

Nov 17, 2004, 1:02:23 PM11/17/04

to

Laurent Bossavit wrote:

I used it formally to develop a toy example (parsing a formatted text
file, probably ~200loc) and informally to develop a data-structure
(lists with length, I believe, also about that length).

Unfortunately, at that size/complexity, I don't inject too many bugs -
coupled with the kind of defensive programming I do, I can identify any
bugs with very little testing.

As a matter of fact, in the toy example, because I tried consciously to
keep from doing anything unnecessary, the time to isolate all bugs
probably went up.

And the time to implement went way up. I can usually code and debug a
200 line C module in under an hour (unless it has some hidden
complexity), but with TDD, I think the time doubled or tripled.

CTips

unread,

Nov 17, 2004, 1:25:33 PM11/17/04

to

While we're on the subject of productivity, I'd like to point to a
(sub-)thread in comp.arch; in particular look at the following articles
(I hope this works; I've never tried cut and paste of google archived
news messages)

http://groups.google.com/groups?selm=3DFAA2EA.FEE1135E%40bestweb.net
http://groups.google.com/groups?selm=atmhu3%24r5h%241%40vkhdsu24.hda.hydro.com
http://groups.google.com/groups?selm=3DFEDB59.34CB27B3%40bestweb.net
http://groups.google.com/groups?selm=nheqta-ih1.ln%40cohen.paysan.nom

Michael Mendelsohn

unread,

Nov 17, 2004, 4:11:08 PM11/17/04

to

CTips schrieb:

> For instance, how do I specify the post-condition for the factorial
> function?
> n = fact(i);
> /* at the end of this n is i! */

> To actually check this would require rewriting the factorial function

> some other way.
> n = fact(i)
> assert( n == fact_1(i) );
> So, in practice, one would just leave the post-condition as a comment.
> Which kind of defeats the purpose.

I think the purpose of the postcondition is to prevent your code from
causing bugs in other code. Specifying that the function result should
be correct as a postcondition is useless. If you want to compute the
result with two different algorithms (or two different implementations
of the same algorithms), that will certainly increase bug detection
rates, but is the postcondition the right place to do that?

Other code can expect the factorial to return a positive integer; it
should be clear how big this integer should be, and I'd expect either a
precondition that puts a limit on the input number size, or a
postcondition that mentions that the output can be NaN. I think assert(
i > 0 ) would fail for NaN?

If I replace fact(i) with code that always returns 1, will code outside
break? If it doesn't break, but merely returns incorrect results, that
is no problem.
One expectancy of the outside could be that for i>1, fact(i) > i.

If I put fact(i) > fact(i-1), I must rely on the language to compute
fact(i-1) without invoking the assertions for fact() yet again.
Languages with retrofitted assert() don't do that, do they?

Just my 2c
Michael
--
Still an attentive ear he lent Her speech hath caused this pain
But could not fathom what she meant Easier I count it to explain
She was not deep, nor eloquent. The jargon of the howling main
-- from Lewis Carroll: The Three Usenet Trolls

Michael Mendelsohn

unread,

Nov 17, 2004, 4:19:19 PM11/17/04

to

Ronald E Jeffries schrieb:

> On Tue, 16 Nov 2004 17:03:10 -0700, Thad Smith <Thad...@acm.org>
> wrote:
> >One of the things that occurs to me is that if you write a test, write
> >code, write a test, etc. cycle, you are basically designing your tests
> >with assumptions about the implementation.

I'd do that with Unit tests.

> >Now suppose you change the
> >implementation. Let's say you need to increase performance and can do
> >this by handling an input parameter x in two ranges: 0 - 15, and 16 or
> >greater, rather than a single range as originally designed. If the
> >original test didn't test these ranges separately, the test after change
> >might not catch a new bug. How do you prevent such cracks opening when
> >you refactor?
>
> Wouldn't we want to write new tests for each of the ranges, given that
> we are trying to (a) improve performance and (b) will very likely be
> creating two new methods?

I'd decide to do a second implementation.
I'd write a test to check the two implementations return equal results.
I'd copy the existing implementation and write a second interface to it
to make the test pass.

I'd decide to improve performance for range 0-15.
I'd write a test that tests performance for the main implementation.
I'd implement code that does it.

If the test that compares implementations still passes,
where's the worry?

The tests show ditinctively that
a) there are indeed 2 different implementations
b) the performance requirement you've set and fulfilled

Cheers

Christophe Thibaut

unread,

Nov 17, 2004, 4:55:52 PM11/17/04

to

CTips a écrit:

> Stede Troisi wrote:
>
>> I get your point now. So I guess you don't use TDD at all because you are
>> past the training wheel stage? If you do use a little TDD what is the
>> best
>> way to determine when to use it and when not?
>
>
> A good question to which I unfortunately don't necessarily have a good
> answer. Maybe the following will help.
>
> I'm assuming you're using TDD for designing your modules. Next time you
> have to, try and work out the design, preferably in your head, but
> possibly with paper and pencil, and see how far you can get. See how
> much of the module you can actually visualize. Definitely _think_ about
> all the test cases you'd expect to write. Think about all the invariants
> you'd expect to hold. Think about how you expect the module to be used.
> Think about the data-structures you'd use. Think about any algorithms
> that will be required.
>
> Now, if at the end of this process, you think you have a fairly good
> handle on what the final code will look like, go ahead and code it up.
> You may want to code it in phases, and test each phase separately.

But when I do that :

- whole design
- coding phase
- testing phase

I observe that during the coding phase, I can't modify my design until
my testing phase is done (i.e. until I have honest test coverage for the
code) not doing so would be hacking.

So either I continue coding until testing is done (continuing coding
while acknowledging for some design defects..hmm) either I stop coding
and restart the design for this phase (with a risk of analysis paralysis
lurking).

For this to work I'd have to do very very good thinking in the design
phase. A design phase where design flaws would not be an option, in fact.

Given the usual pressure for delivering not too late a bug-free system,
this way of trying to make it right at the outset sounds just stressful
to me.

>
> If you find that you are not going back and rewriting code, then you
> know you don't need TDD. If you find that you're rewriting code because
> your understanding of the problem was at fault, then maybe TDD is still
> a good design technique. If you find that you're rewriting code because
> the code becomes cleaner that way, you probably don't need TDD, but
> maybe you do.

*my* probability of not going back and not rewrite code because my
design is so good = near 0.
*my* probability of having a less-than-perfect understanding of the
problem at the outset of the project = near 1.
*my* probability of having to rewrite code because it gets dirty = near 1.

So I'm glad I picked TDD after all. What I hear from what you say is
that really bright programmers don't need TDD, that it could decrease
their velocity instead of increasing it. I agree. For the rest of us (a
majority of my coworkers are like me, not very good at design) TDD just
rocks !

Regards --ct

Michael Mendelsohn

unread,

Nov 17, 2004, 5:17:44 PM11/17/04

to

CTips schrieb:

> get better? Obviously, switching to a new set of practices means that
> your productivity may go down before it goes up again, and it may be
> that the new practices don't work as well for you - but unless you try,
> you'll basically plateau.

This works both ways - if you're as good as you are, it could be a
result of combining the experiences made using both sets of practices:
andin this case, it wouldn't matter whether you used set A and switched
to B, or vice versa: you'd improve in both cases.

If a new set B is introduced, there won't be any people switching from B
to A, so you could get the impression that B is an improvement overall,
even if the improvement is due to the abovementioned effect.

To be really accurate, you'd have to identify the groups you apply the
new methodology to: "To practicioners of practice X, TDD promises a
development speedup of X %, on the average, after a spinup time of circa
6 weeks". "When used in college-level intorductory programing courses,
method B produced on average better statistics (show details) than
method A".

I've not seen statements like this.

Laurent Bossavit

unread,

Nov 18, 2004, 2:19:53 AM11/18/04

to

> Unfortunately, at that size/complexity, I don't inject too many bugs -

I wouldn't call that unfortunate. :)

> And the time to implement went way up. I can usually code and debug a
> 200 line C module in under an hour (unless it has some hidden
> complexity), but with TDD, I think the time doubled or tripled.

It's a new technique for you - it would be surprising if you had the
same performance on your first few runs. "Trying consciously to keep
from doing anything unnecessary" would slow you right down - as if you
were a virtuoso player of some instrument, switching to a rather
different one; it would be a while before you could stop thinking of
where what finger goes, and so on.

Also I would be looking for gains from TDD a little later in the
complexity curve - when that little module has to take on one, two,
three further features.

Laurent

CTips

unread,

Nov 18, 2004, 8:52:20 AM11/18/04

to

In my considered judgement, as complexity goes up, the amount of
work/time to do TDD goes up (you're definitely writing more tests, and
you're interrupting the coding of the module). If the time to design
does not go down, or the time to debug the resulting code does not go
down enough to counter-balance it, its a net loss.

Given that going through a sequence of simple steps quite often yeilds
pushes one into a sub-optimal design and requires extensive rewriting,
it seems quite inefficient. This is going to be more likely in complex
situations than in simple ones. Alternatively - if you already know
up-front a near-optimal design, why bother with the TD_D_?

The number of tests required to get statement coverage when written
post-facto will usually be less than those generated by TDD. (I'm
talking about white-box style testing). If we're trying to get more than
statement coverage (which you should, for anything other than the
simplest modules), you're going to have to generate tests that will
subsume the TDD generated tests anyway.

So, where is the benefit of TDD? Basically, its for programmers who
can't figure out the design of modules by thinking about the problem.

This sparks another thought - TDD may work for figuring out small module
design. But how are you going to architect & design a large program?
Either you can try and do it incrementally, or do a decent job upfront.

If you do it incrementally via TDD, and you find that TDD drives you
through sub-optimal intermediate architectures, you will have to
rewrite major portions of the program, every time you have to switch
architectures.

If you try and do a decent job upfront, you're going to have to think
about the program as a whole, taking into account as much as you know
about the possible requirements (and their stability/uncertainity) at
that point. But that means that you have to have the skill to do the
design up-front for a whole program. If you have that skill, then why
are you scared of designing a module? If you don't have that skill, then
how can you ever hope to tackle anything but small, simple projects.

IMO, anyone who aims to tackle medium/large programs had better out-grow
TDD.

Phlip

unread,

Nov 18, 2004, 9:05:37 AM11/18/04

to

CTips wrote:

> So, where is the benefit of TDD? Basically, its for programmers who
> can't figure out the design of modules by thinking about the problem.

That is another way to say "how to write programs that scale to more
complexity and functionality than can fit in programmers' brains all at the
same time".

> IMO, anyone who aims to tackle medium/large programs had better out-grow
> TDD.

You contradict yourself.

--
Phlip
http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces

Dominic Williams

unread,

Nov 18, 2004, 9:40:56 AM11/18/04

to

CTips wrote:

> In my considered judgement, as complexity goes up,
> the amount of work/time to do TDD goes up (you're
> definitely writing more tests, and you're
> interrupting the coding of the module). If the time
> to design does not go down, or the time to debug the
> resulting code does not go down enough to
> counter-balance it, its a net loss.

My (5 years) experience with TDD shows quite the
opposite. I would go so far as to say that the benefits
of TDD become more and more apparent as complexity
grows.

What made you imagine that TDD takes more work as
complexity grows?

> If you do it incrementally via TDD, and you find that
> TDD drives you through sub-optimal intermediate
> architectures, you will have to rewrite major
> portions of the program, every time you have to
> switch architectures.

TDD does indeed drive one through intermediate
architectures which may be sub-optimal for a
hypothetical final program, but they are optimal for
each, real, intermediate program. This makes a lot of
sense, both technically and in business terms, in
situations where the exact scope and details of the
final program are not certain, or when the customer
benefits from getting usable software sooner.

Do you work in an area where the exact scope and
details of the final program are know from the start?

> If you try and do a decent job upfront, you're going
> to have to think about the program as a whole, taking
> into account as much as you know about the possible
> requirements (and their stability/uncertainity) at
> that point. But that means that you have to have the
> skill to do the design up-front for a whole
> program. If you have that skill, then why are you
> scared of designing a module? If you don't have that
> skill, then how can you ever hope to tackle anything
> but small, simple projects.

I believe I have that skill to a reasonable degree, and
I am certainly not scared. I used to work that way, in
fact I took great pleasure and pride in it.

But I found TDD even more pleasurable, intellectually
interesting, effective, and better adapted to the
reality of changing requirements and of working with
teams. There are many people out there who are far more
brilliant designers than I who have also adopted TDD.

> IMO, anyone who aims to tackle medium/large programs
> had better out-grow TDD.

I think anyone who aims to tackle large programs should
first learn TDD.

Dominic Williams
http://dominicwilliams.net

----

CTips

unread,

Nov 18, 2004, 10:19:25 AM11/18/04

to

Dominic Williams wrote:

> CTips wrote:
>
> > IMO, anyone who aims to tackle medium/large programs
> > had better out-grow TDD.
>
> I think anyone who aims to tackle large programs should
> first learn TDD.
>

Out of curiosity, what kinds of programs would you call large?

What is the "largest" program you've worked on personally as a coder (a
description and a rough LOC would be nice)? What %age of the code was yours?

What is the largest module you've written solo?

Laurent Bossavit

unread,

Nov 18, 2004, 1:13:05 PM11/18/04

to

> If the time to design does not go down, or the time to debug the
> resulting code does not go down enough to counter-balance it, its
> a net loss.

Where I get a net productivity gain is in the latter - it reduces
debugging time handsomely.

> Given that going through a sequence of simple steps quite often yeilds
> pushes one into a sub-optimal design and requires extensive rewriting,

I can see that this can be a concern, but in practice that turns out not
to be the case.

> So, where is the benefit of TDD? Basically, its for programmers who
> can't figure out the design of modules by thinking about the problem.

Exactly. If you're not able to solve the entire design problem (and I
mean *entire*), you're better off having a way of evolving it
incrementally *safely*. My tiny brain has trouble handling a loop, so
TDD starts to help from that point onwards.

Your massive brain can handle up to whatever it is, I'll pull the number
of 10KLOC out of thin air, then TDD will become a net benefit shortly
beyond that point.

What is evidence that you are not able to solve the *entire* problem ?
Whenever you find you have introduced a defect in your code. If you have
bugs - if you spend any time debugging - you are tackling problems whose
complexity is beyond you.

> If you don't have that skill, then how can you ever hope to tackle
> anything but small, simple projects.

TDD *is* a skill that allows you to tackle more complexity.

Laurent

Ronald E Jeffries

unread,

Nov 18, 2004, 1:34:13 PM11/18/04

to

On Thu, 18 Nov 2004 08:52:20 -0500, CTips <ct...@bestweb.net> wrote:

>So, where is the benefit of TDD? Basically, its for programmers who
>can't figure out the design of modules by thinking about the problem.

Well, actually, I /can/ figure out the design of modules by thinking
about the problem, and I've been doing it for many years. And still I
find TDD to be very useful. So I'm not sure the conclusion above is
quite right.

>
>This sparks another thought - TDD may work for figuring out small module
>design. But how are you going to architect & design a large program?
>Either you can try and do it incrementally, or do a decent job upfront.

Yes ...

>
>If you do it incrementally via TDD, and you find that TDD drives you
>through sub-optimal intermediate architectures, you will have to
>rewrite major portions of the program, every time you have to switch
>architectures.

That would only be true if (a) you go through suboptimal
architectures, which would only happen if you weren't up to upfront
design, since otherwise you'd know what was suboptimal, and

(b) if you found out that a decision was suboptimal only after writing
tons of code -- and very non-modular code at that. I find that when I
pay good attention to modularity, I never have to "rewrite" "major
portions", because in a modular program, every idea is implemented in
just one place -- so I just change that one.

>
>If you try and do a decent job upfront, you're going to have to think
>about the program as a whole, taking into account as much as you know
>about the possible requirements (and their stability/uncertainity) at
>that point. But that means that you have to have the skill to do the
>design up-front for a whole program. If you have that skill, then why
>are you scared of designing a module? If you don't have that skill, then
>how can you ever hope to tackle anything but small, simple projects.

Again, logic suggests one thing and experience discovers another. I
think that the issue is that the logic is trying to divide the world
into two cases, and the reality is that a good developer is designing
all the time, not just at the beginning XOR never.

>
>IMO, anyone who aims to tackle medium/large programs had better out-grow
>TDD.

Well, that's interesting advice, but it doesn't match my experience.
I've built some rather large stuff and I would certainly use TDD to do
it were I to do it again.

But TDD doesn't mean "don't think about the design". It means "think
about the design in a very concrete form". I wonder if that's part of
our disconnect.

Regards,

Andrew McDonagh

unread,

Nov 18, 2004, 2:41:25 PM11/18/04

to

CTips wrote:

snipped

> I suspect that I'd call your program on the small size. Unless of
> course, the program is a distributed fault-tolerant program with support
> for things like recovery after network partitioning. Or unless it
> requires switch-level 5-9s reliability. In which case it definitely
> counts as a medium sized program.
>

It is a distributed fault tolerant system, and requires 5-9s
availability - not reliability.

Strange, I'd never use 5-9s availability as a measurement of application
size, cause I can get that with a simple HelloWorld app.

Andrew McDonagh

unread,

Nov 18, 2004, 2:42:42 PM11/18/04

to

There you go again with the LoC... :-)

CTips

unread,

Nov 18, 2004, 3:25:39 PM11/18/04

to

Remember the word "switch-level 5-9s reliability" - that means that the
service will be unavailable something like 9 hours per year; this will
include any scheduled maintainence on the servers, as well as any
software updates, apart from the usual possibility of actual computer
and/or network failures.

Its in the context of what stresses it puts on the system; if you're
designing for 5-9 availability on a distributed fault-tolerant system
[even if you're building on top of some framework like Horus/Isis], it
gets pretty complex.

Some of the nastier problems
- what happens if the network partitions, and someone makes an update?
How do you reconcile the information when they rejoin?
- How about things like a bad router table somewhere?
- What kind of failure detectors are you using?
- Are you dealing with Byzantine failure models? What resilience are you
targeting?
- What happens when A can talk to B and C (and vice versa), but the link
between B & C is very slow?

Hats off to you if you've already built the infrastructure to test some
of those corner cases, let alone derive the code from them.

CTips

unread,

Nov 18, 2004, 3:41:27 PM11/18/04

to

Andrew McDonagh wrote:

BTW: you can only get 5-9s availability on a simple HelloWorld app if
your machine does not go down for more than 9 hours a year. Good luck,
particularily if you were in Florida this year, or in the North-east
last year.

CTips

unread,

Nov 18, 2004, 8:43:24 PM11/18/04

to

Laurent Bossavit wrote:
>>If the time to design does not go down, or the time to debug the
>>resulting code does not go down enough to counter-balance it, its
>>a net loss.
>
>
> Where I get a net productivity gain is in the latter - it reduces
> debugging time handsomely.
>
>
>>Given that going through a sequence of simple steps quite often yeilds
>>pushes one into a sub-optimal design and requires extensive rewriting,
>
>
> I can see that this can be a concern, but in practice that turns out not
> to be the case.
>
>
>>So, where is the benefit of TDD? Basically, its for programmers who
>>can't figure out the design of modules by thinking about the problem.
>
>
> Exactly. If you're not able to solve the entire design problem (and I
> mean *entire*), you're better off having a way of evolving it
> incrementally *safely*. My tiny brain has trouble handling a loop, so
> TDD starts to help from that point onwards.
>
> Your massive brain can handle up to whatever it is, I'll pull the number
> of 10KLOC out of thin air, then TDD will become a net benefit shortly
> beyond that point.

The reason we can handle the design of large systems is because of
layers of abstraction. When we're thinking about a 100kloc program, we
don't think about the 100kloc simultaneously, we just think of the top
level functions/data-structures/abstractions. Then, separately, we think
of the how each of the componets of the top abstraction layer are to be
implemented. And so on and so forth.

Of course, it isn't that clean in practice, but the point is still this
- if you come up with right kinds of abstractions, you don't have to
think about the whole program at the same time. System architecture, to
a large extent, is coming up with the right kinds of top-level
abstractions. ADTs are just another (lower-level) abstraction mechanism.

On a somewhat different note, the largest program for which I didn't
really have too many abstraction layers was about 30 kloc. For
performance reasons, it was exactly one function with the only
abstraction mechanism being macros. That was quite an experience.
However, it wasn't too bad; I think it took about 4 months total.

> What is evidence that you are not able to solve the *entire* problem ?
> Whenever you find you have introduced a defect in your code. If you have
> bugs - if you spend any time debugging - you are tackling problems whose
> complexity is beyond you.

I don't see how that follows. You can introduce "defects" into a "hello,
world" program.

I'd say if you keep changing the structure (architecture) of the
program, then the complexity is beyond you.

>
>>If you don't have that skill, then how can you ever hope to tackle
>>anything but small, simple projects.
>
>
> TDD *is* a skill that allows you to tackle more complexity.

If you're starting from nothing, certainly TDD is a skill that allows
you to tackle more complexity. I'm not saying that TDD is not
appropriate at a certain skill level. Its just that, I suspect, most
good programmers will outgrow it.

The funny thing is that none of the more productive programmers I know
use TDD. And a few of them are pretty passionate about trying to
increase their productivity and probably would have looked at in the
past. I haven't actually discussed the issue with them, but I'd be
surprised if their reasons for not using TDD are different from mine.

CTips

unread,

Nov 18, 2004, 9:01:21 PM11/18/04

to

Ronald E Jeffries wrote:

>
> Well, that's interesting advice, but it doesn't match my experience.
> I've built some rather large stuff and I would certainly use TDD to do
> it were I to do it again.

Do you know of anyone who's using TDD to do something medium/large
sized? I know of a couple of people who say they are, but without
knowing more details, I can't be sure. You're more likely to know.

> But TDD doesn't mean "don't think about the design". It means "think
> about the design in a very concrete form".

But it also prescribes how to implement code - in an incremental
fashion. Remember an old thread about how to implement the isvowel()
function?

You add an 'a' and the function looks like
int
isvowel(unsigned char c)
{
if( c == 'a' ) {
return 1;
}
else {
return '0';
}
}

You add 'e', and the condition looks like
if( c == 'a' || c == 'e' )
...

At some point you figure out that you want to use a table (or one of the
other compact mechanisms) and the code becomes (I'm using psuedo-C99
syntax here):
static int isvowel_tbl[MAX_UCHAR+1] = {
'a' = 1, 'e' = 1, 'i' = 1, 'o' = 1, 'u' = 1,
'A' = 1, 'E' = 1, 'I' = 1, 'O' = 1, 'U' = 1
};
int isvowel(unsigned char c) { return isvowel_tbl[c]; }

Now, suppose I started off with the table based implementation right
up-front, but just filled in the 'a' position - would that be TDD? Or
would that violate the "do the simplest thing" rule?

If I can't start off with the table based implementation in TDD (and
based on the contents of the previous thread, I seem to remember that
you said you can't), then isn't all the prior coding a waste of time?

Now extend that to something more complex, where you may have to write
more code before the "right" design emerges, and the amount of code you
have to throw away increases. Or what if there are multiple intermediate
designs that have to be discarded?

Vladimir Levin

unread,

Nov 18, 2004, 10:27:34 PM11/18/04

to

CTips <ct...@bestweb.net> wrote in message

> > > IMO, anyone who aims to tackle medium/large programs
> > > had better out-grow TDD.

Quite frankly, I am finding your repeated claims to be somewhat
grating. On one hand, you have a right to your opinion, but on the
other hand, you disregard all evidence presented by people such as
Laurent, Iljya, Phlip, Ron, RCM, with respect to fairly large projects
that have been successfully run using Xp and TDD. It is one thing to
say "My experience suggests TDD is not worthwhile." It is another to
imply TDD is for poor developers who need "training wheels."

Also, your position is not very logically consistent in my view. You
seem to point in the direction of writing large amounts of code based
on an initial visualized design, yet you also admit that the right
design emerges only after a significant amount of code has been
written. Finally, VERY large projects are inherently difficult. This
is just my opinion, but a VERY large project, which I would say about
anything in the 500,000+LOC range, is just plain difficult, regardless
of the methodology used. If I were managing a truly enormous project,
I would be much more concerned about the long term maintainability of
my code than about the lines of code being developed per week. I would
want a robust application, and I would expect enhancements and changes
to be done quickly and without adding a lot of new bugs. Also, I would
consider constant wholesale rewrites from scratch all the time to be a
bad thing.

As for the training wheels analogy, I'd say TDD is much more like a
harness for mountain climbing. You may find it uncomgfortable
initially if you're not used to it, but at the end of the day, you're
much better off learning to live with it.

Ronald E Jeffries

unread,

Nov 18, 2004, 11:05:11 PM11/18/04

to

On Thu, 18 Nov 2004 20:43:24 -0500, CTips <ct...@bestweb.net> wrote:

>If you're starting from nothing, certainly TDD is a skill that allows
>you to tackle more complexity. I'm not saying that TDD is not
>appropriate at a certain skill level. Its just that, I suspect, most
>good programmers will outgrow it.

My programming is actually moderately good, and I hang with TDD
programmers who can kick my butt. I'm not claiming to be as good as
you are (by the way, I read a bunch of your tips stuff, and it's
great) but there seem to me to be a lot of programmers lower in the
pyramid.

>
>The funny thing is that none of the more productive programmers I know
>use TDD. And a few of them are pretty passionate about trying to
>increase their productivity and probably would have looked at in the
>past. I haven't actually discussed the issue with them, but I'd be
>surprised if their reasons for not using TDD are different from mine.

If I understand your postings on TDD, I have the impression that
you're not as yet very good at it. For sure, it sounds like you are
doing something very different from what I do, and the experiences you
describe are quite different as well. Just looking at a new technique,
as you suggest these other folks have done, really isn't enough to
qualify us to judge its value, it seems to me.

Ronald E Jeffries

unread,

Nov 18, 2004, 11:15:38 PM11/18/04

to

I didn't say you can't. I said that //I// wouldn't. I always start
with the absolute dumbest code I can think of, because I want to find
out how much hassle it is to change things. I need to push TDD to the
limits to be able to assess how best to use it. I feel that I owe that
to the people I recommend things to.

What's interesting is that it doesn't seem to take me much longer, and
I find that the "shape" of the program comes out better when I define
it with tests than when I define it by speculative design.

>
>Now extend that to something more complex, where you may have to write
>more code before the "right" design emerges, and the amount of code you
>have to throw away increases. Or what if there are multiple intermediate
>designs that have to be discarded?

Well, in the example above, we threw away, what, this many characters:
" if( c == 'a' || c == 'e' )" to which we had never felt any love or
commitment. By writing that simple code, we were able to focus on the
interface to our code (in this case not very interesting but still
more important than the implementation) and it was all at the cost of
about a dozen extra characters of writing.

I find that that sort of ratio generally holds: I throw away very
little code that took very little time to write; I preserve the
interface; I replace the little chunk of code with something bigger
and better.

So it's not like rewriting 10,000 lines. It's like writing 10,100
lines instead of 10,000, with the benefit that the desirable shape of
the code comes to my mind more readily, and I write extra stuff less
frequently. Net, I come out ahead in elapsed time to working code that
I'm proud of.

If you are using TDD and actually encountering big rewrites, I'd like
to observe what you're doing, e.g. in an article like the ones I
write, or just by watching. If you are speculating about what would
happen, I respectfully believe that your speculation is not going to
be borne out in reality.

CTips

unread,

Nov 18, 2004, 11:58:07 PM11/18/04

to

Vladimir Levin wrote:
> CTips <ct...@bestweb.net> wrote in message
>
>
>>> > IMO, anyone who aims to tackle medium/large programs
>>> > had better out-grow TDD.
>
>
> Quite frankly, I am finding your repeated claims to be somewhat
> grating. On one hand, you have a right to your opinion, but on the
> other hand, you disregard all evidence presented by people such as
> Laurent, Iljya, Phlip, Ron, RCM, with respect to fairly large projects
> that have been successfully run using Xp and TDD. It is one thing to
> say "My experience suggests TDD is not worthwhile." It is another to
> imply TDD is for poor developers who need "training wheels."

Also, what evidence do you have that there are any large/medium sized
projects that have used TDD/XP? Or XP/TDD leads to decent productivity?
I've looked at the literature. If there is any such published example, I
have yet to see it. And I have asked for some such example repeatedly.
Every time I am pointed to a paper, it turns out that the project was
small and/or the productivity was abysmal. Worse yet, a lot of them seem
to be written _after_ the project was canceled.

Now, lets look at some of the medium sized projects whose history is
well known, and which are successfully in use - things like the gcc
compiler, the linux kernel (I'm talking about only the kernel), emacs,
the apache server etc. Definitely _NOT_ done using anything like XP or
TDD. Also, done by a relatively small teams of fairly competent
programmers. As far as evidence goes, the perponderance of evidence
suggests that writing medium-sized complex programs is best done by
small teams of competenet programmers using practices other than XP.

> Also, your position is not very logically consistent in my view. You
> seem to point in the direction of writing large amounts of code based
> on an initial visualized design, yet you also admit that the right
> design emerges only after a significant amount of code has been
> written.

I never said that. I said that in _TDD_ the right design emerges only
after you have thrown away a lot of code.

Of course, with any approach, you will always encounter situations where
you have to rewrite major portions of the code, or, worse, change the
entire architecture. This is likely to happen due to requirement
changes, or performance issues, but can happen because your initial
visualization was faulty.

>
Finally, VERY large projects are inherently difficult. This
> is just my opinion, but a VERY large project, which I would say about
> anything in the 500,000+LOC range, is just plain difficult, regardless
> of the methodology used. If I were managing a truly enormous project,
> I would be much more concerned about the long term maintainability of
> my code than about the lines of code being developed per week.

Possibly - however, in my experience productive programmers also tend to
be the ones that write robust, maintainable programs. Its probably
because they write robust programs that they are productive :)

Also, if you spend time upfront designing the program and ask yourself -
what can change? How would I make this aspect flexible? you tend to get
a much more extensible design than otherwise.

> I would
> want a robust application, and I would expect enhancements and changes
> to be done quickly and without adding a lot of new bugs. Also, I would
> consider constant wholesale rewrites from scratch all the time to be a
> bad thing.

So do I - but isn't that what you can get with TDD?

> As for the training wheels analogy, I'd say TDD is much more like a
> harness for mountain climbing. You may find it uncomgfortable
> initially if you're not used to it, but at the end of the day, you're
> much better off learning to live with it.

Based on the current evidence, you're at least 1 order of magnitude off
in productivity than the top-end programmers. How do you think you can
improve 10x? If TDD/XP will give you that improvement, then yes, TDD is
a safety net. If they can only get you part of the way there, then
perhaps you'll have to look beyond them.

CTips

unread,

Nov 19, 2004, 12:21:58 AM11/19/04

to

Ronald E Jeffries wrote:

> On Thu, 18 Nov 2004 21:01:21 -0500, CTips <ct...@bestweb.net> wrote:

>
> I didn't say you can't. I said that //I// wouldn't. I always start
> with the absolute dumbest code I can think of, because I want to find
> out how much hassle it is to change things. I need to push TDD to the
> limits to be able to assess how best to use it. I feel that I owe that
> to the people I recommend things to.

Yes, I know - if you're teaching groups which include some not-so-good
people, then you have to restrict yourself to techniques that are
generally applicable. If you went to a manager and said, "I can double
productivity, but only for 10% of your team", they're probably not going
to be very happy.

And I guess you have to use those techniques yourself. You really don't
want to tell a client, "I don't actually use these techniques myself"

> What's interesting is that it doesn't seem to take me much longer, and
> I find that the "shape" of the program comes out better when I define
> it with tests than when I define it by speculative design.

Why? What do you think you miss? Do you think its because the
code/requirements ratio is low and you tend to miss things? Or something
else?

> If you are using TDD and actually encountering big rewrites, I'd like
> to observe what you're doing, e.g. in an article like the ones I
> write, or just by watching. If you are speculating about what would
> happen, I respectfully believe that your speculation is not going to
> be borne out in reality.

No, I am speculating, since I have only done TDD on small programs, so
the rewrites are small.

However, I'd like to reiterate the example about peephole transforms.
Each transform (written with everything factored out) is about 20-40
lines of code. If you did it incrementally, you'd end up with ~30*N for
N transforms, and each step would only add 30 lines. However, by using a
little language you have to pay 2000 + 4*N lines. The point at which you
decide to abandon the incremental approach for the little language
approach determines the number of lines you throw away and the total
effort.

If you follow the "simplest possible change to pass the next test" rule
of TDD, I think you would _never_ switch to a little language approach,
since the incremental effort would always be ~30 lines, but the cost of
switching would be ~2000 lines. If so, and there were 500 transforms,
you would end up with 15,000 loc.

If on the other hand you had implemented it as a little language from
the beginning, you would end up with a total solution of 4000 lines - a
significantly smaller solution.

Phlip

unread,

Nov 19, 2004, 12:46:15 AM11/19/04

to

CTips wrote:

> Also, what evidence do you have that there are any large/medium sized
> projects that have used TDD/XP?

XP and TDD suck, and barely last long enough to sustain writing a book about
whatever your project is. But the books seem to sell. Actually, XP and TDD
obviously are failing to live up to their claims in all sectors where we
tried it, because the kafluffle you hear on all kinds of forums (mailing
lists, Wikis, blogs, USENET, the bus stop, etc.) must really just part of a
vast righto-leftist conspiracy to get you to post dumb questions.

> Or XP/TDD leads to decent productivity?

RCM keeps claiming "10x reduction in defects released to the field". But he
probably works with the kinds of Fortune 100 companies whose pointy haired
bosses had no direction to go but up.

> I've looked at the literature.

Ah, then you must have read /Agile and Iterative Development: A Managers
Guide/, by Craig Larman. Its main conclusion (besides "waterfall sucks") is
that given our industry's 70% failure rate for large projects, simply
failing less often would be a better goal than increased productivity.

> If there is any such published example, I
> have yet to see it. And I have asked for some such example repeatedly.
> Every time I am pointed to a paper, it turns out that the project was
> small and/or the productivity was abysmal. Worse yet, a lot of them seem
> to be written _after_ the project was canceled.

That's right. A _scientific_ paper must have a large number of matched
populations of controls and subjects. Then you kill all the programmers and
disect their brains, looking for evidence of damage. Nope - the Diet
Mountain Dew affected both groups the same. Start again.

> Now, lets look at some of the medium sized projects whose history is
> well known, and which are successfully in use - things like the gcc
> compiler, the linux kernel (I'm talking about only the kernel), emacs,
> the apache server etc. Definitely _NOT_ done using anything like XP or
> TDD. Also, done by a relatively small teams of fairly competent
> programmers. As far as evidence goes, the perponderance of evidence
> suggests that writing medium-sized complex programs is best done by
> small teams of competenet programmers using practices other than XP.

Oh, my god! Somebody, somewhere, wrote a successful project without XP!!

> I never said that. I said that in _TDD_ the right design emerges only
> after you have thrown away a lot of code.

It "emerges" as a combination of you discovering it and you already knowing
what it is. Kind of like the end of The Wizard of Oz, when the Wikid Witch
of the North told Dorothy that she had the answer with her all along, but
she had to learn it for herself. You just tap the heels of your Ruby
language slippers together three times, and hit the One Test Button.

> Of course, with any approach, you will always encounter situations where
> you have to rewrite major portions of the code, or, worse, change the
> entire architecture. This is likely to happen due to requirement
> changes, or performance issues, but can happen because your initial
> visualization was faulty.

In real life, what typically happens is, after the twister, your house lands
on top of the previous lead architect, and you peer tremulously out the door
at a twisted landscape of big balls of crufty mud, written by short
programmers, their growth stunted by long working hours and junk food, under
the spell of some lessor process than XP.

This leaves you wondering where to start. Then Mike Feathers, wearing a
Wikid Witch of the North costume, appears with a copy of /The Joy of Legacy
Code/, and smacks you across the forehead with it.

DOROTHY
But -- after I add tests to all this legacy code, how to I fix it? Do I
split it down the middle? Do I look for the big common patterns ---

MIKE
Just refactor the low hanging fruit.

DOROTHY
But that sounds too easy! Shouldn't I make a plan too...

But MIKE floats away inside a soap bubble, leaving you all alone. With the
short programmers crowding around you.

DOROTHY
My..! People turnover so quickly here!
...Refactor the low hanging fruit? Refactor the
the low hanging fruit?

DBA
Refactor the low hanging fruit.

TOOLS GUY
Refactor the low hanging fruit!

GUI GUY
Refactor the low hanging fruit.

MATH GUY
Refactor the low hanging fruit.

ALL
Refactor the low hanging fruit.
Refactor the low hanging fruit.
Refactor, factor, factor, factor,
Refactor the low hanging fruit.

Refactor the low-hanging
refactor the...

So, by finding the simplest possible fixes to the lowest level code within
that ball of mud, and by isolating the effects of your changes with
characterization tests, you can begin to tease apart its design.

... And ... you ... are ...

Testing your way to dee-velopment
Developing tests for the cause
You'l find it a whiz, so give it a squiz
Each test just gives one little pause
The tests you test are bestests tests
The bests are tests to test the best
Because ... because
because, because, because,
Because of the wonderful code they does.
You're testing your way to dee-velopment
The wonderful tests for the cause!

--
[Phlip2004]
http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces

Phlip

unread,

Nov 19, 2004, 6:18:53 AM11/19/04

to

Laurent Bossavit wrote:

> Also I would be looking for gains from TDD a little later in the
> complexity curve - when that little module has to take on one, two,
> three further features.

Or colleagues. TDD is a great way to help folks with less exposure to your
module than you change it while you are doing something else.

--
Phlip
http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces

Gerry Quinn

unread,

Nov 19, 2004, 6:36:30 AM11/19/04

to

In article <HMfnd.25927$5b1....@newssvr17.news.prodigy.com>,
phli...@yahoo.com says...

> CTips wrote:
>
> > Also, what evidence do you have that there are any large/medium sized
> > projects that have used TDD/XP?
>
> XP and TDD suck, and barely last long enough to sustain writing a book about
> whatever your project is. But the books seem to sell. Actually, XP and TDD
> obviously are failing to live up to their claims in all sectors where we
> tried it, because the kafluffle you hear on all kinds of forums (mailing
> lists, Wikis, blogs, USENET, the bus stop, etc.) must really just part of a
> vast righto-leftist conspiracy to get you to post dumb questions.

But answer came there none.

> Ah, then you must have read /Agile and Iterative Development: A Managers
> Guide/, by Craig Larman. Its main conclusion (besides "waterfall sucks") is
> that given our industry's 70% failure rate for large projects, simply
> failing less often would be a better goal than increased productivity.

So THAT'S where Ed's figure comes from!

- Gerry Quinn

Phlip

unread,

Nov 19, 2004, 6:41:12 AM11/19/04

to

Gerry Quinn wrote:

> > Ah, then you must have read /Agile and Iterative Development: A Managers
> > Guide/, by Craig Larman. Its main conclusion (besides "waterfall sucks")
is
> > that given our industry's 70% failure rate for large projects, simply
> > failing less often would be a better goal than increased productivity.
>
> So THAT'S where Ed's figure comes from!

His figure comes from eating too many Krispy Kremes.

--
Phlip
http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces

Dominic Williams

unread,

Nov 19, 2004, 10:00:52 AM11/19/04

to

CTips wrote:

I don't attach very much importance to these metrics,
and as a consequence I have not measured or kept such
statistics throughout my 11-years programming. If it
can induce you to attach more weight to my opinion and
those of others, here are a few indications:

The largest program I worked on was about as big as
they come. It was AICES, the commercial survivor of the
ICES system developed at MIT in the 60, then continued
by IBM before being developed and used by Bureau
Veritas to certify offshore oil rigs. It's an
Integrated Civil Engineering System, quite complex
because in addition to doing quite complex numerical
analysis it includes its own specialized languages
(ICETRAN, CDL, STRUDL...), compilers, interpreters, its
own memory management etc. Developed in C, FORTRAN,
Assembly.

Arriving as I did in the mid 90's, "my" code was only a
very small part of this of course. At the time,
compiling the whole system (which was almost never
necessary) took 2 or 3 days on an IBM RISC-6000 AIX
system.

Between '98 and 2003 I was developing real-time
distributed mission- and safety-critical automatic
train control systems. The last two were developed
basically from scratch, I was technical leader, wrote
quite a lot of code. They were both done in full XP and
TDD; I evolved from being the principal architect to
being an XP coach who helped the team of developers
agree on an evolving design. Four years, approx. 400
man-months. The second of those alone was 240 KLOC
(physical), 1200 classes.

I've done a number of smaller things in between. I
can't remember what is the biggest module I developed
solo. But apart from an operating system, about which I
don't know very much, I can't think of any kinds of
software projects that I would find too daunting. Yet
given the choice, I would work in TDD with an XP team
every time. Better, faster, more fun.

Regards,

Phlip

unread,

Nov 19, 2004, 10:08:59 AM11/19/04

to

Dominic Williams wrote:

> I don't attach very much importance to these metrics,
> and as a consequence I have not measured or kept such
> statistics throughout my 11-years programming.

Well, CTips is getting lots of statistics like "in my {10, 15, 30} years of
programming..." here.

--
Phlip
http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces

CTips

unread,

Nov 19, 2004, 10:28:17 AM11/19/04

to

I know about it. Its actually a collection of "programs" (loosely
speaking) rather than a single app.

> Between '98 and 2003 I was developing real-time
> distributed mission- and safety-critical automatic
> train control systems. The last two were developed
> basically from scratch, I was technical leader, wrote
> quite a lot of code. They were both done in full XP and
> TDD; I evolved from being the principal architect to
> being an XP coach who helped the team of developers
> agree on an evolving design. Four years, approx. 400
> man-months. The second of those alone was 240 KLOC
> (physical), 1200 classes.

7200/lines per man-year. Though, of course, its probably not the entire
story - you're not counting the other program - so lets say about
10kloc/year productivity. Thats about 4x the productivity claimed in the
available XP literature. Better.

So, now, how would you push your productivity up to say about
50kloc/year? Or do you think that would be unachievable?

> I've done a number of smaller things in between. I
> can't remember what is the biggest module I developed
> solo. But apart from an operating system, about which I
> don't know very much, I can't think of any kinds of
> software projects that I would find too daunting.

Try high-performance WAN distributed fault-tolerant systems - with
support for things like network partitioning/recovery and
semi-synchronous byzantine failure. *shudder*.