John Roth dolt ( Re: A challenge to proponents of Unit Testing. )

Thaddeus L Olczyk

unread,

Dec 1, 2001, 8:46:42 AM12/1/01

to

Background.

For a long time I've been writing unit tests with my code.
I have a reputation among of friends of writing meticulous
code and of meticulously testing it. I have been so determined
to not release code until it was well tested, that at times I have
suffered professionally from it.

In all that time, I have never once been able to produce a system
of tests that meets three criteria: the tests run automatically, the
tests are comprehensive, and the tests are easily maintainable.
At this point I am strongly beginning to wonder whether such a
system is possible.

So I offered a challenge to those that support heavy unit testing.
In particular to the XP community. Produce a "proof-of-concept"
for unit testing by taking some open source project and adding unit
tests. I also suggested that if the person needed to he could rewrite
the project to facilitate testing. ( Let me say that this also can be
loosely interpreted to include writing an open source project. Some
advocates say that the only way to implement a unit test system is to
write it before you write the code. )

On Fri, 30 Nov 2001 20:06:09 -0800, "John Roth"
<john...@ameritech.net> wrote:

>
>"Tom Adams" <TADA...@AOL.COM> wrote in message
>news:708f3b7f.01113...@posting.google.com...
>> Comprehensive unit testing is not practical. It was shown decades
>> ago that comprehensive unit testing is equivalent to correctness
>> proof of the program being tested. Therefore it is no more
>> practical than correctness proof for large programs.
>>
>> Where you guys been? See "Partition testing does not inspire
>> confidence" by Dick Hamlet. I'll go home tonight at dig up
>> the old references if you like.
>
>I think I've come to the conclusion that the original post on
>this thread was a troll, so I've snipped it.
A troll is supposed to generate a lot of responses.
The original question generated more than 250 responses.
However less than 10 of the responses directly tackled the question.
Until Tom Adams posted his reply there were less then 5.
Many of the rest had no relevance to the original post.

Thus there is no way that I can take credit for the volume of posts.
Nor do I want a lot of posts. I prefer quality over quantity. Still
waters run deep, but then I don't expect you to understand what that
means. You just don't have the intelligence to understand it.

Of all the posts only two actually tried to meet my challenge.
( I'm still evaluating them, and forming my questions. ) Kent Beck's
post, which suggested I look at ant and junit, and Max Ischenko who
suggested ruby. ( Which is why I added comp.lang.ruby. I believe in
giving credit where credit is due. All you "rubiers" who aren't
interested in the rest can go, that was the whole relationship to
ruby. )

You on the other have done just the opposite. Instead of proactively
rising to meet the challenge, you chose instead to denigrate me. Now
you call the challenge a troll. That was one post requesting that
XPers ( and others who believe in unit testing ) offer a
"proof-of-concept" versus a flood of off-topic ( relative to thread )
posts from XPers. 230 to 1 and the 1 is the one wasting bandwidth.
You've got a lot of balls to even suggest it, and a lot of stupidity
to think you wouldn't get called for it.

More amazing is the fact that you ( perhaps in conjunction with others
) don't jump at the chance to create a "proof-of-concept". Such a
thing would be invaluable ( if done correctly ) to hold up to
managers and senior developers to convince them to do unit testing.

Something which BTW I am surprised you would find objectionable,
since it is something that you could hold up to managers as proof of
unit testing. People supporting the less popular programming languages
beg for precisely such a proof for their languages.

I suspect the reason is that you cannot do it. That you really
don't have the development skills to do it for yourself and
instead go into a company and sell the latest religion disguised as
methodology to the management. You don't pay attention when
the developers tell you something is impossible, instead blame them
when things turn out wrong. As you PJPlauger once put it, your
wonderful method should have work, but the programmers just
didn't try hard enough or were using the method wrong etc.

>
>Nobody does "comprehensive unit tests" of the type mentioned,
>for the simple reason that the number of tests vastly outweighs
>any possible benefit, even if it was possible to characterize
>the actual tests required, which in most cases of any practical
>interest, it isn't.
>
"You should test things that might break."
eXtreme Programming eXplained: Embrace Change
Kent Beck

You are now running away from that statement by saying it
does not pay to do that ( after all what does comprehensive mean? )

As for the rest. Well I one made one mistake. A long time ago I swore
that I would reply to the intelligent posts first then the stupid
ones. Your post upset me enough that I violated that rule. I will now
go back to make my detailed analysis of junit as a response to Kent
Beck's post.

When I said "consultants that give consultants a bad name" in my
original post, you said the phrase was nonsensical. Well let me
explain what I meant ( it's obvious to those with an IQ ). Most of
the SD managers and senior people I speak to associate certain words
with consultants "liar", "incompetent", and "con-man" are just a few.
One manager once said to me "We need someone with OLE experience.
This guy claims to be one of the best OLE developers around. Even
though all consultants lie, I expect this one has enough experience
to meet our needs. Even though I know he's lying." Other managers have
said similar things. This comes not from some sort of prejudice, but
from previous experience of these managers. I doubt, however, that you
want to admit such a thing exists because it would make looking in the
mirror hard.

In conclusion. I was so serious about this question that I removed
people from my kill file to get their responses. For the most part
that was a mistake because these people don't seem serious about
wanting to prove unit testing is good. Instead they want to spend
their time dancing around the maypole and sending 230 chants...that
is... irrelevant posts trying to get more converts to their cult.

So...
<plonk>
There is nothing that I can learn from you except that you are an
asshole. I already know that so there is really nothing that I can
learn from you.

I expect that you will say nasty things being the person that you are,
but I hope people take into consideration that they are being said
behind my back.

Sorry, your just too stupid for me to learn anything from you.
Goodbye.
</plonk>

yet another bill smith

unread,

Dec 2, 2001, 10:11:15 AM12/2/01

to

Thaddeus L Olczyk wrote: (snipped in places, my comments interspersed)

>
> Background.
>
> For a long time I've been writing unit tests with my code.
> I have a reputation among of friends of writing meticulous
> code and of meticulously testing it. I have been so determined
> to not release code until it was well tested, that at times I have
> suffered professionally from it.

For a long time (starting 1966) I've been developoing a reputation for
delivering working code faster than expected. I'm sorry you have
suffered professionally (though judging from this post, the original
post with your "challenge," and your posts that I've seen on c.s-eng
over the last year or so (let's face it, your name is somewhat more
memorable than mine), I suspect your suffering is more due to you being
a jerk of the first water.

>
> In all that time, I have never once been able to produce a system
> of tests that meets three criteria: the tests run automatically, the
> tests are comprehensive, and the tests are easily maintainable.
> At this point I am strongly beginning to wonder whether such a
> system is possible.

I've never produced one either. SO who ever told you that engineering is
easy? Some tests will always have to be checked manually (such as many
things that appear on a monitor). Even if you redefine 'comprehensive'
to mean something more like 'adequate' rather than 'exhaustive' there is
still the question of what you think qualifies as easy to maintain. If
testing were easy, it wouldn't have grown into a separate career path
from programming.

>
> So I offered a challenge to those that support heavy unit testing.
> In particular to the XP community. Produce a "proof-of-concept"
> for unit testing by taking some open source project and adding unit
> tests. I also suggested that if the person needed to he could rewrite
> the project to facilitate testing. ( Let me say that this also can be
> loosely interpreted to include writing an open source project. Some
> advocates say that the only way to implement a unit test system is to
> write it before you write the code. )

How generous of you. But what makes you think that the proponents of
test-first coding owe you a response? Particulararly when issued in such
an impolite and ungrammatical format, in such imprecise terms? Hey, if
your ego grows much more, you'll have to start calling yourself
'Universe'! [And notice how few people take Elliott's garbage seriously,
either.]

>
> On Fri, 30 Nov 2001 20:06:09 -0800, "John Roth"
> <john...@ameritech.net> wrote:
>
> >
> >"Tom Adams" <TADA...@AOL.COM> wrote in message
> >news:708f3b7f.01113...@posting.google.com...
> >> Comprehensive unit testing is not practical. It was shown decades
> >> ago that comprehensive unit testing is equivalent to correctness
> >> proof of the program being tested. Therefore it is no more
> >> practical than correctness proof for large programs.
> >>
> >> Where you guys been? See "Partition testing does not inspire
> >> confidence" by Dick Hamlet. I'll go home tonight at dig up
> >> the old references if you like.
> >
> >I think I've come to the conclusion that the original post on
> >this thread was a troll, so I've snipped it.
> A troll is supposed to generate a lot of responses.
> The original question generated more than 250 responses.
> However less than 10 of the responses directly tackled the question.
> Until Tom Adams posted his reply there were less then 5.
> Many of the rest had no relevance to the original post.
>
> Thus there is no way that I can take credit for the volume of posts.
> Nor do I want a lot of posts. I prefer quality over quantity. Still
> waters run deep, but then I don't expect you to understand what that
> means. You just don't have the intelligence to understand it.

Aside: rare personal attack: if he really preferred quality over
quantity, why is this post so long? Also, intelligence has little or
nothing to do with understanding highly idiomatic expressions, and who
ever said that still waters were higher quality than fast? In my
neighborhood, still waters get a coating of scum.

>
> Of all the posts only two actually tried to meet my challenge.
> ( I'm still evaluating them, and forming my questions. ) Kent Beck's
> post, which suggested I look at ant and junit, and Max Ischenko who
> suggested ruby. ( Which is why I added comp.lang.ruby. I believe in
> giving credit where credit is due. All you "rubiers" who aren't
> interested in the rest can go, that was the whole relationship to
> ruby. )
>
> You on the other have done just the opposite. Instead of proactively
> rising to meet the challenge, you chose instead to denigrate me. Now
> you call the challenge a troll. That was one post requesting that
> XPers ( and others who believe in unit testing ) offer a
> "proof-of-concept" versus a flood of off-topic ( relative to thread )
> posts from XPers. 230 to 1 and the 1 is the one wasting bandwidth.
> You've got a lot of balls to even suggest it, and a lot of stupidity
> to think you wouldn't get called for it.

Imagine yourself in a small boat puttering along on a lake, with some
people dangling fishing lines off the stern. If a fish rises to the
bait, it it being proactive or reactive? If someone has risen to meet
your challenge, he/she would also have been reactive. BTW, do you know
the name for that type of fishing? Can you say 'trolling'? I knew you
could.

>
> More amazing is the fact that you ( perhaps in conjunction with others
> ) don't jump at the chance to create a "proof-of-concept". Such a
> thing would be invaluable ( if done correctly ) to hold up to
> managers and senior developers to convince them to do unit testing.

Maybe they've got something more useful to do, like paid work? Or maybe
even a life? (Note, I'm not suggesting that you don't have a life, I'm
just stating that if they did they might prefer it to your challenge.
Hey, if I had a life I'd prefer it to answering your post =-)

From my list of good tag lines: "I have never met a man so ignorant that
I couldn't learn something from him."--Galileo Galilei, physicist and
astronomer (1564-1642)

John Roth

unread,

Dec 2, 2001, 10:54:15 AM12/2/01

to

"Thaddeus L Olczyk" <olc...@interaccess.com> wrote in message
news:3c1247a2....@nntp.interaccess.com...

<snip>

It is better to remain silent and be thought a fool,
than to speak and remove all doubt.

Saying from the common wisdom tradition.

John Roth

Panu Viljamaa

unread,

Dec 2, 2001, 4:05:40 PM12/2/01

to

Thaddeus L Olczyk wrote:

> In all that time, I have never once been able to produce a system of tests that meets three criteria: the tests run automatically, the tests are comprehensive, and the tests are easily maintainable. At this point I am strongly beginning to wonder whether such a system is possible.

0.
I'm with you Thaddeus, and I think you raise worthwhile questions, plus several alternative answers, about what I perceive as the over-emphasis on unit tests. (Where's the integration part ?)

For instance, I've heard the claim that "We don't need interface specifications in Smalltalk - because unit tests guarantee the quality of my code". I don't buy this, and thus welcome your skepticism.

The questions that trouble me are: "When can we say we have enough unit tests? How can we know our unit tests are 'correctly written' ? ".

To take your concrete example, how can we know there are an adequate number of unit tests in the Ruby implementation, and how can we know that the ones in there are 'correct' ? How can we know we have 95% of the needed unit tests in place ? Or is it more like 30% ?

1.
If we start system building from tests first, we could declare that the tests *are* the "Specification" of the system. Then the question turns into: "How can we know all assumptions other components make about my component are expressed by my unit tests?". I.e. how can we know nobody is violating the contract/specification expressed by my tests-code?

But trying to reason about violation of contracts between components, we have already stepped outside the realm of unit tests, into integration testing. So I think unit testing is over-emphasized by some methodologies, since we have no "S-Integration".

2.
If on the other hand unit tests are *not* taken as the 'specification' of a component, we should ask: "How can we know the existing unit tests prove all specified requirements are being fulfilled?" And if we can't prove this, how can we have any qualitative assurance about the 'correctness' or even 'quality' of our code ?

3.
I believe (automated) unit tests are a good technique for discovering bugs due to changes made to the system, early on. But it is misleading to imply that since you have written and run *some* tests, your system now works "correctly". Yet this is what a consultant might tell a customer: "Look, we have all these unit tests in place, and we run them all without errors! We don't need interface contracts. We don't need specifications. Our code is simply the best since it passes all unit tests we have written!"

-Panu Viljamaa
P.S.
Wouldn't it intuitively make sense to have another set of people writing the tests, than the ones writing the code which must pass those tests?

Stefan Schmiedl

unread,

Dec 2, 2001, 5:00:55 PM12/2/01

to

unit tests in isolation are as much evil as every other part
of a software building process in isolation.

Panu Viljamaa (2001-12-03 06:00):

> 1. If we start system building from tests first, we could
> declare that the tests *are* the "Specification" of the system.
> Then the question turns into: "How can we know all assumptions
> other components make about my component are expressed by my
> unit tests?". I.e. how can we know nobody is violating the
> contract/specification expressed by my tests-code?

if someone does, you will receive a bug report. you can add
another test thus enhancing the interface or tell the other to
rtfm and leave things as they are. without feedback things won't
work.

> 2. If on the other hand unit tests are *not* taken as the
> 'specification' of a component, we should ask: "How can we know
> the existing unit tests prove all specified requirements are
> being fulfilled?" And if we can't prove this, how can we have
> any qualitative assurance about the 'correctness' or even
> 'quality' of our code ?

you have functional/acceptance/user tests controlling the
behaviour on a larger scale. and why do you need to "prove"
something? the ultimate criterium of usefulness is whether
something works and if it does not, fails gracefully. there is a
well-known saying of don knuth regarding this matter.

>
> 3. I believe (automated) unit tests are a good technique for
> discovering bugs due to changes made to the system, early on.
> But it is misleading to imply that since you have written and
> run *some* tests, your system now works "correctly". Yet this is
> what a consultant might tell a customer: "Look, we have all
> these unit tests in place, and we run them all without errors!
> We don't need interface contracts. We don't need specifications.
> Our code is simply the best since it passes all unit tests we
> have written!"

shoot the consultant, if you meet him next time. ;>

this is the same consultant that sells you other methodologies
with other inherent flaws by promising other stuff that cannot be
guaranteed. it's a fault of the consultant, not of the unit test.

unit tests help you immensely in diagnosing problems, both early
and late in development. they are no silver bullet. you still need
to apply whatever knowledge you have acquired.

but do you believe that you are completely healthy, only because a
doctor told you so?

>
> -Panu Viljamaa
> P.S.
> Wouldn't it intuitively make sense to have another set of people
> writing the tests, than the ones writing the code which must
> pass those tests?

do it and report problems and successes to the rest of us, please.
meanwhile you might want to break your lines around colum 70 to
improve readability of your posts.

s.
--
Stefan Schmiedl
EDV-Beratung, Programmierung, Schulung
Loreleystr. 5, 94315 Straubing, Germany
Tel. (0 94 21) 74 01 06
Public Key: http://xss.de/stefan.public

shhhh ... I can't hear my code!

Ron Jeffries

unread,

Dec 2, 2001, 3:28:06 PM12/2/01

to

On Sat, 01 Dec 2001 13:46:42 GMT, olc...@interaccess.com (Thaddeus L
Olczyk) wrote:

>For a long time I've been writing unit tests with my code.
>I have a reputation among of friends of writing meticulous
>code and of meticulously testing it. I have been so determined
>to not release code until it was well tested, that at times I have
>suffered professionally from it.

I'm guessing that suffering was due to holding code back when people
wanted it. If my guess is wrong, the following may not apply:

Have you tried testing as or before you code, and releasing your
tested code incrementally, say every day? When I do that, I find that
by far the bulk of the value of my code is available for us any time
the other guy wants it.

>
>In all that time, I have never once been able to produce a system
>of tests that meets three criteria: the tests run automatically, the
>tests are comprehensive, and the tests are easily maintainable.
>At this point I am strongly beginning to wonder whether such a
>system is possible.

That's interesting. I would say that never before test-first have I
accomplished it, and I wrote a lot of software and a lot of tests the
other way. Yet test-first makes it easy. Here's a sketch of how it
happens.

1. You start with a clean text buffer: no code.
2. You have something in mind you want your program to do. Something
tiny: the first little bit.
3. You write an automated test to prove that your clean text buffer
doesn't have that feature.
4. For a while the test doesn't compile. You add just enough code to
the buffer to make it compile (typically a class definition or
function declaration).
5. For a while the test doesn't work. You add just enough code to the
buffer to make the test run.

At this point, you have tests (1) that run automatically, that are
comprehensive, and (it turns out, I have no proof for this) they're
easy to maintain.

Then you think of one more little bit the program might do. You write
another automated test to see if the program does it ...

As odd as it sounds, this really works. It does take practice, and you
do have to go in TINY steps. No, tinier than you are thinking.
T_I_N_Y.

Since you like tests and tested code, you might like to give it a try.
There's a small example in XP Installed, and Chet and I give
demonstrations at various conferences. The Spring Software Development
is a strong possibility for the next public session.

Regards,

>
>So I offered a challenge to those that support heavy unit testing.
>In particular to the XP community. Produce a "proof-of-concept"
>for unit testing by taking some open source project and adding unit
>tests. I also suggested that if the person needed to he could rewrite
>the project to facilitate testing. ( Let me say that this also can be
>loosely interpreted to include writing an open source project. Some
>advocates say that the only way to implement a unit test system is to
>write it before you write the code. )

Ronald E Jeffries
http://www.XProgramming.com
http://www.objectmentor.com

Ron Jeffries

unread,

Dec 2, 2001, 3:33:12 PM12/2/01

to

On Sun, 02 Dec 2001 16:05:40 -0500, Panu Viljamaa <pa...@fcc.net>
wrote:

>
>For instance, I've heard the claim that "We don't need interface specifications in Smalltalk - because unit tests guarantee the quality of my code". I don't buy this, and thus welcome your skepticism.

Have you used Smalltalk extensively? What happened when you did that
made you feel that you needed interface specs?

Ron Jeffries

unread,

Dec 2, 2001, 3:34:03 PM12/2/01

to

On Sun, 02 Dec 2001 16:05:40 -0500, Panu Viljamaa <pa...@fcc.net>
wrote:

>If we start system building from tests first, we could declare that the tests *are* the "Specification" of the system. Then the question turns into: "How can we know all assumptions other components make about my component are expressed by my unit tests?". I.e. how can we know nobody is violating the contract/specification expressed by my tests-code?

Their tests run?

Panu Viljamaa

unread,

Dec 2, 2001, 6:03:12 PM12/2/01

to

Ron Jeffries wrote:

> Have you used Smalltalk extensively? What happened when you did that
> made you feel that you needed interface specs?

Yes I have. What happened was I find it hard to understand how to use a method written by someone else when there is no indication of what kind of arguments should be passed to it. It is also hard to say if it is right or wrong when there is no clearly written contract on "what it should do". This might be specified by some test-case-class somewhere, but I'd rather have 'contract' of the method visible as soon as I view the method's source, so I don't have to go test-case hunting.

If I could clearly -speedily- see that the method does not behave as advertized, I could quickly disregard it, rather than start thinking whether the error is somewhere in my code. The presumptions about the types of arguments could of course be specified by a test case somewhere in some test-case-class, but the original author might as well have forgotten to test for the argument being of the correct type. If the required type was indicated by some other means such as a naming convention for arguments, I could more easily see if the original author forgot to specify it.

-Panu Viljamaa

Panu Viljamaa

unread,

Dec 2, 2001, 6:24:41 PM12/2/01

to

Ron Jeffries wrote:

> On Sun, 02 Dec 2001 16:05:40 -0500, Panu Viljamaa <pa...@fcc.net>
> wrote: >If we start system building from tests first, we could declare that the tests *are* the "Specification" of the system. Then the question turns into: "How can we know all assumptions other components make about my component are expressed by my unit tests?". I.e. how can we know nobody is violating the contract/specification expressed by my tests-code?
>
> Their tests run?

Not necessarily. If all their tests run, it does not mean these tests exercise my components in all the same ways as the 'real application' will.

BTW. Are you saying that this is how we should interpret unit tests, as specifications of each individual component ?

We do not know if *their* test-cases represent every way their component is used in the real application, and we don't know if their test-cases call my components at all. So I'd rather rely on 'my software being correct' than assuming that a) their test cases run and b) their test run all the code that is run in the production application.

Note that their test cases are explicitly meant to "unit-test" their component, not *my* component, nor what happens when their component is integrated with my component.

If you're doing 'continuous integration', this may not seem such a big problem, because a flaw in one component may hide a flaw in another. But continuous integration then is not really very "component-based development". Unit testing and integration testing seem to be merged into one, right ?

It's a bit like saying: "Its fine with us as long as the airplane flies. We need not make sure our components are of the highest quality, or even know how high quality they are. The airplane flies! ".

-Panu Viljamaa

Stefan Schmiedl

unread,

Dec 2, 2001, 7:39:42 PM12/2/01

to

Panu Viljamaa (2001-12-03 08:20):

> Ron Jeffries wrote:
>
> > On Sun, 02 Dec 2001 16:05:40 -0500, Panu Viljamaa <pa...@fcc.net>
> > wrote: >If we start system building from tests first, we could declare
> > that the tests *are* the "Specification" of the system. Then the question
> > turns into: "How can we know all assumptions other components make about my
> > component are expressed by my unit tests?". I.e. how can we know nobody is
> > violating the contract/specification expressed by my tests-code?
> >
> > Their tests run?
>
> Not necessarily. If all their tests run, it does not mean these tests
> exercise my components in all the same ways as the 'real application' will.

if they worked "test-first", it does.

>
> Note that their test cases are explicitly meant to "unit-test" their
> component, not *my* component, nor what happens when their component is
> integrated with my component.

then they are not doing their homework ... because your component might
change over time and after some "harmless" upgrade their software might
just stop working ... if i use an axe, i make sure that it's head is fixed,
because i don't want to lose mine.

>
> It's a bit like saying: "Its fine with us as long as the airplane flies. We
> need not make sure our components are of the highest quality, or even know
> how high quality they are. The airplane flies! ".
>

on the other hand, high quality components don't guarantee that a
plane can take off (concorde) or land (mars lander) successfully.

i prefer unit-tested software over other software every day.

Wilkes Joiner

unread,

Dec 2, 2001, 9:12:15 PM12/2/01

to

Just my 2 cents...

> I.e. how can we know nobody is
> violating the contract/specification expressed by my
> tests-code?
> >
> > Their tests run?
>
> Not necessarily. If all their tests run, it does not
> mean these tests exercise my components in all the
> same ways as the 'real application' will.

Does documentation prevent someone from using your
component improperly?

> BTW. Are you saying that this is how we should
> interpret unit tests, as specifications of each
> individual component ?

It is certainly more concrete and less ambiguous than
documentation. Especially when you consider that a
lot of programmers prefer writing code over writing
"interface contracts". I have seen a few
"specifications" that time and lack of maintenance
have rendered not only obsolete but down right
dangerous to use.

> We do not know if *their* test-cases represent every
> way their component is used in the real application,

They almost certainly don't, but some tests are better
than no tests, and running code is better than a
document that says, "this is how it *should* run."

> and we don't know if their test-cases call my
> components at all. So I'd rather rely on 'my
> software being correct' than assuming that a) their
> test cases run and b) their test run all the code
> that is run in the production application.

OK, so how do you *know* that your software is
correct? How do you know that you were able to do a
perfect translation of the specification into running
code? How do you know that the person using your
component understood your specification perfectly?

I must be missing something.

- Wilkes Joiner

__________________________________________________
Do You Yahoo!?
Buy the perfect holiday gifts at Yahoo! Shopping.
http://shopping.yahoo.com

Phlip

unread,

Dec 2, 2001, 11:16:46 PM12/2/01

to

A call to arms!

Stefan Schmiedl wrote:

> unit tests in isolation are as much evil as every other part
> of a software building process in isolation.

The Point is to write the tests in lockstep with writing the code. That's
not "isolation".

(And don't nobody _dare_ mention the word "isolation" to a hardcore eXtremo
who works AllEngineersInOneRoom, OnsiteCustomer and two developers to each
workstation!)

> if someone does, you will receive a bug report. you can add
> another test thus enhancing the interface or tell the other to
> rtfm and leave things as they are. without feedback things won't
> work.

Feedback forms a dynamic attractor that targets good, clean, solid code.

> you have functional/acceptance/user tests controlling the
> behaviour on a larger scale. and why do you need to "prove"
> something? the ultimate criterium of usefulness is whether
> something works and if it does not, fails gracefully. there is a
> well-known saying of don knuth regarding this matter.

Try to test at any scale necessary. Every function should be short, and
should have a longer function or seven testing it. Doing things like this
lets you go as fast as you possibly can, because you know when everything
works and you know exactly what broke when it broke. No stops to manually
test to see if a refactor worked; you just know.

Have I mentioned here recently that I did Flea in my spare time; less than
an hour a day?

http://flea.sourceforge.com

>> 3. I believe (automated) unit tests are a good technique for
>> discovering bugs due to changes made to the system, early on.
>> But it is misleading to imply that since you have written and
>> run *some* tests, your system now works "correctly". Yet this is
>> what a consultant might tell a customer: "Look, we have all
>> these unit tests in place, and we run them all without errors!
>> We don't need interface contracts. We don't need specifications.
>> Our code is simply the best since it passes all unit tests we
>> have written!"

Real consultants engage with a real customer in real time. This is a Tom
Peters concept that I doubt anyone here's brave enough to cross him over.

The developers do not keep them in the dark until the great unveiling. The
customer sees the project grow in real-time, and steers it in real time.
They see each bug that gets off the bench, and they see how short its life
is. They know a project's risk profile.

> this is the same consultant that sells you other methodologies
> with other inherent flaws by promising other stuff that cannot be
> guaranteed. it's a fault of the consultant, not of the unit test.

Agreed.

> unit tests help you immensely in diagnosing problems, both early
> and late in development. they are no silver bullet. you still need
> to apply whatever knowledge you have acquired.

They are a bullet with a very high albedo. That's why they are taught to
freshpersons in their first semester of programming classes.

> but do you believe that you are completely healthy, only because a
> doctor told you so?

I'l play the odds here. Biological organisms weren't invented TestFirst.

>> Wouldn't it intuitively make sense to have another set of people
>> writing the tests, than the ones writing the code which must
>> pass those tests?

Yes, of course. It would make perfect sense to do things the way that great
bastion of rock-solid code and thrifty processes, Microsoft, does things.

--
Phlip

http://www.greencheese.org/HatTrick

Universe

unread,

Dec 2, 2001, 11:42:02 PM12/2/01

to

Panu Viljamaa <pa...@fcc.net> wrote:

>>>If you're doing 'continuous integration', this may not seem such a big problem, because a flaw in one component may hide a flaw in another. But continuous integration then is not really very "component-based development". Unit testing and integration testing seem to be merged into one, right ?
>>>

Right. It's the generally improper "touch and feel" that
characterizes the XP/Alliance way of doing things. Everything is
merged and "squashed" by them into one, because to them everything is
the same, and there is no objective truth. So we can just do
everything at the same time without sequencing and see how things
evolve. Hackery pure and simple.

>>>It's a bit like saying: "Its fine with us as long as the airplane flies. We need not make sure our components are of the highest quality, or even know how high quality they are. The airplane flies! ".>>>

Bip, bop, boom, zing! Another blockbuster!

My man, Panu! {- :

Elliott

Matt Armstrong

unread,

Dec 3, 2001, 12:06:47 AM12/3/01

to

Phlip <phli...@yahoo.com> writes:

> Have I mentioned here recently that I did Flea in my spare time; less than
> an hour a day?
>
> http://flea.sourceforge.com

You apparently did not unit test your URL. :-) This one seems to
work:

http://sourceforge.net/projects/flea/

--
matt

Stefan Schmiedl

unread,

Dec 3, 2001, 2:00:32 AM12/3/01

to

Phlip (2001-12-03 13:20):

> A call to arms!

whom shall we fight?
we don't disagree, Phlip.

> > unit tests help you immensely in diagnosing problems, both early
> > and late in development. they are no silver bullet. you still need
> > to apply whatever knowledge you have acquired.
>
> They are a bullet with a very high albedo. That's why they are taught to
> freshpersons in their first semester of programming classes.

what good is a silver bullet if it can't fly? the unit test gun is a healthy
combination of other (xp preferred here) practices. unit tests without
refactoring would become unmaintainable quite fast, i imagine.

>
> Yes, of course. It would make perfect sense to do things the way that great
> bastion of rock-solid code and thrifty processes, Microsoft, does things.

thanks for a grin so soon in the morning :-)

Patrick May

unread,

Dec 3, 2001, 4:37:14 PM12/3/01

to

> Right. It's the generally improper "touch and feel" that
> characterizes the XP/Alliance way of doing things. Everything is
> merged and "squashed" by them into one, because to them everything is
> the same, and there is no objective truth. So we can just do
> everything at the same time without sequencing and see how things
> evolve. Hackery pure and simple.

This seems to be a bit of a distortion.

When you are regularly unit testing, you use the tests to describe the
known and expected behaviour of the system and the components. Each
project has different levels of expectations -- little might be
expected of a text munching script, while a mission-critical business
app expected to do alot more.

The depth of the tests determines the depth of the code that needs to
be written. It isn't hackery to solve a simple problem appropriately.

Getting back to Panu's concern:

Unit tests in XP are for internal use, to encode expectations into
something that can be regularly tested. Documenting those
expectations is another issue, though well written tests can help.

Tom Adams

unread,

Dec 6, 2001, 10:32:23 AM12/6/01

to

Ron Jeffries <ronje...@REMOVEacm.org> wrote in message news:<176717160028CE03.51B6AF6E...@lp.airnews.net>...

> 1. You start with a clean text buffer: no code.
> 2. You have something in mind you want your program to do. Something
> tiny: the first little bit.
> 3. You write an automated test to prove that your clean text buffer
> doesn't have that feature.
> 4. For a while the test doesn't compile. You add just enough code to
> the buffer to make it compile (typically a class definition or
> function declaration).
> 5. For a while the test doesn't work. You add just enough code to the
> buffer to make the test run.
>
> At this point, you have tests (1) that run automatically, that are
> comprehensive, and (it turns out, I have no proof for this) they're
> easy to maintain.
>
> Then you think of one more little bit the program might do. You write
> another automated test to see if the program does it ...
>
> As odd as it sounds, this really works. It does take practice, and you
> do have to go in TINY steps. No, tinier than you are thinking.
> T_I_N_Y.
>
> Since you like tests and tested code, you might like to give it a try.
> There's a small example in XP Installed, and Chet and I give
> demonstrations at various conferences. The Spring Software Development
> is a strong possibility for the next public session.
>

I am interesting in your use of the word "comprehensive". Beck gave an
example of test first coding where he wrote the test case:

sum = Calculator.sum(2,2)

assert: equals(sum,4)

Then, he said we could proceed to code the sum method of Calculator.
Is this an example of a comprehensive test? I guess Beck could have
just been giving an example that was less complex than real-world
TFD, so he did not mean it to be comprehensive.

What would the comprehensive test that should be written first look
like for Calculator sum?

Patrick May

unread,

Dec 6, 2001, 4:20:37 PM12/6/01

to

> What would the comprehensive test that should be written first look
> like for Calculator sum?

A more comprehensive test might include type checking on the
variables, etc. Anything that the Customer needs should be tested.

I like to think of unit testing as a variant of design-by-contract.
Before anything goes into the code base, the test/contract has to be
changed to reflect new expectations.

It's impossible to test every possible situation -- it's more
important to define your behaviour in terms of tests. If you unit
test before every new bit of code, you will have a comprehensive test
of your expectations.

This is, of course, much more useful than tests that ensure that the
software works as expected in every possible situation. With the
exception of things like flight avionics, most real world software has
a more limited range of expectations.

Steve Hill

unread,

Dec 7, 2001, 5:04:08 AM12/7/01

to

You might actually be surprised at how many tests are needed. In "The
art of software testing" (Myers, 1979) the 1st chapter starts with a
quiz....

What tests would you write for a function that takes 3 numbers (the
side lengths of a triangle) , and returns whether the triangle is
equilateral, scalene or isoceles.

I think the answer is over 20.... certainly close on 20

Steve

patri...@monmouth.com (Patrick May) wrote in message news:<3b3ad3b4.01120...@posting.google.com>...

Keith Ray

unread,

Dec 7, 2001, 10:09:00 AM12/7/01

to

In article <c230c758.01120...@posting.google.com>,
stephe...@motorola.com (Steve Hill) wrote:

> You might actually be surprised at how many tests are needed. In "The
> art of software testing" (Myers, 1979) the 1st chapter starts with a
> quiz....
>
> What tests would you write for a function that takes 3 numbers (the
> side lengths of a triangle) , and returns whether the triangle is
> equilateral, scalene or isoceles.
>
> I think the answer is over 20.... certainly close on 20
>
> Steve

Kent Beck's response to that example:

http://groups.yahoo.com/group/extremeprogramming/message/37242

"The biggest problem is that it doesn't balance the cost and benefits
of tests. He has a triangle example which he writes (I think) 29
tests for. The test-first version can be done confidently with 5
tests."

David B Lightstone

unread,

Dec 7, 2001, 10:28:15 AM12/7/01

to

"Keith Ray" <k1e2i3t...@1m2a3c4.5c6o7m> wrote in message
news:k1e2i3t4h5r6a7y-4F...@netnews.attbi.com...

It is rather hard to access either your response or Mr Beck's intent.
What to you believe?

(1) Mr Beck may have read the 29 test cases and concluded that
the 5 tests he favors are adequate. (the original 29 hasd redundancy)

(2) Mr Beck may have read the 29 test cases and concludes that
pruning a number of them was appropriate (impossible situations
by virtue of a priori knowledge about other aspects of the system).
(Hence he is doing integration testing, rather than unit testing)

(3) Mr Beck may have determined that the customer can afford
to take the risk because the expected loss associated with
the pruned test cases is acceptable.

Any other possible alternative explinations for Mr Beck's intet?

Which is applicable to your situation?

Darrin Thompson

unread,

Dec 7, 2001, 11:42:17 AM12/7/01

to

David B Lightstone wrote:

>(3) Mr Beck may have determined that the customer can afford
>to take the risk because the expected loss associated with
>the pruned test cases is acceptable.
>

From my reading of XP, this is probably correct. XP Explained opens
with an explanation of risk management and the use of an "options
calculator". (It's been a little while since I read it.)

When you approach the problem as risk management I think you get closer
to Beck's intent and what customers want. When customers can't define
their needs well, or keep changing their minds, or their business is
changing under them, we programmers need to be able to change what we
are writing for them.

There's a certain amount of unit testing that allows us as programmers
to more confidently change what we've already done for the customer.
Actual practice shows that a combination of coding standards, good use
of OO, unit testing, and acceptance testing together lower the cost of
making changes for the customer. And some other stuff that has worked
for the big XP proponents.

Getting too focused on any particular aspect of the process, like unit
testing, results in a loss of focus on the big picture, delivering
exactly what the customer wants, even though they never know exactly
what they want before you start.

Unit testing is about delivering more quality for less money than you
could without it. It's about knowing when you broke something. It's not
about a 100% correctness guarantee.

The kind of testing Beck advocates is more art than science. It requires
some gut instinct to know what to test and what to assume is unlikely to
break. Sometimes you are wrong. You learn. You fix it. There's places
maybe like NASA where more discipline is needed. For the small to medium
sized business app or web thingie, that kind of discipline is overkill
and very expensive.

IMHO. :-)

Darrin

Thaddeus L Olczyk

unread,

Dec 6, 2001, 3:12:42 PM12/6/01

to

On Fri, 07 Dec 2001 15:28:15 GMT, "David B Lightstone"
<david.li...@prodigy.net> wrote:

>
>"Keith Ray" <k1e2i3t...@1m2a3c4.5c6o7m> wrote in message
>news:k1e2i3t4h5r6a7y-4F...@netnews.attbi.com...
>> In article <c230c758.01120...@posting.google.com>,
>> stephe...@motorola.com (Steve Hill) wrote:
>>
>> > You might actually be surprised at how many tests are needed. In "The
>> > art of software testing" (Myers, 1979) the 1st chapter starts with a
>> > quiz....
>> >
>> > What tests would you write for a function that takes 3 numbers (the
>> > side lengths of a triangle) , and returns whether the triangle is
>> > equilateral, scalene or isoceles.
>> >
>> > I think the answer is over 20.... certainly close on 20
>> >
>> > Steve
>>
>> Kent Beck's response to that example:
>>
>> http://groups.yahoo.com/group/extremeprogramming/message/37242
>>
>> "The biggest problem is that it doesn't balance the cost and benefits
>> of tests. He has a triangle example which he writes (I think) 29
>> tests for. The test-first version can be done confidently with 5
>> tests."

...doesn't balance the cost and benefits...
That's exactly what the anti-testing people say.
So what does it appear he does in typical managerial fashion he
tries to balance the equation by lowering the costs but doesn't
consider lowering the benifits. One might be tempted to say
that he is now getting caught up in the general XP cult like
behaviour; but I say: Everyone has an off day, and this might have
been one for him. Let us give him a chance to clarify his statement.

Oh yeah. And describe the five tests.

>
> It is rather hard to access either your response or Mr Beck's intent.
>What to you believe?
>
>(1) Mr Beck may have read the 29 test cases and concluded that
>the 5 tests he favors are adequate. (the original 29 hasd redundancy)
>
>(2) Mr Beck may have read the 29 test cases and concludes that
>pruning a number of them was appropriate (impossible situations
>by virtue of a priori knowledge about other aspects of the system).
>(Hence he is doing integration testing, rather than unit testing)
>
>(3) Mr Beck may have determined that the customer can afford
>to take the risk because the expected loss associated with
>the pruned test cases is acceptable.
>

4) He merges several test cases into one test case. Thus requiring the
same amount of work.

The one thing I notice about each case is that they require you to
think of and dispose of cases. In others even though you don't write
them, you do have to expend considerable effort on them.

The things is though ( and a major part of my original post ) is that
"only 5" sounds good until you think about it.

Assume you have a small project of 1000 units. That means 5000 units
( and that is assuming that 5 still holds I would expect other units
to be more complex requiring even more unit tests, the number
increasing exponentially with complexity. )

So now you need an infrastructure to handle 1000 units + 5000 tests.
That is what I see as the main problem with unit testing.

Curt Hibbs

unread,

Dec 7, 2001, 12:19:54 PM12/7/01

to

Very well put. I'm going to save you message to use the next time I have
this discussion with my colleagues.

Curt

Darrin Thompson

unread,

Dec 7, 2001, 12:57:17 PM12/7/01

to

Thaddeus L Olczyk wrote:

>The things is though ( and a major part of my original post ) is that
>"only 5" sounds good until you think about it.
>
>Assume you have a small project of 1000 units. That means 5000 units
>( and that is assuming that 5 still holds I would expect other units
>to be more complex requiring even more unit tests, the number
>increasing exponentially with complexity. )
>
>So now you need an infrastructure to handle 1000 units + 5000 tests.
>That is what I see as the main problem with unit testing.
>

Email is really an awful way to discuss things. I really doubt that we
all are thinking the same thing when we say "unit" or "unit testing".

If you are minded to write 1000 units, 5000 tests and then deliver the
results to your customer, the tests seem like overkill.

If you are keeping you customer in the loop and delivering say, 100
units at a time and deploying them into production every time, those 5*u
tests ARE the infastructure.

Also, if you keep releasing changes into production, the customer will
come up with new ideas and scrap some old ones. Her new ideas are going
to require you to scrap some of your old ideas too, and maybe even some
fundamental early ones that should be expensive to change.

The 5*u tests tell you at a glance what areas of the program have
changed their behavior as a result of rippling changes. Also, they tell
you about the effects of your changes in the entire program at once.

But, again, testing doesn't live in a vacuum. It's part of a bigger
picture, and that bigger picture is providing benefit to the customer as
early as possible. It's just a part of that.

Darrin

John Roth

unread,

Dec 7, 2001, 1:20:22 PM12/7/01

to

"David B Lightstone" <david.li...@prodigy.net> wrote in message
news:je5Q7.350$lw2.92...@newssvr16.news.prodigy.com...

Let's think about this for a moment. 5 test cases appears to
be adequate for a black box test. You need one for equilateral,
three for isosceles and one for scalene. If you're deriving
more test cases, you're doing an open box test, based on the
actual implementation (or on some set of assumptions about
frequent defects, or some such.)

Since XP writes the test cases before writing the code,
it's not possible to write an open box test, based on inspecting
the code that doesn't exist yet.

Assuming I write fairly simple code (I'm not going to
attempt to do "simplest" because that's kind of subjective)
I might implement this in Python as:

def kindOfTriangle(a, b, c):
if (a == b) and (b == c) and (a == c):
print "equilateral"
elif (a == b) or (b == c) or (a == c):
print "iscosceles"
else:
print "scalene"

John Roth

l...@provida.no

unread,

Dec 7, 2001, 1:28:33 PM12/7/01

to

In comp.object John Roth <john...@ameritech.net> wrote:

> Let's think about this for a moment. 5 test cases appears to
> be adequate for a black box test. You need one for equilateral,
> three for isosceles and one for scalene. If you're deriving
> more test cases, you're doing an open box test, based on the
> actual implementation (or on some set of assumptions about
> frequent defects, or some such.)

How about bad data? Won't you need tests to ensure that the code can
handle bad (zero or negative) values for any combination of the sides?
Isn't that 7 more tests (illegal data for: {a, b, c, a & b, a & c, b &
c, a & b & c })?

--
Leif Roar Moldskred

David B Lightstone

unread,

Dec 7, 2001, 1:36:48 PM12/7/01

to

"Thaddeus L Olczyk" <olc...@interaccess.com> wrote in message
news:3c0fca7e....@nntp.interaccess.com...

Your suggesting, that it be treated as an off the cuff remark (my
interpretation)
is meaningful

Paul Brannan

unread,

Dec 7, 2001, 2:01:39 PM12/7/01

to

On Sat, Dec 08, 2001 at 03:35:11AM +0900, John Roth wrote:
> Let's think about this for a moment. 5 test cases appears to
> be adequate for a black box test. You need one for equilateral,
> three for isosceles and one for scalene. If you're deriving
> more test cases, you're doing an open box test, based on the
> actual implementation (or on some set of assumptions about
> frequent defects, or some such.)

I think you need more tests than that, but more so because the
requirements are wrong than anything else. A combination of three input
values does not necessarily represent a valid triangle, so the function
really cannot return "scalar" just because the three values do not
represent an isosceles or equilateral triangle.

So that's six more test cases (a+b=c, a+b<c, a+c=b, a+c<b, b+c=a,
b+c<a).

I don't think the requirements state that negative values should be
disallowed (a negative value may simply represent a triangle that is
upside-down). So we can't add more bad data tests without having
better requirements.

Paul

Stefan Schmiedl

unread,

Dec 7, 2001, 3:21:54 PM12/7/01

to

l...@provida.no (2001-12-08 03:35):

The user story (msg 27796) said:

"What tests would you write for a function that takes 3 numbers (the
side lengths of a triangle) , and returns whether the triangle is
equilateral, scalene or isoceles."

So I *know* that the input are valid.

Input validation would be done in another method, anyway.
Right now we are focusing *only* on deciding what kind of triangle
we have.

If you check that the numbers make a valid triangle, the Gods of
Refactoring will find ye, and lo, show no mercy on thy code.

BTW: a solution without if could look like

def classifyTriangle(*sides)
["equilateral", "isoceles", "scalene"][sides.uniq.size - 1]
end

Now, this raises another question. Given that you wrote the method
this way ... what could possibly break in this method? I'm not talking
about callers here: If a caller sends two or four sides instead of
three, it's his fault.

John Roth

unread,

Dec 7, 2001, 10:18:43 PM12/7/01

to

<l...@provida.no> wrote in message
news:lT7Q7.9614$yB2.1...@news1.oke.nextra.no...

The specification did not say: test for bad data and return
an exception (or whatever). If the specification said that,
you would need three additional tests, one for each
of a, b, c not being positive. (The cases are independent.)
It also didn't say what you should do in that case. If
you were firewalling the code against invalid data
created by another function that shouldn't be doing it,
then the implementation should be an assertion. If the
function can be called from user code without additional
checking, then more needs to be done, but that needs
to be specified. Otherwise, you're violating the
injunction against doing more than you are asked.

Of course, it's your right to say that the specification
doesn't make sense as written, and get it changed
by the customer.

Even if you're testing for invalid data, most testing
theorists would suggest separate tests for zero and
negative, on the perfectly reasonable grounds that
bugs are likely to hide here. However, that's making
an assumption about the implementation.

John Roth

@objectmentor.com Robert C. Martin

unread,

Dec 8, 2001, 1:35:43 AM12/8/01

to

On 7 Dec 2001 02:04:08 -0800, stephe...@motorola.com (Steve Hill)
wrote:

>What tests would you write for a function that takes 3 numbers (the
>side lengths of a triangle) , and returns whether the triangle is
>equilateral, scalene or isoceles.

Set<int> sides; // data structure allows no duplicates.
sides.add(a);
sides.add(b);
sides.add(c);
return sides; // 1 = equilateral, 2=isoceles, 3=scalene.

What am I missing?

"One of the great commandments of science is:
'Mistrust arguments from authority.'" -- Carl Sagan

rmol...@online.no

unread,

Dec 8, 2001, 4:00:32 AM12/8/01

to

In comp.object John Roth <john...@ameritech.net> wrote:

> The specification did not say: test for bad data and return
> an exception (or whatever).

But then this behaviour is undefined - and that can't be a good thing,
can it?

> If the specification said that,
> you would need three additional tests, one for each
> of a, b, c not being positive. (The cases are independent.)

But doesn't the assumption that the cases are independant break the
black box? How do you know that it's enough to check the three cases
where just one of the sides have illegal values without knowing or
making assumptions about the implementation?

> It also didn't say what you should do in that case. If
> you were firewalling the code against invalid data
> created by another function that shouldn't be doing it,
> then the implementation should be an assertion. If the
> function can be called from user code without additional
> checking, then more needs to be done, but that needs
> to be specified. Otherwise, you're violating the
> injunction against doing more than you are asked.

> Of course, it's your right to say that the specification
> doesn't make sense as written, and get it changed
> by the customer.

But doesn't focusing on testing the specification mean that you might
overlook vagueness in the specification and with that undefined
behaviour in the product?

If nobody realizes that the specification doesn't cover a value that
might arise in a real deployment, then you're not going to discover
that as you are always only testing the by the specification. Or am I
misunderstanding something?

[SNIP]
--
Leif Roar Moldskred

Stefan Schmiedl

unread,

Dec 8, 2001, 4:50:49 AM12/8/01

to

rmol...@online.no (2001-12-08 18:16):

>
> But doesn't the assumption that the cases are independant break the
> black box? How do you know that it's enough to check the three cases
> where just one of the sides have illegal values without knowing or
> making assumptions about the implementation?

well, not even Marvin's brain the size of a planet won't be able
to test *all* possible combinations of numbers. you just draw a line
somewhere.

there is a change of terminology in process over in xp-land.
replace unit tests with programmer tests and acceptance test
with customer test.

so unit tests are written by the programmers who know what they
implement.

>
> But doesn't focusing on testing the specification mean that you might
> overlook vagueness in the specification and with that undefined
> behaviour in the product?

but by writing tests for the specs you often discover the vagueness
before writing something you thought the customer implied but did not.

>
> If nobody realizes that the specification doesn't cover a value that
> might arise in a real deployment, then you're not going to discover
> that as you are always only testing the by the specification. Or am I
> misunderstanding something?

yes. you are allowed to apply common sense :-)
test first shows its full power when applied together with all of
the other xp practices. you just don't sit in a cubicle with a spec
binder and translate the words into a test suite all by yourself.
you have a whole network of fellow programmers and a customer at
hand who you can ask, if a problem arises. Basically, you are not
left alone.

Stefan Schmiedl

unread,

Dec 8, 2001, 4:56:08 AM12/8/01

to

Robert C. Martin (2001-12-08 15:36):

> On 7 Dec 2001 02:04:08 -0800, stephe...@motorola.com (Steve Hill)
> wrote:
>
> >What tests would you write for a function that takes 3 numbers (the
> >side lengths of a triangle) , and returns whether the triangle is
> >equilateral, scalene or isoceles.
>
> Set<int> sides; // data structure allows no duplicates.
> sides.add(a);
> sides.add(b);
> sides.add(c);
> return sides; // 1 = equilateral, 2=isoceles, 3=scalene.
>
> What am I missing?

you're writing C++ on a ruby mailing list? :-)

David B Lightstone

unread,

Dec 8, 2001, 9:42:17 AM12/8/01

to

"Robert C. Martin" <rmartin @ objectmentor . com> wrote in message
news:6nj21uo76gu2nfq92...@4ax.com...

> On 7 Dec 2001 02:04:08 -0800, stephe...@motorola.com (Steve Hill)
> wrote:
>
> >What tests would you write for a function that takes 3 numbers (the
> >side lengths of a triangle) , and returns whether the triangle is
> >equilateral, scalene or isoceles.
>
> Set<int> sides; // data structure allows no duplicates.
> sides.add(a);
> sides.add(b);
> sides.add(c);
> return sides; // 1 = equilateral, 2=isoceles, 3=scalene.
>
> What am I missing?

Fault poorly defined enumerate, case NOT_A_TRIANGLE
Failure to consider the default requirement of all applications - Bullet
Proofing

Take a problem, view it differently, and unknowingly
change it. Your analysis is perfectly correct for your
simplified problem. It just isn't the problem which
was origninally presented

John Roth

unread,

Dec 8, 2001, 9:45:09 AM12/8/01

to

<rmol...@online.no> wrote in message
news:QEkQ7.10066$yB2.1...@news1.oke.nextra.no...

It's always possible to overlook vagueness in the specification,
however, that's what inspections and so forth are for. If you
discover that the specification is too vague for you, by all means
bring the issue up to whoever wrote it. You may or may not
be right, but going off on your own is always wrong.

> If nobody realizes that the specification doesn't cover a value that
> might arise in a real deployment, then you're not going to discover
> that as you are always only testing the by the specification. Or am I
> misunderstanding something?

No, you're not missing something. I made a distinction between
an internal and external routine for a reason. An internal routine is
one that can trust its callers to pass only valid values, an external
routine is one that needs to edit its inputs for validity.

To bring this back to XP, there's a basic misunderstanding of the
function of unit tests in XP. In normal development, unit tests
are created to test an implementation. In XP, they are to
test a specification, and then the implementation is written to
that specification. It takes some experiance to understand that
if you put something in the implementation that is not covered
by the unit tests you've just written, you've made a mistake.

If you feel that you need to test for invalid values, then, after
discussing it with your pair, and you agree that it is reasonable,
then you rewrite the specification. Then you write the necessary
unit tests.

John Roth

unread,

Dec 8, 2001, 9:53:24 AM12/8/01

to

"Robert C. Martin" <rmartin @ objectmentor . com> wrote in message
news:6nj21uo76gu2nfq92...@4ax.com...

> On 7 Dec 2001 02:04:08 -0800, stephe...@motorola.com (Steve Hill)
> wrote:
>
> >What tests would you write for a function that takes 3 numbers (the
> >side lengths of a triangle) , and returns whether the triangle is
> >equilateral, scalene or isoceles.
>
> Set<int> sides; // data structure allows no duplicates.
> sides.add(a);
> sides.add(b);
> sides.add(c);
> return sides; // 1 = equilateral, 2=isoceles, 3=scalene.
>
> What am I missing?

Nice piece of code - a lot more elegant than my approach.
I'm going to have to remember it.

However, the question was about the unit tests, not
about the implementation.

It does bring up an interesting point (which I think
someone else commented on, at least in passing.)
This implementation really only requires three unit
tests, not five, since it doesn't require that the
three parameters be distinct.

If this is the case, I'm at a loss to know what Kent's
other two unit test cases were.

John Roth

David B Lightstone

unread,

Dec 8, 2001, 10:16:07 AM12/8/01

to

"John Roth" <john...@ameritech.net> wrote in message
news:u149pbf...@news.supernews.com...

This is probably the key to understanding the whole argument
There are 2 separate problems being addressed. Nobody
can sort out which problem is being addressed at a given time
Some assume - internal routine others assume external.

Stefan Schmiedl

unread,

Dec 8, 2001, 1:00:29 PM12/8/01

to

John Roth (2001-12-08 23:57):

>
> "Robert C. Martin" <rmartin @ objectmentor . com> wrote in message
> news:6nj21uo76gu2nfq92...@4ax.com...
> > On 7 Dec 2001 02:04:08 -0800, stephe...@motorola.com (Steve Hill)
> > wrote:
> >
> > >What tests would you write for a function that takes 3 numbers (the
> > >side lengths of a triangle) , and returns whether the triangle is
> > >equilateral, scalene or isoceles.
> >
> > Set<int> sides; // data structure allows no duplicates.
> > sides.add(a);
> > sides.add(b);
> > sides.add(c);
> > return sides; // 1 = equilateral, 2=isoceles, 3=scalene.
> >

> This implementation really only requires three unit tests

Does it? This implementation, like the Ruby one I posted, moves
the solution to language features. If you want to make sure down
to this level, you need just two tests, one for two different
numbers, one for two equal numbers.

If you do this in Ruby, how low-level will you go for testing?

def classify(*sides)

["equilateral", "isoceles", "scalene"][sides.uniq.size - 1]
end

I think, you will trust Ruby, to
- collect the arguments into an array
- count the items in an array
- subtract 1 from a number
- extract an element from an array given an index

The only "magic" here is uniq, but even this is built-in, hence
I tend to trust it unless proven wrong.

So: what does the programmer need to test with this implementation?

The user will still provide three triangles for checking out
return values.

@objectmentor.com Robert C. Martin

unread,

Dec 8, 2001, 1:31:10 PM12/8/01

to

On Sat, 08 Dec 2001 00:35:43 -0600, Robert C. Martin <rmartin @
objectmentor . com> wrote:

>On 7 Dec 2001 02:04:08 -0800, stephe...@motorola.com (Steve Hill)
>wrote:
>
>>What tests would you write for a function that takes 3 numbers (the
>>side lengths of a triangle) , and returns whether the triangle is
>>equilateral, scalene or isoceles.
>
>Set<int> sides; // data structure allows no duplicates.
>sides.add(a);
>sides.add(b);
>sides.add(c);

>return sides.length(); // 1 = equilateral, 2=isoceles, 3=scalene.
^^^^^^^^^--oops.

Kent Beck

unread,

Dec 9, 2001, 1:33:07 PM12/9/01

to

Thaddeus L Olczyk <olc...@interaccess.com> wrote in message
news:3c0fca7e....@nntp.interaccess.com...

> On Fri, 07 Dec 2001 15:28:15 GMT, "David B Lightstone"
> <david.li...@prodigy.net> wrote:
>
> >
> >"Keith Ray" <k1e2i3t...@1m2a3c4.5c6o7m> wrote in message

> >> Kent Beck's response to that example:
> >>
> >> http://groups.yahoo.com/group/extremeprogramming/message/37242
> >>
> >> "The biggest problem is that it doesn't balance the cost and benefits
> >> of tests. He has a triangle example which he writes (I think) 29
> >> tests for. The test-first version can be done confidently with 5
> >> tests."
> ...doesn't balance the cost and benefits...
> That's exactly what the anti-testing people say.
> So what does it appear he does in typical managerial fashion he
> tries to balance the equation by lowering the costs but doesn't
> consider lowering the benifits. One might be tempted to say
> that he is now getting caught up in the general XP cult like
> behaviour; but I say: Everyone has an off day, and this might have
> been one for him. Let us give him a chance to clarify his statement.
>
> Oh yeah. And describe the five tests.

I didn't want to respond until I'd done the experiment myself. I apologize
for the length of what follows, but simple sequences of words appear to be
failing me, communication-wise.

I assume the interface is a function that returns 1 if equilateral, 2 if
isoceles, and 3 if scalene. Here is the first test. I don't know where to
put the function yet, so I'll just implement it in the test class and
refactor later.

TriangleTest>>testScalene
self assert: (self evaluate: 1 side: 2 side: 3) = 3

Using Uncle Bob's excellent trick (immortalized as Duplicate Removing Set in
the Smalltalk Best Practice Patterns), this can be implemented as:

TriangleTest>>evaluate: aNumber1 side: aNumber2 side: aNumber3
| sides |
sides := Set
with: aNumber1
with: aNumber2
with: aNumber3.
^sides size

Since I know Sets are insensitive to the order in which elements are added,
I conclude I don't have to test alternative orders for the arguments.

The implementation should suffice for isoceles and equilateral triangles
without modification. However, I want to communicate that to posterity, so I
write:

TriangleTest>>testEquilateral
self assert: (self evaluate: 2 side: 2 side: 2) = 1

TriangleTest>>testIsoceles
self assert: (self evaluate: 1 side: 2 side: 2) = 2

They both run first time.

Now I suppose we want to "idiot-proof" our code, so an exception is thrown
if bad arguments are passed. My code does not seem to suffer from the lack
of such checking code, but I have two tests to burn.

Now the implementation is:

TriangleTest>>evaluate: aNumber1 side: aNumber2 side: aNumber3
| sides |
sides := Set
with: aNumber1
with: aNumber2
with: aNumber3.
sides asSortedCollection first <= 0 ifTrue: [self fail].
^sides size

This implementation will also work if a nil is passed, because the
asSortedCollection will throw an exception trying to compare a number and a
nil. The truly pathological case is if someone were to pass all strings,
which can compare to each other. I'll implement this one, but we're already
wandering in Silly-land.

TriangleTest>>testStrings
[self evaluate: 'a' side: 'b' side: 'c']
on: Exception
do: [:ex | ^self].
self fail

That one already works, too, because of the comparison with 0.

There you have it--five tests that give me great confidence in the MTBF of
my function.

>
> >
> > It is rather hard to access either your response or Mr Beck's intent.
> >What to you believe?
> >
> >(1) Mr Beck may have read the 29 test cases and concluded that
> >the 5 tests he favors are adequate. (the original 29 hasd redundancy)
> >
> >(2) Mr Beck may have read the 29 test cases and concludes that
> >pruning a number of them was appropriate (impossible situations
> >by virtue of a priori knowledge about other aspects of the system).
> >(Hence he is doing integration testing, rather than unit testing)
> >
> >(3) Mr Beck may have determined that the customer can afford
> >to take the risk because the expected loss associated with
> >the pruned test cases is acceptable.
> >
> 4) He merges several test cases into one test case. Thus requiring the
> same amount of work.
>
> The one thing I notice about each case is that they require you to
> think of and dispose of cases. In others even though you don't write
> them, you do have to expend considerable effort on them.
>
> The things is though ( and a major part of my original post ) is that
> "only 5" sounds good until you think about it.

What defects am I missing with the above 5 tests? Put another way, what
further tests could improve the MTBF of my function?

> Assume you have a small project of 1000 units. That means 5000 units
> ( and that is assuming that 5 still holds I would expect other units
> to be more complex requiring even more unit tests, the number
> increasing exponentially with complexity. )
>
> So now you need an infrastructure to handle 1000 units + 5000 tests.
> That is what I see as the main problem with unit testing.

If your unit tests are large and complicated, you have a design problem, not
a testing problem. If you have an exponential unit testing problem, again,
you have a design problem, not a testing problem.

Mr. Beck
P.S. How does any of this make me anti-testing?

Ron Jeffries

unread,

Dec 10, 2001, 7:04:37 AM12/10/01

to

On Sun, 02 Dec 2001 18:03:12 -0500, Panu Viljamaa <pa...@fcc.net>
wrote:

>Yes I have. What happened was I find it hard to understand how to use a method written by someone else when there is no indication of what kind of arguments should be passed to it. It is also hard to say if it is right or wrong when there is no clearly written contract on "what it should do". This might be specified by some test-case-class somewhere, but I'd rather have 'contract' of the method visible as soon as I view the method's source, so I don't have to go test-case hunting.

The usual Smalltalk convention is to write a method defintion like
this:

insert: anObject into: aCollection

which means about the same thing as

void insert( Object o, Collection c)

The name of the method, insert:into: is chosen to reflect what the
method should do.

Was the code you wrote and read using those conventions? What language
are you using that expresses contracts? Eiffel?

Ronald E Jeffries
http://www.XProgramming.com
http://www.objectmentor.com

David B Lightstone

unread,

Dec 10, 2001, 8:21:31 AM12/10/01

to

"Kent Beck" <kent...@my-deja.com> wrote in message
news:9v1ss3$adi$1...@suaar1aa.prod.compuserve.com...

Given the infrastructure available, 5 tests it is.

Unfortunately, the production implementaion does not have the
infrastructure, so we will have to port it to the production version.

How much additional testing will be needed to determine such things as
(1) The exception system functions correctly (maybe there is an
interaction problem to be exposed)
(2) That set order incertion independence

From my perspective a Unit test has been transformed into an
Integration test by means of assuming pretested component
availability. Complexity can be shifted around, but not eliminated

Vladimir

unread,

Dec 10, 2001, 9:10:52 AM12/10/01

to

tada...@aol.com (Tom Adams) wrote in message news:<793af3df.01120...@posting.google.com>...
> Ron Jeffries <ronje...@REMOVEacm.org> wrote in message news:<176717160028CE03.51B6AF6E...@lp.airnews.net>...

> I am interesting in your use of the word "comprehensive". Beck gave an
> example of test first coding where he wrote the test case:
>
> sum = Calculator.sum(2,2)
>
> assert: equals(sum,4)

also this will never reveal a multiplication was ment for addition by
calculator::sum()...

Vladimir

unread,

Dec 10, 2001, 9:17:40 AM12/10/01

to

"David B Lightstone" <david.li...@prodigy.net> wrote in message news:<je5Q7.350$lw2.92...@newssvr16.news.prodigy.com>...

> "Keith Ray" <k1e2i3t...@1m2a3c4.5c6o7m> wrote in message

> (1) Mr Beck may have read the 29 test cases and concluded that
> the 5 tests he favors are adequate. (the original 29 hasd redundancy)
>
> (2) Mr Beck may have read the 29 test cases and concludes that
> pruning a number of them was appropriate (impossible situations
> by virtue of a priori knowledge about other aspects of the system).
> (Hence he is doing integration testing, rather than unit testing)
>
> (3) Mr Beck may have determined that the customer can afford
> to take the risk because the expected loss associated with
> the pruned test cases is acceptable.
>

(3) is my impression of testing in XP.

-------
Best Wishes,
Vladimir

Vladimir

unread,

Dec 10, 2001, 9:22:21 AM12/10/01

to

"David B Lightstone" <david.li...@prodigy.net> wrote in message news:<je5Q7.350$lw2.92...@newssvr16.news.prodigy.com>...

> (3) Mr Beck may have determined that the customer can afford

> to take the risk because the expected loss associated with
> the pruned test cases is acceptable.

also in case (3) Mr Beck should give a customer the worst case
depiction which would state:

"if you'll try to anything our tests didn't do our high quality system
may crash on you."

Vladimir

unread,

Dec 10, 2001, 9:27:45 AM12/10/01

to

l...@provida.no wrote in message news:<lT7Q7.9614$yB2.1...@news1.oke.nextra.no>...

also do not forget testing blindeness effect and use distinct values
for a, b and c which when used will not lead to a correct result even
in faulty implementation.

Peter Hickman

unread,

Dec 10, 2001, 9:57:18 AM12/10/01

to

Vladimir wrote:

> also do not forget testing blindeness effect and use distinct values
> for a, b and c which when used will not lead to a correct result even
> in faulty implementation.

How faulty an implementation?

For example sin(5) should not return 120 but a sin function that is
infact a factorial function would return this answer. Ok this is a very
big error but I've seen some programmers come close.

For any function there can be an implementation that gives a dumb
answer:

def sin(a)
return 120
end

Testing, blackbox stylee, for bad implementations leads to, well,
infinite tests! Ever heard of the 'knocking demon' hypothesis.

Bob Binder

unread,

Dec 10, 2001, 12:35:08 PM12/10/01

to

Robert C. Martin <rmartin @ objectmentor . com> wrote in message news:<h2n41ucnnc8ipnsee...@4ax.com>...

> On Sat, 08 Dec 2001 00:35:43 -0600, Robert C. Martin <rmartin @
> objectmentor . com> wrote:
>
> >On 7 Dec 2001 02:04:08 -0800, stephe...@motorola.com (Steve Hill)
> >wrote:
> >
> >>What tests would you write for a function that takes 3 numbers (the
> >>side lengths of a triangle) , and returns whether the triangle is
> >>equilateral, scalene or isoceles.
> >
> >Set<int> sides; // data structure allows no duplicates.
> >sides.add(a);
> >sides.add(b);
> >sides.add(c);
> >return sides.length(); // 1 = equilateral, 2=isoceles, 3=scalene.
> ^^^^^^^^^--oops.
> >
> >What am I missing?

In this example, wouldn't using an int set for the sides prevent
representing any triangle with two or more equal lengths -- I think a
bag would do the job. Assuming that your set would reject identical
values, any test suite which did not include tests like 3,3,4 or 3,3,3
would miss that bug.

What are the bounds of the ints? If this the typical signed int, zero
and negative values are allowed. Should I take it on faith that the
implementation will reject values less than or equal zero? (granted,
if you have the code in front of you, an inspection would probably
suffice to answer that question.) I see nothing in the interface
definition to warrant such confidence, however. If you'd defined a
bag of natural numbers (not ints), I would be willing to take it on
faith.

A typical algorithm for clasifying the lengths as a triangle sums the
lengths. Using the maximum value for at least one int (e.g.
(2**32)-1) will usually cause an arithmetic overflow.

Another common bug is to classify the lengths based on only checking
two sides -- this is why Myers insisted on permuting the sides for
each type.

All of this is listed on page 7 of my book.

Your example is much closer to Myers' original three numbers on a
punched card problem than the class hierarchy I used to highlight the
differences of testing oo and procedural implementations, which
include complex data types, inheritance, and polymorphism. The larger
number of tests in my example (65) are a direct result of that stuff,
but are only a first stab to suggest the kind of problems in finding
bugs in oo implementations.

BTW, the apparent meaning of "adequate testing" in this thread is not
consistent with definition used in the testing community for over 15
years: adequate testing must *at least* achieve statement coverage.
There are many other possible definitions of "adequate testing", but
if your tests don't at least reach all the code in the IUT, you can't
claim to be doing adequate testing. Although any definition of
"adequate testing" which attempts a more stringent criteria can be
falsified, many unambiguous non-subjective criteria for adequacy have
been useful in practice, but they don't include "I say so".

___________________________________________________________________
Bob Binder http://www.rbsc.com RBSC Corporation
rbinder@ r b s c.com Advanced Test Automation 3 First National Plz
312 214-3280 tel Process Improvement Suite 1400
312 214-3110 fax Test Outsourcing Chicago, IL 60602

Joe Pallas (pallas at cs dot stanford dot edu)

unread,

Dec 10, 2001, 1:23:12 PM12/10/01

to

Kent Beck wrote:

An astounding indictment of the way XP approaches Unit Testing.

> Since I know Sets are insensitive to the order in which elements are added,
> I conclude I don't have to test alternative orders for the arguments.

So, you have made decisions about what cases to test based on the
implementation. That implies two things:

1) You have violated the XP rule that tests should be written before the
code that they test.

2) Your test cases are specific to the implementation, and will not
adequately test a different implementation.

This means that some future refactoring may make your tests inadequate,
but the coder doing the refactoring would rely on your tests and check
in broken code.

> Now I suppose we want to "idiot-proof" our code, so an exception is thrown
> if bad arguments are passed. My code does not seem to suffer from the lack
> of such checking code, but I have two tests to burn.

...

> There you have it--five tests that give me great confidence in the MTBF of
> my function.

> What defects am I missing with the above 5 tests? Put another way, what

> further tests could improve the MTBF of my function?

What a bizarre notion. It is not meaningful to talk about the MTBF of
software, because software does not fail. Software is either correct or
incorrect, and that does not change over time. Requirements change, and
execution environments change, but the program itself is not subject to
"bit rot."

Since you've implied that your function will always throw an exception
for bad arguments, let me point out that the MTFB of your function is
zero if given either of the inputs (5, 5, 11), which it will incorrectly
report to be an isosceles triangle, or (1, 3, 7), which it will
incorrectly report to be a scalene triangle.

> If your unit tests are large and complicated, you have a design problem, not
> a testing problem.

Again, this can only be true if your testing is dictated by the design,
*rather than by the requirements.* That's not just backwards, it is
doubly dangerous in the XP milieu, where frequent changes to the design
are encouraged, and the only validation of those changes comes from the
pre-existing tests.

joe

Tom Adams

unread,

Dec 10, 2001, 1:27:59 PM12/10/01

to

"Kent Beck" <kent...@my-deja.com> wrote in message news:<9v1ss3$adi$1...@suaar1aa.prod.compuserve.com>...

> (Ken's proposed 5 test case solution to Myers Triangle problem was here)

1. What about oddities in the source code? Like:

if (any side ==4) {
return 1;
}

This would cause it to fail for the test case 1,2,4 and your tests
would miss that.

You would need white-box testing to catch this, or you would need
more, perhaps many more, black-box tests to cover all the possible
internals of the function's code.

2. There is also the issue of very large integer values. At least, you
can call the function with literal integers where it will fail. Actually,
it is not possible to code the function to handle integers of unbounded
size without using some special integer representations. Anyway, a
function would pass your tests with undefined behaviour for very large
integers. And, code using the function could work on one compiler or
machine and fail on another, depending on the language, because of issues
related to integer size.

Thaddeus L Olczyk

unread,

Dec 9, 2001, 5:13:51 PM12/9/01

to

On 10 Dec 2001 06:22:21 -0800, trued...@my-deja.com (Vladimir)
wrote:

That's not the worst case. Several others come to mind if the intested
bug turns out to be very nasty.

Stefan Schmiedl

unread,

Dec 10, 2001, 2:32:41 PM12/10/01

to

Joe Pallas (pallas at cs dot stanford dot edu) (2001-12-11 03:40):

> Since you've implied that your function will always throw an exception
> for bad arguments, let me point out that the MTFB of your function is
> zero if given either of the inputs (5, 5, 11), which it will incorrectly
> report to be an isosceles triangle, or (1, 3, 7), which it will
> incorrectly report to be a scalene triangle.
>

I would definitely prefer to factor deciding the validity into
a separate method which is tested separately.
Classification is so much easier when you know the species.

Besides, I understood the message which started this discussion,
to classifiy a triangle described by its sides. So it is up to
the environment to decide whether (5,5,11) is passed into
classification (would be possible in non-Euclidean geometries I think).
Or if Euclidean borderline cases as (5,5,10) are considered as triangles
or not.

Seeing it this way makes for smaller methods which I can easier
understand. Of course, your mileage may vary.

Robert Oliver

unread,

Dec 10, 2001, 2:56:34 PM12/10/01

to

> Kent Beck wrote:
>
> > What defects am I missing with the above 5 tests? Put another way, what
> > further tests could improve the MTBF of my function?

"Joe Pallas (pallas at cs dot stanford dot edu)"
<pallas@cs_dot_stanford.edu> wrote in message
news:3C14FD90.9F8987B3@cs_dot_stanford.edu...

> What a bizarre notion. It is not meaningful to talk about the MTBF of
> software, because software does not fail. Software is either correct or
> incorrect, and that does not change over time. Requirements change, and
> execution environments change, but the program itself is not subject to
> "bit rot."

Joe,

I'd just like to discuss the idea of (Mean Time Between Failure) MTBF for
software.

Firstly, software fails, it just fails differently than hardware. A few
examples:

A function is tested and meets all functional requirements, but it
accumulates a count internally which overflows about once per week in normal
use. The function is always broken, but since it only fails once per week,
it has an MTBF of 1 week.

Another function was tested but one obscure test case was overlooked. The
data with which this function is used contains the obscure combination which
shows the failure about once per month. The MTBF of the software is about 1
month. It could be different with different data. It's still broken.

Yet another function has overlooked a test case. This time, the developers
have not used the function in a way to cause it to fail. Yes, it's still
broken, and someday we will probably find out how. It's MTBF is (for now) a
long time.

Secondly, MTBF is a useful concept that relates to observed behaviour.
Which is more reliable, Windows or Linux? Many of us have an opinion to
this question and it is almost always defined in terms of how long the
system will run before it crashes, in other words, its MTBF.

Thirdly, MTBF can be used to weed out bad systems. We may not be able to
prove a good given system has a high MTBF, but we can almost surely show a
bad system has a poor MTBF. In all likelihood, both systems have some bugs.

Regards,

Bob Oliver

Gerold Keefer

unread,

Dec 10, 2001, 3:29:54 PM12/10/01

to

Robert Oliver wrote:

> Secondly, MTBF is a useful concept that relates to observed behaviour.

actually, reliability considerations do make sense for software *systems*.
they do not make sense for trivial functions or methods.
MTBF makes sense if there is a corresponding usage profile given.
the basic measure shoud be transactions/operations per failure.
the most comprehensise work done in this area is what john musa
is publishing.

regards,

gerold

--
=====================================================================
AVOCA GmbH - Advanced Visioning of Components and Architectures
Kronenstrasse 19
D-70173 Stuttgart
fo +49 711 2271374 fa +49 711 2271375
http://www.avoca-vsm.com mailto:in...@avoca-vsm.com
=====================================================================

"Move the chair" - Frank Lloyd Wright's response to a client who
phoned him to complain of rain leaking through the roof of the
house onto the dining table.

Hal E. Fulton

unread,

Dec 10, 2001, 6:47:03 PM12/10/01

to

----- Original Message -----
From: Robert Oliver <oli...@hfl.tc.faa.gov>
Newsgroups:
comp.object,comp.software.extreme-programming,comp.software-eng,comp.lang.ru
by
To: ruby-talk ML <ruby...@ruby-lang.org>; <undisclosed-recipients: ;>
Sent: Monday, December 10, 2001 2:00 PM
Subject: [ruby-talk:28132] Re: John Roth dolt ( Re: A challenge to
proponents of Unit Testing. )

>
> Joe,
>
> I'd just like to discuss the idea of (Mean Time Between Failure) MTBF for
> software.
>
> Firstly, software fails, it just fails differently than hardware. A few
> examples:

[snip]

Thanks, Bob. These were the points I wanted to make,
but you said it first (and better).

Hal

Keith Ray

unread,

Dec 10, 2001, 9:08:21 PM12/10/01

to

In article <ded6b237.01121...@posting.google.com>,
rbi...@rbsc.com (Bob Binder) wrote:

> BTW, the apparent meaning of "adequate testing" in this thread is not
> consistent with definition used in the testing community for over 15
> years: adequate testing must *at least* achieve statement coverage.
> There are many other possible definitions of "adequate testing", but
> if your tests don't at least reach all the code in the IUT, you can't
> claim to be doing adequate testing. Although any definition of
> "adequate testing" which attempts a more stringent criteria can be
> falsified, many unambiguous non-subjective criteria for adequacy have
> been useful in practice, but they don't include "I say so".

Didn't the tests in the examples by Kent and Uncle Bob have 100%
statement coverage, so by your definition, they are "adequate"?

John Roth

unread,

Dec 10, 2001, 10:50:58 PM12/10/01

to

"Tom Adams" <tada...@aol.com> wrote in message
news:793af3df.01121...@posting.google.com...

> "Kent Beck" <kent...@my-deja.com> wrote in message
news:<9v1ss3$adi$1...@suaar1aa.prod.compuserve.com>...
>
> > (Ken's proposed 5 test case solution to Myers Triangle problem was
here)
>
> 1. What about oddities in the source code? Like:
>
> if (any side ==4) {
> return 1;
> }
>
> This would cause it to fail for the test case 1,2,4 and your tests
> would miss that.
>
> You would need white-box testing to catch this, or you would need
> more, perhaps many more, black-box tests to cover all the possible
> internals of the function's code.

You're violating the assumption that we're talking about XP. All
production code should be written from the unit tests, so such a
randomly idiotic test shouldn't be in there in the first place. If it
was, the pair should have caught it, and the next person to look
at the function should catch it.

If all of that fails, of course you're right. It won't be caught until
it fails somewhere else, and someone has to debug it.

> 2. There is also the issue of very large integer values. At least,
you
> can call the function with literal integers where it will fail.
Actually,
> it is not possible to code the function to handle integers of
unbounded
> size without using some special integer representations. Anyway, a
> function would pass your tests with undefined behaviour for very large
> integers. And, code using the function could work on one compiler or
> machine and fail on another, depending on the language, because of
issues
> related to integer size.

That actually depends on whether the language allows you to
pass something that the function/method will convert to an integer.
In the stated problem (the Meyers Triangle problem) the only
operations are comparisons, so the problem of integer size
is irrelevant unless you're using something like Perl, which can
take a string and attempt to convert it to an integer to do an
integer comparison.

None of the three implementations I've seen (my rather pedestrian
procedural implementation in Python, RM's implementation or
KB's implementation) has this problem as long as the language
doesn't contain any way of failing on the parameter passing.

Of course, if the problem also said something like:

The user will pass the sides as strings. Distinguish whether
the user input is valid, and whether it constitutes a valid
triangle.

we would need more test cases and code.

John Roth

Universe

unread,

Dec 10, 2001, 10:56:40 PM12/10/01

to

Gerold Keefer <gke...@avoca-vsm.com> wrote:

> "Move the chair" - Frank Lloyd Wright's response to a client who
> phoned him to complain of rain leaking through the roof of the
> house onto the dining table.

Likely with regard to "Falling Water", no doubt. (c :

[I've been there recently and to more agreeably "popping" Knob Hill
close by. Though "cool", I like the ideas embodied in FW more than
execution.]

Elliott

@objectmentor.com Robert C. Martin

unread,

Dec 10, 2001, 11:08:03 PM12/10/01

to

On Mon, 10 Dec 2001 10:23:12 -0800, "Joe Pallas (pallas at cs dot
stanford dot edu)" <pallas@cs_dot_stanford.edu> wrote:

>Kent Beck wrote:
>
>An astounding indictment of the way XP approaches Unit Testing.
>
>> Since I know Sets are insensitive to the order in which elements are added,
>> I conclude I don't have to test alternative orders for the arguments.
>
>So, you have made decisions about what cases to test based on the
>implementation. That implies two things:
>
>1) You have violated the XP rule that tests should be written before the
>code that they test.

Not quite. Tests, in XP, are written concurrently with the production
code. It is quite possible that Kent wrote the first test case, then
wrote the code that passed it, and then realized that ordering was
independent, and that no test cases needed to be written to test it.

>
>2) Your test cases are specific to the implementation, and will not
>adequately test a different implementation.

That's correct. Unit tests are white box tests. They are fragile to
significant changes of design. Acceptance tests, on the other hand,
are black box tests that are completely insensative to changes in
design. You need both kinds of tests.

>This means that some future refactoring may make your tests inadequate,
>but the coder doing the refactoring would rely on your tests and check
>in broken code.

One would hope that the programmers who were making the change would
review the tests. One would also hope that there was an acceptance
test or two that would catch the bug. But in the end, you are right,
there are some design changes that might produce bugs that could pass
through the tests. I don't consider this a severe problem for most
projects since the incidence is not likely to be high.

>> What defects am I missing with the above 5 tests? Put another way, what
>> further tests could improve the MTBF of my function?
>
>What a bizarre notion. It is not meaningful to talk about the MTBF of
>software, because software does not fail.

The term is in use in this thread because someone else used it.

>Software is either correct or
>incorrect, and that does not change over time. Requirements change, and
>execution environments change, but the program itself is not subject to
>"bit rot."

And still software can have a MTBF.

>> If your unit tests are large and complicated, you have a design problem, not
>> a testing problem.
>
>Again, this can only be true if your testing is dictated by the design,
>*rather than by the requirements.* That's not just backwards, it is
>doubly dangerous in the XP milieu, where frequent changes to the design
>are encouraged, and the only validation of those changes comes from the
>pre-existing tests.

Again, remember that XP has two kinds of tests. White box unit tests
and black box acceptance tests.

@objectmentor.com Robert C. Martin

unread,

Dec 10, 2001, 11:17:31 PM12/10/01

to

On 10 Dec 2001 10:27:59 -0800, tada...@aol.com (Tom Adams) wrote:

>"Kent Beck" <kent...@my-deja.com> wrote in message news:<9v1ss3$adi$1...@suaar1aa.prod.compuserve.com>...
>
>> (Ken's proposed 5 test case solution to Myers Triangle problem was here)
>
>1. What about oddities in the source code? Like:
>
>if (any side ==4) {
> return 1;
>}

If you accept this as a legitimate testing target, then you must
exhaustively test every possible triplet, since there might be some if
statement that catches one particular triplet and returns the wrong
result. I think such testing is unreasonable in most cases.

>
>2. There is also the issue of very large integer values.

In the language that Kent was using (smalltalk) integers have no
maximum size.

@objectmentor.com Robert C. Martin

unread,

Dec 10, 2001, 11:18:44 PM12/10/01

to

On 10 Dec 2001 09:35:08 -0800, rbi...@rbsc.com (Bob Binder) wrote:

>Robert C. Martin <rmartin @ objectmentor . com> wrote in message news:<h2n41ucnnc8ipnsee...@4ax.com>...
>> On Sat, 08 Dec 2001 00:35:43 -0600, Robert C. Martin <rmartin @
>> objectmentor . com> wrote:
>>
>> >On 7 Dec 2001 02:04:08 -0800, stephe...@motorola.com (Steve Hill)
>> >wrote:
>> >
>> >>What tests would you write for a function that takes 3 numbers (the
>> >>side lengths of a triangle) , and returns whether the triangle is
>> >>equilateral, scalene or isoceles.
>> >
>> >Set<int> sides; // data structure allows no duplicates.
>> >sides.add(a);
>> >sides.add(b);
>> >sides.add(c);
>> >return sides.length(); // 1 = equilateral, 2=isoceles, 3=scalene.
>> ^^^^^^^^^--oops.
>> >
>> >What am I missing?
>
>In this example, wouldn't using an int set for the sides prevent
>representing any triangle with two or more equal lengths -- I think a
>bag would do the job. Assuming that your set would reject identical
>values, any test suite which did not include tests like 3,3,4 or 3,3,3
>would miss that bug.

The intent was to return a 1 for equilateral, 2 for isoceles, and 3
for scalene. Representing the triangle was not part of the job.

Universe

unread,

Dec 10, 2001, 11:42:43 PM12/10/01

to

Ron Jeffries <ronje...@REMOVEacm.org> wrote:

>On Sun, 02 Dec 2001 18:03:12 -0500, Panu Viljamaa <pa...@fcc.net>
>wrote:
>
>>Yes I have. What happened was I find it hard to understand how to use a method written by someone else when there is no indication of what kind of arguments should be passed to it. It is also hard to say if it is right or wrong when there is no clearly written contract on "what it should do". This might be specified by some test-case-class somewhere, but I'd rather have 'contract' of the method visible as soon as I view the method's source, so I don't have to go test-case hunting.
>
>The usual Smalltalk convention is to write a method defintion like
>this:
>
> insert: anObject into: aCollection
>
>which means about the same thing as
>
> void insert( Object o, Collection c)
>
>The name of the method, insert:into: is chosen to reflect what the
>method should do.

Insert when full, held for deletion, being used in 1 transaction, 2
transactions, 3, 4, none? That's my drift.

Elliott
--
http://www.radix.net/~universe ~*~ Enjoy! ~*~
Hail OO Modelling! * Hail the Wireless Web!
@Elliott 2001 my comments ~ newsgroups+bitnet quote ok

Universe

unread,

Dec 11, 2001, 1:11:26 AM12/11/01

to

Robert C. Martin <rmartin @ objectmentor . com> wrote:

>On Mon, 10 Dec 2001 10:23:12 -0800, "Joe Pallas (pallas at cs dot

>>2) Your test cases are specific to the implementation, and will not

>>adequately test a different implementation.

>That's correct. Unit tests are white box tests. They are fragile to
>significant changes of design.

Why? Why wouldn't unit testing only worry about robustly meeting
behavioral requirements.

>Acceptance tests, on the other hand,
>are black box tests that are completely insensative to changes in
>design. You need both kinds of tests.

Please explain, why unit tests are white box, while acceptance - unit,
or other? - are not.

You all don't even get one of the big bennies of testing by units.
Being able to plug working units together purely based upon behavioral
design contract.

Your units - being tested and doing the testing - appear to be
writhing worm balls of LoD violating, encapsulation slap happiness.

You better have courage with XP cause your sanity is going to be truly
"encapsulatedly" tested (you don't care how one thinks rationally,
just lose it).

>>This means that some future refactoring may make your tests inadequate,
>>but the coder doing the refactoring would rely on your tests and check
>>in broken code.

Am I the only one who thinks this happening is awful?

>One would hope that the programmers who were making the change would
>review the tests.

All you have for this travesty is "one would hope"?

> One would also hope that there was an acceptance
>test or two that would catch the bug. But in the end, you are right,
>there are some design changes that might produce bugs that could pass
>through the tests. I don't consider this a severe problem for most
>projects since the incidence is not likely to be high.

Really? In which Bizzaro world? This goes against the most basic
ideas and proven practices with regard to both encapsulation and
testing.

>>> If your unit tests are large and complicated, you have a design problem, not
>>> a testing problem.

>>Again, this can only be true if your testing is dictated by the design,
>>*rather than by the requirements.*

Then you *do* have a design problem Because we established up above
that for unit tests, you "white box" test based upon unit
implementation. ???

>>That's not just backwards, it is
>>doubly dangerous in the XP milieu, where frequent changes to the design
>>are encouraged, and the only validation of those changes comes from the
>>pre-existing tests.

>Again, remember that XP has two kinds of tests. White box unit tests
>and black box acceptance tests.

How does that obviate the dangers of your unit testing process in this
regard?

Kent Beck

unread,

Dec 10, 2001, 6:23:21 PM12/10/01

to

Vladimir <trued...@my-deja.com> wrote in message
news:4837660.01121...@posting.google.com...

> "David B Lightstone" <david.li...@prodigy.net> wrote in message
news:<je5Q7.350$lw2.92...@newssvr16.news.prodigy.com>...
>
> > (3) Mr Beck may have determined that the customer can afford
> > to take the risk because the expected loss associated with
> > the pruned test cases is acceptable.
>
> also in case (3) Mr Beck should give a customer the worst case
> depiction which would state:
>

> "if you'll try to [do] anything our tests didn't do our high quality

system
> may crash on you."

How many tests would I need to write so I was willing to say that the system
*will* crash if they try a scenario not covered by the tests, and they
should pay me anyway? Enough tests so the MTBF of the system as a whole is
dramatically lower than the competition.

Kent

Kent Beck

unread,

Dec 11, 2001, 2:38:57 AM12/11/01

to

Tom Adams <tada...@aol.com> wrote in message
news:793af3df.01121...@posting.google.com...

> "Kent Beck" <kent...@my-deja.com> wrote in message
news:<9v1ss3$adi$1...@suaar1aa.prod.compuserve.com>...
>
> > (Ken's proposed 5 test case solution to Myers Triangle problem was here)
>
> 1. What about oddities in the source code? Like:
>
> if (any side ==4) {
> return 1;
> }
>
> This would cause it to fail for the test case 1,2,4 and your tests
> would miss that.

Don't put oddities like this in your source code. Working strictly test
first, there would have to be a test that was failing before the conditional
above was written.

> You would need white-box testing to catch this, or you would need
> more, perhaps many more, black-box tests to cover all the possible
> internals of the function's code.
>
> 2. There is also the issue of very large integer values. At least, you
> can call the function with literal integers where it will fail. Actually,
> it is not possible to code the function to handle integers of unbounded
> size without using some special integer representations. Anyway, a
> function would pass your tests with undefined behaviour for very large
> integers. And, code using the function could work on one compiler or
> machine and fail on another, depending on the language, because of issues
> related to integer size.

Use a sensible language where integers act like integers, not like 32-bit
counter with rollover. If you want to write a Java version of the function,
and show what happens with 2^31, feel free to write those test cases.

The earlier note that I ignored ill-formed triangles (5,5,11) is absolutely
correct. I missed those test cases. I'll post the updated code and tests
later.

Kent

Stefan Schmiedl

unread,

Dec 11, 2001, 3:25:02 AM12/11/01

to

Universe (2001-12-11 15:22):

> Robert C. Martin <rmartin @ objectmentor . com> wrote:
> >That's correct. Unit tests are white box tests. They are fragile to
> >significant changes of design.
>
> Why? Why wouldn't unit testing only worry about robustly meeting
> behavioral requirements.

because that would leave implementation flaws undetected.
with white-box unit tests you could ensure that third-party
libraries show the same behaviour over releases. since the
environment never sees them,

>
> >Acceptance tests, on the other hand,
> >are black box tests that are completely insensative to changes in
> >design. You need both kinds of tests.
>
> Please explain, why unit tests are white box, while acceptance - unit,
> or other? - are not.

because unit tests are written by the programmer while she
implements the user story. acceptance tests are provided by
the user, who is interested in the macroscopic behaviour of
the program.

In the triangle classification example, the user need not know
if we use a Set to automatically eliminate duplicates or store
the sides in three variables and do everything in some horribly
nested if statements.

>
> You all don't even get one of the big bennies of testing by units.
> Being able to plug working units together purely based upon behavioral
> design contract.

why not? we still have an external interface to our objects, which
is used for plugging.

> ... polemic snipped ...

>
> >>This means that some future refactoring may make your tests inadequate,
> >>but the coder doing the refactoring would rely on your tests and check
> >>in broken code.
>
> Am I the only one who thinks this happening is awful?

why is it awful? by refactoring you are effectively changing the
implementation, hence you need to adapt your tests to the situation.

>
> >One would hope that the programmers who were making the change would
> >review the tests.
>
> All you have for this travesty is "one would hope"?

if you like it more, "the XP process requires it". but the xp-way
tends to rely on people more than other processes.

>
> > One would also hope that there was an acceptance
> >test or two that would catch the bug. But in the end, you are right,
> >there are some design changes that might produce bugs that could pass
> >through the tests. I don't consider this a severe problem for most
> >projects since the incidence is not likely to be high.
>
> Really? In which Bizzaro world? This goes against the most basic
> ideas and proven practices with regard to both encapsulation and
> testing.

would you please provide an example of design changes that are proven not
to create bugs which may go unnoticed for some time?

>
> >>> If your unit tests are large and complicated, you have a design problem, not
> >>> a testing problem.
>
> >>Again, this can only be true if your testing is dictated by the design,
> >>*rather than by the requirements.*
>
> Then you *do* have a design problem Because we established up above
> that for unit tests, you "white box" test based upon unit
> implementation. ???

we have a design problem, when our code refuses to be changed.
we don't have one, when implementation tests change,
when the implementation changes.

>
> >>That's not just backwards, it is
> >>doubly dangerous in the XP milieu, where frequent changes to the design
> >>are encouraged, and the only validation of those changes comes from the
> >>pre-existing tests.
>
> >Again, remember that XP has two kinds of tests. White box unit tests
> >and black box acceptance tests.
>
> How does that obviate the dangers of your unit testing process in this
> regard?

i still fail to see the dangers of our approach.

Stefan Schmiedl

unread,

Dec 11, 2001, 3:25:02 AM12/11/01

to

Kent Beck (2001-12-11 16:42):

> The earlier note that I ignored ill-formed triangles (5,5,11) is absolutely
> correct. I missed those test cases. I'll post the updated code and tests
> later.

would you mind using ruby, as this is ruby-talk? :-)

Hal E. Fulton

unread,

Dec 11, 2001, 3:35:41 AM12/11/01

to

----- Original Message -----
From: Stefan Schmiedl <s...@xss.de>
To: ruby-talk ML <ruby...@ruby-lang.org>
Sent: Tuesday, December 11, 2001 2:23 AM
Subject: [ruby-talk:28185] Re: John Roth dolt ( Re: A challenge to
proponents of Unit Testing. )

> Kent Beck (2001-12-11 16:42):
> > The earlier note that I ignored ill-formed triangles (5,5,11) is
absolutely
> > correct. I missed those test cases. I'll post the updated code and tests
> > later.
>
> would you mind using ruby, as this is ruby-talk? :-)

I'll second that, Stefan... though I was wondering whether
this was crossposted from elsewhere. Is it?

Anyone want to propose a comp.lang.ruby.john.roth.dolt
newsgroup? Just kidding...

Hal

Piers Cawley

unread,

Dec 11, 2001, 4:07:46 AM12/11/01

to

Stefan Schmiedl <s...@xss.de> writes:
> Universe (2001-12-11 15:22):
>> Robert C. Martin <rmartin @ objectmentor . com> wrote:
>> >That's correct. Unit tests are white box tests. They are fragile to
>> >significant changes of design.
>>
>> Why? Why wouldn't unit testing only worry about robustly meeting
>> behavioral requirements.
>
> because that would leave implementation flaws undetected.
> with white-box unit tests you could ensure that third-party
> libraries show the same behaviour over releases. since the
> environment never sees them,

If you write your unit tests before you write your code how can they
be 'white box'? After all, if the box doesn't exist...

--
Piers

"It is a truth universally acknowledged that a language in
possession of a rich syntax must be in need of a rewrite."
-- Jane Austen?

Stefan Schmiedl

unread,

Dec 11, 2001, 4:55:30 AM12/11/01

to

Piers Cawley (2001-12-11 18:01):

> If you write your unit tests before you write your code how can they
> be 'white box'? After all, if the box doesn't exist...

because i build the box to exactly fit the contents as described
by the unit test.

if this is done all the time there is much less spare room
in the box, where you didn't look for bugs.

Stefan Schmiedl

unread,

Dec 11, 2001, 5:00:51 AM12/11/01

to

Hal E. Fulton (2001-12-11 17:33):

> > would you mind using ruby, as this is ruby-talk? :-)
>
> I'll second that, Stefan... though I was wondering whether
> this was crossposted from elsewhere. Is it?
>

don't know ... but it shows us that there is *lots* of territory
to be covered ... one solution used C++, another one Smalltalk,
and it seems that my posted ruby solution went unnoticed in the
main discussion.... maybe it *is* crossposted and they just
don't see what I am replying here :-P

s.

> Anyone want to propose a comp.lang.ruby.john.roth.dolt
> newsgroup? Just kidding...

nice idea to make your name publicly known ;)

suresh vv

unread,

Dec 11, 2001, 5:39:53 AM12/11/01

to

Robert C. Martin <rmartin @ objectmentor . com> wrote in message news:<d62b1ukh61hal3170...@4ax.com>...

> >> >Set<int> sides; // data structure allows no duplicates.
> >> >sides.add(a);
> >> >sides.add(b);
> >> >sides.add(c);
> >> >return sides.length(); // 1 = equilateral, 2=isoceles, 3=scalene.
> ^^^^^^^^^--oops.
> >> >
> >> >What am I missing?
> >
>

> The intent was to return a 1 for equilateral, 2 for isoceles, and 3
> for scalene. Representing the triangle was not part of the job.
>

Sounds like a cop out. You forgot/ignored the "triangle" part of the problem
and just dealt with 3 numbers, returning the number of unique numbers.

There is a lesson in there somwhere for what happens when you
get away from the "real world".

suresh

Stefan Schmiedl

unread,

Dec 11, 2001, 6:36:38 AM12/11/01

to

suresh vv (2001-12-11 19:42):

> Robert C. Martin <rmartin @ objectmentor . com> wrote in message news:<d62b1ukh61hal3170...@4ax.com>...
>

> > The intent was to return a 1 for equilateral, 2 for isoceles, and 3
> > for scalene. Representing the triangle was not part of the job.
> >
>
> Sounds like a cop out. You forgot/ignored the "triangle" part of the problem
> and just dealt with 3 numbers, returning the number of unique numbers.

well, yes, basically that's what the classification of triangles
is about: the number of different sides :-)

>
> There is a lesson in there somwhere for what happens when you
> get away from the "real world".

yup. the lesson is that communication is necessary beyond written specs.
one part of us understood the requirements such that it was *given* that
the numbers are sides from a triangle, hence no testing was required.
a second part started with three numbers coming out of nowhere, while
a thirs fraction started with three arguments that could possibly be
any kind of object.

the problem is definitely not testing, but communication.

Ron Jeffries

unread,

Dec 11, 2001, 8:08:20 AM12/11/01

to

On 11 Dec 2001 02:39:53 -0800, sure...@hotmail.com (suresh vv) wrote:

>> The intent was to return a 1 for equilateral, 2 for isoceles, and 3
>> for scalene. Representing the triangle was not part of the job.
>>
>
>Sounds like a cop out. You forgot/ignored the "triangle" part of the problem
>and just dealt with 3 numbers, returning the number of unique numbers.
>
>There is a lesson in there somwhere for what happens when you
>get away from the "real world".

One of the things we often forget in our zeal to be object oriented is
that the problem is best solved by the most appropriate abstraction.

If the problem is to identify whether a triangle is equilateral,
isoceles, or scalene, it is ONLY the values of the lengths of the
sides that matters. Therefore a collection of three numbers is very
likely the right abstraction.

Just my opinion, of course ...

Ronald E Jeffries
http://www.XProgramming.com
http://www.objectmentor.com

Thaddeus L Olczyk

unread,

Dec 10, 2001, 11:45:19 AM12/10/01

to

On Mon, 10 Dec 2001 23:38:57 -0800, "Kent Beck" <kent...@my-deja.com>
wrote:

>> 1. What about oddities in the source code? Like:
>>
>> if (any side ==4) {
>> return 1;
>> }
>>
>> This would cause it to fail for the test case 1,2,4 and your tests
>> would miss that.
>
>Don't put oddities like this in your source code. Working strictly test
>first, there would have to be a test that was failing before the conditional
>above was written.

Tsk. Tsk. Where have you been? This technique for fixing a bug is one
of the most attractive techniques for fixing a bug there is. Even
highly intellegent programmers who do know better occasionally become
tempted. Albeit under more subtle circumstances.

In fact one of the most renown bugs in Visual C++ 4.2 was caused by
precisely this attitude. There is a bug in the string class that
caused the string to crash on deletion. The conditions of the bug were
that you assigned a string of length 32 or more, then assigned a
small string. When the string assigned to was deleted your program
would crash. Someone fixed this problem by changing the string
class so that the assigned to strings reference count was not
decremented on assignment. Resulting in a large memory leak.

While much of the reason for this problem is the shear stupidity of
some programmers and the insistence on *doing the stupid thing*,
some of the reason smarter programmers do it is that it makes
sense sometimes. ( Eg. in a routine where you calculate the cube
root a number which sits on the strange repulsor should be handled
in this way. )

It should be pointed out that by XP criteria the tests are the specs
and that if those lines of code caused the tests to run successfully,
then the code is up to spec, and the code is not an oddity.

@objectmentor.com Robert C. Martin

unread,

Dec 11, 2001, 9:13:53 AM12/11/01

to

On Mon, 10 Dec 2001 23:38:57 -0800, "Kent Beck" <kent...@my-deja.com>
wrote:

>The earlier note that I ignored ill-formed triangles (5,5,11) is absolutely

>correct. I missed those test cases. I'll post the updated code and tests
>later.

I think it's just this:
sizes.sum() > 2*sides.max()

@objectmentor.com Robert C. Martin

unread,

Dec 11, 2001, 9:22:08 AM12/11/01

to

On 11 Dec 2001 02:39:53 -0800, sure...@hotmail.com (suresh vv) wrote:

A cop out? Why is a simplification a cop out?

Someone had asserted that it took 20+ test cases to verify a function
that evaluated a triangle. I proposed an algorithm that would likely
require less than 22 tests. Kent used that algorithm to propose five
tests. Someone else pointed out that certain illegal triangles pass
the tests, so one more test case is needed.

The whole point here is to show that white box unit test coverage is
strongly dependent upon the design of the system. If you know the
design, then you can significantly prune the tests without losing
coverage.

Jason Voegele

unread,

Dec 11, 2001, 9:42:32 AM12/11/01

to

> Hal E. Fulton (2001-12-11 17:33):
>
>> > would you mind using ruby, as this is ruby-talk? :-)
>>
>> I'll second that, Stefan... though I was wondering whether
>> this was crossposted from elsewhere. Is it?
>>

It's being crossposted from comp.object

> don't know ... but it shows us that there is *lots* of territory
> to be covered ... one solution used C++, another one Smalltalk,
> and it seems that my posted ruby solution went unnoticed in the
> main discussion.... maybe it *is* crossposted and they just
> don't see what I am replying here :-P
>

Presumably, posts to ruby-talk are not being propogated to comp.object, so
it's likely the primary contributors to this thread never saw your message.

Jason

Bob Binder

unread,

Dec 11, 2001, 9:55:24 AM12/11/01

to

Keith Ray <k1e2i3t...@1m2a3c4.5c6o7m> wrote in message news:<k1e2i3t4h5r6a7y-7C...@news.attbi.com>...

There is no evidence that this has (or would) have happened for an
actual implementation. Measuring coverage requires using a coverage
analyzer or equivalent instrumentation. Simply calling every method in
a (sub)class interface does not guarantee coverage of the statements
in the method implementations. Excluding toy problems, it is
completely impractical to attempt to assess coverage without using an
automated coverage analyzer.

Lest there be any confusion, I should also say that achieving
statement coverage is "adequate" testing *only* in a very narrow
technical sense. It is usually easy to achieve statement coverage and
miss lots of nasty bugs. But it is also very easy to incorrectly
conclude that a "black box" (behaviorally-based) test suite has
reached all statements. In fact, repeated experiments have shown that
without the feedback of a coverage analyzer, you rarely get to more
than 2/3s of the code using only a "black-box" testing approach.

The narrow technical sense of "adequate" is based on the fact that you
*must* at least exercise buggy code to reveal a bug. You have zero
chance of revealing a bug if you don't exercise buggy code. On the
other hand, exercising buggy code does not equal a 100% chance of
revealing a bug.

So, if you can't demonstrate by coverage analysis that your test suite
has reached all the code in the scope of the test, then you have
almost certainly not gotten to everything. This untouched code is
often *more* likely to be buggy. You've probably tried all the
obvious stuff, so what remains can only be reached under unusual,
non-obvious conditions, and you may not have completely worked out how
your app should deal with these conditions (if you had, you'd probably
have run a test to show it off.) The chance of lurking bugs in
un-exercised segments usually increases for mystery code that has been
"reused" or maintained for a long time.

The fact that it may be infeasible or impractical to reach some code
segments, meaning that you can't always get 100% coverage, doesn't
diminish this basic result.

Once you get the hang of it, running a coverage analyzer is about as
hard as running a spell checker -- it is not the god-awful monster
that many people seem to think it is. But just like a document with no
spelling errors is not necessarily a good document, getting 100%
statement coverage with a coverage analyzer doesn't mean you've
necessarily done a good job of testing.

Achieving 100% branch coverage is one of the exit criteria in almost
all of the test design patterns presented in my book. The additional
exit criteria are pattern-specific, and determined by the test model
and probable bugs of the implementation under test. Achieving these
exit criteria are what I consider to be adequate testing.

_________________________________________________________________
Bob Binder http://www.rbsc.com RBSC Corporation
rbinder@ r b s c.com Advanced Test Automation 3 First National Plz
312 214-3280 tel Process Improvement Suite 1400
312 214-3110 fax Test Outsourcing Chicago, IL 60602

Bob Binder

unread,

Dec 11, 2001, 10:35:20 AM12/11/01

to

Robert C. Martin <rmartin @ objectmentor . com> wrote in message news:<d62b1ukh61hal3170...@4ax.com>...

> On 10 Dec 2001 09:35:08 -0800, rbi...@rbsc.com (Bob Binder) wrote:
>
> >Robert C. Martin <rmartin @ objectmentor . com> wrote in message news:<h2n41ucnnc8ipnsee...@4ax.com>...
> >> On Sat, 08 Dec 2001 00:35:43 -0600, Robert C. Martin <rmartin @
> >> objectmentor . com> wrote:
> >>
> >> >On 7 Dec 2001 02:04:08 -0800, stephe...@motorola.com (Steve Hill)
> >> >wrote:
> >> >
> >> >>What tests would you write for a function that takes 3 numbers (the
> >> >>side lengths of a triangle) , and returns whether the triangle is
> >> >>equilateral, scalene or isoceles.
> >> >
> >> >Set<int> sides; // data structure allows no duplicates.
> >> >sides.add(a);
> >> >sides.add(b);
> >> >sides.add(c);
> >> >return sides.length(); // 1 = equilateral, 2=isoceles, 3=scalene.
> ^^^^^^^^^--oops.
> >> >
> >> >What am I missing?
> >
> >In this example, wouldn't using an int set for the sides prevent
> >representing any triangle with two or more equal lengths -- I think a
> >bag would do the job. Assuming that your set would reject identical
> >values, any test suite which did not include tests like 3,3,4 or 3,3,3
> >would miss that bug.
>
> The intent was to return a 1 for equilateral, 2 for isoceles, and 3
> for scalene. Representing the triangle was not part of the job.
>

Ok, this is clever way to do the classification. But it seems to me
this is only a fragment of the solution to the scope of the complete
triangle problem, and conclusions drawn about testing this fragment
versus testing a complete solution are skewed.

Unless Set objects silently reject dups, the client code will have to
have a catch block, which means part of the classification scheme
resides in the client. The client will have to constrain lengths to 1
or greater, and assure that the sides are constructable for the
isoceles and scalene cases. If not, your code will claim that 1,1,10
and 1,2,10 are sides of triangle, which they are not. As the
implementation of these constraints is necessary for producing correct
answers, it should be included in scope of the test suite.

Stefan Schmiedl

unread,

Dec 11, 2001, 10:35:43 AM12/11/01

to

Jason Voegele (2001-12-11 23:39):

> > Hal E. Fulton (2001-12-11 17:33):
> >
> >> > would you mind using ruby, as this is ruby-talk? :-)
> >>
> >> I'll second that, Stefan... though I was wondering whether
> >> this was crossposted from elsewhere. Is it?
> >>
>
> It's being crossposted from comp.object
>

> Presumably, posts to ruby-talk are not being propogated to comp.object, so
> it's likely the primary contributors to this thread never saw your message.
>

is there a special reason why the replies to news messages
do not get forwarded back to the originating newsgroups?

Thaddeus L Olczyk

unread,

Dec 11, 2001, 11:05:13 AM12/11/01

to

You must also add a reporting mechanism for the violation of these
tests. And then you need to add more tests for the reporting
mechanism.

Kent Beck

unread,

Dec 11, 2001, 6:57:07 PM12/11/01

to

Kent Beck <kent...@my-deja.com> wrote in message

news:9v4dab$m80$1...@suaar1ab.prod.compuserve.com...

> The earlier note that I ignored ill-formed triangles (5,5,11) is
absolutely
> correct. I missed those test cases. I'll post the updated code and tests
> later.

The test is:

TriangleTest>>testIrrational
[self evaluate: 1 side: 2 side: 3]
on: Exception
do: [:ex | ^self].
self fail

The fix is:

TriangleTest>>evaluate: aNumber1 side: aNumber2 side: aNumber3
| sides |
sides := SortedCollection
with: aNumber1
with: aNumber2
with: aNumber3.
sides first <= 0 ifTrue: [self fail].
(sides at: 1) + (sides at: 2) <= (sides at: 3) ifTrue: [self fail].
^sides asSet size

Which points out that the original test for scalenes is not well formed, so
we change the data to:

TriangleTest>>testScalene
self assert: (self evaluate: 2 side: 3 side: 4) = 3

More test cases, anyone?

Kent

suresh vv

unread,

Dec 12, 2001, 3:33:16 AM12/12/01

to

Robert C. Martin <rmartin @ objectmentor . com> wrote in message > >> >

> A cop out? Why is a simplification a cop out?

Here is your quote:

> >> The intent was to return a 1 for equilateral, 2 for isoceles, and 3
> >> for scalene. Representing the triangle was not part of the job.

Returning 1, 2 or 3 all indicate a type of triangle.
Checking for a triangle definitely is part of the job.

Although the original thread was talking about testing, my comment
was on implementing the "detect triangle type" method as you had done it.

suresh

Tom Adams

unread,

Dec 12, 2001, 8:07:45 AM12/12/01

to

"John Roth" <john...@ameritech.net> wrote in message news:<u1b0ic1...@news.supernews.com>...

> "Tom Adams" <tada...@aol.com> wrote in message
> news:793af3df.01121...@posting.google.com...

> > "Kent Beck" <kent...@my-deja.com> wrote in message

> news:<9v1ss3$adi$1...@suaar1aa.prod.compuserve.com>...
> >
> > > (Ken's proposed 5 test case solution to Myers Triangle problem was
> here)
> >

> > 1. What about oddities in the source code? Like:
> >
> > if (any side ==4) {
> > return 1;
> > }
> >
> > This would cause it to fail for the test case 1,2,4 and your tests
> > would miss that.
> >

> > You would need white-box testing to catch this, or you would need
> > more, perhaps many more, black-box tests to cover all the possible
> > internals of the function's code.
>

> You're violating the assumption that we're talking about XP. All
> production code should be written from the unit tests, so such a
> randomly idiotic test shouldn't be in there in the first place. If it
> was, the pair should have caught it, and the next person to look
> at the function should catch it.

Well, "written from unit test" can't be the whole answer. I could
easily write code from Ken's unit tests that would pass only those
test. The is something else. It has to be written more in a more
general fashion to pass the test and meet the specification.

My specific example does seem unlikely, but such things can happen in
more subtle ways. For instance, one could be trying to reuse some
function that has some limitations. You code passes Ken's test but
fails some other valid test.

I started this thread to question the use of the term "comprehensive
tests" for the testing that Ken proposed. Perhaps his set of test
imply the existence of an implementation meeting the spec. that is
completely tested by those test cases. He is assuming that implementation.
That might work, but proving that the test, the specification, and the
code have this relationship is probably as difficult as proving that
the program is correct without any testing.

Tom Adams

unread,

Dec 12, 2001, 8:15:50 AM12/12/01

to

rbi...@rbsc.com (Bob Binder) wrote in message news:<ded6b237.01121...@posting.google.com>...

> Keith Ray <k1e2i3t...@1m2a3c4.5c6o7m> wrote in message news:<k1e2i3t4h5r6a7y-7C...@news.attbi.com>...
> > In article <ded6b237.01121...@posting.google.com>,
> > rbi...@rbsc.com (Bob Binder) wrote:
> >
> > > BTW, the apparent meaning of "adequate testing" in this thread is not
> > > consistent with definition used in the testing community for over 15
> > > years: adequate testing must *at least* achieve statement coverage.
> > > There are many other possible definitions of "adequate testing", but
> > > if your tests don't at least reach all the code in the IUT, you can't
> > > claim to be doing adequate testing. Although any definition of
> > > "adequate testing" which attempts a more stringent criteria can be
> > > falsified, many unambiguous non-subjective criteria for adequacy have
> > > been useful in practice, but they don't include "I say so".
> >
> > Didn't the tests in the examples by Kent and Uncle Bob have 100%
> > statement coverage, so by your definition, they are "adequate"?
>
> There is no evidence that this has (or would) have happened for an
> actual implementation.

I have a theory about Ken's test first method. I think that he trying
to write a set of tests that is adequate, even complete, for some
implementation. The idea is that there exists such an implementation
and he is assuming that that is the one that will be written after the
tests. That is the only way I can make sense out of the fact that XPer's
keep referring to this kind of testing as comprehensive.

Piers Cawley

unread,

Dec 12, 2001, 7:06:46 AM12/12/01

to

"Kent Beck" <kent...@my-deja.com> writes:

> Kent Beck <kent...@my-deja.com> wrote in message
> news:9v4dab$m80$1...@suaar1ab.prod.compuserve.com...
>> The earlier note that I ignored ill-formed triangles (5,5,11) is
> absolutely
>> correct. I missed those test cases. I'll post the updated code and tests
>> later.
>
> The test is:
>
> TriangleTest>>testIrrational
> [self evaluate: 1 side: 2 side: 3]
> on: Exception
> do: [:ex | ^self].
> self fail
>
> The fix is:
>
> TriangleTest>>evaluate: aNumber1 side: aNumber2 side: aNumber3
> | sides |
> sides := SortedCollection
> with: aNumber1
> with: aNumber2
> with: aNumber3.
> sides first <= 0 ifTrue: [self fail].
> (sides at: 1) + (sides at: 2) <= (sides at: 3) ifTrue: [self fail].

^^
Shouldn't that be '<', otherwise (1,1,2) is a valid triange, which
seems unexpected.
> ^sides asSet size

Bob Binder

unread,

Dec 12, 2001, 8:54:33 AM12/12/01

to

"Kent Beck" <kent...@my-deja.com> wrote in message news:<9v4dab$m80
[snip]

> Use a sensible language where integers act like integers, not like 32-bit
> counter with rollover. If you want to write a Java version of the function,
> and show what happens with 2^31, feel free to write those test cases.
>

Surely there's a limit at some size -- could you report to us how many
milli-seconds and the value of i at which your Smalltalk of choice
complains if you do something like

int i,j;
j=2;
maxInt=0x7FFFFFFF;
for(i=1;i<maxInt;++i){j=j*2;}

Please excuse my use of this barbaric tongue to state the
requirements, but I was never able to abide the mental contortion that
Smalltalk syntax imposed on my feeble brain. And please feel free to
run up maxInt to infinity and beyond, if my guess doesn't do the job.

When you've found out what the limit L is in your environment, then
add 3 tests: {L,L,L},(L-1,L,L}, {L,L-1,L-2}. All are valid triangles
in the bounds of your representation, and should therefore be
correctly classified.

Bob Binder

unread,

Dec 12, 2001, 9:08:32 AM12/12/01

to

rbi...@rbsc.com (Bob Binder) wrote in message news:<ded6b237.01121...@posting.google.com>...

> Keith Ray <k1e2i3t...@1m2a3c4.5c6o7m> wrote in message news:<k1e2i3t4h5r6a7y-7C...@news.attbi.com>...
> > In article <ded6b237.01121...@posting.google.com>,
> > rbi...@rbsc.com (Bob Binder) wrote:
> >
> > > BTW, the apparent meaning of "adequate testing" in this thread is not
> > > consistent with definition used in the testing community for over 15
> > > years: adequate testing must *at least* achieve statement coverage.
> > > There are many other possible definitions of "adequate testing", but
> > > if your tests don't at least reach all the code in the IUT, you can't
> > > claim to be doing adequate testing. Although any definition of
> > > "adequate testing" which attempts a more stringent criteria can be
> > > falsified, many unambiguous non-subjective criteria for adequacy have
> > > been useful in practice, but they don't include "I say so".
> >
> > Didn't the tests in the examples by Kent and Uncle Bob have 100%
> > statement coverage, so by your definition, they are "adequate"?

On further reflection, I'd agree that the proposed test suites achieve
statement coverage for the example code. The examples have no
selection or iteration, so any single test will do the job. This
provides a good example of why statement coverage is a necessary but
not a sufficient condition for adequate testing.

In typical situations with lots of branching and iteration, the tests
needed to achieve statement coverage are not obvious, hence the need
for a coverage analyzer to determine that statement coverage (better
yet branch) has been actually been achieved.

Bob Binder

unread,

Dec 12, 2001, 9:59:14 AM12/12/01

to

Robert C. Martin <rmartin @ objectmentor . com> wrote in message news:<i85c1u0fe2mhlsvjl...@4ax.com>...

> On 11 Dec 2001 02:39:53 -0800, sure...@hotmail.com (suresh vv) wrote:

[snip]

>
> Someone had asserted that it took 20+ test cases to verify a function
> that evaluated a triangle. I proposed an algorithm that would likely
> require less than 22 tests. Kent used that algorithm to propose five
> tests. Someone else pointed out that certain illegal triangles pass
> the tests, so one more test case is needed.
>
> The whole point here is to show that white box unit test coverage is
> strongly dependent upon the design of the system. If you know the
> design, then you can significantly prune the tests without losing
> coverage.

I agree with this notion in general. Effective testing must be guided
by a guess about what bugs are likely, what kind of bugs we're going
after, or what we suspect might go wrong. If we know that the
implementation under test (IUT) does not have any branching, then we
can "prune" tests predicated on the assumption that a bug exists in
conditional logic. This is the only rational strategy for dealing
with the fundamental problem of test design: out of the infinite
number of tests (input and sequence combinations) we could run, which
should we do? Since most of those combinations will work correctly,
we make best use of available testing time when we focus on tests
which have the best chance of revealing a bug.

If knowledge about the IUT helps, that's fine. But there are dangers
in using that knowledge. First, it is hard for any one person to be
the judge, jury, prosecutor, and defendant -- the developer mindset
usually is rooted in the defense. Software has a unique ability to
make fools and liars out of even the best developers. Second, code
does what it does -- it is not necessarily correct and complete, even
if it runs without crashing. Third, code is a notoriously bad
repository of requirements and implementation assumptions -- this
information is critical for designing tests. Fourth is what Beizer
calls the Pesticide Paradox: effective test suites and test strategies
quickly kill all the bugs they can, but are then no good at preventing
or revealing residual bugs. A common result of this that a once
stable system, for which a regression suite is frequently run,
suddenly has a big increase in bug rates.

So, I think the goal should not be to minimize the test suite, but to
maximize your chance of finding bugs.

Patrick May

unread,

Dec 12, 2001, 12:56:52 PM12/12/01

to

> You must also add a reporting mechanism for the violation of these
> tests. And then you need to add more tests for the reporting
> mechanism.

Do you mean JUnit?

Panu Viljamaa

unread,

Dec 13, 2001, 6:10:13 AM12/13/01

to

Ron Jeffries wrote:

> The usual Smalltalk convention is to write a method defintion like
> this:
> insert: anObject into: aCollection

This is exactly the same convention I'm talking about, and am advocating. I'm simply saying we should take it a bit more seriously as a semi-official standard for dynamically typed languages, like Smalltalk and Ruby.

A proposed formalization of this convention is at http://members.fcc.net/panu/4rules.htm ; which expands it a bit by adding ways to express also *return types* and 'generic types'.

-Panu Viljamaa

Tom Adams

unread,

Dec 13, 2001, 8:12:38 AM12/13/01

to

rbi...@rbsc.com (Bob Binder) wrote in message news:<ded6b237.01121...@posting.google.com>...

> "Kent Beck" <kent...@my-deja.com> wrote in message news:<9v4dab$m80
> [snip]
> > Use a sensible language where integers act like integers, not like 32-bit
> > counter with rollover. If you want to write a Java version of the function,
> > and show what happens with 2^31, feel free to write those test cases.
> >
> Surely there's a limit at some size -- could you report to us how many
> milli-seconds and the value of i at which your Smalltalk of choice
> complains if you do something like
>
> int i,j;
> j=2;
> maxInt=0x7FFFFFFF;
> for(i=1;i<maxInt;++i){j=j*2;}
>
> Please excuse my use of this barbaric tongue to state the
> requirements, but I was never able to abide the mental contortion that
> Smalltalk syntax imposed on my feeble brain. And please feel free to
> run up maxInt to infinity and beyond, if my guess doesn't do the job.
>
> When you've found out what the limit L is in your environment, then
> add 3 tests: {L,L,L},(L-1,L,L}, {L,L-1,L-2}. All are valid triangles
> in the bounds of your representation, and should therefore be
> correctly classified.
>

With the triangle problem, the function might run out of memory when it
tries to sum two of the sides for very large integers.

Ken Beck has not addressed this issue in his "comprehensive" set of test.

Phlip

unread,

Dec 13, 2001, 5:05:39 PM12/13/01

to

> While much of the reason for this problem is the shear stupidity...

I thought it was caused by "sheer stupidity"...

> It should be pointed out that by XP criteria the tests are the specs
> and that if those lines of code caused the tests to run successfully,
> then the code is up to spec, and the code is not an oddity.

The rules (presented for anyone who actually cares what they are):

"The right design for the software at any given time is the one that

1. Runs all the tests
2. Has no duplicated logic.
3. States every intention important to the programmers.
4. Has the fewest possible classes & methods."

Code may easily suck but pass item 1. That's why the others exist.
Code that back-patches to cover over a bug (in any way conceivable)
violates 2. and 4.

This is an absolutely correct example of "emergent design". Overly
anal architects may spend their time adding rules, such as "No
backpatching to cover bugs." But a set of reduced instructions beats a
set of deliberately enhanced ones every time.

--
Phlip phli...@yahoo.com
http://www.greencheese.org/AndNowaNitelyLitelyUrbanOne
-- Who needs a degree when we have Web e-search engines? --

Marc Gluch

unread,

Dec 14, 2001, 10:06:26 AM12/14/01

to

Ron Jeffries <ronje...@REMOVEacm.org> wrote in message news:<74CECE8B145DE9C3.0F450343...@lp.airnews.net>...
> On Sun, 02 Dec 2001 18:03:12 -0500, Panu Viljamaa <pa...@fcc.net>
> wrote:
>
> >Yes I have. What happened was I find it hard to understand how to
use a method written by someone else when there is no indication of
what kind of arguments should be passed to it. It is also hard to say
if it is right or wrong when there is no clearly written contract on
"what it should do". This might be specified by some test-case-class
somewhere, but I'd rather have 'contract' of the method visible as
soon as I view the method's source, so I don't have to go test-case
hunting.

>
> The usual Smalltalk convention is to write a method defintion like
> this:
>
> insert: anObject into: aCollection
>

> which means about the same thing as
>
> void insert( Object o, Collection c)
>
> The name of the method, insert:into: is chosen to reflect what the
> method should do.

I'm all for intention revealing selectors (and more generally,
intention revealing names), but you left out a discussion
of tradeoffs in naming parameters after their types
vs their roles in the method.

Furthermore, your example raises a question about how revealing
this particular selector is:
The Collection class has a selector for inserting elements.
Seeing a method such as
#insert: AnObject into: aCollection
aCollection add: anObject.
makes me question the quality of the design.
More likely #insert:into: does more than insert,
but then it should have a different name -
one that reflects the semantics of the domain.

Marc Gluch
Mindtap Inc.

Dave Harris

unread,

Dec 14, 2001, 3:36:00 PM12/14/01

to

rbi...@rbsc.com (Bob Binder) wrote (abridged):

> There is no evidence that this has (or would) have happened for an
> actual implementation. Measuring coverage requires using a coverage
> analyzer or equivalent instrumentation. Simply calling every method in
> a (sub)class interface does not guarantee coverage of the statements
> in the method implementations. Excluding toy problems, it is
> completely impractical to attempt to assess coverage without using an
> automated coverage analyzer.

I mostly agree with this, and it bothers me that XP advocates don't seem
to pay much attention to coverage tools.

However, it may not be as bad as you think because "test first"
programming tends to lead to much higher coverage than after-the-fact
testing. In XP, production code is only written to fix a failing unit
test. Therefore the test should execute all parts of the new code.

This takes discipline, and is part of what they mean when they talk about
programming in tiny steps, and avoiding anticipation. Don't write code
which you think will be needed to satisfy the next test; do not write any
code which will not be executed by the current set of tests. Always write
the test before the code, and verify that the test /fails/ without the
code so you know the code must be executed for the test to pass.

I don't expect this gives 100% coverage, and as I say it bothers me that
we don't seem to have hard figures about what percentage is actually
reached in practice. It will give much higher coverage than "test last"
and I wouldn't be surprised if the coverage was higher than 95%. (I
appreciate your comments that a remaining 5% will have disproportionately
more bugs.)

Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
bran...@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."

Ron Jeffries

unread,

Dec 14, 2001, 7:16:27 PM12/14/01

to

On 11 Dec 2001 06:55:24 -0800, rbi...@rbsc.com (Bob Binder) wrote:

>There is no evidence that this has (or would) have happened for an
>actual implementation. Measuring coverage requires using a coverage
>analyzer or equivalent instrumentation. Simply calling every method in
>a (sub)class interface does not guarantee coverage of the statements
>in the method implementations. Excluding toy problems, it is
>completely impractical to attempt to assess coverage without using an
>automated coverage analyzer.

Bob, thanks for a comprehensive report on coverage. I am a bit
surprised, however, because you seem to me to be (sort of) saying
that theory says you can't accomplish what Beck actually accomplished
in practice.

What seems to me to happen when one goes strictly test-first is that
one gets very high coverage, of all branches. There are exceptions ...
and exceptions are one of them. Often in Java you can't even get a
statement to compile unless you embed it in a try block. This forces
the programmer to code both sides of the block while he only has a
test for one side.

Overall, however, the practice gets very high coverage, which (while I
agree it is at best the beginning, not the end of good testing) is
very much higher than programmers "generally" accomplish.

What you didn't much address is whether this little example needs more
tests or not. In my view, it does not (after the one for not a
triangle, which has been provided). It certainly seems not to require
22.

With all due respect, your theoretical answer seems to me to try to
sweep aside an interesting practical result, of which this toy is an
exanple: the test-first practice seems to produce code which needs far
fewer tests than black-box theory would suggest. It might just be an
artifact of the example, except that many of us who are using the
practice observe that it works well all around.

My belief is that test-first is in some sense a different kind of
testing and programming from the joe codes it and jack tests it kind
that we have used in the past.

I could be wrong ... I frequently am,