New article: What Really Is TDD?

225 views
Skip to first unread message

AlexBolboaca

unread,
May 18, 2015, 10:29:12 AM5/18/15
to software_cr...@googlegroups.com
Hi,

Given the periodic discussions that appear about TDD, I felt the urge to put down my definite answer on what TDD is. You can read it here: http://www.alexbolboaca.ro/wordpress/my-take-on/what-is-really-tdd.

Here's the short version:
  • Design is intentionally conceiving and giving form to artifacts that solve problems
  • Computer code is such an artifact, therefore any piece of code that intentionally solves a problem is designed
  • Therefore TDD is a method for obtaining design
  • Good design means design that has certain qualities. The most common quality we seek today is changeability.
  • TDD offers some built-in qualities: testability and improved mistake-proofing. The developer has to work to improve other qualities such as changeability. This is why practitioners use SOLID principles to guide their design decisions.
  • Therefore the qualities of the design obtained through TDD largely depend on the skills of the designer
  • When doing TDD, the developer designs before starting (eg. because using an MVC web framework) and all throughout the TDD cycles: when writing the test (pick class / method names, decide on types of classes to use etc), when implementing the code (variable names) and when refactoring.
  • I propose that TDD is a method for incremental design, since the solution grows step by step. This also relates to problem solving, and the circle closes – because design means solving a problem.
As usual, your comments and questions are most welcome.

I hope we can all build on this and have a clearer view on why we do what we do.

Thanks,
Alex

Tim Ottinger

unread,
May 18, 2015, 12:47:18 PM5/18/15
to software_cr...@googlegroups.com
Not to mention the frequently recurring opportunity (if not *requirement*) to integrate and refactor what you have written. 

--
You received this message because you are subscribed to the Google Groups "software_craftsmanship" group.
To unsubscribe from this group and stop receiving emails from it, send an email to software_craftsma...@googlegroups.com.
To post to this group, send email to software_cr...@googlegroups.com.
Visit this group at http://groups.google.com/group/software_craftsmanship.
For more options, visit https://groups.google.com/d/optout.



--

Philip Schwarz

unread,
May 19, 2015, 2:22:58 AM5/19/15
to software_cr...@googlegroups.com
>Given the periodic discussions that appear about TDD

One such discussion was started recently by Sandro Mancuso when he tweeted the following:

"I believe software design should be taught before TDD. TDD can’t lead to good design if we don’t know what good design looks like."


Sandro and J.B. Rainsberger later had a Google Hangout: TDD and Software Design.

Later, Sandro blogged: Does TDD really lead to good design? 

Philip

Philip Schwarz

unread,
May 19, 2015, 3:49:35 AM5/19/15
to software_cr...@googlegroups.com

I don't understand why in his blog post Sandro says the following:

"4 Rules of Simple Design are NOT part of TDD and I’m purely discussing TDD here. 4 Rules of Simple Design is normally the design guidelines that many experienced TDD practitioners use (including myself, among other techniques) during the refactoring phase."

A couple of days earlier, I tried challenging that notion in the following twitter exchange (the two extracts are from the 1st edition of 'XP explained'):

@sandromancuso @RonJeffries just to make sure I understand your point: do you consider the 4Rules PART of TDD?

@sandromancuso @RonJeffries I'm asking because I understand them as one of the many great recommendations from XP. Something you combine with TDD.

@sandromancuso @RonJeffries and I'll believe in whatever you say, since you were at the core of it. ;)

@RonJeffries @sandromancuso no they are the rules of simple design and nearly any design difficulty shows up as a problem with one or more of them

@sandromancuso @RonJeffries OK. I agree with that. Just wanted to make sure they are not part of TDD but definitely important during refactor.

@philip_schwarz .@sandromancuso @RonJeffries The design strategy advocated by XP is TDD: 1/4  


@philip_schwarz @sandromancuso @RonJeffries The link between TDD and 4RSD is step four: "If you ever see the chance to make the design simpler, do it." 2/4

@philip_schwarz @sandromancuso @RonJeffries XP defines the 'best' design as the 'simplest' design that runs all the test cases 3/4

@philip_schwarz @sandromancuso @RonJeffries XP defines 'simplest' as the following four constraints, in priority order: 4/4 

Sandro didn't respond. What is also strange (for me) is that Ron Jeffries (who is an XP authority) replied to Sandro's question with a 'no'.

Philip

Ron Jeffries

unread,
May 19, 2015, 5:21:01 AM5/19/15
to software_cr...@googlegroups.com
Hi Philip,

On May 19, 2015, at 3:49 AM, Philip Schwarz <philip.joh...@googlemail.com> wrote:

Sandro didn't respond. What is also strange (for me) is that Ron Jeffries (who is an XP authority) replied to Sandro's question with a 'no'.

I may have lost context. I think the question you refer to was whether the four rules of simple design are part of TDD, and my answer was “no, they are not”. And you find it confusing. I’ll respond to this. If this is the wrong question or answer that you find confusing, we’ll have to try again.

TDD is RED / GREEN / REFACTOR. 
RED is write a failing test
GREEN is make it work.
REFACTOR is clean up the design.

In essence, that’s all that TDD is.

Beck's rules of simple design are the same old 1,2,3,4 as usual. There are other rules of design. SOLID, for example. And there are different rules / guidelines / ideas from different authors and applicable to different architectures and languages. At base, all those ideas are about one impossible to define thing: a good design. We have to learn what a good design is. We can learn it by study, by applying rules, by al kinds of means. In the end, we have to learn to recognize a good design the same way we recognize a valid English sentence: we internalize a vast amount of detail.

When we clean up the design in REFACTOR, we look at the design we have, we imagine one that would be better, and we apply a number of small code changes, to move the code from the design we don’t like toward the one that we do, without breaking it. 

We might use the four rules 1,2,3,4 to decide what to do. If so, we’d be using the four rules inside TDD. But we might use other design rubrics, any design ideas that we have.

So the four rules can be used in TDD, but they are not part of TDD. They might be part of your practice: they definitely are part of mine. But they are not TDD. TDD is RED / GREEN / REFACTOR. Rules 1,2,3,4 can be used in there but TDD doesn’t know or care.

Make sense?

Ron Jeffries
www.XProgramming.com
Sometimes you just have to stop holding on with both hands, both feet, and your tail, to get someplace better. 
Of course you might plummet to the earth and die, but probably not: you were made for this.

Sleepyfox

unread,
May 19, 2015, 6:08:17 AM5/19/15
to software_cr...@googlegroups.com
I think there are different developer cultures, and differing ideas of
what constitutes 'design' vs. what constitutes 'programming'.

I personally agree with Jack Reeves that programming software *IS*
design, and that asking 'where does your programming end and your
designing begin', or visa versa, is a clear case of 'mu'.

http://www.hacker-dictionary.com/terms/mu

I suspect that the part of the conversation between Sandro and J.B. on
Twitter that is linked above is a good example of this: where you see
one person say "A cannot be followed by B" and another say "But I did
A and then B happened" and then the two argue about whether experience
actually does match theory or whether it cannot be counted because of
condition X or whether theory needs extension Y in order to account
for experience or or or or...

This is generally a good indicator of 'mu': that there are incorrect
assumptions about the world at play and rethinking our presuppositions
and our model of the world can be helpful.

"All models are wrong, but some are useful" -- George Box

Nigel Runnels-Moss
@sleepyfox
--

Adam Sroka

unread,
May 19, 2015, 6:45:05 PM5/19/15
to software_cr...@googlegroups.com
I'd like to add a caveat: TDD is also not about object-oriented design. In theory you can do it with any kind of design. However, if you search the literature on TDD and refactoring you will have a hard time finding advice on anything but OO. You will have to figure out a lot of stuff on your own. 

Likewise, nearly every trainer or coach that I know uses Kent's rules (or some derivative.) I don't find a lot of easily accessible information on ways to apply it differently. 

I guess what I am saying is that while it is technically true that these things aren't part of TDD they are the de facto implementation of TDD that has been handed down to the masses. Which is not, I think, a bad thing. 

Raoul Duke

unread,
May 19, 2015, 6:47:44 PM5/19/15
to software_cr...@googlegroups.com
There was a blog post a while back I can never find again, mentioning
a TDD workshop that was trying to enforce "the simplest thing that
could possibly work" with an underlying motive to teach (pure)
functional programming as a design meme.

Adam Sroka

unread,
May 19, 2015, 6:54:00 PM5/19/15
to software_cr...@googlegroups.com
The exception that proves the rule. There are guys doing it, conference talks, mentions in books, etc. But when Joe Manager calls to get his Scrum team trained in "Engineering Practices" they do it in Java and see slides on the Four Rules. Or when I go out and buy a book to teach myself the examples are in Java (or C# or Ruby) and the Four Rules are described (or alluded to) repeatedly. 

Raoul Duke

unread,
May 19, 2015, 7:12:31 PM5/19/15
to software_cr...@googlegroups.com
> The exception that proves the rule.

yeah, i was meant to be agreeing that people should consider using it
for/in other purposes/ways/goals.

Philip Schwarz

unread,
May 25, 2015, 1:22:24 PM5/25/15
to software_cr...@googlegroups.com

Hello Ron,

thanks for your reply.

First of all, can I state what is hopefully obvious and say that what drove me to contribute to this thread the way I did is not a desire to prove anyone wrong, but the desire to learn more about TDD and the four rules.

What I find confusing is that there is an assertion being made that the four rules of simple design are not part of TDD, but at the same time I believe XP Explained 1st edition asserts the opposite.

The book says the XP design strategy is the following:

The above seems to me to be the RED-GREEN-REFACTOR cycle, the las step being the refactoring step and having simple design as its target. 

The book also defines the 'best' design as the 'simplest' design that runs all the test cases. So the best design that the TDD cycle can arrive at is the simplest design that passes the first of the four rules of simple design, i.e. 'Pass all tests':

The book defines simple design as follows:

Points 1 and 2 are the second and third of the four rules of simple design, and points 3 and 4 are the fourth of the four rules.

I find that logic dictates that from the above we must conclude that the 4 rules of simple design are part of TDD.

What do you think?

Philip

Philip Schwarz

unread,
May 25, 2015, 2:33:24 PM5/25/15
to software_cr...@googlegroups.com
Ron,

one more attempt at arguing that the 4 rules of simple design (or at least some of them) are part of TDD. 

In what is effectively the first page of Kent Beck's Test Driven Development book (the first page of the Preface), we read the following:

In TDD we:
  • Write new code only if an automated test has failed
  • Eliminate duplication
and a few lines later:

The two rules imply an order to the tasks of programming.
  1. Red— Write a little test that doesn't work, and perhaps doesn't even compile at first.
  1. Green— Make the test work quickly, committing whatever sins necessary in the process.
  1. Refactor— Eliminate all of the duplication created in merely getting the test to work.
Red/green/refactor—the TDD mantra.

and on page seven:

Remember, the cycle is as follows:
  • Add a little test.
  • Run all tests and fail.
  • Make a little change.
  • Run the tests and succeed.
  • Refactor to remove duplication.
and in the Pluggable Object pattern section of the book (page 172), we read:

the second imperative of TDD is the elimination of duplication

I think the above excerpts make it clear that the removal of duplication is central to TDD.

By the way, as a very brief reminder of why duplication removal is so important, let me just quote the following two statements from the 'Dependency and Duplication' section on page 8:

 
Dependency is the key problem in software development at all scales
...
If dependency is the problem, duplication is the symptom

Next, I'd like to quote the following two excerpts from J.B.Rainsberger's blog post at http://blog.thecodewhisperer.com/2013/12/07/putting-an-age-old-battle-to-rest/

I don’t think it matters whether you focus first on removing duplication or on revealing intent/increasing clarity, because these two guidelines very quickly form a rapid, tight feedback cycle. By the time the guidelines guide you to any useful results, you’ll have probably used them both. Therefore, order the rules however you like, because you’ll get to the same place either way.

 
The Four Elements of Simple Design Revisited
I have been teaching for years about how to reduce the four elements of simple design to two: after several months, I don’t think about writing tests any more—I simply call that “programming”—and I’ve never seen a well-factored code base that had an order of magnitude too many elements. With these two points out of the way, I guide my design with two basic rules: remove duplication and improve names. I’ve started thinking about these guidelines a little differently.
Now, I think of them as a single guideline: remove duplication and improve names in small cycles. When I do this, I produce a higher proportion of well-factored code compared to all the code I write. I use tests to clarify the goal of my code and to put strict limits on how much code I write.

Maybe when Kent Beck wrote that rules 2 (expresses intent) and 3 (no duplication) together are the Once and Only Once Rule he was in part expressing a similar notion to Rainsberger's, that the two rules are so tightly related that they can be considered one.

So maybe it can be argued that the first of the four rules (passes all test) is clearly part of TDD, and the last rule (as few elements as possible) is part of TDD because no one using TDD to produce well factored code can hope to do so while breaking that rule, and the two middle rules are part of TDD because the third one (remove duplication) clearly is by Kent Beck's definition of TDD, and the second one (expresses intent) is so complementary and tightly related to the other, that it cannot help also being part of TDD.

Philip

Ron Jeffries

unread,
May 25, 2015, 3:09:31 PM5/25/15
to software_cr...@googlegroups.com
Hi Philip,

On May 25, 2015, at 1:22 PM, Philip Schwarz <philip.joh...@googlemail.com> wrote:

I find that logic dictates that from the above we must conclude that the 4 rules of simple design are part of TDD.

What do you think?

I think they are two things, linked. 

In addition, let’s look at what they are, as well. One is a testing ritual, the other is a set of design criteria. They seem to me to be quite independent in both focus and utility.

Ron Jeffries
ronjeffries.com
In times of stress, I like to turn to the wisdom of my Portuguese waitress,
who said: "Olá, meu nome é Marisol e eu serei sua garçonete."





Ron Jeffries

unread,
May 25, 2015, 3:15:17 PM5/25/15
to software_cr...@googlegroups.com
Hi Philip,

One could always ask Kent what he intended. The ideas seem to me to be independently useful and so even if he wrote them together for some didactic purpose, the single responsibility principle argues for separating them. 

On May 25, 2015, at 2:33 PM, Philip Schwarz <philip.joh...@googlemail.com> wrote:

So maybe it can be argued that the first of the four rules (passes all test) is clearly part of TDD, and the last rule (as few elements as possible) is part of TDD because no one using TDD to produce well factored code can hope to do so while breaking that rule, and the two middle rules are part of TDD because the third one (remove duplication) clearly is by Kent Beck's definition of TDD, and the second one (expresses intent) is so complementary and tightly related to the other, that it cannot help also being part of TDD.

I see no advantage to binding the ideas together as a matter of course, since they’re perfectly useful separately. That said, if you find it useful, go for it. :)

Ron Jeffries
You never know what is enough unless you know what is more than enough. -- William Blake

Gregory Salvan

unread,
May 25, 2015, 6:16:25 PM5/25/15
to software_cr...@googlegroups.com
Hi,
maybe a case where you can follow the 4 rules of simple design without doing TDD is for embed software.
You've not necessarly emulator or simulator to run your tests fast and have the rapid feedback required for designing from tests.

Matteo Vaccari

unread,
May 25, 2015, 10:46:16 PM5/25/15
to software_cr...@googlegroups.com

But but but..... is it always good to remove duplication?  For instance, when you have two separate applications or services, os it always a good idea to create a "commons" lib of shared code? I usually find that the shared lib gets in the way, making it more difficult to deploy the apps separately. I think that in some cases it's better to just have two separate copies of some code and let them evolve separately.

I have some support for this idea from Sam Newman's microservices book, and Dan North's motto "DRY is the enemy of decoupled".

What do you all think?

--
Mandata dal telefonino.

Gregory Salvan

unread,
May 26, 2015, 5:03:19 AM5/26/15
to software_cr...@googlegroups.com
Hi Matteo,

There is a lot of concepts behind TDD and I think it is useless to bind them except when teaching.
Duplicated code, is tighly coupled to code smells and corresponding refactorings, at least it's what I have in mind when reading "remove duplication", that means there is identified cases where it is better to avoid duplications, and other cases where you have to make a choice. It's a question of context and experience, not a dogma.

Ron Jeffries

unread,
May 26, 2015, 6:07:18 AM5/26/15
to software_cr...@googlegroups.com
Matteo,

On May 25, 2015, at 10:46 PM, Matteo Vaccari <matteo....@gmail.com> wrote:

But but but..... is it always good to remove duplication?  For instance, when you have two separate applications or services, os it always a good idea to create a "commons" lib of shared code? I usually find that the shared lib gets in the way, making it more difficult to deploy the apps separately. I think that in some cases it's better to just have two separate copies of some code and let them evolve separately. 

I have some support for this idea from Sam Newman's microservices book, and Dan North's motto "DRY is the enemy of decoupled".

What do you all think?

In a single program, I always remove duplication as soon as I find it, if I see how. Often, if I have forked the program into two locations or services, I do not. I find that this creates an undesirable situation for me, namely that I have two or more programs that need maintenance, and only the one I’m presently working on gets the benefit.

If the change is a bug fix or speed improvement, it is troubling to lose that in the applications containing the older version. Often the bug really needs to be fixed in the other versions, even if it has not shown up. Often, the performance improvement would improve the whole ecosystem. But, usually, I do not take the time to do it, because all the old versions would have to be built from source.

Now, on the other hand, I have been bitten more than once when I’ve allowed a system to be built with an external library. (Jekyll, I’m looking at you.) Changes sometimes break my system. This is why we have Bundle and similar packages. 

The real issue with the change in Jekyll version, however, is that they do not have a sufficient suite of tests. If they did; in particular if they had some tests pertaining to my use of their library, they’d not release something that would break me.

There are probably times when removing duplication may not be the thing to do. When code is evolving, removing duplication is so close to always the thing to do that I just do it always. I find that that works just fine and saves my brain for more important things.

Ron Jeffries
The seemingly easy way of learning — by asking — is not necessarily the best.
When you eventually understand, you will understand fully.
— Dragon
   The Line War
   (Neal Asher)

Matteo Vaccari

unread,
May 26, 2015, 1:07:36 PM5/26/15
to software_cr...@googlegroups.com
Hi Gregory,

I don't understand what you mean with "to bind them except when teaching".  

Context and experience certainly trump a lot of things, but I'm concerned with the idea that all, or almost all, important design ideas are a consequence of the Four Rules.  I think that the descriptions that J.B. Rainsberger and Corey Haines made of the Rules made them easier to understand.  All the same, being expressed as absolutes makes them a bit dangerous.  The way they are expressed, they SEEM to imply that

"Keep removing duplication and you'll be fine in the end"

While they would probably be better expressed as

"Keep removing duplication, observe your results, and if they are not good then try something else"

or

"Removing duplication is usually the good thing to do IN CONTEXTS X and Y"

What is often lacking when people speak of DRY or "remove duplication" is the observation that

"I remove duplication from A and B into a shared thing C.  Now A and B are coupled through C".  

Matteo






Matteo Vaccari

unread,
May 26, 2015, 1:22:41 PM5/26/15
to software_cr...@googlegroups.com
Hi Ron,

I hear you say that there are situations (like when working on code of a single service) where removing duplication as far as you can is beneficial; and other situations where it is not, like when you have separate services, especially when one of them sees more active development than the others.  

I notice that you seem to be contradicting yourself, because in the first paragraph of your response you say that sometimes, it's OK not to remove duplication.  In the last one, you seem to say the opposite.  But if I read carefully, I notice that you say "when the code is evolving", by which I think you mean that removing duplication within code that is actively evolving is almost certainly good; while going out of your way to remove duplication from code that works and is not being actively changed may not be a good idea.  So I think that you are not contradicting yourself; you are making a statement that removing duplication is almost always good *in a certain context*.

So my question is: why do we need to be so subtle?  It's very easy to misunderstand statements like this and think "I heard Ron say that removing duplication is the right thing, 99.999% of the times, period."  We miss "... in this context".  I think that we as a community could be a bit more explicit in the discussion of the contexts, limits of applicability, examples and counterexamples of our favourite Rules.

Matteo


Ron Jeffries

unread,
May 26, 2015, 2:20:14 PM5/26/15
to software_cr...@googlegroups.com
Matteo,

On May 26, 2015, at 1:07 PM, Matteo Vaccari <matteo....@gmail.com> wrote:

The way they are expressed, they SEEM to imply that

"Keep removing duplication and you'll be fine in the end"

While they would probably be better expressed as

"Keep removing duplication, observe your results, and if they are not good then try something else"

or

"Removing duplication is usually the good thing to do IN CONTEXTS X and Y"

What is often lacking when people speak of DRY or "remove duplication" is the observation that

"I remove duplication from A and B into a shared thing C.  Now A and B are coupled through C”.  

This particular observation is really close to false. Let’s look at the kinds of coupling:

  • Content coupling, relying on internal implementation? No, reliance is on external behavior only.
  • Common coupling, sharing global data? No, extracting a method does not result in data sharing.
  • External coupling, sharing a protocol? Maybe, in the sense that using SQRT instead of writing one in line does so. If it’s one method, all it really shares is a calling sequence.
  • Control coupling, one module telling another what to do? Only in the trivial sense that calling SQRT tells SQRT to calculate square root.
  • Stamp Coupling, sharing parts of a data structure? Not generally. One could do that but it’s a choice, not an inevitable result.
  • Data coupling, sharing data? Yes, in the exact sense that a float is shared with SQRT, which is no issue at all.
  • Message coupling, sending messages? Yes. That’s how you program objects, sending messages about.
  • way down the list …
  • These two things are “coupled” because they both do the same operation the same way. But that’s what we WANT: everyone who does the same operation SHOULD do it the same way.

Bottom line ...

I disagree that they are dangerous. What is dangerous is the kind of programming so many people do, where they copy and paste code, then modify it in the new location.

I see many people who have not removed enough duplication. I can’t remember ever seeing a real situation where too much duplication removal had taken place. And if it occurred, it takes just one rape-and-paste to back it out. If you want to. Which, generally, there’s no reason to do.

It might be true that in some cases, if you keep removing duplication, you’ll get in trouble. I challenge you, and everyone, to try to get in trouble that way. It’s very difficult to do. And remember that there isn’t just one rule, there are four.

When you get in trouble, publish the code and let us take a look at it and see what we think.

Thanks,

Ron Jeffries
Perfectionism is the voice of the oppressor -- Anne Lamott

Raoul Duke

unread,
May 26, 2015, 2:25:57 PM5/26/15
to software_cr...@googlegroups.com
> I see many people who have not removed enough duplication. I can’t remember
> ever seeing a real situation where too much duplication removal had taken
> place. And if it occurred, it takes just one rape-and-paste to back it out.
> If you want to. Which, generally, there’s no reason to do.

There's a pessimal thing that can happen where people extract "common"
code but fail to do a good job of it, and that "duplicated" code is
complicated, serving too many masters. So, as with all things, effort
can be expended in the name of X and yet not be done well or really be
seriously X, even though that's what people think it is.

Ron Jeffries

unread,
May 26, 2015, 2:26:09 PM5/26/15
to software_cr...@googlegroups.com
Hello Matteo,

On May 26, 2015, at 1:22 PM, Matteo Vaccari <matteo....@gmail.com> wrote:

I hear you say that there are situations (like when working on code of a single service) where removing duplication as far as you can is beneficial; and other situations where it is not, like when you have separate services, especially when one of them sees more active development than the others.  

What I tried to say was that if I have two separate programs, I do not always remove duplication that occurs between them. (It happens that if I could do so conveniently, I’d probably make the same improvements to all the code I have ever written at once. I simply do not have the time to open and edit many files.


I notice that you seem to be contradicting yourself, because in the first paragraph of your response you say that sometimes, it's OK not to remove duplication.  In the last one, you seem to say the opposite.  But if I read carefully, I notice that you say "when the code is evolving", by which I think you mean that removing duplication within code that is actively evolving is almost certainly good; while going out of your way to remove duplication from code that works and is not being actively changed may not be a good idea.  So I think that you are not contradicting yourself; you are making a statement that removing duplication is almost always good *in a certain context*.

Correct.


So my question is: why do we need to be so subtle?  It's very easy to misunderstand statements like this and think "I heard Ron say that removing duplication is the right thing, 99.999% of the times, period."  We miss "... in this context".  I think that we as a community could be a bit more explicit in the discussion of the contexts, limits of applicability, examples and counterexamples of our favourite Rules.

You’re the one who is asking for subtlety with things like “DRY causes coupling”, which then requires us to make some kind of tradeoff.

I believe that removing duplication is so often the right thing that we should develop the habit of doing it almost by reflex. We should, of course, always think, and if we actually SEE a REAL difficulty due to the removal, we should undo it.

I think you’ll get better code by removing duplication whenever you can’t see an immediate issue, rather than removing it only when you’re sure everything will be OK in the future.

Of course, you get to do as you wish, and I certainly wish you all success as you go forward.

Ron Jeffries
Sometimes people just don't want the flower. Sometimes you have to let them walk away.
— Amanda Palmer

Adam Sroka

unread,
May 26, 2015, 2:37:00 PM5/26/15
to software_cr...@googlegroups.com
I can't recall a situation where I removed some duplication and wasn't happy with the result. I can recall many, many situations where I thought that removing duplication was hard and so I didn't. 

...

The root word "simple" and it's derivatives appear often in the XP literature (and in Kent's writing in general.) The thing about "simple" is that everyone is sure they know what it means until you ask them to explain it. Having one of your four values be something you can't explain might be a problem. Fortunately Kent did explain it in a subtly brilliant way. All of the thing that are the opposite of his "rules" - not enough tests, too much duplication, unclear intent, and extra nonsense - are the root causes of maintenance nightmares that every professional programmer has faced at some time in his or her career. 

Matteo Vaccari

unread,
May 30, 2015, 11:50:23 AM5/30/15
to software_cr...@googlegroups.com
Hi Ron,

TL;DR: you can get in trouble by removing duplication when the factored out part is not a good abstraction.  You would then be better off by keeping the duplication.

Examples of getting in trouble by removing duplication come easily to my mind.  Two or more applications that share a data schema on a shared database, is a common example.  At a particular client, the teams and the tech leaders insist that it's a good thing that the "shared model" is common.  They went all the way to sharing the "domain objects", as they call them, in a library.  One of the consequences is that all applications are deployed in sync every three weeks, taking 2-4 days.  I suggested that separate apps need separate parts of the data, and that perhaps if the apps communicated through APIs instead of through shared data-model-plus-library, the organization would be able to deploy the apps independently of one another, more frequently and with less labor.  But my argument was rejected on the basis that I advocate "introducing duplication".

At another client, two apps, maintained by two teams in different cities, must access the same data sources.  Since these databases are a bit of legacy mess, and require non-trivial algorithms to find which database is the right one in a given circumstance, the teams decide to share a "common lib".  The problem is that the two apps have different needs and different programming styles.  One consequence is that the classes in the library become more complicated; (example: the factories accept parameters that will specify a different behaviour for one or the other application).    Another consequence is that the library becomes a dumping ground for anything that could remotely useful on both sides, therefore the library lacks cohesion and it changes for many different reasons. This pushes developers to retest and redeploy both apps for changes that should impact only one of them.  A lot of time is spent (really wasted) synchronizing the lib on both sides.  I argue that even though a part of the behaviour is common to the two apps, there would be little harm in having separate copies of the database access code.  Some parts will be similar, some parts will evolve differently.

Another illustration is from page 20 of "Reliable software through composite design", by Glenford Myers:

A simplistic example ... suppose the following sequence of instructions appears several times in a module or in several modules

    A = B + C
    GET CARD
    PUT OUTPUT
    IF B=4, THEN E=0
    
A well-intentioned programmer may analyze the situation and decide to replace all such sequences with a CALL to module X, and then create a module X containing these four instructions. 

Module X now probably has coincidental strength, since there are no apparent relationships among these four instructions. That is, these instructions probably have different meanings in the original modules.

Suppose in the future a need arises in one of the modules originally containing these instructions to say GET TAPERECORD instead of GET CARD.  The programmer is now faced with a problem.  If the instruction in module X is modified, module X is unusable by all of its other callers.  He has another alternative, to place a test in module X to determine the calling module in order to decide whether to issue the GET TAPERECORD or GET CARD instruction.  This alternative is equally bad.

The latter very simple example shows how it is possible that modules A and B are coupled through a refactored module X.  A new need in A forces the programmer to choose between introducing complexity in X, or breaking B, or undoing the refactoring completely by inlining module X.  All of this because we don't want to break B.

Getting back to my original point... an unqualified suggestion to "remove duplication is almost always the right thing" is simplistic.  I would rather that the programmers reflect on good and bad ways to remove duplication, and the forces of cohesion and coupling that change as a result of removing duplication.

With all due respect and admiration for your work... as someone else said, I'm in this also because of reading a book that you wrote!

Matteo



Reply all
Reply to author
Forward
0 new messages