[SC] Is Intent Revelation More Important than DRY?

177 views
Skip to first unread message

Philip Schwarz

unread,
Nov 26, 2011, 12:29:12 AM11/26/11
to software_craftsmanship
Were you at Software Craftmanship North America http://scna.softwarecraftsmanship.org/
? Were you at Jim Weirich's "Code Kata and Analysis"?

I ask because earlier this week I saw the following tweets:

@zspencer It appears that @jimweirich is recommending reordering the 4
rules of simple design. Is intent revelation more important than dry?
#scna
@jimweirich @zspencer I've seen both orderings. See
http://c2.com/cgi/wiki?XpSimplicityRules and the original vs the
@ronjeffries version.
@RonJeffries @jimweirich @zspencer in the original white book, i
believe it appears twice with 2 and 3 in each order. i prefer DRY
first. makes u think
@jbrains @RonJeffries @jimweirich @zspencer I remove dup'n to uncover
good structure, then improve names to distribute responsibilities
better.

I think I also saw a tweet saying that talks were not recorded.

If you were at the talk: what was Weirich's message?

Jim Weirich

unread,
Nov 26, 2011, 2:59:30 AM11/26/11
to software_cr...@googlegroups.com

The talk was a live performance of (a portion) of the Roman Numeral Converter kata, followed by an analysis of the decisions made during the kata. The code portion of the kata is more or less reproduced in a static form here: https://gist.github.com/1095310

I made the following points during the talk:

* Know where to start (put some thought into picking your first test)
* Know where to continue (the right sequence of tests can make finding the solution easier)
* Allow the solution to drive the tests (counterpoint to letting the tests drive the code)
* Know what to skip (sometimes delaying some tests until later is helpful)
* Recognize duplication (even when it doesn't look like duplication)
* Know when to not remove duplication (when it impacts readability)
* Know the edge cases of your problem space

I think the second to the last point is what caused the twitter stream. At some point in the kata, we reached a data structure that looked like this:

ROMAN_REDUCTIONS = [
[1000, "M"],

[900, "CM"],
[500, "D"],
[400, "CD"],
[100, "C"],

[90, "XC"],
[50, "L"],
[40, "XL"],
[10, "X"],

[9, "IX"],
[5, "V"],
[4, "IV"],
[1, "I"],
]

There is some obvious duplication in the 1,4,5,9 pattern that repeats for each power of ten. I suggested you *could* remove that duplication with something like this:

def self.power_of_ten(power, ones_glyph, fives_glyph, tens_glyph)
[
["#{ones_glyph}#{tens_glyph}", 9*power],
[fives_glyph, 5*power],
["#{ones_glyph}#{fives_glyph}", 4*power],
[ones_glyph, 1*power],
]
end

CONVERSION_TABLE = [ ["M", 1000] ] +
power_of_ten(100, 'C', 'D', 'M') +
power_of_ten( 10, 'X', 'L', 'C') +
power_of_ten( 1, 'I', 'V', 'X')

In the talk, I suggested that the change was not an improvement. The complexity introduced by the power_of_ten method overshadowed the rather minor duplication in the data table. Since the duplication was (1) minor, (2) well communicated by the existing table, and (3) unlikely to be subject to change, I felt the explicit data table was preferable. The audience seemed to agree with the assessment.

I also said that when balancing removing duplication VS adding complexity, I tend to weigh readability and understandability heavier in tests than I do in regular code.

That's the meat of the talk and what generated the twitter conversation. I wasn't advocating reordering the 4 rules for simple design (although there seems to be some confusion on the exact ordering anyways). I was merely suggesting that sometimes (not always) removing duplication adds complexity and that in some cases the benefit of removing duplication is outweighed by the added complexity. You need to be aware of the trade-off and make well informed decisions.

I hope this helps.

--
-- Jim Weirich
-- jim.w...@gmail.com


RonJeffries

unread,
Nov 26, 2011, 7:41:59 AM11/26/11
to software_cr...@googlegroups.com
Hi Jim,

On Nov 26, 2011, at 2:59 AM, Jim Weirich wrote:

There is some obvious duplication in the 1,4,5,9 pattern that repeats for each power of ten. I suggested you *could* remove that duplication with something like this:

 def self.power_of_ten(power, ones_glyph, fives_glyph, tens_glyph)
   [
     ["#{ones_glyph}#{tens_glyph}", 9*power],
     [fives_glyph, 5*power],
     ["#{ones_glyph}#{fives_glyph}", 4*power],
     [ones_glyph, 1*power],
   ]
 end

 CONVERSION_TABLE = [ ["M", 1000] ] +
   power_of_ten(100, 'C', 'D', 'M') +
   power_of_ten( 10, 'X', 'L', 'C') +
   power_of_ten(  1, 'I', 'V', 'X')

In the talk, I suggested that the change was not an improvement. The complexity introduced by the power_of_ten method overshadowed the rather minor duplication in the data table.  Since the duplication was (1) minor, (2) well communicated by the existing table, and (3) unlikely to be subject to change, I felt the explicit data table was preferable. The audience seemed to agree with the assessment. 

I would certainly agree with the assessment that the particular method chosen to remove that duplication was not an improvement at first glance, and maybe not at all. However, that doesn't tell us that no better way exists.

Chet and I teach the Simple Design order this way:
  1. Runs all the tests;
  2. Contains no duplication;
  3. Expresses all the programmer's design ideas;
  4. Minimizes programming artifacts.

We also "teach the controversy" by pointing out that some people want to place expression above duplication removal. We say that if we ever encountered a place where removing duplication inevitably horked up expression, we'd favor expression.

We go on to say that we suspect that when rules 2 and 3 seem to be in competition, the code is telling us something that we do not yet hear. A very common occurrence is that there is a new idea in the code which has not yet materialized out of the fog. A new class, perhaps.

For me, I leave duplication above expression because duplication is easy to spot and it needs to be heeded so often that it deserves high priority. In addition, many programmers, though of course none of us here, think that their code is perfectly expressive as it stands. I fear that if we rank expression over duplication, the normal tendency to think our code is great will cause us to miss opportunities to improve.

It's a very interesting list. In some ways, the most interesting thing Kent ever did. For sure, 2 and 3 are close together in priority. In theory, I think expression trumps duplication removal. In practice, I think that happens very very rarely, if ever.

Ron Jeffries
www.XProgramming.com
I try to Zen through it and keep my voice very mellow and low.
Inside I am screaming and have a machine gun.
Yin and Yang I figure.
  -- Tom Jeffries

RonJeffries

unread,
Nov 26, 2011, 7:48:31 AM11/26/11
to software_cr...@googlegroups.com
Hi again, Jim,

On Nov 26, 2011, at 2:59 AM, Jim Weirich wrote:

The talk was a live performance of (a portion) of the Roman Numeral Converter kata, followed by an analysis of the decisions made during the kata.  The code portion of the kata is more or less reproduced in a static form here: https://gist.github.com/1095310

Delightful, by the way, even in written form. I like how you bring out the discoveries. The "IF is a WHILE" one is delicious!

I wonder about a recursive approach. Would it eliminate the whiles and turn them all back into IF? Would a big outer loop WHILE n > 0 remove the IFs?

The "duplication" in the table remains interesting, but in the grand scheme, not VERY interesting :)

Good stuff! Thanks!

Ron Jeffries
If it is more than you need, it is waste. -- Andy Seidl

Esko Luontola

unread,
Nov 26, 2011, 9:24:04 AM11/26/11
to software_cr...@googlegroups.com, ronje...@acm.org


On Saturday, November 26, 2011 2:41:59 PM UTC+2, Ron Jeffries wrote:

On Nov 26, 2011, at 2:59 AM, Jim Weirich wrote:
 CONVERSION_TABLE = [ ["M", 1000] ] +
   power_of_ten(100, 'C', 'D', 'M') +
   power_of_ten( 10, 'X', 'L', 'C') +
   power_of_ten(  1, 'I', 'V', 'X')

In the talk, I suggested that the change was not an improvement. The complexity introduced by the power_of_ten method overshadowed the rather minor duplication in the data table.  Since the duplication was (1) minor, (2) well communicated by the existing table, and (3) unlikely to be subject to change, I felt the explicit data table was preferable. The audience seemed to agree with the assessment. 

I would certainly agree with the assessment that the particular method chosen to remove that duplication was not an improvement at first glance, and maybe not at all. However, that doesn't tell us that no better way exists.

 What if we would take removing the duplication further? For example in the above code X, C and M are each repeated twice. Maybe we would end up with code like this:

ROMAN_REDUCTIONS = [
    [1000, "M"],
    [500, "D"],
    [100, "C"],
    [50, "L"],
    [10, "X"],
    [5, "V"],
    [1, "I"],
  ]

Jim Weirich

unread,
Nov 26, 2011, 9:47:51 AM11/26/11
to software_cr...@googlegroups.com
On Nov 26, 2011, at 7:48 AM, RonJeffries <ronje...@acm.org> wrote:

> Delightful, by the way, even in written form. I like how you bring out the discoveries. The "IF is a WHILE" one is delicious!

Delicious is a good word to describe that.

> I wonder about a recursive approach. Would it eliminate the whiles and turn them all back into IF? Would a big outer loop WHILE n > 0 remove the IFs?

There is an alternate solution where we turn the WHILEs into IFs by calculating the number of glyphs needed using a bit of modulo math. Then we don't need the inner whiles at all and they can become IFs. (Hmm, as I write this, it occurs to me that the IFs might not be needed either. I might explore that the next time I do the kata.)

> The "duplication" in the table remains interesting, but in the grand scheme, not VERY interesting :)

Agreed.

--
-- Jim Weirich

Jim Weirich

unread,
Nov 26, 2011, 9:51:28 AM11/26/11
to software_cr...@googlegroups.com
On Nov 26, 2011, at 9:24 AM, Esko Luontola <esko.l...@gmail.com> wrote:

> What if we would take removing the duplication further? For example in the above code X, C and M are each repeated twice. Maybe we would end up with code like this:
>
> ROMAN_REDUCTIONS = [
> [1000, "M"],
> [500, "D"],
> [100, "C"],
> [50, "L"],
> [10, "X"],
> [5, "V"],
> [1, "I"],
> ]

It would be interesting to see how that choice effects the code.

--
-- Jim Weirich

Jim Weirich

unread,
Nov 26, 2011, 9:54:42 AM11/26/11
to software_cr...@googlegroups.com
On Nov 26, 2011, at 7:41 AM, RonJeffries <ronje...@acm.org> wrote:

> For me, I leave duplication above expression because duplication is easy to spot and it needs to be heeded so often that it deserves high priority. In addition, many programmers, though of course none of us here, think that their code is perfectly expressive as it stands. I fear that if we rank expression over duplication, the normal tendency to think our code is great will cause us to miss opportunities to improve.
>
> It's a very interesting list. In some ways, the most interesting thing Kent ever did. For sure, 2 and 3 are close together in priority. In theory, I think expression trumps duplication removal. In practice, I think that happens very very rarely, if ever.

Thanks for the insight on the tension between expression vs duplication. I like this explanation.

--
-- Jim Weirich

Philip Schwarz

unread,
Nov 27, 2011, 4:10:12 PM11/27/11
to software_craftsmanship
>The code portion of the kata is more or less reproduced in a static form here: https://gist.github.com/1095310
I find the gist very readable.

>I hope this helps.
That was very helpful and much appreciated: thank you.

Philip

On Nov 26, 7:59 am, Jim Weirich <jim.weir...@gmail.com> wrote:
> On Nov 26, 2011, at 12:29 AM, Philip Schwarz wrote:
>
>
>
>
>
>
>
>
>

> > Were you at Software Craftmanship North Americahttp://scna.softwarecraftsmanship.org/


> > ? Were you at Jim Weirich's "Code Kata and Analysis"?
>
> > I ask because earlier this week I saw the following tweets:
>
> > @zspencer It appears that @jimweirich is recommending reordering the 4
> > rules of simple design. Is intent revelation more important than dry?
> > #scna
> > @jimweirich @zspencer I've seen both orderings. See

> >http://c2.com/cgi/wiki?XpSimplicityRulesand the original vs the

> -- jim.weir...@gmail.com

Philip Schwarz

unread,
Nov 27, 2011, 4:41:34 PM11/27/11
to software_craftsmanship
Thank you for sharing your insights: I found them very instructive.

Philip

George Dinwiddie

unread,
Nov 27, 2011, 6:46:25 PM11/27/11
to software_cr...@googlegroups.com
Jim,

That's all very reasonable. I have no problem with that, though I would
probably explain this particular situation differently.

I noted today that Kent's list in XPE1 (p. 57) says "Has no duplicated
logic." I would say that the 1,4,5,9 pattern is not duplicated logic,
but just a pattern found in Roman Numerals. Minimizing the space of
this table requires some more complicated logic.

I call this "incidental duplication" and think it's not worthy of
removal. It's not that the V, L, and D share something special. They
just happen to appear similar after being converted to base 10.

I liken the impetus to remove them to, in C, declaring "#define ONE 1"
and using "ONE" everywhere a 1 might appear. If the 1s are unrelated,
this is not duplication and the substitution adds an unnecessary coupling.

Like Ron, I've never found steps 2 & 3 to be in conflict. Considering
apparent conflicts such as this one helped me to see duplication as more
than similar numbers or words.

- George

--
----------------------------------------------------------------------
* George Dinwiddie * http://blog.gdinwiddie.com
Software Development http://www.idiacomputing.com
Consultant and Coach http://www.agilemaryland.org
----------------------------------------------------------------------

Olivier Azeau

unread,
Nov 27, 2011, 9:49:22 PM11/27/11
to software_cr...@googlegroups.com
An easy move is to compute the complete reductions set from the DRY one.
For example:

-==============-

  ROMAN_SINGLE_CHAR_REDUCTIONS = [
    ["M", 1000],
    ["D", 500],
    ["C", 100],
    ["L", 50],
    ["X", 10],
    ["V", 5],
    ["I", 1]
    ]

  def self.add_double_char_reductions_to(roman_single_char_reductions)
        roman_reductions = roman_single_char_reductions.dup

        [2,4,6].each do|reduction_index|
            unit = roman_single_char_reductions[reduction_index]
            fiveunits = roman_single_char_reductions[reduction_index-1]
            tenunits = roman_single_char_reductions[reduction_index-2]
            roman_reductions.insert(reduction_index*2-3,[unit[0]+tenunits[0], tenunits[1]-unit[1]])
            roman_reductions.insert(reduction_index*2-1,[unit[0]+fiveunits[0], fiveunits[1]-unit[1]])
        end

        return roman_reductions
  end

  ROMAN_REDUCTIONS =
add_double_char_reductions_to(ROMAN_SINGLE_CHAR_REDUCTIONS)

-==============-

Actually, this left me with more questions and few answers:

- minimization of programming artifacts has a lower priority than no duplication. Is this rule still valid if removing duplication by introducing new artifact implies more code than keeping the duplication?
In this example, I would probably keep the original array with duplications.

- Which one is more intent revealing on the construction of roman numbers?
    ["CM", 900],  ["XC", 90],  ["IX", 9]
    ["CD", 400],  ["XL", 40],  ["IV", 4],
or
    roman_reductions.insert(reduction_index*2-3,[unit[0]+tenunits[0], tenunits[1]-unit[1]])
    roman_reductions.insert(reduction_index*2-1,[unit[0]+fiveunits[0], fiveunits[1]-unit[1]])
?

- the two step construction of ROMAN_REDUCTIONS (single char, then double char) clearly shows that the double char ones cannot be inserted anywhere in the list (the insertion index almost looks like a code trick). The order of reductions is important but this information is not visible in the original design.
Should it be visible?

Olivier



RonJeffries

unread,
Nov 27, 2011, 10:02:28 PM11/27/11
to software_cr...@googlegroups.com
Hi Olivier,

On Nov 27, 2011, at 9:49 PM, Olivier Azeau wrote:

An easy move is to compute the complete reductions set from the DRY one.
For example:

This must be some new kind of easy that I wasn't previously familiar with. I found the resulting code to be quite opaque. But it could be me, my IQ is only 160-something and I've only been programming for a half-century. 

But seriously ... I would not say that transformation was an improvement. :)

Ron Jeffries
I'm really pissed off by what people are passing off as "agile" these days.
You may have a red car, but that does not make it a Ferrari.
  -- Steve Hayes

Olivier Azeau

unread,
Nov 28, 2011, 4:35:59 AM11/28/11
to software_cr...@googlegroups.com
On 28/11/2011 04:02, RonJeffries wrote:
Hi Olivier,

On Nov 27, 2011, at 9:49 PM, Olivier Azeau wrote:

An easy move is to compute the complete reductions set from the DRY one.
For example:

This must be some new kind of easy that I wasn't previously familiar with. I found the resulting code to be quite opaque. But it could be me, my IQ is only 160-something and I've only been programming for a half-century. 

But seriously ... I would not say that transformation was an improvement. :)

I totally agree, no improvement here! :-)

Just keep in mind the inception.
We have this code:

ROMAN_REDUCTIONS = [
    [1000, "M"],

    [900, "CM"],
    [500, "D"],
    [400, "CD"],
    [100, "C"],

    [90, "XC"],
    [50, "L"],
    [40, "XL"],
    [10, "X"],

    [9, "IX"],
    [5, "V"],
    [4, "IV"],
    [1, "I"],
  ]

and then we say "what if we would only define an equivalent array with no char duplication?"

The first idea that come to my mind is doing something like

ROMAN_SINGLE_CHAR_REDUCTIONS = [
    [1000, "M"],

    [500, "D"],
    [100, "C"],

    [50, "L"],
    [10, "X"],

    [5, "V"],
    [1, "I"],
  ]

ROMAN_REDUCTIONS = add_double_char_reductions_to(ROMAN_SINGLE_CHAR_REDUCTIONS) 


That is indeed an *easy* move because you do not have to think about anything else in the code.
It turns out that the "add_double_char_reductions_to" function is complex but it was indeed easy to *write* (not to read!) - and, as I've stated as a comment on this move, I would probably keep the original array because, until I get a simple and DRY implementation of this function, I would favor repetition over complexity.

Beside this, the move has one value: it showed me that the order of reductions in the array is essential to this piece of code but this information is unfortunately not visible.

Olivier





J. B. Rainsberger

unread,
Dec 14, 2011, 4:04:51 PM12/14/11
to software_cr...@googlegroups.com
On Mon, Nov 28, 2011 at 00:46, George Dinwiddie <li...@idiacomputing.com> wrote:

> I noted today that Kent's list in XPE1 (p. 57) says "Has no duplicated
> logic."  I would say that the 1,4,5,9 pattern is not duplicated logic, but
> just a pattern found in Roman Numerals.  Minimizing the space of this table
> requires some more complicated logic.
>
> I call this "incidental duplication" and think it's not worthy of removal.
> It's not that the V, L, and D share something special. They just happen to
> appear similar after being converted to base 10.

Agreed. This reminds me of the circumference/area problem with Circle:
we don't extract pi*r to its own function, because we wouldn't know
what to name it. I don't know many examples like this, and I consider
them somewhat pathological.
--
J. B. (Joe) Rainsberger :: http://www.jbrains.ca ::
http://blog.thecodewhisperer.com
Author, JUnit Recipes
Free Your Mind to Do Great Work :: http://www.freeyourmind-dogreatwork.com
Find out what others have to say about me at http://nta.gs/jbrains

RonJeffries

unread,
Dec 14, 2011, 5:08:36 PM12/14/11
to software_cr...@googlegroups.com
Hi JB,

On Dec 14, 2011, at 4:04 PM, J. B. Rainsberger wrote:

Agreed. This reminds me of the circumference/area problem with Circle:
we don't extract pi*r to its own function, because we wouldn't know
what to name it. I don't know many examples like this, and I consider
them somewhat pathological.

true, but TWO_PI = OnceAround. :)

Jon Jagger

unread,
Dec 15, 2011, 1:42:53 AM12/15/11
to software_cr...@googlegroups.com
What about if the arabic to roman numeral solution does this...

units = [ "", "I", "II", "III", "IV", "V", "VI", "VII", "VIII", "IX" ]
tens = [ "X", "XX", "XXX", "XL", "L", "LX", "LXX", "LXXX", "XC" ]
hundreds = [ ... ]

return hundreds[arabic / 100] + tens[arabic / 10 % 10] + units[arabic % 10]

Is it now "too much duplication" ?
Is it worth it to avoid the lopping?
When is duplication duplication that counts?

Cheers
Jon

> --
> You received this message because you are subscribed to the Google Groups
> "software_craftsmanship" group.
> To post to this group, send email to
> software_cr...@googlegroups.com.
> To unsubscribe from this group, send email to
> software_craftsma...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/software_craftsmanship?hl=en.

--
CyberDojo - a game-server for practising the collaborative game called
software development.
Explained at http://jonjagger.blogspot.com/p/cyberdojo.html
Open-sourced at http://github.com/JonJagger/cyberdojo
Server probably at http://www.cyber-dojo.com
Video of Roman Numerals kata in Ruby at http://vimeo.com/15104374

Jim Weirich

unread,
Dec 15, 2011, 2:42:21 AM12/15/11
to software_cr...@googlegroups.com

On Dec 15, 2011, at 1:42 AM, Jon Jagger wrote:

> What about if the arabic to roman numeral solution does this...
>
> units = [ "", "I", "II", "III", "IV", "V", "VI", "VII", "VIII", "IX" ]
> tens = [ "X", "XX", "XXX", "XL", "L", "LX", "LXX", "LXXX", "XC" ]
> hundreds = [ ... ]
>
> return hundreds[arabic / 100] + tens[arabic / 10 % 10] + units[arabic % 10]
>
> Is it now "too much duplication" ?
> Is it worth it to avoid the lopping?
> When is duplication duplication that counts?

These are all good questions. In this particular case, my judgement is that

(a) The duplication is small ... only three repetitions.
(b) The probability of a change is small, I don't think Roman numerals have changed much in, oh, say 2000 years.

If either of (a) or (b) were not the case, I would be much more inclined to do something about the "just barely duplicated" duplication.

As an aside, it is interesting to note that the tens pattern given above is incorrect, in that it omitted the initial "". Also when I transcribed your code (to run it locally to verify it gave good numbers), I had a typo in the hundreds array (typed "DC" instead "CD"). So in the end between you and me, we got two errors in duplicating the units array.

Another aside, removing duplication is (IMHO) easier in your formulation than my original. It could be written like this:

units = [ "", "I", "II", "III", "IV", "V", "VI", "VII", "VIII", "IX" ]

tens = units.map { |glyph| glyph.tr("IVX", "XLC") }
hundreds = units.map { |glyph| glyph.tr("IVX", "CDM") }

Simple enough to do rather than explicit arrays? Maybe.

RonJeffries

unread,
Dec 15, 2011, 4:43:56 AM12/15/11
to software_cr...@googlegroups.com
Hi Jim,

On Dec 15, 2011, at 2:42 AM, Jim Weirich wrote:

Another aside, removing duplication is (IMHO) easier in your formulation than my original.  It could be written like this:

   units    = [ "", "I", "II", "III", "IV", "V", "VI", "VII", "VIII", "IX" ]
   tens     = units.map { |glyph| glyph.tr("IVX", "XLC") }
   hundreds = units.map { |glyph| glyph.tr("IVX", "CDM") }

Simple enough to do rather than explicit arrays?  Maybe.

Delicious as that is, it seems to me there's no chance it's better, as one has to figure out what the map glyph tr stuff is about and even if one is totally au courant with those, it still has to be kind of executed in the mind. Whether it actually works, my mind, limited as it is at 0445, cannot discern.

The literal arrays are quite expressive. The example above may be a place where expression trumps removal of duplication ... if one even cares about duplication in data, which this one rat cheer does not.

Regards,

Ron Jeffries
I know we always like to say it'll be easier to do it now than it
will be to do it later. Not likely. I plan to be smarter later than
I am now, so I think it'll be just as easy later, maybe even easier.
Why pay now when we can pay later?

Doug Bradbury

unread,
Dec 18, 2011, 1:22:22 PM12/18/11
to software_cr...@googlegroups.com
I was cleaning off my camera and just remembered that I recorded Jim performing the Roman Numeral kata at SCNA.  It's not a great recording, but I though that those of you not present might like to see it anyway.


Enjoy!
Doug

J. B. Rainsberger

unread,
Dec 18, 2011, 4:46:41 PM12/18/11
to software_cr...@googlegroups.com
On Thu, Dec 15, 2011 at 01:42, Jon Jagger <j...@jaggersoft.com> wrote:
> What about if the arabic to roman numeral solution does this...
>
> units = [ "", "I", "II", "III", "IV", "V", "VI", "VII", "VIII", "IX" ]
> tens = [ "X", "XX", "XXX", "XL", "L", "LX", "LXX", "LXXX", "XC" ]
> hundreds = [ ... ]
>
> return  hundreds[arabic / 100] + tens[arabic / 10 % 10] + units[arabic % 10]
>
> Is it now "too much duplication" ?

Probably not.

> Is it worth it to avoid the lopping?

I like the other looping opportunity this opens up:

arabic = 901
n = arabic # Need a better name
parts = [units, tens, hundreds, ...].map { |each| each[n % 10]; n /= 10 }
parts.should == ["I", "", "CM"]
roman = parts.reverse.join("")
roman.should == "CMI"

Now you can extend it for thousands, myriads, … as long as you have
the glyphs you want.

> When is duplication duplication that counts?

If I don't just see it, then I follow this rule of thumb: if either
there's 3 copies, or I can name it, then I extract it.

Matteo Vaccari

unread,
Dec 19, 2011, 8:29:36 AM12/19/11
to software_cr...@googlegroups.com
On Thu, Dec 15, 2011 at 8:42 AM, Jim Weirich <jim.w...@gmail.com> wrote:

On Dec 15, 2011, at 1:42 AM, Jon Jagger wrote:

> What about if the arabic to roman numeral solution does this...
>
> units = [ "", "I", "II", "III", "IV", "V", "VI", "VII", "VIII", "IX" ]
> tens = [ "X", "XX", "XXX", "XL", "L", "LX", "LXX", "LXXX", "XC" ]
> hundreds = [ ... ]
>
> return  hundreds[arabic / 100] + tens[arabic / 10 % 10] + units[arabic % 10]
>
> Is it now "too much duplication" ?
> Is it worth it to avoid the lopping?
> When is duplication duplication that counts?

These are all good questions.  In this particular case, my judgement is that

(a) The duplication is small ... only three repetitions.
(b) The probability of a change is small, I don't think Roman numerals have changed much in, oh, say 2000 years.

Actually they have :-)

It turns out that ancient Romans wrote "XXXX".  It was medieval accountants that, wishing to save parchment, invented IV, XL etc.

This bit of trivia gives me a hint on how to improve the solution, by removing the duplication in the table *and* making it more intention-revealing.  Suppose that we use the simpler table

ROMAN_SYMBOLS = [
    [1000, "M"],

    [500, "D"],
    [100, "C"],

    [50, "L"],
    [10, "X"],

    [5, "V"],
    [1, "I"],
  ]
This would produce numbers such as XXXXIIII for 44.  Then apply this transformation:

def save_parchment_in a_text
  while a_symbol = appears_four_consecutive_times_in(a_text)
    a_text.gsub!( a_symbol*4, a_symbol+successor_of(a_symbol) ) 
  end
end

def successor_of a_symbol
  index = ROMAN_SYMBOLS.index(a_symbol)
  ROMAN_SYMBOLS[index+1]
end

(Yes, it breaks for MMMM :-)


Philip Schwarz

unread,
Dec 14, 2013, 2:46:28 AM12/14/13
to software_cr...@googlegroups.com, ronje...@acm.org
Hi Ron,

just reading this reply of yours again, 2 years after you wrote it, because last week, in "Putting an Age Old Battle to Rest" [1], J.B.Rainsberger blogged on "the controversy" of the relative ordering of simple design rules 2 and 3, and these great thoughts of yours on the ordering of the rules are very interesting and so I left a comment pointing out how you teach the Simple Design order.

Philip

Ron Jeffries

unread,
Dec 14, 2013, 6:51:53 AM12/14/13
to software_cr...@googlegroups.com
Philip,

On Dec 14, 2013, at 2:46 AM, Philip Schwarz <philip.joh...@googlemail.com> wrote:

just reading this reply of yours again, 2 years after you wrote it, because last week, in "Putting an Age Old Battle to Rest" [1], J.B.Rainsberger blogged on "the controversy" of the relative ordering of simple design rules 2 and 3, and these great thoughts of yours on the ordering of the rules are very interesting and so I left a comment pointing out how you teach the Simple Design order.

Thanks :)

Philip Schwarz

unread,
Apr 20, 2014, 1:17:40 AM4/20/14
to software_cr...@googlegroups.com, ronje...@acm.org
Hello Ron (RIP Jim Weirich),

>It's a very interesting list. In some ways, the most interesting thing Kent ever did. For sure, 2 and 3 are close together in priority. In theory, I think expression trumps duplication removal. In practice, I think that happens very very rarely, if ever.

I finally found the time to start reading "Understanding the Four Rules of Simple Design" [1] (Corey Haines' recently published book). It starts with a foreword in which Kent Beck says:

My approach to communicating complex ideas at the time was to
formulate a simple set of rules, the emergent property of which
was the complex outcome I was aiming at (cf patterns). (I have
since become disenchanted with this strategy.) I thought about how
I recognized simplicity, turned those criteria into actions, sorted
them by priority (it’s no coincidence that human communication
is number two)...

Goes straight into my scrapbook, not far from your thoughts above.

Philip


On Saturday, 26 November 2011 12:41:59 UTC, Ron Jeffries wrote:

Philip Schwarz

unread,
Apr 20, 2014, 1:19:43 AM4/20/14
to software_cr...@googlegroups.com, ronje...@acm.org
Reply all
Reply to author
Forward
0 new messages