APOSD vs. Clean Code

John Ousterhout

unread,

Feb 23, 2025, 2:59:41 AMFeb 23

to software-design-book

As many of you may know, I have significant disagreements with the software design advice given by Robert "Uncle Bob" Martin in his book "Clean Code". Over the last several months he and I have been discussing our disagreements and we have recorded the discussion in a document on GitHub:

https://github.com/johnousterhout/aposd-vs-clean-code/blob/main/README.md

If you wish to comment on that discussion, I suggest doing so here.

-John-

The Future of Programming

unread,

Feb 23, 2025, 4:59:26 AMFeb 23

to software-design-book

Hello John

>> "Over the last several months he and I have been discussing our disagreements and we have recorded the discussion in a document on GitHub"

Thanks for sharing the discussion.

>> "If you wish to comment on that discussion, I suggest doing so here"

As a developer who respect the different ideas/background behind the books

I think the guidelines provided in the books are context sensitive and depends on the domain and the general goal

1- High-level programming vs. Low-level programming

2- Programming in the large vs. Programming in the small

3- System development vs. Application development

4- Large team vs. Small team

5- Using new programming tools vs. Using old programming tools

6- Writing new software vs. Maintaining old software

7- Developing for Programmers vs Developing for Users

So, I think as programmers we need to read more and really understand the advice (pros/cons) and the perfect context to apply it.

Greetings,

Mahmoud

Artie Shevchenko

unread,

Feb 23, 2025, 5:24:35 AMFeb 23

to The Future of Programming, software-design-book

Thanks, John! I think your point on module depth can be summarized as, “Typically, interface simplicity is way more important than implementation size.” Of course, as you mentioned, it's good to have a small implementation as long as it improves overall readability, but I agree with you that focusing on the implementation size (instead of interface simplicity and depth) is quite a misleading distraction.

That said, I find the following Martin's note on naming to be a very important decomposition guardrail:

> "Meaningfully" means that the extracted functionality can be given a descriptive name, and that it does less than the original method.

And I disagree, John, with your comment that “anything can be named”—at least not if you aim for both clarity and brevity.

But overall, I feel like you are from slightly different universes with Martin (as Mahmoud may be hinting at above as well), and I personally think that John, your universe is more real. From my experience at Google and other companies, I haven't seen much room for 5-line-long methods in production code. And I don’t think I’ve ever seen anybody writing a test before code, although I believe I had pretty good exposure to great software engineers to work with. But maybe that’s just me, and there are useful real-life products from Martin’s universe as well. I would expect AI agents to be good at working on those, then, though.

--

Kind regards

Artie Shevchenko
https://codehealthguardian.com/

--
You received this message because you are subscribed to the Google Groups "software-design-book" group.
To unsubscribe from this group and stop receiving emails from it, send an email to software-design-...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/software-design-book/90887c66-b727-48d0-8bec-8033388d4acbn%40googlegroups.com.

Jason Neal

unread,

Feb 23, 2025, 1:12:53 PMFeb 23

to software-design-book

I want to start by thanking you both for publishing this! It was great to hear your unique perspectives.

I think you both know this better than me - but I found this so interesting because we all arrive at our own conclusions based on our own experiences. In my career, I've worked for small 2-person marketing shops, larger regional marketing agencies, and large enterprises with thousands of software engineers. Each of these positions required a vastly different skill set for me to be successful. I've read dozens of books on this subject and what I find is that it's all valuable information. Hearing each person's unique perspective based on their experiences helps inform my own personal opinion when combined with my own personal experiences. I don't treat any of these books as "bibles" that must be followed in exact detail as if they were a set of commandments from God himself. Far from that!

I love that you two are able to work together respectfully and discuss this topic without too much ego getting in the way. Kudos!!

Thanks,

Jason

Lucas de Oliveira

unread,

Feb 23, 2025, 2:32:29 PMFeb 23

to software-d...@googlegroups.com

Firstly John, I would like to thank you for the material, I made a small post on my Linkedin with the link because I believe that this debate should be publicized to the community.

For me, these words spoken by Artie summarize everything I think about the APOSD x Clean Code debate: "I personally think that John, your universe is more real. From my experience at Google and other companies, I haven't seen much room for 5-line-long methods in production code"

Thanks

Lucas

To view this discussion visit https://groups.google.com/d/msgid/software-design-book/b4a6559b-f342-44b7-bf58-2758e2c55ec1n%40googlegroups.com.

Artem Zakirullin

unread,

Feb 24, 2025, 1:53:13 AMFeb 24

to software-design-book

Thank you so much for sharing the discussion! Really valuable insights.

I couldn't agree more on "One Thing approach". It's too vague. Take 5 developers and ask them to extract methods based on this principle - you'd get different results, because everyone's mental models are different. By operating on such vague principles we create more harm that good.

Some time ago I've written just about that, word to word!
https://minds.md/zakirullin/cognitive#srp

I've put a link to your discussion here if you don't mind: https://github.com/zakirullin/cognitive-load?tab=readme-ov-file#too-many-small-methods-classes-or-modules

Eivind Jahren

unread,

Feb 25, 2025, 1:42:15 PMFeb 25

to software-design-book

Fantastic and much anticipated discussion, thank you so much!

On making sure TDD is described correctly in the next edition of the book:

Kent Beck (as the coiner of the term) wrote Canon TDD [1] for exactly this purpose. In

his own words: "If you’re going to critique something, critique the actual thing." Especially

step 1 in his text was not brought up in the discussion with Robert Martin. I think it is

relevant. It's not a step I have associated with TDD, but I do like having a short

description/discussion of the proposed solution before coding.

[1]: https://tidyfirst.substack.com/p/canon-tdd

Eivind Jahren

unread,

Feb 25, 2025, 1:44:06 PMFeb 25

to software-design-book

Fantastic read! On fixing the description of TDD in the next edition;

Kent Beck wrote a piece called canon TDD to make sure critics critique

what it was he coined. In his words: "If you’re going to critique something,

critique the actual thing".

You can find it here: https://tidyfirst.substack.com/p/canon-tdd

I think the variation you discussed with Uncle Bob was pretty close,

but still not exactly this due to missing step 1. Honestly, not a step I have done

explicitly, but I tend to write some text about the proposed solution before I start at

least.

On Monday, 24 February 2025 at 07:53:13 UTC+1 art...@gmail.com wrote:

William Adams

unread,

Feb 26, 2025, 11:55:05 AMFeb 26

to software-design-book

The discussion of this on various other forums has been quite interesting:

https://old.reddit.com/r/programming/comments/1iwlgzq/a_discussion_between_john_ousterhout_and_robert/

https://news.ycombinator.com/item?id=43166362

Finally broke down and bought the Kindle version, and will be ordering another print copy (so as to be able to critique the typesetting) and will write further presently.

William

Nmhor Kmhor

unread,

Feb 28, 2025, 6:28:52 PMFeb 28

to software-design-book

I think Uncle Bob left a couple of important arguments out of his discussion on small methods and those are the SOLID principles of "Open-Closed" and "Dependency Inversion". Smaller classes and smaller methods help with both of these. In the discussion, John said " How is the library stored (e.g. is it entirely in memory? saved on disk?)" when referring to the addSongToLibrary method. While this is a high-level concern, it is not a documentation concern at this level. The way the library is stored is irrelevant to the add method. It may even be irrelevant to the containing [SongLibrary] class. The class may use dependency injection to establish the storage mechanism so even the class cannot document how the library is stored. I think while one needs to choose a simple example for a discussion like this, the choice of a numerical recipe colored the discussion. I think if the example were a song library, some of Uncle Bob's comments could have made more sense. I have run into real world situations where I wish I had broken methods/classes down into a finer granularity of single responsibility so that they would be more open to extension (or dependency inversion via polymorphism [see "Off Topic' note below]. In the prime number example, John's refusal to split the method into smaller ones does not appear to cause any harm because of the nature of the problem being solved. There are plenty of other examples where a monolithic method would cause problems. I could imagine John saying "Well, if I had a (functional or design) requirement for, say different storage methods in the song library, then I would account for that in my design". The problem is that we rarely know all of the requirements up front; the requirements evolve with the software. The song library, at its core, provides CRUD type database operations. One could image writing a BookLibrary or RecipeLibrary reusing a good bit of the code from SongLibrary if it were properly designed using SOLID principles and that includes small, single responsibility classes and methods. This kind of problem comes up frequently in the world I live in. The "BookLibrary" or "RecipeLibrary" are not conceived of when the initial "SongLibrary" requirements are written but emerge shortly after SongLibrary is put into production. I believe Uncle Bob's experience (and mine) is to anticipate scope creep (which occurs more frequently than not in our experience) by designing according to SOLID principles at the outset.

[Off topic: I have not read APOSD and I apologize for responding to the discussion between John and Uncle Bob without having read both books first. I did not see SOLID principles in the table of contents for APOSD but they may be discussed in the book therefor I do not know if John supports them or not. I believe they are very valuable. For a good example of dependency inversion through polymorphism, watch this recent video. In it the author defines a function f(x) = x*x + 3*x +2 without specifying what x is. x could be an int or float, or as in the video, the author adds Dual numbers and the function works, unmodified, on Dual numbers so long as addition and multiplication are defined for them. I also noticed in the APOSD chapter "Define Errors out of Existence" the first section is "Why exceptions add complexity". I've written two Microsoft Windows services in python than can never crash (and haven't for the couple of years they have been running). They perform database access, numerous computations, and access (sometimes flaky) network storage. Python, as you probably know, embraces exceptions. I find not cluttering code with "if" statements to check for parameter correctness and proper function returns makes the code much simpler. I can bubble up an exception from deep within the code to the top level, report the exception, abort the current task, and not have to litter my code with many, many checks.]

co...@the784.org

unread,

Feb 28, 2025, 6:54:56 PMFeb 28

to software-d...@googlegroups.com

SOLID is not a fact it's just a thing some people made up.

Factoring code into methods is quite like organising a written argument into paragraphs. It helps. The length is less important than how the chosen composition helps the reader.

--
You received this message because you are subscribed to the Google Groups "software-design-book" group.
To unsubscribe from this group and stop receiving emails from it, send an email to software-design-...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/software-design-book/35c6620f-c91d-4ae5-961b-d98ff545b00bn%40googlegroups.com.

Dan Cross

unread,

Feb 28, 2025, 9:06:55 PMFeb 28

to John Ousterhout, software-design-book

An interesting discussion; thank you for posting it. I do have some
comments, but I should admit my biases up front: I don't think
particularly highly of the "Agile" people, and I have an especially
low opinion of Robert Martin, who is frankly not a very good
programmer. I am mildly surprised that you chose to give him this
opportunity, as he strikes me as someone who peddles bad advice from a
position of unearned authority. Also, I'm not going to call him "Uncle
Bob": he is not my uncle, and he is not some venerable elder of
software development who automatically commands respect; I feel no
need to indulge him by participating in that pompous and ridiculous
marketing conceit. That said....

I am struck by how he is both unable to defend his positions, and
simultaneously (generally) unable to admit their flaws. An example is
his complaint about the labeled continue from the inner loop of the
prime number generator; he describes it as "_awful_" (emphasis his),
but when challenged on this he doesn't bother to explain himself, and
just goes on to repeat the assertion in the next exchange. Based on
reading some of his books, seeing his writing online (blogs, social
media, etc), and having a few very unpleasant interactions with him, I
think this is typical of his approach: state an opinion as an
assertion of fact, and don't bother to provide evidence or a
persuasive argument. If challenged, either claim authority (usually
based on length of experience) or simply ignore and repeat the
original assertion. Appeals to authority are bad enough, appeals to
length of experience are worse: just because someone has been doing
something for a long time does not mean they are good at it. I've been
shooting baskets for more than 40 years, but I'll never be a point
guard in the NBA.

We see this again in the discussion around test-driven development. He
makes vague allusions to his experience, but never really presents any
evidence for his stance on the practice, nor does he seem capable of
acknowledging the potential downsides. There is no excuse for someone
like Martin, who advocates heavily for TDD, to not be aware of, or
able to readily cite, relevant empirical evidence for his claims. And
in fact, there have been some studies of TDD: Nachi Nagapan at MSR
published a study in 2009 where the findings were that defect rates
decreased when using TDD, up to almost a factor of two, but at a
significant cost in development time: 15-35% longer
(https://www.microsoft.com/en-us/research/blog/exploding-software-engineering-myths/;
I do wish that there should be more studies like Nagapan's: perhaps a
thesis topic for a student working at the intersection of social
psychology and software engineering?).

Going further here, either Martin or one of the other agile influencer
types once wrote a blog post about TDD where they TDD'd their way into
"designing" bubble sort; of course, I can't find it now (Martin's
material, as with much of the material from the consultant types,
often moves around or disappears). The conclusion was weirdly
exultant: "look at how bubble sort just falls out of using TDD!" But
this reinforces the thesis that the methodology often leads to poor
results by favoring shallow, tactical thinking over deeper inquiry and
insight: why would I want to use a methodology that drives design
towards bubble sort? For that matter, why wouldn't I pop open a book
on algorithms and implement something better? Or, why wouldn't I look
up the documentation of my language and libraries and pick one that's
already been implemented for my environment?

As an aside, personally, I've found that TDD _can_ be useful as a way
to overcome writer's block when starting a project. If I find myself
stuck in the throes of analysis paralysis, sometimes writing some
failing tests and making them work after the fact can help move past
the mental block. But the "rules" of TDD seem ridiculous to me, and
the way the whole practice is often wrapped in martial arts metaphors
is just weird: "katas"? Really? Implementing something trivial like
the rules of bowling over and over again does not make me a better
engineer the way throwing a thousand punches against a heavy bag may
make me a better boxer; the two things are not alike. This, and the
"software craftsmanship" stuff, are mostly shallow marketing fluff.
It's interesting to me that he didn't try to bring any of that up; I
suspect he realized it wouldn't fly.

When Martin does admit he's wrong about something, he tends to
minimize or excuse it with comments of the form, "I wrote this as an
afterthought 18 years ago...". Ok, but he also published it in a book
that he claims exemplifies best practices for writing software. One
can reasonably expect better.

In fairness to Martin, he does have a point about names and
decomposition enhancing understanding and making comments redundant.
_If_ a comment is obviated by a better name, that seems useful to me,
and if a comment is entirely redundant with the code, I favor
eliminating the comment. But Martin misses the mark by focusing on
comments as being a "necessary evil" instead of a useful tool, and
once again resorts to dogma instead of producing a convincing
argument. I suppose creating a convincing argument is hard and that as
a self-styled "master software craftsman" he may not feel it is
necessary to explain himself to lesser beings.

I was disappointed that neither of you touched on either types or
tooling, and the discussion about comments went too far to either
extreme as a result.

For instance, John you say that comments are an "fundamental and
irreplaceable" part of building systems. I would rephrase this and say
that _documentation_ in the form of written prose in a natural
language is the essential thing, but whether such documentation is
expressed in the form of comments or something else is an
implementation detail. As an alternative, Common Lisp has the notion
of "documentation string" that can be attached to e.g. functions,
methods in classes, and so on
(https://www.lispworks.com/documentation/HyperSpec/Body/26_glo_d.htm#documentation_string).
Such strings are separate from comments, and I would argue that they
are more useful as they can be inspected from the repl for any object
in the image that defines one, potentially parsed and interpreted, and
so on.

Related, other languages make a distinction between "doc comments" and
other kinds of comments; the former can usually be processed by tools
that generate documentation from source code, while the latter may act
as guides to aspects of the implementation. Consider the documentation
for the Rust standard library, most of which is generated from the
source code itself (e.g. https://doc.rust-lang.org/std/index.html;
note the link to the source at the top of the page). Without this sort
of documentation, using the standard library would be significantly
harder.

Rust even has a facility for running _documentation tests_ embedded in
comments, which helps to ensure that comments stay up to date as
interfaces change
(https://doc.rust-lang.org/rustdoc/write-documentation/documentation-tests.html).
Martin had quipped that comments are not checked by the
compiler...except that sometimes they are. It's odd to me that Martin
focused on the color settings in his IDE, asserting that the default
subdued color in whatever tool he's using is an invitation to skip
comments entirely (incidentally, as he himself implicitly
acknowledged, this is obviously a configurable setting. And perhaps
the default is to make them easier to read as prose?), without also
acknowledging that many IDEs can use doc comments to generate
documentation directly from source. Perhaps he doesn't know how to use
his tools to their full effect?

I think you alluded to the semantic difference by making the header
comment for the prime generator a doc comment; Martin seemed befuddled
by this, and it would have been worth explicitly hammering home that
such comments are processed by tools to generate comprehensive
documentation for a module that can be inspected _outside of the
source code_, aiding understanding and discoverability. By focusing on
comments, as the means of expression of documentation, instead of
documentation as the desired artifact, I think the point was lost. On
the flip side, Martin seems averse to any sort of documentation _at
all_, believing that the code _is_ the documentation. This is just
silly, and doesn't scale beyond toy examples.

The point here is that not all comments are created equal, and
different kinds serve different purposes. Moreover, the
undifferentiated focus on comments specifically detracts from the
critical and essential role of documentation generally. It would have
been useful to highlight the different types of comments and their
different uses, perhaps mention other facilities, and show how these
things interact with external tooling to produce system-level
documentation.

A related point that I think deserved explicit attention is
discoverability. Navigating a large system and finding the information
needed to work with it effectively is dramatically easier with good
documentation, but the focus mostly seemed confined to localized use
of an interface or understanding of the implementation of something (a
method, class, whatever). In the discussion of the prime number
generator, Martin talks about "intimacy" with the code, which I
interpreted to mean deep knowledge and understanding. But he gives no
consideration to how a programmer becomes "intimate" with code, and
how on real-world teams people come and go and must to acquire that
intimacy to be effective at their jobs. This is where documentation
and discoverability really shine and endless decomposition hurts.

With respect to types, Martin is infamously averse to them, so much so
that Shriram Krishnamurthi called him out about it on twitter
(https://x.com/shriramkmurthi/status/1136411753590472707). Shriram is
right: Martin has no clue what he's talking about. Obviously, with a
statically typed language, I don't have to "test" whether a function
that expects an integer behaves properly when passed a list or vice
versa; such programs are not representable in the first place, and
this is enforced by the language environment. As such, my "testing
burden" is reduced by orders of magnitude since the language
eliminates entire categories of errors. And yes, these sorts of things
happen all the time in production systems implemented in dynamic
languages (I've personally encountered both of the examples I
mentioned above). But in this exchange, better use of a type system
could also be used to define better interfaces.

Martin writes that he thinks, `addSongToLibrary(String title, String[]
authors, int durationInSeconds);` is a fine, self-documenting
interface. His words:

|This seems like a very nice abstraction to me, and I cannot
|imagine how a comment might improve it.

John, you rightly take him to task about this, asking among other
things: "Is there any expected format for an author string?" A great
question, but one that could be obviated by passing an instance of an
`Author` object that abstracts that away. Here, the problem may not be
a lack of comments, but rather a lack of structure in the types
accepted by the function.

Two other things jumped out to me about this example: first, just as
using a string for authors is insufficiently precise to effectively
use the interface, using an integer for "durationInSeconds" is
unnecessarily low-level, and complects the interface with an
irrelevant detail (the time unit). Moreover, the interface is less
flexible as a result: I, as a listener of music, probably only care
about song durations in units of seconds. But a recording engineer may
well want something more granular. Similarly, what if this is called
from somewhere that has the duration as milliseconds, or a
(minutes,seconds) pair? So instead of taking an `int` for duration and
giving it a name that ties it to a particular unit, why not take an
instance of a more generic `Duration` type, instead? Many languages
have them; this example is in Java, so passing a `java.time.Duration`
here would be superior to the `int`.

Second, I'm no musician, but I don't think one usually refers to the
creators or performers of a song as the "authors"; "writers" seems
more common in the domain (at least given my teenage experience
pouring over liner notes). So in this case, the interface is poor
because it doesn't match the domain's expected nomenclature. And while
one often reads the term "songwriter", songs are also written by
producers or composers or others in the studio, so perhaps this should
take instances of a "Creator" object. The point here is that the
abstraction ought to match the domain, and it doesn't. Is this unfair
to Martin? After all, he's a consultant and author, not (as far as I'm
aware) a musician. Perhaps, but it's also an engineer's professional
obligation to make sure they're familiar with the domain they're
working in: if I were designing that interface, at a minimum I'd ask a
couple of musicians what I should call my arguments.

In one of Martin's books, there's a chapter where he sits down with
another agile consultant type and they "pair" to write a program that
implements the rules for scoring a bowling match. Something that
struck me about this is that they don't seem to start with a written
specification of those rules, but rather just feel their way through
based on their intuition; they conclude by saying something like,
"Yeah, I think that's about it", perhaps drawing from a memory of a
childhood birthday party? That may be fine for a consultant pedding
trivial "katas", but it's no way to write professional software. I see
a similarly flippant attitude with the presentation of
`addSongToLibrary`.

Related to the first point, Alexis King wrote a wonderful blog post
several years ago about what she called, "type-driven design"
(https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/).
If I were to try and distill this to its essence, it says that a type
system can and should be usefully employed to encode semantic meaning
into data in such a way that it cannot be misused. Such encoding
should happen at the edges of a system, so that preconditions on data
need not be re-checked when the data is used.

This abstraction Martin proposed is bad not only because it's
undocumented, but because it entangles the interface with irrelevant
implementation details about the format of the arguments that a caller
must now know about and take care to enforce: the burden is now on the
caller to convert to the expected format at the call site. So while
documenting the interface would be an improvement, using King's model
would be even better. Internally, the system should pass around
Duration objects to represent lengths of time, only converting to and
from at the boundaries of the system. In fact, in this case, the
interface should be both improved _and_ documented so that it's easily
discoverable and more usable.

I would like to turn to the prime number generator program now. On my
first reading, I somewhat preferred Martin's final version of the
prime number program, but looking at it again now, I see that it is
deficient in several significant ways.

First, there's the whole matter of attribution. The algorithm is not
due to Knuth, but rather to Dijkstra (this is the example used in
section 9, "A First Example of Step-Wise Program Composition" from
"Structured Programming"). Knuth himself acknowledges this in the
source paper on literate programming that this whole thing is based on
(http://www.literateprogramming.com/knuthweb.pdf); Martin should do
better.

Next, Martin leaves it entirely undocumented, even after repeated
pleas to, at least, explain the rationale for squaring numbers. I find
the insistence on not documenting particularly odd given that he got
the algorithm from a paper on literate programming. If they go back to
Knuth's paper, the curious reader will find that the whole business
around squares and multiples and the properties of both with respect
to this algorithm is actually very well explained in a way that is
highly accessible to someone with a lay knowledge of the mathematics
involved. No need for hour-long bike rides or long contemplative sits
in one's chair with one's eyes closed when one could spend 15 minutes
reading a program's documentation. Seriously; the irony here is
deafening.

As an aside, note that Dijkstra also explains use of squares and the
multiples array; however, he quotes Knuth (who presumably read a draft
copy of "Structured Programming") and commented, "Here you [Dijkstra]
are guilty of a serious omission! Your program makes use of a deep
result of number theory, namely that if p_n denotes the $n$th prime
number, we _always_ have $p_{n+1} < p_n^2$." Dijkstra responds,
"Peccavi." Note that careful reading of the original could also let us
make a _memory_ optimization; observe that the "multiples" array
doesn't need to be the same size as the output array (length n), but
rather can be ~sqrt(n) entries; Knuth notes that, for n=1000, the
index into the multiples array is never more than 30; he's using
Pascal, and the array bounds go from 2..30, so the size is actually
smaller.

Martin also fails to validate input: the way it is defined, a caller
could ask for 0 primes and the program would generate an exception (I
believe the same is true of your version as well, John). I get that
this is a pedagogical example, but that actually amplifies the point:
this is a pedagogical example in a book targeted towards
professionals, not a first example in an introductory programming
text. In this context, I should reasonably expect to see best
practices, while in an introductory book one may introduce those later
so as not to overwhelm the reader early on.

His choices around decomposition lead to "spooky action at a
distance": he makes `candidate` a data member of the class. So it's
incremented in the loop, but not passed to `candidateIsPrime` or
`registerTheCandidateAsPrime` methods. To understand what the main
loop does, I also have to understand the other two methods. Frankly, I
don't see why so many things need to be data members of a class.
Indeed, even the structure of the `for` loop he uses is misleading, in
that the initialization and increment parts are manipulating
`candidate`, but the test portion is against `primesFound`.

Still, I rather liked the concision of his main loop and the semantic
pairing of, "if the number is prime, add it to the list of primes". It
struck me that the program might be better written by keeping that as
the central structure of the loop while separating out the primality
testing state into an auxiliary object. But because of the tight
coupling between that state and the iteration through candidates, I
became convinced that this was hopeless. What little benefit being
able to write something like:

candidate = 3;
for (i = 0; i < n; i++) {
while (!auxState.isPrime(candidate))
candidate += 2;
primes[i] = candidate;
}

...just isn't repaid by the complexity of maintaining that state
separately. Moreover, it's no longer easy to verify that the state
stays in sync with the context of the rest of the program. Trying to
decompose this program into smaller functions that share state in the
parent object is not useful, and just increases the cognitive load on
the programmer, who as you pointed out must now read and internalize
the decomposed methods to convince themselves both that they work, but
also to ensure that they maintain the invariants of the generator
across invocations. It occur to me that this complication may be
somewhat alleviated by passing that state to a function by argument,
and allowing the function to update the state as needed, but I wonder
if, instead, more of an "iterator-like" interface wouldn't be better:
provide a 'nextPrime()' method and just wrap all the state up into
that. Repeated invocations would simply return the next prime. This
would let the programmer certainly detangle the outer loop, which is
accumulating primes in a buffer, from the details of what prime we're
searching for.

Also, with respect to comments, this is the sort of program that is
better served with an interface doc comment that describes how to use
it, and a "theory statement" comment near the top of the
implementation that explains the algorithm, while omitting most of the
inline comments.

I spent some time poking at this, and my "iterator" idea, and have
attached the result. I'm still not happy with some aspects of it (one
will note that the "OddPrimesIterator" class I implemented duplicates
the `primes` array; this isn't strictly necessary, as that could be
passed as a parameter to `next`, but then we're back to duplicating
state in an awkward way; this feels like the least-bad way to approach
it). Also, one _could_ in theory call a large enough number of times
that it, but given that it's private, this is unlikely.

As a very minor quibble: you import `java.util.ArrayList` in your
version but don't seem to use it.

Thanks, and apologies for the length of this email.

- Dan C.

Primes.java

Dan Cross

unread,

Mar 1, 2025, 2:47:21 PMMar 1

to John Ousterhout, software-design-book

Perhaps I should have sat on my excessively long email yesterday until
this morning. Or, at least, waited on attaching my modification of the
prime number generator program until then. On review this morning, I
saw a number of embarrassing omissions and errors in my theory
statement comment, and I was unhappy about the program anyway.
Attached is a revision.

I'm still somewhat disappointed in the way that state is distributed
across methods in a class; I'd rather it was all localized to one
function. However, I found the main loop in both versions from the
exchange with Martin confusing in that it iterated over candidates and
tested against the number of factors found; I really wanted to
separate those. So what I have now, while imperfect, at least does
that. A careful reader may note that this code resurrected Knuth's
(and Dijkstra's) `square` variable, and reduced the size of the prime
multiples array; I found going back to Knuth's paper very useful, and
studying Martin's refactoring far less so.

In support of comments, I did find them extremely valuable for both
explaining what was going on, and also documenting the expectations
for what can be called when.

I readily admit that I am not a Java programmer, nor am I really set
up to profile Java programs. But I did write a little driver around
the generator to test it, and captured timestamps around a run to get
some timing numbers relative to the other versions. I took care to
warm up the JVM before timestamping the final run, which calculates
the first 10,000 primes. Mine is slightly faster than John's, but not
significantly so; I imagine the delta is well within the margin of
error. This was all on a 2022 Mac Studio with OpenJDK 23.0.2
(installed from Homebrew).

term% diff -u a b
--- a 2025-03-01 14:16:42
+++ b 2025-03-01 14:16:59
@@ -9998,4 +9998,4 @@
104717
104723
104729
-Primes.getFirst(10000) took 1019959 ns
+Primes.getFirst(10000) took 1013000 ns
term%

Note that in the exchange, Martin claimed that his final version of
the prime generator had a significant performance advantage over your
version. I see no reason to believe that his version would be faster,
though, so I ran and timed it as well; on my first go, in my warmup
routine, his version threw an exception because of lack of checking
preconditions as I mentioned yesterday (the warmup just ran a loop
from 1 to 10000, computing that number of primes and checked the size
of the returned array). Easy enough to fix, of course: just skip that
case in the warmup for his version.

Anyway, the best I saw his clock in at on my machine is 1028334 ns;
insignificantly slower than either, but certainly not faster, let
alone faster by ~30% or whatever he claimed. Increasing the number of
calculated primes to 1,000,000 yields similar results, though of
course the whole thing takes longer for all three versions. In this
case, your version was fastest, then Martin's, then mine. But the
absolute delta between all three was very small, around 1.2% between
yours and mine.

term% tail -1 a b c
==> a <==
Primes.getFirst(1000000) took 419105625 ns

==> b <==
Primes.getFirst(1000000) took 424275167 ns

==> c <==
Primes.getFirst(1000000) took 421752292 ns
term%

Put bluntly, I don't trust that Martin performed whatever benchmarking
he did competently; he certainly gave no details about his
experimental setup or execution environment.

- Dan C.

Primes.java

Dan Cross

unread,

Mar 4, 2025, 10:41:54 PMMar 4

to John Ousterhout, software-design-book

Another addendum on this. In fact, Martin's version of the prime
number generator has yet more deficiencies, some of which I read about
elsewhere ("Hacker News" and Reddit).

I had mentioned that when I ran my benchmark against Martin's version,
my warmUp() routine threw an exception due to an out-of-bounds array
access. I didn't think much about it at the time, but the specific
circumstance was when n==1; this means that Martin's version cannot be
used to return a list consisting of only the first prime. One can see
this trivially by trying to run Martin's version with a simple
exerciser:

public static void main(String[] args) {
int[] ps = generateFirstNPrimes(1);
System.out.println(ps[0]);
}

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
at rcmLiteratePrimes.PrimeGenerator.initializeTheGenerator(PrimeGenerator.java:23)
at rcmLiteratePrimes.PrimeGenerator.generateFirstNPrimes(PrimeGenerator.java:30)
at rcmLiteratePrimes.PrimeGenerator.main(PrimeGenerator.java:80)

I also mentioned that Knuth's reference program did not append to the
prime multiples array until a candidate exceeded the currently tracked
square, while Martin's does so eagerly (John's does, as well).
However, what I didn't realize at the time was that this means that
the prime multiples overflow their signed, 32-bit representation once
the algorithm gets to 46349, which is only the 4793th prime. In some
sense, this doesn't matter much, since they won't likely be used, but
in addition to being wasteful, it is an error waiting to happen.

Finally, Martin's version puts state into a class, in order to
facilitate method decomposition. However, that state is stored in
static members, meaning that the prime generator is not thread safe.
(I also keep track of state via a class abstraction, but in member
variables in an instance of the generator object.)

In short, not only does Martin's version have the problems I mentioned
in my earlier notes, it is incorrect.

- Dan C.

Matteo Docci

unread,

Apr 15, 2025, 11:30:49 AMApr 15

to software-design-book

I tried to refactor Knuth's code to see where I would end up.
My guess is that explaining parts of the algorithm before presenting the code benefits readers more than fragmenting cohesive code into excessive functions or scattering lengthy comments throughout operations.

I renamed every variable to better reflect its purpose. Notably I renamed the multiples array to nextOddMultiples, putting -1 as a placeholder for multiples of 2. While this choice is as arbitrary as using the number 2 or 4, it allows the array name to include 'Odd' and makes it clearer that the value is a placeholder. An alternative could be using null, but that might depend heavily on the language of choice.
I wrote my version of the class in C#, since that's what I use on a daily basis. It should be almost identical to the Java equivalent.

PrimeGenerator.cs

Reply all

Reply to author

Forward