Rethinking Literate Programming

Gregg Reynolds

unread,

May 8, 2014, 8:57:51 AM5/8/14

to clo...@googlegroups.com

The thread on documentation that Val started (https://groups.google.com/forum/?hl=en#!topic/clojure/oh_bWL9_jI0) is getting a little long so I'm starting a related one specific to litprog.

I've made a start on rethinking LP at https://github.com/mobileink/codegenres/wiki/Rethinking-Literate-Programming.

A few key points:

* Knuth's main early LP tool (WEB) was to a certain extent an attempt to fix deficiencies in Pascal, as Knuth himself explicitly acknowledged. Some aspects of Knuthian LP (KLP) may make sense for imperative languages with side effects; since it's hard to reason about programs written in such languages, added commentary is needed. But if you already know that functions are side-effect free and data are immutable, you no longer need that.
* Programming language design has not evolved in the direction of LP, as we might have expected from some of Knuth's more grandiose pronouncements; instead they have evolved in the direction of greater expressivity, which obviates the need for many kinds of documentation. You can argue that LP was a fine thing in its day, but the world has moved on.

* KLP is largely based on the personal aesthetic and psychological preferences of DE Knuth involving issues such as the proper order and mode of presentation of code. Those are normative issues, and there is no reason to take Knuth's preferences as gospel. In particular there is no justification for his claim that using LP "methods" leads to "better" code. It not only depends on what "better" means, it depends on what other methods are available. Just because writing Pascal in LP was better (for Knuth et al.) than writing plain Pascal does not mean this will always be the case in all languages. It doesn't generalize. (To a hammer, everything looks like a goto.)

* There is (demonstrably) no reason to think that there is any "natural" or "best" order of presentation for code; there are only preferences, and everybody has one, if you catch my drift. The point again being that Knuth was not on to some kind of laws of programming. KLP is all about his preferred style, not about the Way Things Are.

* KLP is sometimes contrasted with "self-documenting code" To get a grip on what that is and what we can expect from it we need to examine the notions of function, algorithm, and code. Then it looks like code does not in fact "self-document", if "documentation" is taken to mean explanation. But it does express meaning, and sometimes expressivity is preferrable to explanation. Maybe that's terminological nit-picking, but sometimes coming up with the right terminology makes all the difference (see "lambda abstraction").

* Speaking of which, Knuth himself admitted that his choice of "literate programming" as the name of his "new method" was tongue in cheek, since it makes anybody who doesn't use it an "illiterate programmer". (The citation is in one of the essays in his book "Literate Programming".) So maybe we should stop using it and come up with a more accurate name. Howsabout "Knuthian Programming"?

* Knuth's model for program text is the literary essay, read from beginning to end. This is obviously in tension with the way code actually works. Library code usually does not have a beginning or end, for example. This is a little ironic, since hypertext has liberated us from the tyranny and oppression of linear narrative. A better literary analog to program text is The Book of Lists, or Commonplace books, whose contents can be read in any order.

* Finally, a whiff of a hint of a soupcon of a concrete proposal: instead of supporting some kind of structured markdown-style syntax in comments and docstrings, add support for the Z specification notation, so that we can express in clear, concise, formally defined, standard set-theoretic notation the exact meaning of code. That's the general idea, I don't have a concrete suggestion yet.

There's more stuff on the wiki, and more to be said, but I'll stop here.

Cheers,

Gregg

Gregg Reynolds

unread,

May 8, 2014, 9:00:26 AM5/8/14

to clo...@googlegroups.com

PS. Just to be clear, my purpose is neither to attack nor to defend LP, just to get clear about exactly what it is, what its presuppositions are, what its implications are, etc.

-G

Mark Engelberg

unread,

May 8, 2014, 2:02:17 PM5/8/14

to clojure

Greg,

I can tell by the amount of work you've put into this document that this is an earnest attempt at analysis and not trolling, so I'm going to give you my earnest response: you are wrong on so many levels.

First, you seem to have several misconceptions about literate programming in general, and Knuth-style literate programming specifically, which makes me wonder whether you've ever actually read Knuth's code. For example, you say, "Knuth's model for program text is the literary essay, read from beginning to end. This is obviously in tension with the way code actually works." Yes, it is true that one goal of literate programming is to free the programmer to choose an order to describe the code that is independent of the order and structure that the compiler needs to see it. But a Knuth-style literate program is far more than a linear essay. Although you *can* often read a literate program from beginning to end in order to understand the full system, the detailed hyperlinking and indexing makes reading code almost more like a choose-your-own-adventure story, making it easy to understand the parts of code you care about and understand how the different parts relate to one another. I suggest you sit down and read some actual Knuth code if you want to try to understand what that approach does and does not accomplish.

Second, you repeatedly make the case that programming expressiveness (and presumably Clojure's expressiveness specifically) is far better than in Knuth's time. This is nonsense. LISP is one of the oldest languages, and Clojure isn't profoundly different in its expressiveness. In fact, Clojure has a number of features that actively hurt its expressiveness relative to other modern languages:

1. Definitions must precede their usage. Can usually work around this with forward-declarations, but even that small extra burden causes programmers to tend towards a bottom-up style of writing Clojure code, even if that is not desirable for a certain program.
2. Very strict limitations on ways that different files/namespaces relate to one another (e.g., no cyclic dependencies), so very often, things need to be organized for the convenience of the compiler rather than for understanding.
3. Limited convenient notation for expressing that a function is merely a "helper function" (only defn has a convenient notation for this, defn-).
4. The tooling convention of having tests in separate files places even more obstacles in the way of using things like defn- in order to express the distinction between primary and secondary functions.
5. Clojure's inability to handle local recursive references (i.e., no letrec) provides obstacles for clearly expressing that certain things are merely local functions/data for another function. Some things have to be made global that conceptually aren't.
6. The "sea of sameness" problem -- no visual distinction between functions, macros, variables, control constructs.

You state that functional programs are so much easier to comprehend than mutable ones, they don't really require explanation. This also is silly.

I challenge you to buy this book: http://www.amazon.com/Pearls-Functional-Algorithm-Design-Richard/dp/0521513383, a collection of literate programs (in the academic article sense of the word, not really in the Knuth sense) written in Haskell, arguably the "most functional language". Most of the chapters conclude with the entire source of the program under discussion. I challenge you to pick any one of those chapters and look just at the final program, then try to figure out how and why it works. Good luck!

There is a reason why academic computer science journals are not just books of raw source code. Innovative code and complex code require explanation. I would argue that Clojure would not exist were it not for a long tradition of code-embedded-in-detailed-explanation. Clojure is founded on the use of cutting-edge functional data structures, and I would wager that Rich Hickey would likely not have understood these structures well enough to implement them so successfully had he not been able to read articles explaining their construction and why they work.

So this notion that the world is trending away from literate programs just isn't true. We see them all the time in the form of articles and blog posts designed to elucidate, we see them all the time in the context of real-world large, complex systems designed to outlive the people who created them, and in the context of literate programs written to teach and influence a new generation of programmers. For example, did you know that the book/literate program "Physically Based Rendering" recently won a Scientific and Technical Academy Award? (Yes, that's right, a literate program won an Academy Award -- the "Hollywood movie" kind.)

"Physically based rendering has transformed computer graphics lighting by more accurately simulating materials and lights, allowing digital artists to focus on cinematography rather than the intricacies of rendering. First published in 2004, Physically Based Rendering is both a textbook and a complete source-code implementation that has provided a widely adopted practical roadmap for most physically based shading and lighting systems used in film production."

If you don't ever write algorithms that would be easier for others to understand if you could include a picture or a mathematics formula, that's fine. If you don't ever write code that would benefit from being presented in an order independent of the constraints of the Clojure compiler, that's also fine. If you don't write code where the reasoning behind the code needs to be understood deeply by other programmers who will follow you, that's great. The truth is that there is a lot of programming out there that needs to be done -- programming which is so obvious that almost any programmer would sit down and approach it the same way -- programming that requires little explanation. Maybe that's the programming world you inhabit, but you need to recognize that not everyone inhabits that world.

For example, I personally don't inhabit the world of programmers who need to *prove* their programs are correct. But I don't go around denying the value of formal verification systems. I could truthfully make claims like, "One obstacle to formal verification is that it is difficult to constantly update the verification proofs in real-world programs with aggressive schedules and constantly changing requirements. There's a reason why formal verification hasn't caught on in everyday use -- it's hard. You have to understand programming *and* proofs." However, none of those claims would eliminate the value and validity of that approach to programming.

Similarly, it is true that literate programming faces genuine obstacles to mainstream adoption: it's hard and doesn't (currently) integrate well with the other tooling we use. In fact, I've heard exactly the same claims leveraged against functional programming (it's hard and doesn't integrate well with debugging and other analysis and refactoring tools that are commonplace in languages like Java).

But that doesn't mean there isn't value there. Even if you don't personally see the need for literate programming, anything that moves us in the direction of incorporating richer text, images, hyperlinks, video, formulas, auto-updating examples, etc. into our programs is a good thing. One problem with the "journal article" model of literate programming is that it tends to involve prose written around a finished, unchanging program. Any new ideas that help strengthen our ability to build explanations around living programs is a good thing.

I recongize there are challenges: it's very hard to figure out how to do this in a way that will work across all the tools and IDEs that are common in the Clojure community, but I applaud the efforts of those who want to make these problems easier in Clojure.

u1204

unread,

May 8, 2014, 3:13:31 PM5/8/14

to clo...@googlegroups.com, clo...@googlegroups.com

> PS. Just to be clear, my purpose is neither to attack nor to defend LP,
> just to get clear about exactly what it is, what its presuppositions are,
> what its implications are, etc.

I also do not want to get into defending LP yet again. But I do think
you might have missed the key point by focusing on presentation rather
than communication.

> * Knuth's main early LP tool (WEB) was to a certain extent an attempt to
> fix deficiencies in Pascal, as Knuth himself explicitly acknowledged. Some
> aspects of Knuthian LP (KLP) may make sense for imperative languages with
> side effects; since it's hard to reason about programs written in such
> languages, added commentary is needed. But if you already know that
> functions are side-effect free and data are immutable, you no longer need
> that.

Having worked in string-free Pascal, the only real way to "fix it" would
be to take it outside and burn it. :-)

> * Programming language design has not evolved in the direction of LP, as we
> might have expected from some of Knuth's more grandiose pronouncements;
> instead they have evolved in the direction of greater expressivity, which
> obviates the need for many kinds of documentation. You can argue that LP
> was a fine thing in its day, but the world has moved on.

"The nature of functional programming is to build, Russian doll-style,
functions that use functions that use functions etc. But without
something like a literate style, your efforts are quickly lost in the
details. You do stuff -- and unless you have a phenomenal memory,
you've simply dug a nice, deep tunnel that is, at the same time,
collapsing behind you. YOU may know what you've done, but how to make
others aware and get them involved? All they see is some collapsed
tunnel with a sales pitch about how you should go re-dig that very same
tunnel." -- Lawrence Bottorff, February 2014

> * KLP is largely based on the personal aesthetic and psychological
> preferences of DE Knuth involving issues such as the proper order and mode
> of presentation of code. Those are normative issues, and there is no
> reason to take Knuth's preferences as gospel. In particular there is no
> justification for his claim that using LP "methods" leads to "better"
> code. It not only depends on what "better" means, it depends on what other
> methods are available. Just because writing Pascal in LP was better (for
> Knuth et al.) than writing plain Pascal does not mean this will always be
> the case in all languages. It doesn't generalize. (To a hammer,
> everything looks like a goto.)

While the "proper order and mode of presentation" is a matter of
personal preference it is not really germaine to the problem.

When code is written and decorated with comments about what it does
we feel we have communicated the important part.

The problem is that we missed the "why". Sure, we have immutable, log32,
red-black trees (ILRB trees). Yes, we documented what the arguments mean.
But you'll notice that nowhere in the github tree is there any answer to
"why?".

A "literate programming style" isn't really the issue. The loss of
"why?" is the issue. Answering "why?" means that you have to
build up the background problem so people can understand "why?" the
code is a solution. In other words, you need to communicate the ideas
in some linear fashion so they have a common background understanding.

> * There is (demonstrably) no reason to think that there is any "natural" or
> "best" order of presentation for code; there are only preferences, and
> everybody has one, if you catch my drift. The point again being that Knuth
> was not on to some kind of laws of programming. KLP is all about his
> preferred style, not about the Way Things Are.

As an author I agree that there is no "natural" or "best" order of
presentation. But there are clear preferences. Pick up any "book" which
is a collection of conference papers and you can see that presentation
choice is vital. Random, unorganized piles of ideas is not
communication.

The real focus of literate programming is actually about communication
from one person to another. "Ideas" are missing from "The Way Things
Are". At best we document what something does but not why we want to use
it.

If you look at the mailing list, a lot of the answers are of the form
"this is why you should write it this way".

Rich has been pretty good about communicating his ideas and his videos
can be found on the web, assuming you know where to look and what you
are looking for. Which video would I watch to get the ILRB tree concept?
Which one would I watch to get the ideas behind "conj vs cons"? Or the
details of software transactional memory? Where is the explanation of
that long block of code I posted last week?

And how do those ideas relate to the code? After all, I end up staring
at the code.

> * KLP is sometimes contrasted with "self-documenting code" To get a grip on
> what that is and what we can expect from it we need to examine the notions
> of function, algorithm, and code. Then it looks like code does not in fact
> "self-document", if "documentation" is taken to mean explanation. But it
> does express meaning, and sometimes expressivity is preferrable to
> explanation. Maybe that's terminological nit-picking, but sometimes coming
> up with the right terminology makes all the difference (see "lambda
> abstraction").

"Self-documenting code" is known as Cobol. Expressivity IS preferrable
to explanation as anyone who does comedy or art knows. Explaining a joke
ruins it. I can't say the same for explaining code though.

The "right terminology" is definitely important. But reducing the idea
expressed by the terminology to a single symbol assumes that there is
shared semantics between the author and the audience. A sliding_window
variable is useful in TCP code but meaningless unless you have the
background knowledge to capture the semantics.

> * Speaking of which, Knuth himself admitted that his choice of "literate
> programming" as the name of his "new method" was tongue in cheek, since it
> makes anybody who doesn't use it an "illiterate programmer". (The citation
> is in one of the essays in his book "Literate Programming".) So maybe we
> should stop using it and come up with a more accurate name. Howsabout
> "Knuthian Programming"?
> * Knuth's model for program text is the literary essay, read from beginning
> to end. This is obviously in tension with the way code actually works.
> Library code usually does not have a beginning or end, for example. This
> is a little ironic, since hypertext has liberated us from the tyranny and
> oppression of linear narrative. A better literary analog to program text
> is The Book of Lists, or Commonplace books, whose contents can be read in
> any order.

I have several "book of lists". My chemistry list will tell me
everything about hydrogen. My biology list will tell me all about
glycogen. My physics list will tell me all about momentum. They can be
read in any order. The problem is that communication requires orderly
and logical development to build concept upon concept.

I see you've created a wiki. The wiki idea is that everyone can
contribute and that the information will self-organize. In my experience
with two projects that were wiki-based I have found that the information
may be somewhere on the wiki. Unfortunately the information doesn't
"self-organize" in any rational way. Both wiki efforts reached a "wiki
horizon" where they became chaotic. Anyone could point at the wiki
saying "the information is here", since they wrote it. But no other
person could find it.

As any author of a textbook knows the hard part is linearizing the
information so it can build upon itself. Everybody needs to know
everything to understand anything and it falls to the author to choose
a path, lay the foundations, and communicate the ideas in an orderly
and logical fashion.

> * Finally, a whiff of a hint of a soupcon of a concrete proposal: instead
> of supporting some kind of structured markdown-style syntax in comments and
> docstrings, add support for the Z specification notation, so that we can
> express in clear, concise, formally defined, standard set-theoretic
> notation the exact meaning of code. That's the general idea, I don't have
> a concrete suggestion yet.

I have used ACL2 and COQ to write formal specifications of code. If you
think literate programming is painful and opaque you have to try to read
these proofs. I do support the idea of program proofs but you REALLY
don't want me to climb on THAT horse. I'm being enough of a pain
already. Curry-Howard would get me lynched. :-)

I think that there is a hierarchy of things we can do to make Clojure
code better and these can be ordered by how much "machinery" we need
to make it work. A possible order might be

Geekhood
0) read the code

Motherhood
1) good variable name choice

Javadoc
2) docstrings
3) argument type decorations
4) API coverage

Markdown
5) paragraphs explaining code

??
6) sections explaining ideas

??
7) chapters introducing areas

??
8) table of contents
9) index
10) bibliography

??
11) hyperlinks

??
12) inline executing examples (e.g. running Ant inline)
13) audio
14) video

Having travelled this path I found that LP using Latex covered all of
these cases using only one tool. Clearly other people would like a
collection tools aimed at the specific problems. The choice isn't really
important but the goal of better communication is.

Tim

u1204

unread,

May 8, 2014, 3:28:37 PM5/8/14

to clo...@googlegroups.com, clo...@googlegroups.com

> For example, did you know that
> the book/literate program "Physically Based Rendering" recently won a
> Scientific and Technical Academy Award? (Yes, that's right, a literate
> program won an Academy Award -- the "Hollywood movie" kind.)

An awesome book, by the way. I WISH I could write such a literate
program. Like "Lisp in Small Pieces", they wrote a masterpiece.

I hope it shows the next generation of programmers what beautifully
written programs look like.

Tim

Mark Engelberg

unread,

May 8, 2014, 4:59:46 PM5/8/14

to clojure

On Thu, May 8, 2014 at 11:02 AM, Mark Engelberg <mark.en...@gmail.com> wrote:

In fact, Clojure has a number of features that actively hurt its expressiveness relative to other modern languages:

BTW, that list was by no means exhaustive. In the past couple of hours I've thought of a couple more, I'm sure others could easily add to the list:

7. Use of infix notation means that math formulas look dramatically different in Clojure than in math form, and therefore, it is difficult to determine at a glance whether a formula as implemented in Clojure matches.

8. Arrays in many domains are more naturally expressed as 1-based, but in Clojure, they are 0-based. I've encountered a lot of code that was confusing because of lots of increments/decrements to shift back and forth between the problem as specified with 1-based implementation and the 0-based implementation imposed by Clojure. Lots of opportunities for off-by-one errors and/or later confusion when other readers try to make sense out of the code.

9. Clojure's ease of functional composition can result in deeply nested calls that are far easier to write than they are to read.

10. Unlike most other languages, every time you give names to local variables with let, you add a level of indentation. Especially with alternations of let and if/cond, you can easily end up with "rightward drift" that makes code harder to read.

These are things we learn to live with. If these were show-stoppers, I'd be using another language, but they are not, so on balance I prefer Clojure with its other many strengths. My only point is that by no means is Clojure a pinnacle of expressiveness where all code is miraculously obvious.

Erlis Vidal

unread,

May 9, 2014, 8:59:18 AM5/9/14

to clo...@googlegroups.com

Guys, you really are into the Literate part, those emails are huge! let me catch up and then I'll reply...

Interesting discussion!

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Erlis Vidal

unread,

May 9, 2014, 9:33:45 AM5/9/14

to clo...@googlegroups.com

In the past I've used a java tool to write "acceptance tests". Concordion [http://concordion.org/]. The idea is simple yet effective. You write your documentation in HTML, and later you can run your code that will interact with that documentation and generate a new documentation, marking the portions of the text that are implemented and right (in green) vs the portion that's not yet implemented or failed (in red).

This was an excellent communication tool. We can design the documentation in a way that the information flows and anyone could understand. I think the idea could be used in Clojure also, actually I was thinking about this for a while, it shouldn't be hard to use from clojure, it's a Java tool in the end.

After reading this discussion I was wondering if a tool like this could be use to do LP, if not, I would like to know why.

Thanks!

Gregg Reynolds

unread,

May 9, 2014, 10:17:15 AM5/9/14

to clo...@googlegroups.com

On Fri, May 9, 2014 at 8:33 AM, Erlis Vidal <er...@erlisvidal.com> wrote:

In the past I've used a java tool to write "acceptance tests". Concordion [http://concordion.org/]. The idea is simple yet effective. You write your documentation in HTML, and later you can run your code that will interact with that documentation and generate a new documentation, marking the portions of the text that are implemented and right (in green) vs the portion that's not yet implemented or failed (in red).

This was an excellent communication tool. We can design the documentation in a way that the information flows and anyone could understand. I think the idea could be used in Clojure also, actually I was thinking about this for a while, it shouldn't be hard to use from clojure, it's a Java tool in the end.

After reading this discussion I was wondering if a tool like this could be use to do LP, if not, I would like to know why.

Hi Erlis,

That looks like an excellent tool, thanks for bringing it up! Years ago, when I first started dinking around with LP, I wanted to get the good documentation of LP but without mixing code and documentation in one file and without reordering; in other words, a way to reliably attach documentation to code from the outside, keep it up to date, etc. Something akin to the way XSLT relates to XML source docs. I think something like that is doable, but it's complicated and my initial enthusiasm eventually petered out. But it never occurred to me to use tests in this way. It looks like a great way to document APIs (internal and external), and I no reason why it wouldn't work just fine for Clojure. If not Concordion, it probably would not be too tremendously difficult to add similar capabilities to one of the pure Clojure unit test frameworks.

One thing I might change (having spent no more that a few minutes looking at Concordion) would be to write Condordion specs in XML instead of HTML. That would make it easier to repurpose the output to PDF, Eclipse helpfiles, Windows helpfiles, etc. (see http://dita-ot.github.io/1.8/readme/AvailableTransforms.html for a list of output formats in active use by tech documentation specialists). In fact, now I think of it, it looks like Concordion specification would be a good candidate for a DITA specialization.

Personally I would not call this LP, just to avoid confusion. Knuth's original notion of LP pretty clearly means, among other things, code and documentation in the same LP source text, and free ordering. Most other LP systems that I've looked at follow those norms, so calling something like Concordian an LP tool would likely lead to gnashing of teeth, not to mention theological debates about True LP. I'm not sure what I would call it, other than a documentation tool. Maybe live documentation? Integrated external documentation? Test-based documentation?

Thanks,

Gregg

-Gregg

Erlis Vidal

unread,

May 9, 2014, 11:51:08 AM5/9/14

to clo...@googlegroups.com

I've always seen this to document what the system does, as a way to gather requirements. And the name used is similar to what you propose. Live Specification or Specification by Example among other names.

It never occurred to me that this could be used for API documentation, and I'm a completely n00b to LP, that's why I asked if we could use something like that. I see that the definition of LP involve the word "programming" so basically we have to bind the code with the "literate" part.

Maybe concordion could be a interesting idea to present in the discussion we have around a new way of documentation for Clojure. It's nice what you can do with it. We can even use it to document how the future version of the language is progressing, we can go to the "live" page and see what's done and what's pending.

I'll see if I find some time to create something in clojure that's documented using concordion.

Anyway, thanks for the answer and keep up the great work, everyone!

--

Gary Johnson

unread,

May 9, 2014, 11:55:31 AM5/9/14

to clo...@googlegroups.com

puzzler and Tim,

Well said, gentlemen. As someone who has been using LP heavily for the past two years, I have certainly reaped many if not most of the benefits regularly argued in its favor (and of course, I've wrestled with all the usual tooling issues as well). While I agree with puzzler that many programmers probably don't write sufficiently novel software that would benefit from an LP style, quite a few of us do. In my case, much of my programming is aimed at environmental research, developing models and algorithms for describing and predicting natural processes and human interactions with them. In order for my work to be accepted and used for decision making (usually around land planning), it is absolutely crucial that I can transparently explain all the formulas used, literature cited, and conceptual steps taken to get from empirically measured data to modelled outputs. A literate programming style not only helps me to organize my thoughts better (both hierarchically and sequentially), but it provides me with a living (tangled) document that I can share with my non-programmer colleagues to get their domain-specific feedback about my choice of model assumptions, formulas, etc. This enables a level of collaboration that I simply could not achieve if I simply wrote the code directly. Finally, as a thoroughly unexpected side effect, some of my most complicated programs actually became much, much shorter when I rewrote them in an LP style (in terms of lines of code, of course). Part of this had to do with the available tooling (Org-mode's polyglot literate programming and reproducible research facilities are outstanding) and part of it simply came from having to write down my ideas in English first. This kept me from rushing into writing code and possibly getting lost in those "collapsing tunnels" to which Tim alluded. Instead, the additional "hammock time" that I took to think my way through how to present my solutions frequently led to "Aha!" moments in which I realized a simpler way to express the problem. Cliche, I know, but still results are results.

Get Literate! (use only as necessary; LP may not be recommended for some patients due to increased blood pressure and carpal tunnel risk)
~Gary

u1204

unread,

May 9, 2014, 12:32:46 PM5/9/14

to clo...@googlegroups.com, clo...@googlegroups.com

With respect to "documentation" of open source software...

"You keep using that word. I don't think it means what you think it
means." -- "The Princess Bride"

The notion that "reading the code" is the ultimate truth for
"documentation" is based on a misunderstanding at so many levels it is
hard to explain. In fact, most of the ideas don't begin to cover
"documenting the system". Fortunately, Robert Lefkowitz absolutely
illuminates the scope of the problem in these delightful talks.

For those who have not heard it, this is truly a treat.
For those who "document" this is a must-hear.

Robert Lefkowitz -- The Semasiology of Open Source
http://web.archive.org/web/20130729214526id_/http://itc.conversationsnetwork.org/shows/detail169.html
http://web.archive.org/web/20130729210039id_/http://itc.conversationsnetwork.org/shows/detail662.html

Tim Daly

Mars0i

unread,

May 10, 2014, 2:04:18 PM5/10/14

to clo...@googlegroups.com

I think we all know this, but just to make sure the point is clear (in some of the dicussion here, it doesn't seem that it is), the alternatives are not only:

(a) Source code with docstrings (or fancy formatted docstrings with links, etc.) and sparse comments, but no other explanatory text anywhere.

(b) Literate programming.

Of course long chunks of text are needed to explain algorithms, motivation, paths not taken, etc. Literate programming requires that those chunks be inserted into the source file, and that you have to run the source file through a filter to get rid of them.

Here's how I have been working lately:

Early in the process of developing a section of code, I often insert long comments to explain what I'm doing or what I intend to do, etc. Like literate code, but with comment characters in front of it. Eventually, I look at the source file and think, "That long comment is cluttering up the code. I have to scroll down to often to see what's going on in the code." Then I put the long comment into a separate file, (perhaps a file that already contains some explanatory text), strip the comment characters, and maybe add some markdown or write some additional text. I leave some essential comments in the text, or put information into docstrings if it seems appropriate.

The point is that these days, at least, I don't *want* a lot of text in the file that contains my source code, even though I think that explanatory documentation is *very* important, and even though I'd guess that I've written more of it per line of code than the average programmer. But that's me. Others find LP extremely beneficial, and I support that strategy for those who like it.

(The desire to be able to see a lot of code at once is also one reason why C-style code formatting is undesirable in a lisp language.)

u1204

unread,

May 10, 2014, 3:18:57 PM5/10/14

to clo...@googlegroups.com, clo...@googlegroups.com

> I think we all know this, but just to make sure the point is clear (in some
> of the dicussion here, it doesn't seem that it is), the alternatives are
> not only:
>
> (a) Source code with docstrings (or fancy formatted docstrings with links,
> etc.) and sparse comments, but no other explanatory text anywhere.
>
> (b) Literate programming.

Actually, lisp has a long tradition of semicolon-style comments where

;;;; Chapter
;;; Section
;; Subsection
; Paragraph or inline

With some Emacs hacking it would be possible to fold/unfold these
comments. I worked on a Transputer editor that had fold/unfold and it
was reasonably useful. I believe Emacs org-mode can also hide comments
on command. Michael Fogus (The Joy of Clojure) is a better org-mode
resource. John Kitchin (CMU professor) shows org-mode in his talk:

http://www.youtube.com/watch?v=1-dUkyn_fZA

I'm a "software blacksmith" and tend to create my own tools from
scratch so I can't give advice on org-mode, IDEs or other "store-bought"
solutions. :-)

> Of course long chunks of text are needed to explain algorithms, motivation,
> paths not taken, etc. Literate programming requires that those chunks be
> inserted into the source file, and that you have to run the source file
> through a filter to get rid of them.
>

...[snip]...

>
> The point is that these days, at least, I don't *want* a lot of text in the
> file that contains my source code, even though I think that explanatory
> documentation is *very* important, and even though I'd guess that I've
> written more of it per line of code than the average programmer. But
> that's me. Others find LP extremely beneficial, and I support that
> strategy for those who like it.
>
> (The desire to be able to see a lot of code at once is also one reason why
> C-style code formatting is undesirable in a lisp language.)

I used comment-style documentation method in my youth. I created a
language, called KROPS, which was the implementation language for a
large expert system. It was pure lisp on a Symbolics machine so there
were no other documentation tools. This limited the style of comments to
the conventions mentioned above.

Returning to the code many years later (which is where LP really pays
off) the comments were extremely helpful but not sufficient. KROPS uses
a circular, self-modifying data structure which prints as a single,
solid block of code. Documenting and diagraming this structure is
necessary to understand it. ASCII tools are not sufficient for the
diagrams but the current tools are.

The code sat in long non-comment stretches. The comments that did exist
tended to follow the semicolon style mentioned above. Overall the code
follows a book-like convention but used pure source code. Snippets are
attached below to show the style.

Things to note are
* The use of #|..|# (Common Lisp has 2 comment delimiter styles)
* Higher organization of the comments
- an intro to check that it works
- a table of contents
- the use of "levels" of semicolon structure
- the use of docstrings on functions
- the use of UPPERCASE in docstrings to highlight symbol names
- the use of inline semicolons

Not shown are long uninterrupted stretches of code containing
only the docstring/uppercase and a few "inline" comments.

So it is possible to do some form of reasonably well documented
programming that is somewhat structured, involving only some
discipline on commenting style.

====================================================================
#|
;; KROPS in common lisp
;;
;; Trivial test:
;;
;; initialize the system:
;; (PS-INITIALIZE)
;;
;; define the class TEST:
;; (LITERALIZE TEST A=)
;;
;; create one rule:
;; (P ASDF WHEN (TEST A= 1) THEN (PRINT "IT WORKS"))
;;
;; create one working memory element:
;; (MAKE TEST A= 1)
;;
;; look for the rule to fire:
;; (CS) ==> NIL.ASDF
;;
;; run one rule:
;; (RUN) ==> "IT WORKS"
;;

; STRUCTURE OF THIS FILE:

; 1.0 PACKAGE INFORMATION
; 2.0 VERSION VARIABLE
; 3.0 CHANGE LIST
; 4.0 IMPLEMENTATION PATCHES
; 5.0 DATA STRUCTURE DOCUMENTATION
; 5.1 RETE DATA STRUCTURES
; 5.1.1 R-NODE
; 5.1.2 NODES IN THE ALPHA RETE
; 5.1.2.1 R-A-PLAIN
; 5.1.2.2 R-A-OR
; 5.1.2.3 R-A-TRIG
; 5.1.2.4 R-B-A-DISTR
; 5.1.3 NODES IN THE BETA RETE
; 5.1.3.1 R-B-JOIN
; 5.1.3.2 R-B-SORT (sorted memory node)
; 5.1.3.3 R-B-P (PRODUCTION) NODES
; 5.1.4 OTHER DATA STRUCTURES
; 5.1.4.1 R-CS (A conflict set element)
; 5.1.4.2 R-B-COMMON (the vector portion of an R-B-JOIN node)
; 5.1.4.3 R-B-P-C (the vector portion of an R-B-P node)
; 5.1.4.4 R-B-ACC (an access vector)
; 5.1.5 MEMORIES
; 5.1.5.1 R-MEM (Rplacdable memory header)
; 5.1.5.2 R-MEM-ITEM (an element of R-MEM)
; 5.1.5.3 R-MEM-SORT (sorted memory)
; 5.1.5.4 R-UNIQUE
; 5.2 PARSER DATA STRUCTURES
; 5.2.1 INPUT SYNTAX
; 5.2.2 *RULE*
; 5.2.3 INTERNAL DATA STRUCTURES
; 5.2.3.1 PREDICATE
; 5.2.3.2 ORLIST
; 5.2.3.3 ANDLIST
; 5.2.3.4 TERM
; 5.2.3.5 TERMLIST
; 5.2.3.6 CENODE
; 5.2.3.7 ANDNODE
; 5.2.3.8 NOTNODE
; 5.2.3.9 SORTNODE
; 5.2.4 OUTPUT SYNTAX
; 5.3 COMPILER DATA STRUCTURES
; 5.3.1 *RHS-MAKES* ALIST
; 5.3.2 *PS-EXPRS* HASHTABLE
; 5.3.3 THE TOKEN
; 5.3.4 ACCESS FUNCTIONS
; 5.3.5 ACCESS PATHS
; 5.3.6 VARIABLES
; 5.4 OTHER DATA STRUCTURES
; 6.0 DEFVARS
; 7.0 DEFMACROS
; 7.1 GENERAL
; 7.2 PARSER
; 7.3 COMPILER
; 7.4 RETE FUNCTIONS
; 7.5 USER COMMANDS
; 7.6 RIGHT HAND SIDE MACROS
; 8.0 DEFUNS
; 8.1 GENERAL
; 8.2 PARSER
; 8.3 COMPILER
; 8.4 RETE FUNCTIONS
; 8.5 USER COMMANDS
; 9.0 TODO LIST

|#

; 1.0 PACKAGE INFORMATION

(provide :krops)

(make-package :krops)
(make-package :krep)

;(in-package 'krops :nicknames '(kb) :use :cl-user)

(export '(attributep attributes-of cestack context context= cs
defrule excise excise-module famo
findrule firecount for-all-matches-of
ge gt le lisp-value literalize lt make makev
make-using make-usingv match matches maximize minimize
modify modifyv ne nomatch opserror p p-context
p-remove-context pprule ppwm
priority *ps-all-rules* ps-initialize ps-such-that
ps-remove ps-reset pwatch
remove-all remove-match rjust run say same-type
ps-select select-set strategy tabto
then watch when window-hook wm wme-extract wme-class=
wme-time-tag= wnltt & ! +=))

(use-package 'cl-user)

; 2.0 VERSION VARIABLE

(defvar *ps-version* 0)

(setq *ps-version* 2)

(eval-when (eval load)
(setf *features*
(cons
(intern (format nil "KROPS-VERSION-~a" *ps-version*) 'keyword)
*features*)))

#|

; 3.0 CHANGE LIST

;***********************************************************************
;version 03: dos2unix
; untabify
; kb: to krep:
; remove krep package from shadowing-use-package call
;version 02: change use-package from lisp to cl-user
; add (make-package 'krep)
; remove use of ki, change to krep
; use keyword for package references
; remove symbolic, lucid, gclisp conditional code
;version 01: SBCL translation
;version 00: Conversion for Tires
;***********************************************************************
;version 86: add version info to *features* list
;version 85: change eval-when to include eval

; 5.0 DATA STRUCTURE DOCUMENTATION

; 5.1 RETE DATA STRUCTURES

; The RETE is composed of two parts, the alpha part and the beta part.
; Alpha tests are tests that can be performed by referencing only one
; working memory element. Beta tests are tests that require more than
; one working memory element. Thus,
; (class attr= 1)
; would generate an alpha test because we need only index into the
; current working memory element to decide if some field in it has
; the value 1. But,
; (class1 attr= <x>)
; (class2 attr= <x>)
; would generate a beta test because we must look at two working memory
; elements to decide that their attr= fields match.
;
; There is one overall structure to a node in the rete network.
; This is called an R-NODE.

; 6.0 DEFVARS

(defvar *attr-id* nil)
(defvar *attr-ndx* nil)
(defvar *beta-tests* nil)
(defvar *ce-vars* nil)

; 7.0 DEFMACROS

; 7.1 GENERAL

(defmacro add1 (arg)
`(the fixnum (+ 1 (the fixnum ,arg))))

(defun parse-symbol (symbol)
"PARSE-SYMBOL takes a krops symbol and breaks it into three parts if
it contains a trailing = sign. The purpose is to return the ROLE
FACET and ACCESS-FUNCTION for the symbol. There are two cases:
ROLE.FACETNAME= form and the ROLE= form. In the first case the
ROLE is assigned to the ROLE field, the FACET is assigned the
FACETNAME and the access function is assigneda constructed
GET-facetname form. In the second case the ROLE is assigned the
ROLE, the facet defaults to VALUE and the ACCESS-FUNCTION
is assigned GET-VALUE."
(let ((role symbol) facet name access pos)
(cond
((colonp symbol)
(setq name (symbol-name symbol))
(cond
((eq symbol '*=)
(setq role nil)
(setq facet nil)
(setq access nil))
((setq pos (position #\. name))
(setq role (intern (subseq name 0 pos)))
(setq facet (intern (subseq name (1+ pos) (1- (length name)))))
(setq access
(intern (concatenate 'string "GET-" (string-upcase facet))
(find-package 'krep))))
('else
(setq role (intern (subseq name 0 (1- (length name)))))
(setq facet 'value)
(setq access 'get-value)))))
(values role facet access)))

... acres of code follow...

Gary Johnson

unread,

May 11, 2014, 6:37:40 PM5/11/14

to clo...@googlegroups.com

Emacs org-mode provides a markdown-like language, which can be organized into a foldable outline (e.g., chapters, sections, subsections, subsubsections). Syntax is provided for headers, ordered/unordered lists, tables, inline images/figures, hyperlinks, footnotes, and (most importantly for LP) code blocks. In order to avoid having to scroll up and down forever to see your code spread through the document, you simply use TAB to fold/unfold the outline sections your are currently interested in. Pressing C-c ' within any code block automatically switches to Emacs' major mode for that language, showing only the code in its own temporary buffer. When you want to see all of your code at once, just tangle the document to a *.clj file and look at it in another buffer. Using auto-refresh on the tangled buffer provides an easy way to keep checking code changes in this way with minimal effort. When you are ready to weave the document into a nicely readable format, org-mode provides output filters to auto-generate latex articles, html webpages, OpenDocument files, latex beamer presentations, and quite a few others as well.

This is just meant to clarify some of the LP-related features of this platform. Obviously, some emacs lisp hacking can extend it to do whatever else people want.

Reply all

Reply to author

Forward