Large Literate Programs

Aaron W. Hsu

unread,

Oct 6, 2011, 4:32:12 AM10/6/11

to

Does anyone have examples of large literate programs other than TeX and
Metafont? Specifically ones that are available to be read?

Aaron W. Hsu

--
Programming is just another word for the Lost Art of Thinking.

Adam Russell

unread,

Oct 7, 2011, 12:27:33 AM10/7/11

to

I would recommend the book "Weaving a Program" by Sewell.
This book is a bit old but should be easy to find a copy.
It contains several non-trivial examples that are enlightening to read.
The only downside, again, is that the book is kind of old and the examples
are in the original Pascal WEB. However, the book does contain excellent
examples
and it is easy enough to get a sense of LP style for large programs
which you
can extend to whatever LP system you plan on using.

Aaron W. Hsu

unread,

Oct 7, 2011, 2:42:05 AM10/7/11

to

On Fri, 07 Oct 2011 00:27:33 -0400, Adam Russell <ac.ru...@live.com>
wrote:

> I would recommend the book "Weaving a Program" by Sewell.
> This book is a bit old but should be easy to find a copy.
> It contains several non-trivial examples that are enlightening to read.
> The only downside, again, is that the book is kind of old and the
> examples
> are in the original Pascal WEB. However, the book does contain excellent
> examples
> and it is easy enough to get a sense of LP style for large programs
> which you
> can extend to whatever LP system you plan on using.

So, I guess, now I'm wondering what people might consider a large literate
program? Is there, say, a page limit you need to go over?

I have my own preferences when it comes to literate programs, but I'd like
to see more precedent when it comes to large software artifacts,
especially in terms of modern development practice.

Aharon Robbins

unread,

Oct 7, 2011, 2:06:57 PM10/7/11

to

In article <j6jp2c$fq4$1...@labrador.cs.tufts.edu>,

See the libadt source, written using literate techniques:
http://adtinfo.org/index.html.

Arnold
--
Aharon (Arnold) Robbins arnold AT skeeve DOT com
P.O. Box 354 Home Phone: +972 8 979-0381
Nof Ayalon Cell Phone: +972 50 729-7545
D.N. Shimshon 99785 ISRAEL

Aaron W. Hsu

unread,

Oct 7, 2011, 2:47:05 PM10/7/11

to

On Fri, 07 Oct 2011 14:06:57 -0400, Aharon Robbins <arn...@skeeve.com>
wrote:

> In article <j6jp2c$fq4$1...@labrador.cs.tufts.edu>,
> Aaron W. Hsu <arc...@sacrideo.us> wrote:
>> Does anyone have examples of large literate programs other than TeX and
>> Metafont? Specifically ones that are available to be read?
>>
>> Aaron W. Hsu
>>
>> --
>> Programming is just another word for the Lost Art of Thinking.
>
> See the libadt source, written using literate techniques:
> http://adtinfo.org/index.html.

Wow, that's great. Thanks!

Ben Pfaff

unread,

Oct 7, 2011, 11:54:57 PM10/7/11

to

arn...@skeeve.com (Aharon Robbins) writes:

> In article <j6jp2c$fq4$1...@labrador.cs.tufts.edu>,
> Aaron W. Hsu <arc...@sacrideo.us> wrote:
>>Does anyone have examples of large literate programs other than TeX and
>>Metafont? Specifically ones that are available to be read?
>>
>> Aaron W. Hsu
>>
>>--
>>Programming is just another word for the Lost Art of Thinking.
>
> See the libadt source, written using literate techniques:
> http://adtinfo.org/index.html.

It's a large book but I'm not sure that it's a large program.
There's less than 25,000 lines of code and some of that is
mechanically generated by macro substitution.
--
"[Modern] war is waged by each ruling group against its own subjects,
and the object of the war is not to make or prevent conquests of territory,
but to keep the structure of society intact."
--George Orwell, _1984_

Aaron W. Hsu

unread,

Oct 8, 2011, 2:07:12 AM10/8/11

to

On Fri, 07 Oct 2011 23:54:57 -0400, Ben Pfaff <b...@cs.stanford.edu> wrote:

> arn...@skeeve.com (Aharon Robbins) writes:
>
>> In article <j6jp2c$fq4$1...@labrador.cs.tufts.edu>,
>> Aaron W. Hsu <arc...@sacrideo.us> wrote:
>>> Does anyone have examples of large literate programs other than TeX and
>>> Metafont? Specifically ones that are available to be read?
>>>
>>> Aaron W. Hsu
>>>
>>> --
>>> Programming is just another word for the Lost Art of Thinking.
>>
>> See the libadt source, written using literate techniques:
>> http://adtinfo.org/index.html.
>
> It's a large book but I'm not sure that it's a large program.
> There's less than 25,000 lines of code and some of that is
> mechanically generated by macro substitution.

I've never actually counted, but how many lines of code is TeX?

Aaron W. Hsu

unread,

Oct 8, 2011, 2:27:04 AM10/8/11

to

On Fri, 07 Oct 2011 23:54:57 -0400, Ben Pfaff <b...@cs.stanford.edu> wrote:

> It's a large book but I'm not sure that it's a large program.
> There's less than 25,000 lines of code and some of that is
> mechanically generated by macro substitution.

Answering my own question, the tangled source for TeX seems to be around
7000 lines of compressed Pascal? I imagine that if it were written
normally that would be closer to 8000 or 9000. My ChezWEB LP system is
about 45 pages of woven output, and about 700 lines of formatted Scheme
code. That seems to line up with the line count to page ratio in TeX, too.

This leads me to conclude that, theoretically, the largest TeX programs
are in the range of 10,000 to 30,000 lines of code, which would be
considered a large program by normal standards, too, I think, but wouldn't
be anywhere close to the really large programs. I just heard about a
system the other day that had about 15,000 or 20,000 lines of code in it
that actually generated about 150,000 lines of Java code as the end
product. Even that is lower than other programs of which I know have
500,000+ lines of code.

Extrapolating this out, if my math is correct, that means that such a
program might reasonably be a 35,000+ page book! That's an order of
magnitude larger than some of the largest books in print that I know of.
That's just mind-bogglingly big. The real question is, could such a task
be managed successfully? Could one reasonably scale literate programming
to this magnitude? Would you want to? Would it be better than the systems
already in place around such large projects?

Simon Wright

unread,

Oct 10, 2011, 9:38:30 PM10/10/11

to

b...@cs.stanford.edu (Ben Pfaff) writes:

>> See the libadt source, written using literate techniques:
>> http://adtinfo.org/index.html.

(I should be referencing the original artcle but my newsreader can't
find it, sorry)

This is from 'Reading the Code':

A TexiWEB document is a C program that has been cut into sections,
rearranged, and annotated, with the goal to make the program as a
whole as comprehensible as possible to a reader who starts at the
beginning and reads the entire program in order.

Do people in general find that LP is a way of documenting a program that
has already been (designed and) implemented rather than part of the
process of designing the program? I'd (naively) hoped for the latter!

gratefulfrog

unread,

Oct 10, 2011, 10:12:47 PM10/10/11

to

On Oct 7, 8:42 am, "Aaron W. Hsu" <arcf...@sacrideo.us> wrote:
> On Fri, 07 Oct 2011 00:27:33 -0400, Adam Russell <ac.russ...@live.com>

This was my 1st literate program: http://home.scarlet.be/asld0009/Lisp/sal.pdf

It's 180 pages of pdf. The use of the literate paradigm has made the
program actually useful and the longevity increased considerably as a
result. The client considers it "bug-free" practically since
delivery. I would say that the use of the literate paradigm was
hugely beneficial to me the writer as well as the client, the user.
The program has run nearly continuously for 5 years now...

I hope this helps someone!

Ciao,
Bob

Palo Vlcek

unread,

Oct 10, 2011, 10:13:16 PM10/10/11

to

Aaron W. Hsu

unread,

Oct 11, 2011, 12:22:05 PM10/11/11

to

On Mon, 10 Oct 2011 22:13:16 -0400, Palo Vlcek <pavol...@gmail.com>
wrote:

Sorry, I didn't see any body for this message other than quotes?

Aaron W. Hsu

unread,

Oct 11, 2011, 1:32:06 PM10/11/11

to

On Mon, 10 Oct 2011 21:38:30 -0400, Simon Wright <si...@pushface.org>
wrote:

> Do people in general find that LP is a way of documenting a program that
> has already been (designed and) implemented rather than part of the
> process of designing the program? I'd (naively) hoped for the latter!

I almost always do a significant part of the writing with the programming.
That is, I see them as mutually beneficial. In this sense, it's part of
designing. However, I always attempt to gain some sort of grasp on the
concept and design before I just randomly sit down to program. Sometimes I
experiment with ideas, but in those cases, I'm usually encoding that
thought process into my program anways, at the documentation level.

So, I use it for both. What I don't ever do is start with a piece of
program text that I've just finished writing, and then think, "Now I'll
fill out the documentation." I might do this with some piece of
non-literate code that I have inherited, but whenever I start a program
from scratch, the prose and code grow together, and never separately.

Aaron W. Hsu

unread,

Oct 11, 2011, 1:32:07 PM10/11/11

to

On Mon, 10 Oct 2011 22:12:47 -0400, gratefulfrog <gratef...@gmail.com>
wrote:

> This was my 1st literate program:
> http://home.scarlet.be/asld0009/Lisp/sal.pdf

> It's 180 pages of pdf. The use of the literate paradigm has made the
> program actually useful and the longevity increased considerably as a
> result. The client considers it "bug-free" practically since
> delivery. I would say that the use of the literate paradigm was
> hugely beneficial to me the writer as well as the client, the user.
> The program has run nearly continuously for 5 years now...

Interesting stuff. It's nice to find other LP projects around.

Palo Vlcek

unread,

Oct 12, 2011, 1:57:05 AM10/12/11

to

> Sorry, I didn't see any body for this message other than quotes?
>
> Aaron W. Hsu
>
> --
> Programming is just another word for the Lost Art of Thinking.

Hi,

see the Axiom algebra system (written in Noweb):
http://en.wikipedia.org/wiki/Axiom_%28computer_algebra_system%29

Palo

Aaron W. Hsu

unread,

Oct 12, 2011, 4:52:07 PM10/12/11

to

On Wed, 12 Oct 2011 01:57:05 -0400, Palo Vlcek <pavol...@gmail.com>
wrote:

>> Sorry, I didn't see any body for this message other than quotes?

> Hi,
>
> see the Axiom algebra system (written in Noweb):
> http://en.wikipedia.org/wiki/Axiom_%28computer_algebra_system%29

I love seeing all of these programs coming out of the wood work. It's hard
to search for these. This one is really a large system, and done in Common
Lisp too, which is impressive.

Ben Pfaff

unread,

Oct 13, 2011, 6:36:16 PM10/13/11

to

Simon Wright <si...@pushface.org> writes:

[quoting what I wrote in the libavl book]

> This is from 'Reading the Code':
>
> A TexiWEB document is a C program that has been cut into sections,
> rearranged, and annotated, with the goal to make the program as a
> whole as comprehensible as possible to a reader who starts at the
> beginning and reads the entire program in order.
>
> Do people in general find that LP is a way of documenting a program that
> has already been (designed and) implemented rather than part of the
> process of designing the program? I'd (naively) hoped for the latter!

Maybe advanced practitioners do it the way that you prefer, but
I'm too used to the way that I already write code, and so I wrote
libavl the other way. Also, the libavl book was an evolution of
an existing library that had already been written in the ordinary
way, so it was natural to me to write it that way.

I was also implementing the TexiWEB language in parallel with
writing libavl with it.
--
Ben Pfaff
http://benpfaff.org

Simon Wright

unread,

Oct 14, 2011, 3:47:04 AM10/14/11

to

Ben Pfaff <b...@cs.stanford.edu> writes:

Thanks for the info.

I implemented one project by design-first; it was an Ada control package
over a C driver for a radar/graphics card, and it was really helpful to
have the mathematics, the C headers and glue, and the matching Ada
declarations in the same source document, not just for me as the
implementer but (I hope!) for the rest of the team.

The current project is much more like yours, for the same reasons. I'm
hoping that even it I'm not doing it "right" I'm at least going to end
up with a better product!

dtopham

unread,

Oct 14, 2011, 6:12:06 PM10/14/11

to

Ben, Is TexiWEB available for others to use? I couldn't find any links
to it on the internet--even on your homepage! -Dave

Ben Pfaff

unread,

Oct 14, 2011, 6:57:07 PM10/14/11

to

dtopham <dto...@gmail.com> writes:

> Ben, Is TexiWEB available for others to use? I couldn't find any links
> to it on the internet--even on your homepage! -Dave

It's integrated into libavl. I've never bothered to separate it,
but I don't think that it would be too hard.

So, you can find it in the libavl repository here:
http://git.savannah.gnu.org/cgit/avl.git

There is a little documentation for how to use it in the TEXIWEB
file in that distribution. That file isn't very long, and maybe
it is of interest to the newsgroup, so I include it here:

----------------------------------------------------------------------
Notes on the TexiWEB language:

TexiWEB is Texinfo, with a few additions. You can use almost any
Texinfo you like in a TexiWEB document. Any exceptions are probably
due either to bugs or insufficient generality in the TexiWEB
translator program.

TexiWEB draws inspiration from Knuth's WEB language. If you know a
little bit about WEB, then you should be able to figure out TexiWEB
quickly.

The texiweb program processes TexiWEB. In "weave" mode, it translates
TexiWEB input into Texinfo output. In "tangle" mode, it extracts code
segments into the source files specified in the TexiWEB input.

TexiWEB assumes that code segments are written in C. The "texiweb"
program formats C keywords and operators in its output based on this
assumption. It also parses code segments to find out function
definitions, global variables, structure definitions, and typedefs,
and uses this information to enter these definitions in the Texinfo
concept index (@cindex).

A TexiWEB code segment looks like this:

@<Hello world@> =
#include <stdio.h>

@

Use += instead of = to extend a segment:

@<Hello world@> +=
int main(void)
{
@<Print hello@>
return 0;
}
@

In "tangle" mode, each of the lines in @<Print hello@> will be
expanded above indented by two spaces, since the reference above is
itself indented by two spaces.

TexiWEB can also do word prefix substitutions in references; see near
the end of http://adtinfo.org/libavl.html/Reading-the-Code.html for an
explanation.

A segment introduced with @( specifies the contents of a file:

@(hello.c) =
@<Hello world@>
@

The segment @<Anonymous@> is special. This segment may not be
included in other segments, so it is never written to a file. It is
useful for example code in expository text. In "weave" mode, the
section number is left off anonymous segments. The texiweb program,
with --unused, warns about ordinary segments that are not directly or
indirectly included in a file, but it will not warn about anonymous
segments.

A TexiWEB document should include a @setheaderfile command near its
top, perhaps just before @titlepage. The command should specify the
name of a file, which will be created by the texiweb program as part
of the "weave" process. In the Texinfo output @setheaderfile will be
translated into @include.

Outside a code segment, TexiWEB treats text surrounded by |
characters as C code and applies the same kind of formatting to it as
it does to code segments. TexiWEB does the same thing for comments
within code segments, so that within a code segment:
/* Asserts that |tbl_insert()| succeeds at inserting |item| into |table|. */
becomes a nicely formatted comment when it passes through the "weave"
process.

TexiWEB implements @iftangle...@end iftangle and @ifweave...@end
ifweave directives to allow some input sections to be processed only
in the specified mode. These can be used inside or outside code
segments.

TexiWEB implements a feature called "catalogues" that libavl uses to
document the algorithms that it implements, e.g.:
http://adtinfo.org/libavl.html/Catalogue-of-Algorithms.html
I can document this more fully if you like.

TexiWEB has special support for exercises and their answers, which I
found useful in libavl since it is really a textbook of sorts.
I can document this more fully if you like.

The @deftypedef command declares to the TexiWEB parser that its
argument is a typedef name, which makes the formatting of the name
clearer. Usually the TexiWEB parser can figure this out on its own,
but if a typedef name is only created by a word prefix substitution in
a segment reference, then it will not figure it out.
----------------------------------------------------------------------

--
"Implementation details are beyond the scope of the Java virtual
machine specification. One should not assume that every virtual
machine implementation contains a giant squid."
--"Mr. Bunny's Big Cup o' Java"

Yannick Duchêne (Hibou57)

unread,

Oct 22, 2011, 11:07:12 AM10/22/11

to

Le Tue, 11 Oct 2011 04:12:47 +0200, gratefulfrog <gratef...@gmail.com>
a écrit:

> This was my 1st literate program:
> http://home.scarlet.be/asld0009/Lisp/sal.pdf

What did you used to edit it ? And how did you convert it into PDF ?

--
“Syntactic sugar causes cancer of the semi-colons.” [Epigrams on
Programming — Alan J. — P. Yale University]
“Structured Programming supports the law of the excluded muddle.” [Idem]
Java: Write once, Never revisit