There is no reason that this has to be true. We now have a variety of
systems that have become quite popular for marking up source, using
only formatted comments. For example, kdoc, doxygen for C++.
To my mind, a system that combines the power of the literate
programming tools (particularly, the ability to include mathematical
TeX markup) with a non-intrusive approach (using only formatted
comments in the compilation source file) would be ideal and would lead
to a much wider acceptance.
How is that an obstcle?
> There is no reason that this has to be true. We now have a variety of
> systems that have become quite popular for marking up source, using
> only formatted comments. For example, kdoc, doxygen for C++.
The problem with that approach is that it loses the benefits of reordering and
only gives you prettyprinting. Personally, I find that reordering is more
important to a clear program presentation than prettyprinting.
Matthias
--
Matthias Neeracher <ne...@iis.ee.ethz.ch> http://www.iis.ee.ethz.ch/~neeri
"The only debatable issue, it seems to me, is whether it is more ridiculous
to turn to experts in social theory for general well-confirmed propositions,
or to the specialists in the great religions and philosophical systems for
insights into fundamental human values." -- Noam Chomsky
> To my mind, a system that combines the power of the literate
> programming tools (particularly, the ability to include mathematical
> TeX markup) with a non-intrusive approach (using only formatted
> comments in the compilation source file) would be ideal and would
> lead to a much wider acceptance.
You should look at the functional language Haskell, whose standard
includes a literate programming specification. All implementations of
the language that I know of implement this spec, so you don't need to
do any separate preprocessing of the file.
Literate Haskell supports the "default is comment" style, and it
allows one to use anything to make the doc: eg. HTML, SGML, LaTeX.
Reordering is not needed, since Haskell functions are very unobtrusive
and where you'd define a chunk in a traditional litprog tool, you
define a function in literate Haskell.
Other programming languages could adopt similar support for literate
programming. Is anyone aware of any language besides Haskell that
does this?
--
%%% Antti-Juhani Kaijanaho % ga...@iki.fi % http://www.iki.fi/gaia/ %%%
""
(John Cage)
The extra step shouldn't be a problem (it never has been for me),
especially if you're using make correctly. It doesn't alter the make
command line one bit, except for printing, and only requires a short
addition to the makefile (and that should be added to your master
template makefile anyway). If, on the other hand, the complaint is,
"But I don't want to use make", then I have no sympathy.
> There is no reason that this has to be true. We now have a variety of
> systems that have become quite popular for marking up source, using
> only formatted comments. For example, kdoc, doxygen for C++.
Yes, but, (1) rendering is significantly more important than markup, and
(2) chunking and indexing are both significantly more important than
either markup or rendering. And, *NO*, chunking and function
declaration are not equivalent; they're not even the same type of
activity. <IMHO> And chunking is (for me, at least) the most helpful
part of literate programming. <\IMHO>
> To my mind, a system that combines the power of the literate
> programming tools (particularly, the ability to include mathematical
> TeX markup) with a non-intrusive approach (using only formatted
> comments in the compilation source file) would be ideal and would lead
> to a much wider acceptance.
I agree that there being no difference between the literate language and
the regular language would be a more optimal solution, but it would
require designing a new language alltogether, as things like chunking
are not possible to do non-intrusive manner currently.
Sent via Deja.com http://www.deja.com/
Before you buy.
I'm just curious about why the extra stop seems like such an obstacle
to you? I just started looking at LitProg a few weeks ago, and think it
seems like a wonderful idea. What I've done for my project tree, is
modify
my top-level makefile so that it knows how to deal with .nw (noweb)
files.
Now, if there is a .nw that has been updated since the corresponding
.java file, a new .java file gets written automatically when I type
"make." Then the make program detects that the .java file is newer
than the .class file, and recompiles that. So for me, it's still just
a one-line command -- "make"
> There is no reason that this has to be true. We now have a variety of
> systems that have become quite popular for marking up source, using
> only formatted comments. For example, kdoc, doxygen for C++.
In the book I'm reading right now, Literate Programming by Donald Knuth,
the author writes (in his first publisher paper on LitProg):
Thus the WEB language allows a person to express programs in a "stream
of consciousness" order. TANGLE is able to scramble everything up into
the arrangement that a Pascal compiler demands. This feature of WEB is
perhaps its greatest asset; it makes a WEB-written program much more
readable than the same program written purely in Pascal.
What he's talking about is splitting the file up into chunks, and I love
the idea. You take a very complicated method, and split it up into
the parts that make sense -- this is something who works on a complex
program does in their mind (and on flowcharts, etc.), but with a LitProg
tool you can actually have those parts be distinct entities. I just
wrote
the following simple example this weekend when I was working on my
teach-myself-the-guts-of-java program (it's a truly simplistic example):
\subsection{delete}
Delete will remove [[len]] bytes from a file, starting at position
[[off]]. As with [[insert]], we simply find the appopriate
[[BFileNode]] and pass on the request.
<<BFile.delete>>=
/**
* Deletes <I>len</I> bytes from this file, starting at offset
* <I>off</I>.
*
* @param off the start offset to delete at
* @param len the number of bytes to delete
*
* @returns number of bytes deleted
*
* @exception IOException if an I/O error occurs.
*/
public int
delete(long off, int len)
throws IOException
{
int results = 0; // bytes deleted from file
<<get [[pNode]] for [[pos]]; failure returns results immediately>>
results = pNode.delete(off, len);
return results;
}
@
...
\subsection{getNode}
The following is a generic snippet of code that is used throughout
this program. It simply creates [[pNode]], a reference to the
[[BFileNode]] which corresponds to position [[pos]] in the file. If
[[pNode]] is null, we will immediatly return [[results]]. The idea is
that [[results]] will be zero, or empty depending on what kind of
variable it is. Another idea is to throw an IOException, for now we
will do this and see if it is flexibile enough to work with.
<<get [[pNode]] for [[pos]]; failure returns results immediately>>=
BFileNode pNode = getNode(off);
if ( pNode == null )
{
return results;
}
@
Using a chunk ensures that I can do all my error processing in one
section. The label clearly defines what it is doing, and as long as
I keep that in mind I can have the "results" variable be anything I
want. So my error handling can be in that chunk, but it is invisible to
me in all the places I use it when I am writing my code. I don't think
I can do this with a method.
The REAL advantage of a chunk comes out when you try and wrote a complex
loop. I've not yet done that, but looking at Knuth's examples shows that
it can really help one understand the flow of the code. Yes can you
can delimit parts of a complex loop with comments, but I find that at
least my mind still tries to hold the entire thing in my head at once.
If it's split up into distinct parts, i.e., visually AND physically
separate, I can follow everything without getting a headache.
> To my mind, a system that combines the power of the literate
> programming tools (particularly, the ability to include mathematical
> TeX markup) with a non-intrusive approach (using only formatted
> comments in the compilation source file) would be ideal and would
> lead to a much wider acceptance.
At work, my co-workers and I are supposed to use JavaDoc to comment all
our methods. Someone at SUN wrote a paper claiming that JavaDoc was a
literate programming tool (at least I think that was the way they put
it).
It's a very nice comment system that lets you write the API at the same
time as you write the code. But even if it let you write mathematical
equations (say it could take an SGML DTD instead of just assuming the
HTML
subset), I don't think it would be as useful as literate programming.
Programmers just don't like to write API comments. I force myself to
when I can, but sometimes I just don't feel like it -- it's a pain in
the butt to line up and reformat everything by hand when writing the
javadoc comment (and the convention is of course to have those lead '*',
which makes the problem worse).
Literate programming splits explanation from coding in a very definite
way. And you don't have to worry about formatting to the proper
indentation
level or whatever. Everything happens automatically for you.
So, in summary, from my perspective LitProg has a few advantages to just
a more powerful comment parser:
LitProg allows you to restructure the way you approach problems.
It lets you make complex code clear by splitting out sections in
an absolutely clear manner (at least to you, the author)
It lets you write your explanations/comment without worrying about
how it will look inside the code
Jim
It's interesting that until I printed off and read the
source file today, I hadn't noticed that the chunk name
didn't agree with the source. I think I like this litprog
stuff -- I didn't notice in my editor or when I was posting,
but it stood out on paper. =)
Norman
Probably, though some tweaking might need to be done. The
rules for javadoc are very simple.
It is a comment just above a class or a method
It begins with '/**' and continues until the next '*/'
It may contain HTML markup
It may contain contain multiple 'Tag' entries.
Tags start with '@' and may be one of:
@see
@param
@return
@exception
@deprecated
@author
@version
@since
Since the tags are optional, the only thing really
needed is to enclose the text in /** ... */
I guess people could just as easily write @@author
and so on, to not confuse noweb, right?
Oh, in other news. I've been looking at cdefs.nw,
and have to admit that I'm a bit lost... Is there
any *defs.nw that has more commenting on what it
is doing? I don't know Icon, so I have to pull up
the source, the not-very-good online documentation
I found on Icon, and lookup up each new snippet of
code (like, what doe '~==' do, and so on and so forth)
It's an obstacle because of 2 things:
1) There is a vast array of tools that are understand the C or C++
(for example) source, and not the .e.g, .nw source.
2) There is a vast array of programmers that would be happy to use C
or C++ code I write that has literate programming comments, but
would have no interest in using my .nw file.
Yes, I know how to use make, and write makefiles. Unfortunately, not
everyone wants to bother, or has the same environment as I do.
: 2) There is a vast array of programmers that would be happy to use C
: or C++ code I write that has literate programming comments, but
: would have no interest in using my .nw file.
Wouldn't the "nountangle-trick" help?
-# Georg
You know, as I wrote my second literate program today
I think something as useful would be a way to auto-insert
the code-chunk name as a comment at the top of each code
chunk. It is a concise, useful, description of exactly
what is going on in the next "section" of code.
<<set up shared memory segment [[shm_id]]>>=
/*
* set up shared memory segment [[shm_id]]
*/
...
% Look at the real world, POD and Javadoc as well as other C++/C are the only
% successful literate programming tools, I've seen. Python has support for
% literate programming too, and it's model isn't Knuth's old and outdated
% model. Docstrings ala emacs lisp are very useuful too.
Of course, javadoc and [what? doc++? Microsoft's thing that nobody ever used
and I can't remember the name?] doesn't have the same objectives as litprog,
and I haven't seen any evidence of its being used by the development teams
I've seen creating java applications. What's more, the idea of sticking
documentation in comments certainly predates Knuth's work on litprog,
so I claim that that approach is failed, old, and outdated.
% Successful == used by joe programmer
But, successful by this measure certainly doesn't mean `useful'.
Litprog is definitely a different approach to programming, and while
an approach which is not used can't provide any benefit, much, if not
all, of the benefit of litprog is lost when you take the cnoweb approach
(which has also been quite unsuccessful, if you measure success by
the number of people actually using it).
The joe programmer equivalent to litprog is the UML, although I don't
know how many people have tried to use Rational Rose to maintain code
once it's past the early design stages. Anyway, the UML is absolutely
the hottest and most exciting new thing in the history of ideas, but
its use in code generators is just an application of Knuth's tired
old ideas.
Joe programmer has followed a lot of trends, and he's generally managed
to do so without meeting his schedules or producing acceptable quality
code.
--
Patrick TJ McPhee
East York Canada
pt...@interlog.com
If you don't care about parameterizing over the comment convention,
you can do this with noweb using something like a 2-line awk script as
a filter. Filters are covered in the Hackers' Guide.
N
> In the book I'm reading right now, Literate Programming by Donald
> Knuth, the author writes (in his first publisher paper on LitProg):
>
> Thus the WEB language allows a person to express programs in a
> "stream of consciousness" order. TANGLE is able to scramble
> everything up into the arrangement that a Pascal compiler
> demands. This feature of WEB is perhaps its greatest asset; it
> makes a WEB-written program much more readable than the same
> program written purely in Pascal.
>
> What he's talking about is splitting the file up into chunks, and I
> love the idea. You take a very complicated method, and split it up
> into the parts that make sense -- this is something who works on a
> complex program does in their mind (and on flowcharts, etc.), but
> with a LitProg tool you can actually have those parts be distinct
> entities. I just wrote the following simple example this weekend
> when I was working on my teach-myself-the-guts-of-java program (it's
> a truly simplistic example):
Why don't you all use inline functions to do chunking? I ordered the
CWEB book because I think Knuth is so fly, but I think CWEB is not
helpful. I already do my own chunking--functions should easily fit on
my emacs screen and in a reader's head.
Any respectable programming language supports chunking already, and
any respectable compiler supports inlining of trivial functions.
Other programmers already know the programming language, so why
introduce an unnecessary and foreign means of achieving the level of
abstraction that you can get with the programming language itself?
A new maintainer of your code will be able to read it easily if you
use the programming language itself, but if you use an unpopular
system, they have to find it and learn it first in order to maintain
your code.
Pretty printing coupled with fancy comments is really great, but I
think that taking the chunking out of the programming language is a
mistake.
--
--Ed Cashin PGP public key:
eca...@coe.uga.edu http://www.coe.uga.edu/~ecashin/pgp/
> > when I was working on my teach-myself-the-guts-of-java program (it's
> > a truly simplistic example):
>
> Why don't you all use inline functions to do chunking? [...]
>
> Any respectable programming language supports chunking already, and
> any respectable compiler supports inlining of trivial functions.
Do you mean "you all" as in those of us using literate programming tools,
or "you all" as in "hey y'all" just me? If you mean the former, I can't
speak for anyone else. If you mean me, their are a couple of reasons.
Problem number one is that I'm using code chunking for things other than
what inline functions are good at. I'm not using it for "a safer sort
of #define," rather I'm using it as an excellent reference marker. More
on this at the end of my post. =)
The example I gave could, I suppose, be done as an inline function but
for problem number two. That code was in Java, and you can't do inline
functions in Java. I'm not going to get in a debate about C++ vs. Java
(my homepage already has a bit on that), but I have to program in perl,
C, Java, and <shudder>SQL</shudder> at my job. Saying "any respectable
programming language supports [it]" isn't, well, reasonable. Different
tools for different jobs means some nice features are in one tool and
not another.
> Other programmers already know the programming language, so why
> introduce an unnecessary and foreign means of achieving the level of
> abstraction that you can get with the programming language itself?
>
> A new maintainer of your code will be able to read it easily if you
> use the programming language itself, but if you use an unpopular
> system, they have to find it and learn it first in order to maintain
> your code.
How foreign is <<name of your language>> to you? If you are talking
about DEK's WEB or CWEB, then I agree that the syntax is ... about as
easy to learn as TeX. =) However, if you look at Norman Ramsey's
noweb, it's simplicity itself:
@ begin a document chunk
<<id>> reference a code chunk
<<id>>= begin code chunk "id"
And that's all there is to it. Plus, reading the actual weaved product
is, for me, easier then reading a straight printout of the raw code.
I actually DID leave work today thinking to myself "programming in noweb
is great fun!" just like DEK wrote in one of his essays.
> Pretty printing coupled with fancy comments is really great, but I
> think that taking the chunking out of the programming language is a
> mistake.
I don't know if I'm all alone here, but I also rather dislike the pretty
printing -- I don't enjoy having to reread the pseudo-code operator
symbols until my feeble mind groks what a sequence of them is really
doing. But lots of literate programming tools print the code WYSIWYG,
I think many people enjoy pretty printing it's because they go further
then me in the litprog concept. They almost make it pseudo-code because
they really ARE writing a manuscript for publication instead of just code.
The code algorithm is retained, but there is an attempt to hide the syntax
of the language actually being used (which may be considered unimportant)
under the hood. I'm not anywhere near that level.
> I ordered the CWEB book because I think Knuth is so fly, but I think
> CWEB is not helpful. I already do my own chunking--functions should
> easily fit on my emacs screen and in a reader's head.
I also ordered CWEB, and didn't like the looks of it very much. I actually
do know TeX, or rather I used to use it all the time and I could probably
polish off those neurons if I had to. But I still don't like the rather
odd syntax that DEK chose for WEB/CWEB.
However, I disagree with your implication that functions are just as
easy to fit in one's head and do just as good a job showing how the
code works.
Say you are given a program "specification":
Take a file, split it into records
Check a db to see if the records are modified or new
Update the db if necessary
Well, that's the sort of thing you get from your manager. You
need to expand on it:
Parse command line options
Set up database connection
For each line in input parse into a record
Check the record against the db
If the record exists, update it
If the record is new, insert it
Well, that's still not complete. How about this:
Import external library functions
Define variables used
Parse command line options
Set up database connection
For each line of input
Split line into fields based off our <specification>
Check that the fields are reasonable
Check the record against the db
If the record exists
If the record is different
Update the record
If the record is new, insert it.
That's close to a real spec, and I'm beginning to realize that it's
something I *should* have been spelling out in my programs from the
beginning. I usually went with that first description, or maybe the
second if I was feeling particularly productive. With litprog techniques
I have it almost automatically. And best of all, it comes naturally.
At the end is a snippet from a program I just finished writing tonight.
It's from my second literate program, and it's the one that had me
so happy with using noweb when I left the office today.
Actually, if experienced literate programmers are still reading this, I'd
love it if some of you could take a look at my first two literate programs
and say something about how you would have done things differently in
terms of the code formatting, verbage, etc. And of course if I'm making
any glaring mistakes in my code.
1st lit program (in c): http://highwire.stanford.edu/~jimr/mod_warmup.pdf
2nd lit program (in perl): http://highwire.stanford.edu/~jimr/qip.pdf
The 1st one needs some polish. I have some ideas on improving the 2nd
as well, but I'm pretty happy with it. I should also explain that these
two fall into the "utilities" category: I haven't had a chance to tackle
one of my big projects with litprog techniques. =(
Here's that snippet of qip.pl which explains in some detail about how
the program works. I think the specification just falls into your lap
when you're writing with litprog techniques...
\section{Program structure}
The structure of this program is fairly simple. \program\ will read
in a text file, either from \textsc{stdin} or from a file specified on
the command line, and split each line into a record. Each record will
be checked against the [[orgs]] table; if the record has been updated
or is new, \program\ will modify the table or insert the new record.
<<*>>=
#!/usr/gnu/bin/perl
#
# Read in a comma-separated list of records for nextwave's [[orgs]]
# table, and update or insert any altered or new records.
# The format of the list is:
# [[org_id]], [[org_name]], [[pgm_id]], [[pgm_name]]
# Any commas embedded in the field must be escaped with a '\'
#
use strict;
<<import perl modules>>
<<declare global variables>>
<<parse command line arguments>>
<<set up database connection [[dbh]]>>
while(<>) # for each line of input
{
<<declare local variables>>
<<split line into [[org_id]], [[org_name]], [[pgm_id]], and [[pgm_name]]>>
<<get [[rowcount]] for [[org_name]], [[pgm_name]] matching [[org_id]], [[pgm_id]]>>
if ($rowcount == 1)
{
<<update db record if [[org_name]] or [[pgm_name]] has changed>>
}
elsif ($rowcount == 0)
{
<<insert new db record>>
}
else
{
my $error = ("Database results from query on record $n returned",
"\$rowcount of $rowcount.\n",
"This may indicate a bad database entry.\n");
die($error);
}
}
...
So if anyone is looking at qip.pl and thinking "that's
not splitting on unescaped commas," here is what the
line really looks like:
@field = ($_ =~ /^(.+?[^\\]),(.+?[^\\]),(.+?[^\\]),(.+)$/);
The '^' appear properly in acroread, but I guess xpdf doesn't
even warn that it's dropping a symbol! =(
Jim
> [Loooong post warning. Windbag alert! Just me babbling away...]
>
> > > when I was working on my teach-myself-the-guts-of-java program (it's
> > > a truly simplistic example):
> >
> > Why don't you all use inline functions to do chunking? [...]
> >
> > Any respectable programming language supports chunking already, and
> > any respectable compiler supports inlining of trivial functions.
>
> Do you mean "you all" as in those of us using literate programming
> tools, or "you all" as in "hey y'all" just me? If you mean the
> former, I can't speak for anyone else. If you mean me, their are a
> couple of reasons.
Hmm. You know that "y'all" is always and only plural, right? I meant
"all the literate programming advocates in this thread", and you
constitute most of them, weighted by postings per (this) thread, so
speak away!
> Problem number one is that I'm using code chunking for things other
> than what inline functions are good at. I'm not using it for "a
> safer sort of #define," rather I'm using it as an excellent
> reference marker. More on this at the end of my post. =)
You seem to be using it in situations where trivial subroutines or
functions with their own local variables would make your program
easier to read without requiring your audience to know anything
besides the programming language.
I notice in your examples that it is difficult to see scoping
directly, since the scopes are broken across chunks. I really like
the way local variables in perl and C (and others), coupled with
braces and indentation, let you easily see exactly what data is
relevant to a particular snippet of code.
> The example I gave could, I suppose, be done as an inline function
> but for problem number two. That code was in Java, and you can't do
> inline functions in Java. I'm not going to get in a debate about
> C++ vs. Java (my homepage already has a bit on that), but I have to
> program in perl, C, Java, and <shudder>SQL</shudder> at my job.
> Saying "any respectable programming language supports [it]" isn't,
> well, reasonable. Different tools for different jobs means some nice
> features are in one tool and not another.
No, it is not reasonable taken absolutely. But not all of the ones
you mention are the kinds of languages where inlining is appropriate,
and of course they don't need to support it to be respectable. I was
unclear:
I meant that serious languages provide a means of chunking that
doesn't seriously degrade performance--inline functions is just one
way to do that. SQL is a language that I only use in bits that are a
few lines long anyway.
> > Other programmers already know the programming language, so why
> > introduce an unnecessary and foreign means of achieving the level of
> > abstraction that you can get with the programming language itself?
> >
> > A new maintainer of your code will be able to read it easily if you
> > use the programming language itself, but if you use an unpopular
> > system, they have to find it and learn it first in order to maintain
> > your code.
>
> How foreign is <<name of your language>> to you? If you are talking
> about DEK's WEB or CWEB, then I agree that the syntax is ... about as
> easy to learn as TeX. =)
I must admit I wuv TeX and was talking about CWEB, which I would
hesitate to thrust upon my workplace successors.
> However, if you look at Norman Ramsey's
> noweb, it's simplicity itself:
>
> @ begin a document chunk
> <<id>> reference a code chunk
> <<id>>= begin code chunk "id"
>
> And that's all there is to it.
Very nice, but still something extra for my successors to learn and
understand, and perhaps not necessary given the chunkability of many
languages.
> Plus, reading the actual weaved product is, for me, easier then
> reading a straight printout of the raw code. I actually DID leave
> work today thinking to myself "programming in noweb is great fun!"
> just like DEK wrote in one of his essays.
I would certainly have more fun using noweb or CWEB, but I think that
the resulting product would be a little less readable (I'm thinking of
DEK's own sources as viewed from a CWEB beginner's eyes) code
well-chunked using the language itself.
> > Pretty printing coupled with fancy comments is really great, but I
> > think that taking the chunking out of the programming language is a
> > mistake.
>
> I don't know if I'm all alone here, but I also rather dislike the
> pretty printing -- I don't enjoy having to reread the pseudo-code
> operator symbols until my feeble mind groks what a sequence of them
> is really doing. But lots of literate programming tools print the
> code WYSIWYG,
Oh, I don't like the weird "pretty" symbols--I was thinking of, e.g.,
using italics for keywords, using proportional fonts where possible,
etc.
...
> However, I disagree with your implication that functions are just as
> easy to fit in one's head and do just as good a job showing how the
> code works.
> Say you are given a program "specification":
>
> Take a file, split it into records
> Check a db to see if the records are modified or new
> Update the db if necessary
Now why break it down further? You have a general and understandable
strategy that fits nicely into three functions. You'll have to break
them down further in your code ... functions calling functions calling
functions.
...
> That's close to a real spec, and I'm beginning to realize that it's
> something I *should* have been spelling out in my programs from the
> beginning. I usually went with that first description, or maybe the
> second if I was feeling particularly productive. With litprog
> techniques I have it almost automatically. And best of all, it
> comes naturally.
Of course design is important, but I do that with pen and ink
drawings.
> At the end is a snippet from a program I just finished writing
> tonight. It's from my second literate program, and it's the one
> that had me so happy with using noweb when I left the office today.
I like the look of it, but I feel that if the programs were regualar C
and perl sources, I'd be able to tell at a glance what they do. The
extra structuring is fine for a book, but as a programmer who sees a
lot of C and perl and little noweb output, I feel hindered by noweb's
woven presentation.
A big problem is that it's more difficult to see variable scoping
graphically.
...
> <<*>>=
> #!/usr/gnu/bin/perl
> #
> # Read in a comma-separated list of records for nextwave's [[orgs]]
> # table, and update or insert any altered or new records.
> # The format of the list is:
> # [[org_id]], [[org_name]], [[pgm_id]], [[pgm_name]]
> # Any commas embedded in the field must be escaped with a '\'
> #
> use strict;
> <<import perl modules>>
This is simple enough not to warrant breaking down.
> <<declare global variables>>
Similarily, perl programmers expect globals here. I notice too that I
cannot see your "use vars()" statement next to your global variables.
You might not even mean package globals at all but lexically scoped
variables. I don't know because I have to go look it up rather than
seeing it at a glance. That is part of what makes the code seem
unnecessarily foreign.
> <<parse command line arguments>>
Could be in a subroutine. I should probably rephrase my question:
How is the chunking provided by third-party software like noweb
superior to the chunking provided by the languages everyone knows?
... Or is literate programming just a clever way to get programmers to
do what they could have been doing all along by offering them
something "new".
Oh no, I've had some people say "Hey Y'all" to me when I was all by my
lonesome! And yes, this newsgroup doesn't seem very active. Probably
cause we're all just rehashing the same old same old. =)
[General Query: I notice the litprog mail archive stops around 1995,
is that when it all went to Usent?]
> You seem to be using it in situations where trivial subroutines or
> functions with their own local variables would make your program
> easier to read without requiring your audience to know anything
> besides the programming language.
I think we have a fundamental difference of opinion here. I found the
weaved product very easy to understand, much more so than the straight
code that I've programmed over the years. Also, you seem to be talking
about the actual .nw file, and not the weaved paper. The point of the
weaved product is that one should first read it and gain the
understanding
of the program and it's structure before diving into the code. More on
the
"where trivial subroutines ... would make your program easier to read"
below. =)
> I notice in your examples that it is difficult to see scoping
> directly, since the scopes are broken across chunks. I really like
> the way local variables in perl and C (and others), coupled with
> braces and indentation, let you easily see exactly what data is
> relevant to a particular snippet of code.
What do you mean? I think that for both of the examples I had local
variables appropriate to the area. I guess you're probably talking
about
the same thing that someone else mentioned in e-mail -- the fact that
I'm
not taking real advantage of perl's declare-a-variable-where-you-like to
make the program clearer? I think it's just that I, personally, don't
think it makes it clear. In raw code, I prefer to be able to just jump
back to one area and look for the variable declaration. For literate
programming, I can define the variables local to the chunk, and have
them appear at the top of the major region when it is tangled. And
of course it goes back to "why not use a function and move some of
those variables out?" -- It's because it doesn't make sense to me
for such a small program.
> SQL is a language that I only use in bits that are a few lines long
anyway.
Yeah, I try and use SQL in tiny bits as well. Sometimes it gets
hairy though (like there's this one 50+ line monstrosity that
someone left squatting in our access control database...)
I still don't think perl, C, or Java have any really nice way of doing
chunking (I had to use eval() in perl a little while ago, and I almost
became ill). Now I'm not talking about functions here, since litprog
doesn't affect your ability to use functions.
> Very nice, but still something extra for my successors to learn and
> understand, and perhaps not necessary given the chunkability of many
> languages.
Could you maybe post (on your website or here?) some example of
how you could "chunk" code using, say something in perl similar
to what I posted?
> I would certainly have more fun using noweb or CWEB, but I think that
> the resulting product would be a little less readable (I'm thinking of
> DEK's own sources as viewed from a CWEB beginner's eyes) code
> well-chunked using the language itself.
So you are talking about reading the code directly? Reading the
raw .nw or .w file? I think one must take care to be very precise
when writing those file, I put 4 newlines before the start of new
chunks, ensure that tabbing is kept consistant, etc. I think that
helps make the raw code very readable.
> Similarily, perl programmers expect globals here. I notice too that I
> cannot see your "use vars()" statement next to your global variables.
> You might not even mean package globals at all but lexically scoped
> variables. I don't know because I have to go look it up rather than
> seeing it at a glance. That is part of what makes the code seem
> unnecessarily foreign.
Ah, but at some point this afternoon, when I was reading the weaved
output, I changed the noweb file to read:
We need to import [[Highwire::Util::Config]], and [[Getopt::Std]].
We will also want to declare the [[opt_c]] and [[opt_d]] here,
instead
of in the normal global variable section. The reason is that we are
using [[use strict]], which requires that all variables are either
lexical or fully declared globals. Because those variables are
defined by
[[Getopt::Std]], perl will complain about them unless we declare
them in
[[use vars]].
<<import perl modules>>=
use Highwire::Util::Config;
use Getopt::Std;
use vars qw($opt_d $opt_c);
@ %def $opt_d $opt_c
I just prefer to have all the "use ..." in one location. I imagine
other people would think this horrid or something, but it's something
I do with my "raw code" anyway. So the litprog hasn't changed that.
This goes to the point you make later about how I might be just too
lazy. =)
> Now why break it down further? You have a general and understandable
> strategy that fits nicely into three functions. You'll have to break
> them down further in your code ... functions calling functions calling
> functions.
> [...] Of course design is important, but I do that with pen and ink
> drawings.
> [...] I like the look of it, but I feel that if the programs were
regualar
> C and perl sources, I'd be able to tell at a glance what they do.
> [...] How is the chunking provided by third-party software like noweb
> superior to the chunking provided by the languages everyone knows?
>
> ... Or is literate programming just a clever way to get programmers to
> do what they could have been doing all along by offering them
something
> "new".
I think that you may a) be better at reading code directly and keeping
it
all in your head, and b) be better disciplined in terms of sitting down
and designing code before you sit down at the keyboard. I'm not very
good at the former, and though I try and do the latter I often find
myself
in the postion of "oh ... I forgot about that case, didn't I. damn."
Keep in mind that nothing about litprog denies you the ability to code
normal functions, or take advantage of whatever macro facility already
exists in the native language.
I find it much easier to split the code into very distinct regions for
*some* tasks. I am not limited by function names that can't use spaces
and
have conventions on how they should look. If I tried to use functions
called
if ( update_db(org_id, pgm_id, &org_name, &pgm_name) == -1 )
{
// stuff to figure out exactly why it failed
}
I could have a reasonable idea of what it does, and I could shift down
in
the source code to read the header comment of the function to see that
it only updates if the values are different. But it is so much clearer
(to me) with full explanation reading:
<<update db record if [[org_name]] or [[pgm_name]] has changed>>
If I tried to name the function with anywhere near the same amount of
annotation I'd be shot by my co-workers with full justification. =)
As I said above, I'm not at all against using functions where
appropriate.
In the case of my first two litprog attempts, I didn't see the need for
function calls in any of the code. I'm not repeating them anywhere, so
why inflict the overhead of a function call? That very first example I
wrote, the one in Java, I couldn't use a function for, but I still had
to
do the same thing over and over again. So I preferred to use a code
chunk.
I think litprog gives you the flexibility to program either using a
chunk or using a function or using the native languages macro facility
or whatever. It expands your ability to write the code in the way that
makes the most sense to you.
Jim
[snip]
>Why don't you all use inline functions to do chunking? I ordered the
>CWEB book because I think Knuth is so fly, but I think CWEB is not
>helpful. I already do my own chunking--functions should easily fit on
>my emacs screen and in a reader's head.
Chunks need not be complete functions -- they are, well, chunks. i
Sure you can try to break your code into functions that fit on one
screen (25 lines over my VT100 connexion) but it makes for a lot of
overhead unless the compiler is really good. So, you'll have different
performance on different platforms. With a litprog tool you'll end up
compiling the same source code and chances are that even a compiler that
is less sophisticated generates good code.
Also, function calls may -- or very often are -- kill maximum
optimization in tight loops. Imagine, a compiler encountering something
like:
if <some test>
do <blah>
else
do <blub>
in a tight loop, or better yet rewritten as
if <some test>
do <blah-loop>
else
do <blub-loop>
may have problems with advanced instruction scheduling etc. if <blah>
and <blub> are function calls. At the same time, the stuff in <> may be
too complicated to be easily put in a Macro as in C (which isn't in
Fortran, anyway).
>Any respectable programming language supports chunking already, and
>any respectable compiler supports inlining of trivial functions.
Ah. In any respectable ideal world, yes. What planet are you on :-)
>Other programmers already know the programming language, so why
Frankly, I have seen people work as programmers who knew amazingly
little about the language and the tools they invoked to earn their
living. The IT industry is moving too fast to have only highly qualified
people working for it (see MicroSoft products). I just hope they'll
always have enough good ones for flight control systems and other really
essential stuff.
>introduce an unnecessary and foreign means of achieving the level of
>abstraction that you can get with the programming language itself?
Foreign, yes. Unnecessary? No.
To some extent, even if you put the documentation side aside (and that's
a _huge_ advantage to ignore), you can consider litprog tools as an
advanced preprocess that gets right what cpp doesn't. Speaking of c ...
most I/O stuff is, AFAIK, written as macros written in lower level C.
Yet it's standardized. Why have it if every respectable programmer could
do it himself? :-)
>A new maintainer of your code will be able to read it easily if you
>use the programming language itself,
Check with altavista for the Obfuscated C contest. :-)
An alternative is to examine tons of legacy Fortran code in use all over
the place in engineering, weather prediction, physics, which is, if at
all, documented with blank lines most of the time. Sure it's something
intelligable if you speak the language, but a clearly documented (set
of) file(s), properly type set and cross-indexed by means of a litprog
tool will make it much easier to work through and maintain programmes of
several hundred thousand lines of code.
>but if you use an unpopular system, they have to find it and learn it
>first in order to maintain your code.
(S)he will have to learn and find it. The problem you raise is a real
one though, passing on my fweb files to people who still consider word
pad a programmer's editor and ftp their codes to the UNIX box after
editing, to compile it there w/o bothering to even turn on optimization,
that is often a stop gap.
If I consider the endless hours I have wasted looking at codes inherited
from other source trying to find out what the heck was going on I think
that learning a web tool is nothing compared to working to a legacy
program w/o adequate docs.
Much of this kind of code is not written with portability in mind, you
find stuff like
x=x+1d-16
y=1d0/x
and have to figure out that that was done to avoid testing for x.eq.0
(or better, |x| within a certain \epsilon larger than zero) which was
slower on that hardware than just adding some small number and going on.
This one is obvious still, but many other things are not.
>Pretty printing coupled with fancy comments is really great, but I
Pretty or not, the real features -are- chunking and indexing and
documentation while coding, in the code.
>think that taking the chunking out of the programming language is a
>mistake.
But chunking at that level is implemented in very few languages. I
assume you haven't yet written a project using litprog tools. If I am
right -- why don't you try some litprog tool, and be it the strikingly
small yet efficient nuweb?
Cheers, Stefan
--
=========================================================================
Stefan A. Deutscher | (+33-(0)1) voice fax
Laboratoire des Collisions Atomiques et | LCAM : 6915-7699 6915-7671
Mol\'{e}culaires (LCAM), B\^{a}timent 351 | home : 5624-0992 call first
Universit\'{e} de Paris-Sud | email: s...@utk.edu
91405 Orsay Cedex, France (Europe) | (forwarded to France)
=========================================================================
Do you know what they call a quarter-pounder with cheese in Paris?
> On Fri, 12 Nov 1999 02:51:27 GMT, Ed L. Cashin <eca...@coe.uga.edu> wrote:
...
> >Any respectable programming language supports chunking already, and
> >any respectable compiler supports inlining of trivial functions.
>
> Ah. In any respectable ideal world, yes. What planet are you on :-)
The planet of snobby arrogant programmers who make a scene in
newsgroups without fully knowing the subject about which the scene is
being made.
...
> If I consider the endless hours I have wasted looking at codes inherited
> from other source trying to find out what the heck was going on I think
> that learning a web tool is nothing compared to working to a legacy
> program w/o adequate docs.
Yep. I am starting to think, "Hey, maybe it is not so bad if the
chunking is not all done at the level of the programming language. At
least it's understandable either way."
> Much of this kind of code is not written with portability in mind, you
> find stuff like
>
> x=x+1d-16
> y=1d0/x
I don't envy you.
> and have to figure out that that was done to avoid testing for x.eq.0
> (or better, |x| within a certain \epsilon larger than zero) which was
> slower on that hardware than just adding some small number and going on.
> This one is obvious still, but many other things are not.
>
> >Pretty printing coupled with fancy comments is really great, but I
>
> Pretty or not, the real features -are- chunking and indexing and
> documentation while coding, in the code.
>
> >think that taking the chunking out of the programming language is a
> >mistake.
>
> But chunking at that level is implemented in very few languages. I
> assume you haven't yet written a project using litprog tools. If I am
> right -- why don't you try some litprog tool, and be it the strikingly
> small yet efficient nuweb?
That is a very good suggestion, but I think it will probably be CWEB,
since:
1) I always learn a lot by doing it Knuth's way
2) I like the non-monospaced font printable version that
you get from CWEB.
I honestly don't understand why you find the syntax of CWEB difficult. It
consists of a rather limited list of control codes, even very limited if one
only considers those used in ordinary situations (say: `@*', `@ ', `@c', `@<',
`@d', `@(', `@@'), all of them consist of `@' followed by one character, most
of them do not take any arguments, and for those who do the single argument is
terminated by `@>'. Well yes, one must also remember to write `=' after
'@<...@>' in case of a module definition (as opposed to reference). To me this
seems like a rather minimalistic syntax. (I admit it makes the sources look a
bit uglier, but that's a different point.) And the noweb syntax also gets a
bit hairier if you want to do things like indexing variable uses (but it too
is quite properly considered minimalistic). In any case I find the CWEB syntax
much easier to remember and use then any of the 20+ different syntaxes used
for regular expressions (and their close relatives) in different tools (not to
mention the difficulties caused by having to guide those expressions though
the shell's or emacs' or whatnot's lexical munchers). Why, you uttered the
following in a followup to your own post:
@field = ($_ =~ /^(.+?[^\\]),(.+?[^\\]),(.+?[^\\]),(.+)$/);
And you find CWEB difficult?
> I don't know if I'm all alone here, but I also rather dislike the pretty
> printing -- I don't enjoy having to reread the pseudo-code operator
> symbols until my feeble mind groks what a sequence of them is really
> doing.
If you mean you don't like the way CWEB uses special symbols to render
expressions like `p&&q' `p||q', `a!=b', `a==b' etc. in a manner that is closer
to the mathematical tradition than is possible in C (due to ASCII
limitations), then I can sympathise with you (I assume than unlike Knuth, you
are not a mathematician); note however that by a few simple TeX macros
definitions you can make these operators come out any way you like, including
a representation identical to the source. (I don't think that CWEB documents
this possibility very well, but CWEBx does so very explicitly.) Personally I
can report though that the fact that the (usually erroneous) statement `if
(x=0) ...' gets prettyprinted something like `if (x<-0) ...' has often helped
me correct such errors before I even compile for the first time (I usually get
my prettyprinted code looking properly first before compiling; it greatly
helps me understand what I am doing).
> I think many people enjoy pretty printing it's because they go further
> then me in the litprog concept. They almost make it pseudo-code because
> they really ARE writing a manuscript for publication instead of just code.
Being a supporter of free software, to me the code _is_ a manuscript for
publication, whatever its form; using litprog the manuscript will be more
readable.
In a followup Ed Cashin wrote:
> I should probably rephrase my question:
> How is the chunking provided by third-party software like noweb
> superior to the chunking provided by the languages everyone knows?
The main reason is: it allows you to identify the chunk by exactly the right
amount of information needed to understand the code from which the chunk was
abstracted away. Even if function names of arbitrary length can be used, I
think this is not possible by function abstraction.
A secondary reason is that this chunking is a light-weight operation: if some
part of code is getting to bulky for comfort, no other administration or
penalty is involved in isolating it as a chunk than devising a proper name for
it. This is why chunking must have a macro-like semantics, without any change
to the scoping context. (I don't particularly like macro semantics, but here
it is essential; I might add that existing macro systems may serve as a
warning of the slippery slope we would get onto by allowing for instance
parametrised chunks.) To preserve readability, this does require any variables
used exclusively in the chunk to be locally declared (so the reader does not
need to hunt for the declaration); as a rule of thumb, any variable that
communicates information between the chunk and its context should be mentioned
in its name.
If separation from the context is desirable (one wishes to free ones mind from
the goings-on outside) then one can always replace the chunk by a function;
let's hope that in these cases the purpose of the chunk is sufficiently
self-contained that it can be described by a properly chosen function name.
Marc van Leeuwen
Universite de Poitiers
http://wwwmathlabo.univ-poitiers.fr/~maavl/
>> Ah. In any respectable ideal world, yes. What planet are you on :-)
>The planet of snobby arrogant programmers who make a scene in
>newsgroups without fully knowing the subject about which the scene is
>being made.
Uhhhm --- you weren't striking in my direction overlooking the smiley,
were you? In any case -- a programmer I am not. I just use programming
as a tool. And I try to make that as bearable, as maintainable, and as
efficient as possible. Hence, litprog.
>...
>> If I consider the endless hours I have wasted looking at codes
>> inherited from other source trying to find out what the heck was
>> going on I think that learning a web tool is nothing compared to
>> working to a legacy program w/o adequate docs.
>
>Yep. I am starting to think, "Hey, maybe it is not so bad if the
>chunking is not all done at the level of the programming language. At
>least it's understandable either way."
Right on!
>> >Pretty printing coupled with fancy comments is really great, but I
>> Pretty or not, the real features -are- chunking and indexing and
>> documentation while coding, in the code.
>> >think that taking the chunking out of the programming language is a
>> >mistake.
>> But chunking at that level is implemented in very few languages. I
>> assume you haven't yet written a project using litprog tools. If I am
>> right -- why don't you try some litprog tool, and be it the
>> strikingly small yet efficient nuweb?
>
>That is a very good suggestion, but I think it will probably be CWEB,
>since: 1) I always learn a lot by doing it Knuth's way 2) I like the
>non-monospaced font printable version that you get from CWEB.
Great! Any of these -- cweb, nuweb, noweb, fweb -- are fine tools to get
the job done. I just suggested nuweb because it is _really_ small and
easy to build and the docs are something like three pages. I myself use
fweb most of the time, given that I use fortran most of the time.
Anyway, nuweb also produces the docs to be typeset with LaTeX, so that
part will be pretty printed, but the source part, AFAIR, will be spaced
and indented as in the source file, and type set using a typewriter
font.
Have fun & good luck!
>> > Why don't you all use inline functions to do chunking? [...]
>>
>> Do you mean "you all" as in those of us using literate programming
>> tools, or "you all" as in "hey y'all" just me? If you mean the
>> former, I can't speak for anyone else. If you mean me, their are a
>> couple of reasons.
>
>Hmm. You know that "y'all" is always and only plural, right?
Well, not in TN. I've been "y'all" more often than y'all would imagine,
just like I was "yousguys" or "youns" up North.
Y'all come back now!
Cheers, Stefan
--
========================================================================
Stefan A. Deutscher | (+1-423-) voice fax
The University of Tennessee, Knoxville | UTK : 974-7838 974-7843
Department of Physics and Astronomy | ORNL : 574-5897 574-1118
401, A. H. Nielsen Building |
Knoxville, T.N. 37996-1200, USA | email: s...@utk.edu
========================================================================
> On 12 Nov 1999 22:56:25 GMT, Ed L. Cashin <eca...@coe.uga.edu> wrote:
> >"James A. Robinson" <jimro...@my-deja.com> writes:
>
> >> > Why don't you all use inline functions to do chunking? [...]
> >>
> >> Do you mean "you all" as in those of us using literate programming
> >> tools, or "you all" as in "hey y'all" just me? If you mean the
> >> former, I can't speak for anyone else. If you mean me, their are a
> >> couple of reasons.
> >
> >Hmm. You know that "y'all" is always and only plural, right?
>
> Well, not in TN. I've been "y'all" more often than y'all would imagine,
> just like I was "yousguys" or "youns" up North.
A HA!! Now I know who is corrupting the "y'all" from within the South
itself. It is the Tennesseeans! Maybe they are into some kind of
religion where each person is everything, so that one could be all.
If that is the case I rescind my unkind remarks.
<aside>Although I have derived much fun from being silly here, I feel
for those who are not as silly, and so I am going to finish my
participation in the "y'all" thread now. Thanks to everyone for their
patience.</aside>
> "James A. Robinson" wrote:
...
> > easy to learn as TeX. =) However, if you look at Norman Ramsey's
> > noweb, it's simplicity itself:
> >
> > @ begin a document chunk
> > <<id>> reference a code chunk
> > <<id>>= begin code chunk "id"
Please note that James A. Robinson is the person you are quoting here.
> I honestly don't understand why you find the syntax of CWEB
> difficult.
...
> lexical munchers). Why, you uttered the following in a followup to
> your own post:
>
> @field = ($_ =~ /^(.+?[^\\]),(.+?[^\\]),(.+?[^\\]),(.+)$/);
>
> And you find CWEB difficult?
And note again that Ed Cashin is the person who wrote that regular
expression. James was responding to my point that a programmer would
have to learn something extra, something beyond the difficulties of
the programming language itself, in order to understand sources that
rely on third-party litprog software. His point was that some of the
litprog packages are easy to learn.
I *like* learning new things, and I like the CWEB syntax. It's fine.
I also like regular expressions. But while I do not have a problem
using regular expressions in my perl code or expecting future
maintainers of my perl code to understand perl, I don't expect them to
know things over and above perl, like noweb and CWEB.
...
> In a followup Ed Cashin wrote:
> > I should probably rephrase my question:
> > How is the chunking provided by third-party software like noweb
> > superior to the chunking provided by the languages everyone knows?
>
> The main reason is: it allows you to identify the chunk by exactly
> the right amount of information needed to understand the code from
> which the chunk was abstracted away. Even if function names of
> arbitrary length can be used, I think this is not possible by
> function abstraction.
Hmm. You certainly have a point, but I am uncomfortable with any
technique that breaks up the relationship between the braces I see in
the code and how that relates to the scopes. I'll give nuweb and
noweb and CWEB a try, though, and see what it's like.
> On 14 Nov 1999 08:20:04 GMT, Ed L. Cashin <eca...@coe.uga.edu> wrote:
> >The planet of snobby arrogant programmers who make a scene in
> >newsgroups without fully knowing the subject about which the scene is
> >being made.
>
> Uhhhm --- you weren't striking in my direction overlooking the smiley,
> were you? In any case -- a programmer I am not. I just use programming
> as a tool. And I try to make that as bearable, as maintainable, and as
> efficient as possible. Hence, litprog.
No, no--it certainly needed a smiley! First, I was full of hubris as
I pronounced pedantically, "any respectable ... functions," and then
when I was caught, I indulged in silly self-deprecation. The
silliness warranted a smiley, and I apologize for the omission.
...
> Great! Any of these -- cweb, nuweb, noweb, fweb -- are fine tools to get
> the job done. I just suggested nuweb because it is _really_ small and
> easy to build and the docs are something like three pages. I myself use
> fweb most of the time, given that I use fortran most of the time.
> Anyway, nuweb also produces the docs to be typeset with LaTeX, so that
> part will be pretty printed, but the source part, AFAIR, will be spaced
> and indented as in the source file, and type set using a typewriter
> font.
I'll look at nuweb too, then. Cheers!
: Also, you seem to be talking
: about the actual .nw file, and not the weaved paper. The point of the
: weaved product is that one should first read it and gain the
: understanding
: of the program and it's structure before diving into the code.
And definitely you cannot take usual Monitors to your
favourite chair and read in silence. :-)
The paper is like an "in-brain code browser" for me
that, by personal empirical evidence, shows me much more clearly
if I have written messy nonsense ;-)
With Index and TOC
# Georg