One of the reasons why I program less now is that the abilities of
programming languages are so far behind my expectations. What I mostly
miss are good test suite systems, and, most of all, excellent
ingrained documentation.
Most experienced programmers would say that in fact there are many
tools and libraries for writing good test suites and ingrained
documentation. You can even see a good example of this in my field
tree implementation, using the doctest module in python:
http://paste.lisp.org/display/81455
http://gist.github.com/386234
Others will add that if you want to be more advanced about this, you
can use approaches such as literate programming. The idea of literate
programming was developed by Donald Knuth as a way to structure
information with instructions for separate human and computer oriented
representations, to then either weave documentation out of it, or
tangle code.
And herein lies the crux. Literate programming does represent the
state of the art for ingrained documentation, and this is the problem,
because literate programming is rubbish. Take the following snippet
from the example in Wikipedia's article about literate programming:
We must include the standard I/O definitions, since we want to send
formatted output to stdout and stderr.
<<Header files to include>>=
#include <stdio.h>
@
This exposes the biggest problem with literate programming, which is
that people tend to use it as a mere excuse to embed a banal
programming language tutorial inline and macroise the complicated
parts out. This is crazy. Whenever I explore literate programming, I
find this problem. Literate programming is only used for creating
abysmal documentation.
Perl is well known as a language which tends to get very messy, there
is a mantra that this isn't due to the language at all but rather due
to the programmers and their style. This is true. You can see
excellent proof of that here, in these scripts by Morbus Iff, or in
pretty much anything that he writes:
http://www.disobey.com/detergent/code/itunes2html.txt
http://www.disobey.com/detergent/code/wowquests.pl
http://www.disobey.com/detergent/code/leecharoo/leechgrounds.pl
So you could make the same case for literate programming. Would it be
possible to create beautiful literate programs if you just tried hard
enough?
I don't believe that one can create beautiful literate programs for
two reasons. The first is that in literate programming, a
human-computer hybrid is the primary encoding of the program
information, whereas in reality the code always starts out in a woven,
human state and the programmer's job is to translate that to a
tangled, computer state. The second is that the woven state is so far
removed from the structure and syntax of an actual syntactic computer
program that I think it's almost impossible to meld the two in a
satisfying way, at least with the dull edge of existing literate
programming systems.
In other words, I think that programmers tend to like to solve things
on the syntactic level because that for them is the goal. That is what
they are always trying to achieve, to get from their ideas to an
actual syntactic program. But the starting points of this process, the
ideas, are something much more complex and human.
The way that I would like to represent a program is something more
like a hypertext story. Let's use an actual example here, say a
program for merging log files which have become out of sync. This is a
real problem that I have.
The first thing that I would like to do is to describe the problem
that I have thoroughly. Why do I want to act upon this? This would
involve talking about why I have log files, why the log files are
important, why they are out of sync, and why it's important to put
them back into sync.
Then I should like to examine some high level solutions for getting
the log files back into sync. Can I fix whatever is getting them out
of sync in the first place? Then if I could do that I could probably
just fix them by hand and not worry about it. The program is meant to
make a repeatable job more easy.
This background would normally be ostensibly omitted by a programmer
in the actual finished program program, so in other words though the
finished code is affected by the initial requirements, these kinds of
things are usually not explained in any kind of detail the code
itself, and are rarely explained properly in the documentation either.
But when you think about it, this is the first part of the lifecycle
of code, and the most important part too.
Then of course I would like to solve the problem itself, so I would
need to think about ways to process the logs. I would like to merge
the logs into one new log. I would also possibly like to change the
output format, for reasons separate to needing to merge the logs in
the first place. Does this mean that I should create a modular system
which has two main components?
To somebody who is coming to the program for the first time, they
probably don't want to know about this in as much detail. There ought
to be a summary version of this background so that people can check
whether the problem that they are facing is the same as the one that
the code solves. And if their problem is similar but not exactly the
same, then a good literate programming system, a hypercode system,
ought to make it very clear to them what they need to change.
Once the manner of the solution has been worked through, then comes
the part that makes a programmer a programmer: actually thinking about
the code. This is where the huge divergence between the human logic
and the code gets introduced or becomes apparent. What I would like to
do is to model how I actually think about a program, and then
gradually tease the code itself out of that. The code is almost
redundant, secondary to the original logic. This could be imagined as
a kind of tree structure where the logic branches off into specifics
which get blown about by the gusty winds of the syntax of a specific
language.
There should be a kind of rule of thumb that language specific
constructs should be most thoroughly avoided in hypercode. You are not
embedding a programming language tutorial, you are describing how your
problem can be solved. So for example, in terms of my log files, the
logs are broken down into sessions. You might have two log files, call
them (1) and (2), which are arranged as follows:
(1) A, B, D, E, F, I, J
(2) C, G, H, K, L
Each of the letters represents a session, and the alphabetical order
represents the chronological order of the sessions. Sessions do not
overlap chronologically. What the code needs to do is to parse the
logs into component sessions, order them, and then write them out in a
single stream.
This means that there must be two, always two and never more, input
streams which we read from. To keep memory consumption low, what we
should do is to buffer only as much of each file in memory as we must.
So for example, we would read A from (1) and C from (2) and realise
that C is later than A. We would then check (1) to see if there are
any more sessions between A and C that we need to add. So we would
read B from (1), and again D from (1). Since we realise that D is
later than C, we can write out A, B, and C to output. Then we must
check what comes after C to make sure that there is nothing after C
which is before D. And so we end up with a kind of dance performed by
two partners.
One of the most important things in writing good code is to make sure
that there are no errors, which involves taking care of all edge cases
and making sure that all cases work equally well. To do this, one
needs to understand the code as fully as possible and make it very
clear what is going on. What we want to do is the opposite of
obfuscation. We want to make a program so clear that you could publish
it in a tabloid newspaper.
One problem with the letter representation of this system is that to
us it is clear that not only is C later than A, but also that there is
a session missing between A and C. The program doesn't know this. It
has to inspect B before it can know that B comes between A and C. So
in a way the labelling is cheating. Perhaps that could be solved by
using arrow annotations to show which sessions are adjacent to which
other sessions. This is a problem we need to think about in the
exposition of the program.
As an example of a problem in the program design itself, we could
consider for example what would happen if the inputs were reversed.
The code shouldn't rely on input source (1) having the most early
session in it. And what about strange situations? Would we want to
handle the case where, for example, a file starts "B, A" with the
sessions in the wrong order? Presumably we would like to fail with a
very clear error message, but perhaps we would also like to simply
assume that the dates are correct and to output the sessions as "A,
B". This cannot actually happen in the system that I have unless there
is some very strange corruption, so I think the best answer would be
to fail with an error. The error message should explain not only what
the error is, but why this counts as an error at all. This is
important, but is usually missed out of error messages.
Recently I was chatting to good friend Tav about our experimental
programming languages, his being called naaga and mine pluvo2, and we
were thinking about what the most natural expression is for variables.
I realised that we tend not to say that the act of interpersonal
tickling is where a subject, x, tickles an object, y, where the object
y will laugh. We tend instead to use concrete examples. We might say
that if John tickles Mary, then Mary will laugh. We then abstract
backwards from this to realise that John and Mary are in a sense
universally quantifiable. Not universally quantified, but
quantifiable. This is an important distinction.
To go back to my log file example, imagine an SVG (say raphaeljs)
representation of the dance that I described. What you could have is a
coloured circle for each session, arranged into two stacks
representing the two input files. Then you could have a machine which
represents the code, some kind of a Heath Robinson or Rube Goldberg
style contraption that grabs the circles from each of the stacks, and
throws them out in the right order. You should be able to see not only
the dance happening, but the reason for each stage of the dance.
That would be the program working on a specific set of inputs. Now
what you need to see is that it could work on any possible input. This
is the challenge of programming. There is a comparable situation with
my proof of the pythagorean theorem:
http://inamidst.com/stuff/notes/pythag
The theorem is only solved, on that page, for a specific instance of
right angled triangle. But it would be possible to prove it for all
right angled triangles by having an animation of the triangle changing
shape from its most acute non-right-angled angle becoming the least
acute and vice versa. Similarly, we want to be able to show that our
Robinson-Goldberg contraption is going to be able to handle all
inputs, but I'm not sure that this would be as easy to represent as in
the proof of the pythagorean theorem.
When we get to the details, the nature of the problems changes quite a
lot. For example, we might get as far down as the details of the
encoding of the input files. Are they utf-8, or should we detect the
encoding? Since they're a form of instant messaging logs, in fact each
line could be in an independent encoding. This can make things very
tricky. But in the general case it would be reasonable to assume
they're utf-8. How should this be represented? When we get down to
this stage of details, we're talking about code like the following, in
python:
with open(first, encoding='utf-8') as a:
...
This does not need explanation if you know python, and if you don't
know python and can't work out what this block opener means, then you
probably shouldn't be reading the code anyway, so unlike regular
literate programming it probably doesn't need explaining at the
syntactic level. You would have to say why you are decoding the file
as utf-8, however, and it may also be useful to mention that reading
from the file after this point will not give a stream of bytes, but of
unicode characters.
This is about as far down the stack into the details as we get, but
there are of course middle levels too. Consider for example the
parsing of the file into sessions. What we would be doing for that is
to iterate over the input lines from the unicode character stream, and
inspect them to see when a particular token that we recognise as a
session starter will appear. In this case the session starter lines
will look like the following:
**** BEGIN LOGGING AT Sun Apr 25 13:34:29 2010
No other line will start with a "*" except those lines that start
sessions, so line.startswith("*"), in python, would be an adequate way
of splitting the sessions up.
The representation of session parsing could be seen as somewhat
between the sessions dance and the encoding level. It would be
possible, for example, to represent this in an SVG diagram just like
the sessions dance. You would perhaps have a file looking more or less
like a normal file in a text editor, only arranged a bit more like a
table so that each line is put in a rectangle. Each of the lines is
vertically stacked. At the left hand side you could imagine a pointer
going down, and it inspects the first letter of each line. When it
finds a "*" there, what it does is to break the line apart from the
line above it.
You can imagine how this could integrate into the session dance. When
a session is broken apart, it could be compacted into a circle with a
code in it representing the date, or with annotational arrows to give
the order relative to other known sessions. These could then be thrown
into the area which represents the stack of sessions parsed from the
file. This is important since it would be worth showing that the files
don't actually come pre parsed. We won't be having all the sessions in
memory, so though the files might be structured like this on the disk:
(1) A, B, D, E, F, I, J
(2) C, G, H, K, L
We would actually find A emerging first, then C, then B, then D, then
G, and so on. These sessions are being pulled from the input streams.
So both parts of the code are working simultaneously, the parser
giving input to the session dance.
On the other hand, you could also argue that this is such a simple
part of the code that it's not even really worth making a diagram of.
After all, it should work if you just read the whole file in and did
an input.split("\n*") on it. In this view, the representation would be
more like the encoding level, where you don't bother explaining, but
do explain the rationale behind the design. So for example it would be
important to note the following two points:
* Sessions only start with lines starting with "*"
* Only lines starting with "*" start sessions
In other words if it starts with "*" then it starts a session, and all
sessions will be found this way. Neither of these points by itself is
sufficient to be able to inspect "*" to be the beginning of a section,
so you would have to document both of them. Of course we don't have an
either-or proposition here, and we could have an animation of this but
would still need to explain why the inspector is looking for the "*"
and why it can know that it will get all sessions that way.
There will probably come an intuitive balance between these levels. In
programming practice, the details of the program are the most
important part. The individual lines of code and the calls which are
being made are the meat of the system. But on the documentation level,
the more specific we get, the more "solved" the overall problem has
become, so we can afford to pay less attention to it. So as we move
out to the leaves of the system, the amount of documentation that we
will give to them will become smaller and less rich.
At this point it may be a good idea to explain more about why I would
like to create a program in this way. Obviously the main feature of
programming in this way is that it would take a lot of effort. And in
fact it could become somewhat recursive. If you created an animation
of the sessions dance in raphaeljs, then that would itself be code, so
then you would have to document the code that you're using to document
the original code! And all this would bloat up the code to be much
larger a project that perhaps if you just merged the logs by hand when
they're out of sync.
There are two responses to this. The first is that to some extent what
we are missing are tools to make good documentation like this, though
also to perhaps a lesser extent we are also missing patterns,
practices, and a culture around it. If these things were supplied,
then of course the effort required to make excellent code like this
would be diminished. But the second response is that actually this is
in a sense separate to the wanting to solve the original problem. This
is adding the extra dimension of wanting to make the solution to the
original problem itself somewhat of an art form. I don't consider
beauty in this program to be a side effect of wanting to solve the
original problem. I consider it something which is a first class
product in itself, something that is of itself enjoyable.
This is one of the primary things that makes programmers do stupid
things. They love programming so much that they start to invent
problems that don't even need solving, and go about solving them,
often fooling themselves meanwhile into thinking that the ostensible
thing that they're working on is the productive one. But the love of
programming is productive in a sense too, so there isn't really any
need to fool oneself in this way. In fact it ought to be liberating to
realise that you can just program for fun.
What I've done in this explanation is to be careful to pick an example
that is actually useful. The log merger is the only code at the moment
that I'm thinking about writing, and since I actually need it I'd
probably just bang out a short script first and then if I actually did
do this nice documentation it would be done as a second stage.
The problem with programming small scripts like this is that
programming in such a frustrating language makes it feel as though
you're losing a lot of the essential thinking behind the program,
which is much of the enjoyment of the whole thing. Though you do get a
usable product, which is the main aim, you probably wouldn't be
learning from it later on. When I look at code that I wrote five years
ago, I just wonder what's going on in there, and usually don't really
bother inspecting it too much. Very beautiful hypercode on the other
hand would perhaps inspire us to learn more about the exciting aspects
that make a program valuable.
--
Comment at
http://groups.google.com/group/whits/topics
Subscribe to
http://inamidst.com/whits/feed