Hypercode

28 views
Skip to first unread message

Sean B. Palmer

unread,
May 1, 2010, 8:58:46 AM5/1/10
to Gallimaufry of Whits
One of the reasons why I program less now is that the abilities of
programming languages are so far behind my expectations. What I mostly
miss are good test suite systems, and, most of all, excellent
ingrained documentation.

Most experienced programmers would say that in fact there are many
tools and libraries for writing good test suites and ingrained
documentation. You can even see a good example of this in my field
tree implementation, using the doctest module in python:

http://paste.lisp.org/display/81455
http://gist.github.com/386234

Others will add that if you want to be more advanced about this, you
can use approaches such as literate programming. The idea of literate
programming was developed by Donald Knuth as a way to structure
information with instructions for separate human and computer oriented
representations, to then either weave documentation out of it, or
tangle code.

And herein lies the crux. Literate programming does represent the
state of the art for ingrained documentation, and this is the problem,
because literate programming is rubbish. Take the following snippet
from the example in Wikipedia's article about literate programming:

We must include the standard I/O definitions, since we want to send
formatted output to stdout and stderr.
<<Header files to include>>=
#include <stdio.h>
@

This exposes the biggest problem with literate programming, which is
that people tend to use it as a mere excuse to embed a banal
programming language tutorial inline and macroise the complicated
parts out. This is crazy. Whenever I explore literate programming, I
find this problem. Literate programming is only used for creating
abysmal documentation.

Perl is well known as a language which tends to get very messy, there
is a mantra that this isn't due to the language at all but rather due
to the programmers and their style. This is true. You can see
excellent proof of that here, in these scripts by Morbus Iff, or in
pretty much anything that he writes:

http://www.disobey.com/detergent/code/itunes2html.txt
http://www.disobey.com/detergent/code/wowquests.pl
http://www.disobey.com/detergent/code/leecharoo/leechgrounds.pl

So you could make the same case for literate programming. Would it be
possible to create beautiful literate programs if you just tried hard
enough?

I don't believe that one can create beautiful literate programs for
two reasons. The first is that in literate programming, a
human-computer hybrid is the primary encoding of the program
information, whereas in reality the code always starts out in a woven,
human state and the programmer's job is to translate that to a
tangled, computer state. The second is that the woven state is so far
removed from the structure and syntax of an actual syntactic computer
program that I think it's almost impossible to meld the two in a
satisfying way, at least with the dull edge of existing literate
programming systems.

In other words, I think that programmers tend to like to solve things
on the syntactic level because that for them is the goal. That is what
they are always trying to achieve, to get from their ideas to an
actual syntactic program. But the starting points of this process, the
ideas, are something much more complex and human.

The way that I would like to represent a program is something more
like a hypertext story. Let's use an actual example here, say a
program for merging log files which have become out of sync. This is a
real problem that I have.

The first thing that I would like to do is to describe the problem
that I have thoroughly. Why do I want to act upon this? This would
involve talking about why I have log files, why the log files are
important, why they are out of sync, and why it's important to put
them back into sync.

Then I should like to examine some high level solutions for getting
the log files back into sync. Can I fix whatever is getting them out
of sync in the first place? Then if I could do that I could probably
just fix them by hand and not worry about it. The program is meant to
make a repeatable job more easy.

This background would normally be ostensibly omitted by a programmer
in the actual finished program program, so in other words though the
finished code is affected by the initial requirements, these kinds of
things are usually not explained in any kind of detail the code
itself, and are rarely explained properly in the documentation either.
But when you think about it, this is the first part of the lifecycle
of code, and the most important part too.

Then of course I would like to solve the problem itself, so I would
need to think about ways to process the logs. I would like to merge
the logs into one new log. I would also possibly like to change the
output format, for reasons separate to needing to merge the logs in
the first place. Does this mean that I should create a modular system
which has two main components?

To somebody who is coming to the program for the first time, they
probably don't want to know about this in as much detail. There ought
to be a summary version of this background so that people can check
whether the problem that they are facing is the same as the one that
the code solves. And if their problem is similar but not exactly the
same, then a good literate programming system, a hypercode system,
ought to make it very clear to them what they need to change.

Once the manner of the solution has been worked through, then comes
the part that makes a programmer a programmer: actually thinking about
the code. This is where the huge divergence between the human logic
and the code gets introduced or becomes apparent. What I would like to
do is to model how I actually think about a program, and then
gradually tease the code itself out of that. The code is almost
redundant, secondary to the original logic. This could be imagined as
a kind of tree structure where the logic branches off into specifics
which get blown about by the gusty winds of the syntax of a specific
language.

There should be a kind of rule of thumb that language specific
constructs should be most thoroughly avoided in hypercode. You are not
embedding a programming language tutorial, you are describing how your
problem can be solved. So for example, in terms of my log files, the
logs are broken down into sessions. You might have two log files, call
them (1) and (2), which are arranged as follows:

(1) A, B, D, E, F, I, J
(2) C, G, H, K, L

Each of the letters represents a session, and the alphabetical order
represents the chronological order of the sessions. Sessions do not
overlap chronologically. What the code needs to do is to parse the
logs into component sessions, order them, and then write them out in a
single stream.

This means that there must be two, always two and never more, input
streams which we read from. To keep memory consumption low, what we
should do is to buffer only as much of each file in memory as we must.
So for example, we would read A from (1) and C from (2) and realise
that C is later than A. We would then check (1) to see if there are
any more sessions between A and C that we need to add. So we would
read B from (1), and again D from (1). Since we realise that D is
later than C, we can write out A, B, and C to output. Then we must
check what comes after C to make sure that there is nothing after C
which is before D. And so we end up with a kind of dance performed by
two partners.

One of the most important things in writing good code is to make sure
that there are no errors, which involves taking care of all edge cases
and making sure that all cases work equally well. To do this, one
needs to understand the code as fully as possible and make it very
clear what is going on. What we want to do is the opposite of
obfuscation. We want to make a program so clear that you could publish
it in a tabloid newspaper.

One problem with the letter representation of this system is that to
us it is clear that not only is C later than A, but also that there is
a session missing between A and C. The program doesn't know this. It
has to inspect B before it can know that B comes between A and C. So
in a way the labelling is cheating. Perhaps that could be solved by
using arrow annotations to show which sessions are adjacent to which
other sessions. This is a problem we need to think about in the
exposition of the program.

As an example of a problem in the program design itself, we could
consider for example what would happen if the inputs were reversed.
The code shouldn't rely on input source (1) having the most early
session in it. And what about strange situations? Would we want to
handle the case where, for example, a file starts "B, A" with the
sessions in the wrong order? Presumably we would like to fail with a
very clear error message, but perhaps we would also like to simply
assume that the dates are correct and to output the sessions as "A,
B". This cannot actually happen in the system that I have unless there
is some very strange corruption, so I think the best answer would be
to fail with an error. The error message should explain not only what
the error is, but why this counts as an error at all. This is
important, but is usually missed out of error messages.

Recently I was chatting to good friend Tav about our experimental
programming languages, his being called naaga and mine pluvo2, and we
were thinking about what the most natural expression is for variables.
I realised that we tend not to say that the act of interpersonal
tickling is where a subject, x, tickles an object, y, where the object
y will laugh. We tend instead to use concrete examples. We might say
that if John tickles Mary, then Mary will laugh. We then abstract
backwards from this to realise that John and Mary are in a sense
universally quantifiable. Not universally quantified, but
quantifiable. This is an important distinction.

To go back to my log file example, imagine an SVG (say raphaeljs)
representation of the dance that I described. What you could have is a
coloured circle for each session, arranged into two stacks
representing the two input files. Then you could have a machine which
represents the code, some kind of a Heath Robinson or Rube Goldberg
style contraption that grabs the circles from each of the stacks, and
throws them out in the right order. You should be able to see not only
the dance happening, but the reason for each stage of the dance.

That would be the program working on a specific set of inputs. Now
what you need to see is that it could work on any possible input. This
is the challenge of programming. There is a comparable situation with
my proof of the pythagorean theorem:

http://inamidst.com/stuff/notes/pythag

The theorem is only solved, on that page, for a specific instance of
right angled triangle. But it would be possible to prove it for all
right angled triangles by having an animation of the triangle changing
shape from its most acute non-right-angled angle becoming the least
acute and vice versa. Similarly, we want to be able to show that our
Robinson-Goldberg contraption is going to be able to handle all
inputs, but I'm not sure that this would be as easy to represent as in
the proof of the pythagorean theorem.

When we get to the details, the nature of the problems changes quite a
lot. For example, we might get as far down as the details of the
encoding of the input files. Are they utf-8, or should we detect the
encoding? Since they're a form of instant messaging logs, in fact each
line could be in an independent encoding. This can make things very
tricky. But in the general case it would be reasonable to assume
they're utf-8. How should this be represented? When we get down to
this stage of details, we're talking about code like the following, in
python:

with open(first, encoding='utf-8') as a:
...

This does not need explanation if you know python, and if you don't
know python and can't work out what this block opener means, then you
probably shouldn't be reading the code anyway, so unlike regular
literate programming it probably doesn't need explaining at the
syntactic level. You would have to say why you are decoding the file
as utf-8, however, and it may also be useful to mention that reading
from the file after this point will not give a stream of bytes, but of
unicode characters.

This is about as far down the stack into the details as we get, but
there are of course middle levels too. Consider for example the
parsing of the file into sessions. What we would be doing for that is
to iterate over the input lines from the unicode character stream, and
inspect them to see when a particular token that we recognise as a
session starter will appear. In this case the session starter lines
will look like the following:

**** BEGIN LOGGING AT Sun Apr 25 13:34:29 2010

No other line will start with a "*" except those lines that start
sessions, so line.startswith("*"), in python, would be an adequate way
of splitting the sessions up.

The representation of session parsing could be seen as somewhat
between the sessions dance and the encoding level. It would be
possible, for example, to represent this in an SVG diagram just like
the sessions dance. You would perhaps have a file looking more or less
like a normal file in a text editor, only arranged a bit more like a
table so that each line is put in a rectangle. Each of the lines is
vertically stacked. At the left hand side you could imagine a pointer
going down, and it inspects the first letter of each line. When it
finds a "*" there, what it does is to break the line apart from the
line above it.

You can imagine how this could integrate into the session dance. When
a session is broken apart, it could be compacted into a circle with a
code in it representing the date, or with annotational arrows to give
the order relative to other known sessions. These could then be thrown
into the area which represents the stack of sessions parsed from the
file. This is important since it would be worth showing that the files
don't actually come pre parsed. We won't be having all the sessions in
memory, so though the files might be structured like this on the disk:

(1) A, B, D, E, F, I, J
(2) C, G, H, K, L

We would actually find A emerging first, then C, then B, then D, then
G, and so on. These sessions are being pulled from the input streams.
So both parts of the code are working simultaneously, the parser
giving input to the session dance.

On the other hand, you could also argue that this is such a simple
part of the code that it's not even really worth making a diagram of.
After all, it should work if you just read the whole file in and did
an input.split("\n*") on it. In this view, the representation would be
more like the encoding level, where you don't bother explaining, but
do explain the rationale behind the design. So for example it would be
important to note the following two points:

* Sessions only start with lines starting with "*"
* Only lines starting with "*" start sessions

In other words if it starts with "*" then it starts a session, and all
sessions will be found this way. Neither of these points by itself is
sufficient to be able to inspect "*" to be the beginning of a section,
so you would have to document both of them. Of course we don't have an
either-or proposition here, and we could have an animation of this but
would still need to explain why the inspector is looking for the "*"
and why it can know that it will get all sessions that way.

There will probably come an intuitive balance between these levels. In
programming practice, the details of the program are the most
important part. The individual lines of code and the calls which are
being made are the meat of the system. But on the documentation level,
the more specific we get, the more "solved" the overall problem has
become, so we can afford to pay less attention to it. So as we move
out to the leaves of the system, the amount of documentation that we
will give to them will become smaller and less rich.

At this point it may be a good idea to explain more about why I would
like to create a program in this way. Obviously the main feature of
programming in this way is that it would take a lot of effort. And in
fact it could become somewhat recursive. If you created an animation
of the sessions dance in raphaeljs, then that would itself be code, so
then you would have to document the code that you're using to document
the original code! And all this would bloat up the code to be much
larger a project that perhaps if you just merged the logs by hand when
they're out of sync.

There are two responses to this. The first is that to some extent what
we are missing are tools to make good documentation like this, though
also to perhaps a lesser extent we are also missing patterns,
practices, and a culture around it. If these things were supplied,
then of course the effort required to make excellent code like this
would be diminished. But the second response is that actually this is
in a sense separate to the wanting to solve the original problem. This
is adding the extra dimension of wanting to make the solution to the
original problem itself somewhat of an art form. I don't consider
beauty in this program to be a side effect of wanting to solve the
original problem. I consider it something which is a first class
product in itself, something that is of itself enjoyable.

This is one of the primary things that makes programmers do stupid
things. They love programming so much that they start to invent
problems that don't even need solving, and go about solving them,
often fooling themselves meanwhile into thinking that the ostensible
thing that they're working on is the productive one. But the love of
programming is productive in a sense too, so there isn't really any
need to fool oneself in this way. In fact it ought to be liberating to
realise that you can just program for fun.

What I've done in this explanation is to be careful to pick an example
that is actually useful. The log merger is the only code at the moment
that I'm thinking about writing, and since I actually need it I'd
probably just bang out a short script first and then if I actually did
do this nice documentation it would be done as a second stage.

The problem with programming small scripts like this is that
programming in such a frustrating language makes it feel as though
you're losing a lot of the essential thinking behind the program,
which is much of the enjoyment of the whole thing. Though you do get a
usable product, which is the main aim, you probably wouldn't be
learning from it later on. When I look at code that I wrote five years
ago, I just wonder what's going on in there, and usually don't really
bother inspecting it too much. Very beautiful hypercode on the other
hand would perhaps inspire us to learn more about the exciting aspects
that make a program valuable.

--
Comment at http://groups.google.com/group/whits/topics
Subscribe to http://inamidst.com/whits/feed

Noah Slater

unread,
May 1, 2010, 10:47:32 AM5/1/10
to wh...@googlegroups.com

On 1 May 2010, at 13:58, Sean B. Palmer wrote:

> Once the manner of the solution has been worked through, then comes
> the part that makes a programmer a programmer: actually thinking about
> the code. This is where the huge divergence between the human logic
> and the code gets introduced or becomes apparent. What I would like to
> do is to model how I actually think about a program, and then
> gradually tease the code itself out of that. The code is almost
> redundant, secondary to the original logic. This could be imagined as
> a kind of tree structure where the logic branches off into specifics
> which get blown about by the gusty winds of the syntax of a specific
> language.

I'm not so sure about this bit.

The syntax of a particular language may be entirely unimportant, or, it may be representative of some larger difference. Some languages, like Erlang or Scheme, approach problems in a completely different way than, say, Python or Perl. If we're thinking about instructions for a Rube Goldberg machine, or even just a human who's reading a paper — then the difference between Python and Erlang, say, is like the difference between the instructions being for one person, or a group of people who are concurrently working on the same problem.

Obviously, depending on the agents (human or otherwise) involved, the way you approach a problem is going to be radically different. I don't think we can toss this aside as a trivial syntax issue. There's a certain parallel with the Sapir Whorf hypothesis for programming languages. Tim Bray's Wide Finder project has been quite illustrative for me in this regard.

Dave Pawson

unread,
May 1, 2010, 1:34:22 PM5/1/10
to wh...@googlegroups.com
On 1 May 2010 13:58, Sean B. Palmer <s...@miscoranda.com> wrote:
> One of the reasons why I program less now is that the abilities of
> programming languages are so far behind my expectations. What I mostly
> miss are good test suite systems, and, most of all, excellent
> ingrained documentation.



Problems.
foreach 10 programmers only 1 is good
foreach 10 programmers only 1 can write good clear docs.

If the programmer is focussed on coding, documentation is generally
not considered.

What's 'ingrained documentation' Sean? I've seen dirt ingrained, but ....

Love litprog, just find it hard work: One I've used, and documented,
is that that comes naturally with docbook. I.e. embedding code within
docbook. It won't help those who naturally write 'bad'
documentation. In fact I think that is orthogonal to having good,
current docs with code.

quote. I don't believe that one can create beautiful literate programs
for two reasons. The first is that in literate programming, a
human-computer hybrid is the primary encoding of the program
information, whereas in reality the code always starts out in a woven,
human state and the programmer's job is to translate that to a
tangled, computer state. endquote.

Taking your ideas from further on. Why not start writing your ideas in
X format, (I'll use xml from now on, you decide what it is to
be). Talk it out or write a requirement, TDD, whatever. That should
take you to the start coding point. Also provide a 'throwaway' topic,
to use DITA terminology, on the history and background.

quote.The second is that the woven state is so far
removed from the structure and syntax of an actual syntactic computer
program that I think it's almost impossible to meld the two in a
satisfying way, at least with the dull edge of existing literate
programming systems. endquote.

Not sure if I understand. I would like syntax directed editing to switch
from (name your language) to (name your markup), but so long as I can
get the two forms out, I'd favour supporting the markup form. Jut
personal view. I know I can pull code from docbook markup easily.

It does leave a ragged problem though. How to test code written inside
docbook markup? Big loop to go through for both syntax and semantic
errors. Only idea is to work in small enough chunks, loosely coupled,
such that each is testable with a tiny harness. Then continue in the
litprog mode. Larger chunk testing involves tangling then executing in
an appropriate environment. No big issue with docbook. Isolating the
code is no big deal. Isolation into separate units of code is
manageable with xslt.

quote. Then I should like to examine some high level solutions for
getting the log files back into sync. endquote.

In my experience the nicest way of doing this is in one of the Lisp
family of languages, possibly clojure today. Top down design is so
easy, with stubs representing what become huge chunks of code. I found
it to work well. I'm sure other languages can do the same.

The huge win for litprog is the requirements change. Go change the
markup. Make the code follow those ideas. Even bring in TDD
ideas. Keeping the documentation ahead of the code, then alongside the
code, is the discipline. Working this cycle is generally where the out
of date code or documentation comes from IMHO.

quote. When we get to the details, the nature of the problems changes
quite a lot. endquote.

Agreed. When detail reveals a necessary design change. I really don't
think anythign other than discipline will overcome this. Keeping the
outline, detailed design and code in sync, within one litprog
structure.


quote . To somebody who is coming to the program for the first time,
they probably don't want to know about this in as much detail. There
ought to be a summary version of this background so that people can
check whether the problem that they are facing is the same as the one
that the code solves. endquote.

This is a hint to me to use something like a DITA structure where you
could isolate those 'early' history notes such that they are only
built with certain document stacks. Again selection is easy with XSLT.

quote. We want to make a program so clear that you could publish it in
a tabloid newspaper. endquote.

Litprog again has this covered. When/if you realise that piece X needs
heavy documentation to cover such and such a nuance it can be done
there and then, as part of the development.

The reference to a visualisation of the problem I guess is one persons
goal, anothers blank stare? We aren't all visual.


quote. So as we move out to the leaves of the system, the amount of
documentation that we will give to them will become smaller and less
rich. endquote.

Interesting. Yet you state that programmers view the detail as the
important part? If you develop in smaller chunks of code, wrapping
them together into larger pieces is seldom a problem. Dissecting a
monolith is often a nightmare. Documentation in a good litprog
solution can be easily interspersed with the code, so put the weight
where you want it.

What's missing in this debate is the 'second time round' solution?
I've looked at the working solution and I hate X and Y and Z. I want
to tear it down and start again. Is this an opportunity for Seans
'sessions dance' software? Quite possible. How to change the dance
steps to do Y differently.



--
Dave Pawson
XSLT XSL-FO FAQ.
Docbook FAQ.
http://www.dpawson.co.uk

Sean B. Palmer

unread,
May 1, 2010, 6:58:58 PM5/1/10
to wh...@googlegroups.com
On Sat, May 1, 2010 at 6:34 PM, Dave Pawson wrote:

> Problems.
>  foreach 10 programmers only 1 is good
>  foreach 10 programmers only 1 can write good clear docs.

Yeah, but I was just thinking about why I don't like programming.
Other people might not like this, they might be quite content with
programming in the regular way. Or they may not understand what I
mean. There are all kinds of reasons why people might not like to
program in the way I described.

Heck, even if this way of programming evolved into a common approach
with lots of tools and further ideas and development, I'm not sure I'd
do it all the time. I'd only do it if I wanted to program for the joy
of programming, or if it were clear that the benefits would be very
strong in whatever domain I was programming.

The only problem really is if I'm one of those 9 programmers who can't
write good code, or one of those 9 programmers who can't write good
documentation. But part of what this is all about is my own personal
development as a programmer, and me thinking about how I could produce
more top quality programs.

> If the programmer is focussed on coding, documentation is generally
> not considered.

Which is sometimes a great shame!

> What's 'ingrained documentation' Sean? I've seen dirt ingrained

Wiktionary says:

adj, ingrained (not comparable)
1. Being an element; present in the essence of a thing
2. Fixed, established

But for you I can use an analogy from the old WAI PF days! Remember
how Gregory always says that accessibility isn't bolted on after, it
should be baked in from the very start? Well that's kind of the same
thing. You know how in TDD you're trying to document the effects
beforehand, in essence. That's what I'd like to do in hypercode, to in
a way complete the program before even writing it.

As you go along writing it, you might change your mind and have
throwaways, but those would be throwaways of the hypercode, not of the
code in whatever programming language you have. In one way I think of
hypercode as a way of managing how to think about programming in
advance, how to design thoroughly.

I don't want to get too abstract about this though, because I've only
thought of what I'd do for this one specific log files example, and
even there I've left many things unresolved. For example, I don't know
how to prove that the sessions dance machine works for all input test
cases.

So there might be lots more philosophy that I could pack behind the
word "ingrained", and it might turn out to be all rubbish.

> It won't help those who naturally write 'bad' documentation.
> In fact I think that is orthogonal to having good, current docs
> with code.

Well, that's the perl argument. That's saying that just because people
make bad litprogs, that doesn't mean that litprog itself is broken.
But my contention is just the opposite, that litprog is in fact very
broken. When I say broken I don't mean that a good coder couldn't make
something good with it. Just that I believe we can go very far beyond
the current state of the art.

And I don't mean that we can exceed the capabilities of litprog only
by reaching for the stars and being really experimental. The really
strange thing about almost all the thoughts that I sketched in my
essay is that to me they seem very obvious and achievable. It would
only be a matter of effort, not so much inspiration, to get it to
work.

I'm surprised that I haven't seen anything exactly like this before,
though Joe Geldart has shown me many, many things that are in some way
related to the idea. I've thought of a few myself too, but there is
really nothing exactly like this that I can think of.

I guess this is because programmers are so focussed on syntax.

> Why not start writing your ideas in X format, (I'll use xml from
> now on, you decide what it is to be). Talk it out

Sure, there's no reason why any of this would be difficult to do. In
fact, for my X format I was thinking of just using good old fashioned
HTML! And I already mentioned that I would probably do the animations
in SVG using the raphaeljs library. There are probably some things
that I would need to innovate, but I don't think any of this is far
beyond the trivial level. It just takes effort to think about the
program really clearly, in such detail.

I suppose what I'm trying to do is to elide the intuition from
programming in a way. Usually if I'm going to write a thing like this
log script, I might take half an hour to do it. It might need a bit of
testing, and then it works. The whole thing might take an hour. I put
it in a script directory somewhere. I use it. People don't know about
it.

What I'm describing is the opposite in production terms. Think about
every damn aspect. Think through the greatest possible way to explain
to others how the code works. Come up with pretty diagrams and
embedded testable use cases in the documentation that you can run
live. Basically go utterly Web 2.0 on the problem.

The point is not so much to make the program any better, it would
probably already be okay as it is after an hour's programming. It
might break or have some serious bug, but then you just fix it.

But you wouldn't bother putting it on your website and saying, look
how beautiful this is. And I miss that excitement that I felt when I
first started programming and I was actually creating and exploring
interesting ideas, not merely gluing libraries together and fixing
other people's code (as some recent memorable quote went).

One of the most fun pieces of code that I ever wrote was an HTTP PUT
module in python. Why was it so interesting? That's one of the things
that could be answered if I used hypercode to explain how it works.
That code was extremely beautiful to me, but to anybody else it would
probably look just like any other bit of code that I produced. The
question of the beauty in the code is a more complex one than the end
result of how the code works. A Skoda and a Bugatti both get you
places, but I know which is the more beautiful.

> I would like syntax directed editing to switch from (name your
> language) to (name your markup), but so long as I can get the
> two forms out, I'd favour supporting the markup form.

You're still thinking entirely in the literary programming approach I
think. What I was describing wasn't really a literary programming
approach at all. You wouldn't call it literary programming, it's that
far removed.

In literary programming, you have the human/machine information in one
document, and then you pull out the human aspect and the machine
aspects separately. What I was saying, though I didn't say it clearly,
is that I would rather start from the human side and only document
things there, but then work slowly towards the code expression from
there. And that isn't just a change in syntax. That is a completely
different way of thinking.

So it's not so much the language that I would use for the development
which concerns me. As I said, I'd probably just use HTML and SVG. I
could use all sorts of things, some would be better than others, but
the choice of a format is pretty orthogonal to what I was describing.
It's not so much a technological questions as a mental problem.

Programming for me goes on in the head. You know, sometimes I lie
awake at night imagining how a particular interface should work, what
arguments a particular constructor should use, how I can make some
class more lean and more easy to use and understand. Those sorts of
conversations with myself don't happen in code, really, they happen in
a kind of internal narrative that could be illustrated in all sorts of
ways.

I've already outlined one of them in my previous email, the idea of
using an SVG animated state diagram. Well there are all kinds of
things you can do like that. Some guy at MIT came up with a brilliant
tabular description of logic trees.

They don't have to be graphic either, there are all sorts of ways that
you can describe code to people. When I say that the queue sorting in
the log program is a "sessions dance", you can see how that metaphor
sticks in the mind. That was a particularly nice way of describing it.
Those are just simple things, there might be some really awesome
complex things one can do too.

> It does leave a ragged problem though. How to test code written inside
> docbook markup? Big loop to go through for both syntax and semantic
> errors. Only idea is to work in small enough chunks, loosely coupled,
> such that each is testable with a tiny harness.

Well, you'll notice that I only broached the topic of testing once in
my essay. That was when I said that we don't tend to model
abstractions but rather specific examples. Well a specific example is
a test case. When you show that that test case works, then you've
basically shown that a test case succeeds. What I said the problem was
is that you then need to abstract it over a range of cases.

And my analogy with the pythagorean proof was an interesting one
because you can see how in that case you can actually provide a
graphical representation which covers a whole *class* of test cases!
So there might be a way you could do that for the state machine for
the sessions dance that I described. Maybe there isn't, I don't know!
But that is one of the things that I'd try to do, and that would make
a test suite redundant in this case.

Basically it would be a kind of mathematical proof that the code
works. Well, when you have a proof you don't need test cases. You have
the proof. But you can't prove all code, only trivial code really. The
log file merger is a trivial bit of code.

> quote. Then I should like to examine some high level solutions
> for getting the log files back into sync. endquote.
>
> In my experience the nicest way of doing this is in one of the Lisp
> family of languages, possibly clojure today.

Actually I mean more like... Well the problem is caused by storing IRC
log files in Dropbox. When Dropbox syncs, it doesn't merge files if my
IRC client connects before Dropbox has merged the files. So it just
forks the file and says there's a conflict.

So a way to solve this would be to ask Dropbox if there's some way of
making it automatically merge. Or I could see if a DVCS has some kind
of tool for automatically merging files like this. I doubt either of
these solutions would work though. Another way would be to control my
IRC usage somehow so that it waits for Dropbox. All kinds of things
like that which can be considered on that level.

> The huge win for litprog is the requirements change. Go change the
> markup. Make the code follow those ideas. Even bring in TDD
> ideas. Keeping the documentation ahead of the code, then alongside
> the code, is the discipline.

Yeah, very strongly agreed. I don't like writing documentation after
the fact and then having to change the code and realising I don't know
if I updated all my documentation. I really hate that. But that's not
a win for literary programming exactly, because even python docstrings
can give you that sort of functionality. Unless you consider those to
be literary programming, but I mean those are just comments. Comments
have been around in programming way longer than literary programming
has been around.

But that's what I mean, I don't think literary programming is all that
revolutionary anyway. The main idea is to make the comments have the
same kind of structural level as the code, so that the documentation
is a first class citizen and perhaps the code is slightly second
class. Also the macroisation, but again, not exactly unique to
literary programming is it?

> When detail reveals a necessary design change. I really don't
> think anythign other than discipline will overcome this. Keeping
> the outline, detailed design and code in sync, within one litprog
> structure.

Actually that's the one big thing that's causing me problems. You know
how people write code sometimes and then they can't be bothered to
write the code? Well for me, I can imagine writing extremely detailed
documentation and then not bothering to write the code!

Well we'd all like English to be executable. But you know, I've seen a
criticism that pseudocode shouldn't be used in academic papers because
it's often a source of great confusion over the formality of what's
going on. That's why we have a mantra, "prove it in python!", on Tav's
IRC channel.

But the hypercode doesn't replace the code. This is just as true as
saying that a program is no good if there's no manual for it. Well a
manual is no good if there's no code for it!

But this is incidental.

> This is a hint to me to use something like a DITA structure where you
> could isolate those 'early' history notes such that they are only
> built with certain document stacks. Again selection is easy with XSLT.

Yeah, maybe. I hadn't seen DITA before; it's interesting.

> quote. We want to make a program so clear that you could publish it in
> a tabloid newspaper. endquote.
>
> Litprog again has this covered. When/if you realise that piece X needs
> heavy documentation to cover such and such a nuance it can be done
> there and then, as part of the development.

Yeah, but it's not so much the idea that you can make it clear, it's
*how clear* you can make it. What's the best way of making a
particular thing clear? Depends on the case. Those are the kinds of
things I want to explore, the extents of the clarity.

> The reference to a visualisation of the problem I guess is one persons
> goal, anothers blank stare? We aren't all visual.

That's another benefit of using SVG. I'm not sure if raphaeljs can
produce accessible diagrams, but I'd hope that it would be possible to
furnish various ideas across different modalities. The whole "choice
not echo" point from XAG!

> quote. So as we move out to the leaves of the system, the amount of
> documentation that we will give to them will become smaller and less
> rich. endquote.
>
> Interesting. Yet you state that programmers view the detail as the
> important part? If you develop in smaller chunks of code, wrapping
> them together into larger pieces is seldom a problem.

Of course programmers view the detail as the most important part,
because the lines of code are what they are ultimately producing.
Again I'm not saying that the actual running code is inessential. Just
that the actual running code tends to start to come down to things
which are quite dependent on local factors. So Python has one way of
opening files using a certain encoding, and Perl has another way of
doing it. Well, Perl probably has a million ways of doing it.

We don't have to care about variations like that on the documentation
level, usually. Sometimes we do... for example it's nice to explain
that after decoding, Python then uses characters not bytes. But I mean
at this level we just have to give less and less rich documentation.
The code starts speaking for itself.

> Dissecting a monolith is often a nightmare.

But a monolith too is made of single lines of code.

So I think I'm talking about a different kind of grouping.

> What's missing in this debate is the 'second time round' solution?
> I've looked at the working solution and I hate X and Y and Z. I want
> to tear it down and start again. Is this an opportunity for Seans
> 'sessions dance' software? Quite possible. How to change the dance
> steps to do Y differently.

That's an interesting point that I hadn't considered. Until I work
through the details of how the dance actually works, and it's not all
that difficult but I haven't properly thought about it yet, I don't
know whether another dance might achieve the same results. Then it
would be interesting to compare the types of dances.

In fact, since this is a sorting algorithm, I suppose there might
certainly be different kinds of dance. There are certainly many, many
single-list sorting algorithms.

--
Sean B. Palmer, http://inamidst.com/sbp/

Sean B. Palmer

unread,
May 1, 2010, 7:00:58 PM5/1/10
to wh...@googlegroups.com
On Sat, May 1, 2010 at 3:47 PM, Noah Slater wrote:

> The syntax of a particular language may be entirely unimportant,
> or, it may be representative of some larger difference. Some
> languages, like Erlang or Scheme, approach problems in a
> completely different way than, say, Python or Perl.

Sure, but you don't choose the logic based on the programming
language. You choose the programming language based on the logic.

--
Sean B. Palmer, http://inamidst.com/sbp/

Noah Slater

unread,
May 1, 2010, 7:07:01 PM5/1/10
to wh...@googlegroups.com

On 2 May 2010, at 00:00, Sean B. Palmer wrote:

> On Sat, May 1, 2010 at 3:47 PM, Noah Slater wrote:
>
>> The syntax of a particular language may be entirely unimportant,
>> or, it may be representative of some larger difference. Some
>> languages, like Erlang or Scheme, approach problems in a
>> completely different way than, say, Python or Perl.
>
> Sure, but you don't choose the logic based on the programming
> language. You choose the programming language based on the logic.

I'm not convinced it's as simple as that.

Tim Bray's Wide Finder project is very similar to your example project, in that it a "simple" log parsing exercise. The solutions to the problem have been provided in almost every major language — some of them more fit than others. We may choose a language because we want to learn a new one, or because we are familiar with an old one — and in both cases, our approach to the problem may be shaped by the language we choose. Solving a problem with Erlang is fundamentally different to solving the same problem with Python. We probably should choose languages to match the problem domain, but there is only one programmer I've ever met who does that.

Dave Pawson

unread,
May 2, 2010, 2:19:34 AM5/2/10
to wh...@googlegroups.com
On 2 May 2010 00:07, Noah Slater <nsl...@me.com> wrote:

> I'm not convinced it's as simple as that.
>
> Tim Bray's Wide Finder project is very similar to your example project, in that it a "simple" log parsing exercise. The solutions to the problem have been provided in almost every major language — some of them more fit than others. We may choose a language because we want to learn a new one, or because we are familiar with an old one — and in both cases, our approach to the problem may be shaped by the language we choose. Solving a problem with Erlang is fundamentally different to solving the same problem with Python. We probably should choose languages to match the problem domain, but there is only one programmer I've ever met who does that.



+1

I have a hammer syndrome.


regards

--
Dave Pawson
XSLT XSL-FO FAQ.
Docbook FAQ.
http://www.dpawson.co.uk

Dave Pawson

unread,
May 2, 2010, 2:21:47 AM5/2/10
to wh...@googlegroups.com
On 1 May 2010 23:58, Sean B. Palmer <s...@miscoranda.com> wrote:
quote. As you go along writing it, you might change your mind and have
throwaways, but those would be throwaways of the hypercode, not of the
code in whatever programming language you have. endquote.

Love it. Hypercode throwaway. Even catchy. Isn't that what p-code was
all about, back in the 80's?

if stream 1 has higher element
process stream 1
code written for the programmer, to be interpreted as code for the computer?

quote. For example, I don't know how to prove that the sessions dance
machine works for all input test cases. endquote.

Which is no different from 'not knowing that this code I'm writing
will work for all corner cases? Stands a better chance though, since
it is at a higher level? Equally, its a model, don't expect perfection.

quote. But my contention is just the opposite, that litprog is in fact
very broken. When I say broken I don't mean that a good coder couldn't
make something good with it. Just that I believe we can go very far
beyond the current state of the art. endquote.

OK. I hadn't picked that up.

Concern. How to check if your visualisations ideas are viable beyond
sorting out your logs? Pick half a dozen programs you've worked and
see if you can 'see' the algorithm flowing or moving? If you can't
visualise it, what else would you do that works for you? Words or an
audio stream?

quote. Come up with pretty diagrams and embedded testable use cases in
the documentation that you can run live. endquote.

So you are still intending to keep extensive documentation? As
litprog? Or as annotations to the visual 'dance'.

quote. And I miss that excitement that I felt when I first started
programming and I was actually creating and exploring interesting
ideas, endquote

How about the additional effort you put in when you know your code is
going to go public too? I think that makes me put more into the
documentation and the code.

quote. Programming for me goes on in the head. You know, sometimes I
lie awake at night imagining how a particular interface should
work,.... endquote.

I'm starting to see (sic) audio streams in this documentation? Where
you can discuss why the beauty you see is there, why that interface is
as it is. HTML5 +SVG?

quote. The main idea is to make the comments have the same kind of
structural level as the code, so that the documentation is a first
class citizen and perhaps the code is slightly second class. Also the
macroisation, but again, not exactly unique to literary programming is
it? endquote.

No, agreed. That puts the ideas into a new light for me. Using an xml
analogy, putting the documentation into a different namespace? No
impact on the code when run, but still there. Like it.


quote. Yeah, but it's not so much the idea that you can make it clear,
it's *how clear* you can make it. What's the best way of making a
particular thing clear? Depends on the case. endquote.

Again, shifting the documentation up a gear to become the first class
citizen it deserves. "How clear Documentation" is the catch phrase :-)


quote. > Dissecting a monolith is often a nightmare.
But a monolith too is made of single lines of code.
So I think I'm talking about a different kind of grouping. endquote.

Can you expand on that please Sean? New aspect to me?


regards


--
Dave Pawson
XSLT XSL-FO FAQ.
Docbook FAQ.
http://www.dpawson.co.uk

Noah Slater

unread,
May 2, 2010, 7:50:23 AM5/2/10
to wh...@googlegroups.com

On 1 May 2010, at 23:58, Sean B. Palmer wrote:

> And my analogy with the pythagorean proof was an interesting one
> because you can see how in that case you can actually provide a
> graphical representation which covers a whole *class* of test cases!
> So there might be a way you could do that for the state machine for
> the sessions dance that I described. Maybe there isn't, I don't know!
> But that is one of the things that I'd try to do, and that would make
> a test suite redundant in this case.
>
> Basically it would be a kind of mathematical proof that the code
> works. Well, when you have a proof you don't need test cases. You have
> the proof. But you can't prove all code, only trivial code really. The
> log file merger is a trivial bit of code.

What makes Z notation unsuitable for this?

That would provide the proof of the system you describe, and in essence, the test case that covers n test cases. If you could build a test harness in Z notation, for a particular part of the system, you could prove all possible test cases. I wonder what a Z visualisation would look like? I wonder if something like Mathematica would provide visualisations for running sets of test cases through an algorithm or mathematical function.
Reply all
Reply to author
Forward
0 new messages