;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; 0 AUTHOR and LICENSE
;;; Timothy Daly (da...@axiom-developer.org)
;;; License: Public Domain
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; 1 ABSTRACT and USE CASES
;;; Don Knuth has defined literate programming as a combination of
;;; documentation and source code in a single file. The TeX language
;;; is documented this way in books. Knuth defined two functions
;;; tangle -> extract the source code from a literate file
;;; weave -> extract the latex from a literate file
;;; This seems unnecessarily complex. Latex is a full programming
;;; language and is capable of defining "environments" that can
;;; handle code directly in Latex. Here we define the correct environment
;;; macros. Thus, the "weave" function is not needed.
;;; If this "tangle" function were added to Clojure then Clojure could
;;; read literate files in Latex format and extract the code. We create
;;; the necessary "tangle" function here.
;;; This program will extract the source code from a literate file.
;;; A literate lisp file contains a mixture of latex and lisp sources code.
;;; The file is intended to be in standard latex format. In order to
;;; delimit code chunks we define a latex "chunk" environment.
;;; Latex format files defines a newenvironment so that code chunks
;;; can be delimited by \begin{chunk}{name} .... \end{chunk} blocks
;;; This is supported by the following latex code.
;;; So a trivial example of a literate latex file might look like
;;; (ignore the prefix semicolons. that's for lisp)
; this is a file that is in a literate
; form it has a chunk called
; \begin{chunk}{first chunk}
; THIS IS THE FIRST CHUNK
; \end{chunk}
; and this is a second chunk
; \begin{chunk}{second chunk}
; THIS IS THE SECOND CHUNK
; \end{chunk}
; and this is more in the first chunk
; \begin{chunk}{first chunk}
; \getchunk{second chunk}
; THIS IS MORE IN THE FIRST CHUNK
; \end{chunk}
; \begin{chunk}{all}
; \getchunk{first chunk}
; \getchunk{second chunk}
; \end{chunk}
; and that's it
;;; From a file called "testcase" that contains the above text
;;; we want to extract the chunk names "second chunk". We do this with:
; (tangle "testcase" "second chunk")
; which yields:
; THIS IS THE SECOND CHUNK
;;; From the same file we might extract the chunk named "first chunk".
;;; Notice that this has the second chunk embedded recursively inside.
;;; So we execute:
; (tangle "testcase" "first chunk")
; which yields:
; THIS IS THE FIRST CHUNK
; THIS IS THE SECOND CHUNK
; THIS IS MORE IN THE FIRST CHUNK
;;; There is a third chunk called "all" which will extract both chunks:
; (tangle "testcase" "all")
; which yields
; THIS IS THE FIRST CHUNK
; THIS IS THE SECOND CHUNK
; THIS IS MORE IN THE FIRST CHUNK
; THIS IS THE SECOND CHUNK
;;; The tangle function takes a third argument which is the name of
;;; an output file. Thus, you can write the same results to a file with:
; (tangle "testcase" "all" "outputfile")
;;; It is also worth noting that all chunks with the same name will be
;;; merged into one chunk so it is possible to split chunks in mulitple
;;; parts and have them extracted as one. That is,
; \begin{chunk}{a partial chunk}
; part 1 of the partial chunk
; \end{chunk}
; not part of the chunk
; \begin{chunk}{a partial chunk}
; part 2 of the partial chunk
; \end{chunk}
;;; These will be combined on output as a single chunk. Thus
; (tangle "testmerge" "a partial chunk")
; will yield
; part 1 of the partial chunk
; part 2 of the partial chunk
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; 2 THE LATEX SUPPORT CODE
;;; The verbatim package quotes everything within its grasp and is used to
;;; hide and quote the source code during latex formatting. The verbatim
;;; environment is built in but the package form lets us use it in our
;;; chunk environment and it lets us change the font.
;;;
;;; \usepackage{verbatim}
;;;
;;; Make the verbatim font smaller
;;; Note that we have to temporarily change the '@' to be just a character
;;; because the \verbatim@font name uses it as a character
;;;
;;; \chardef\atcode=\catcode`\@
;;; \catcode`\@=11
;;; \renewcommand{\verbatim@font}{\ttfamily\small}
;;; \catcode`\@=\atcode
;;; This declares a new environment named ``chunk'' which has one
;;; argument that is the name of the chunk. All code needs to live
;;; between the \begin{chunk}{name} and the \end{chunk}
;;; The ``name'' is used to define the chunk.
;;; Reuse of the same chunk name later concatenates the chunks
;;; For those of you who can't read latex this says:
;;; Make a new environment named chunk with one argument
;;; The first block is the code for the \begin{chunk}{name}
;;; The second block is the code for the \end{chunk}
;;; The % is the latex comment character
;;; We have two alternate markers, a lightweight one using dashes
;;; and a heavyweight one using the \begin and \end syntax
;;; You can choose either one by changing the comment char in column 1
;;; \newenvironment{chunk}[1]{% we need the chunkname as an argument
;;; {\ }\newline\noindent% make sure we are in column 1
;;; %{\small $\backslash{}$begin\{chunk\}\{{\bf #1}\}}% alternate begin mark
;;; \hbox{\hskip 2.0cm}{\bf --- #1 ---}% mark the beginning
;;; \verbatim}% say exactly what we see
;;; {\endverbatim% process \end{chunk}
;;; \par{}% we add a newline
;;; \noindent{}% start in column 1
;;; \hbox{\hskip 2.0cm}{\bf ----------}% mark the end
;;; %$\backslash{}$end\{chunk\}% alternate end mark (commented)
;;; \par% and a newline
;;; \normalsize\noindent}% and return to the document
;;; This declares the place where we want to expand a chunk
;;; Technically we don't need this because a getchunk must always
;;; be properly nested within a chunk and will be verbatim.
;;; \providecommand{\getchunk}[1]{%
;;; \noindent%
;;; {\small $\backslash{}$begin\{chunk\}\{{\bf #1}\}}}% mark the reference
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; 3 IMPORTS
(import [java.io BufferedReader FileReader BufferedWriter FileWriter])
(import [java.lang String])
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; 4 THE TANGLE COMMAND
;;;
;;; The tangle command does all of the work of extracting code.
;;;
;;; In latex form the code blocks are delimited by
;;; \begin{chunk}{name}
;;; ... (code for name)...
;;; \end{chunk}
;;;
;;; and referenced by \getchunk{name} which gets replaced by the code
;;; There are several ways to invoke the tangle function.
;;;
;;; The first argument is always the file from which to extract code
;;;
;;; The second argument is the name of the chunk to extract
;;; (tangle "clweb.pamphlet" "name")
;;;
;;; The standard chunk name is ``*'' but any name can be used.
;;;
;;; The third arument is the name of an output file:
;;; (tangle "clweb.pamphlet" "clweb.chunk" "clweb.spadfile")
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; 5 SAY
;;; This function will either write to the output file, or if null,
;;; to the console
(defn say [where what]
(if where
(do (.write where what) (.write where "\n"))
(println what)))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; 6 READ-FILE
;;; Here we return a lazy sequence that will fetch lines as we need them
;;; from the file.
(defn read-file [streamname]
^{:doc "Implement read-sequence in GCL"}
(let [stream (BufferedReader. (FileReader. streamname))]
(line-seq stream)))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; 7 ISCHUNK
;;; There is a built-in assumption (in the ischunk functions)
;;; that the chunks occur on separate lines and that the indentation
;;; of the chunk reference has no meaning.
;;;
;;; ischunk recognizes chunk names in latex convention
;;;
;;; There are 3 cases to recognize:
;;; \begin{chunk}{thechunkname} ==> 'define thechunkname
;;; \end{chunk} ==> 'end nil
;;; \getchunk{thechunkname} ==> 'refer thechunkname
;;; The regex pattern #"^\\begin\{chunk\}\{.*\}$" matches
;;; \begin{chunk}{anything here}
;;; The regex pattern #"^\\end\{chunk\}$" matches
;;; \end{chunk}
;;; The regex pattern #"^\\getchunk\{.*\}$" matches
;;; \getchunk{anything here}
(defn ischunk [line]
^{:doc "Find chunks delimited by latex syntax"}
(let [ begin #"^\\begin\{chunk\}\{.*\}$"
end #"^\\end\{chunk\}$"
get #"^\\getchunk\{.*\}$"
trimmed (.trim line) ]
(cond
(re-find begin trimmed)
(list 'define (apply str (butlast (drop 14 trimmed))))
(re-find end trimmed)
(list 'end nil)
(re-find get trimmed)
(list 'refer trimmed)
:else
(list nil trimmed))))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; 8 HASHCHUNKS
;;; hashchunks gathers the chunks and puts them in the hash table
;;;
;;; if we find the chunk syntax and it is a
;;; define ==> parse the chunkname and start gathering lines onto a stack
;;; end ==> push the completed list of lines into a stack of chunks
;;; already in the hash table
;;; otherwise ==> if we are gathering, push the line onto the stack
;;; a hash table entry is a list of lists such as
;;; (("6" "5") ("4" "3") ("2" "1"))
;;; each of the sublists is a set of lines in reverse (stack) order
;;; each sublist is a single chunk of lines.
;;; there is a new sublist for each reuse of the same chunkname
;;; Calls to ischunk can have 4 results (define, end, refer, nil) where
;;; define ==> we found a \begin{chunk}{...}
;;; end ==> we found a \end{chunk}
;;; refer ==> we found a \getchunk{...}
;;; nil ==> ordinary text or program text
;;;
;;; gather is initially false, implying that we are not gathering code.
;;; gather is true if we are gathering a chunk
(defn hashchunks [lines]
^{:doc "Gather all of the chunks and put them into a hash table"}
(loop [ line lines
gather false
hash (hash-map)
chunkname "" ]
(if (not (empty? line))
(let [[key value] (ischunk (first line))]
(condp = key
'define
(recur (rest line) true hash value)
'end
(recur (rest line) false hash chunkname)
'refer
(if gather
(recur (rest line) gather
(assoc hash chunkname (conj (get hash chunkname) value))
chunkname)
(recur (rest line) gather hash chunkname))
nil
(if gather
(recur (rest line) gather
(assoc hash chunkname (conj (get hash chunkname) value))
chunkname)
(recur (rest line) gather hash chunkname))))
hash)))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; 9 EXPAND
;;; expand will recursively expand chunks in the hash table
;;;
;;; latex chunk names are just the chunkname itself e.g. chunkname
;;; a hash table key is the chunk name and the value is a reverse
;;; list of all of the text in that chunk.
;;; To process the chunk we reverse the main list and
;;; for each sublist we reverse the sublist and process the lines
;;; if a chunk name reference is encountered in a line we call expand
;;; recursively to expand the inner chunkname.
(defn expand [chunkname where table]
^{:doc recursively expand latex getchunk tags}
(let [chunk (reverse (get table chunkname))]
(when chunk
(loop [lines chunk]
(when (not (empty? lines))
(let [line (first lines)]
(let [[key value] (ischunk line)]
(if (= key 'refer)
(do
(expand (apply str (butlast (drop 10 value))) where table)
(recur (rest lines)))
(do (say where line)
(recur (rest lines)))))))))))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; 10 TANGLE
;;; We expand all of the lines in the file that are surrounded by the
;;; requested chunk name. These chunk names are looked up in the hash
;;; table built by hashchunks, given the input filename.
;;; then we recursively expand the ``topchunk'' to the output stream
(defn tangle
^{:doc "Extract the source code from a pamphlet file, optional file
output"}
([filename topchunk] (tangle filename topchunk nil))
([filename topchunk file]
(if (string? file)
(with-open [where (BufferedWriter. (FileWriter. file))]
(expand topchunk where (hashchunks (read-file filename))))
(expand topchunk nil (hashchunks (read-file filename))))))
Thanks so much for your work!
--Robert McIntyre
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your
> first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
On 12/26/2010 9:56 PM, Praki Prakash wrote:
> Tim,
>
> This approach is very interesting. My choice of mode for LP has always
> been noweb-mode but it doesn't seem to work with my version of emacs
> anymore. My current approach is to embed prose and clojure code in a
> latex document and generate a .tex file with formatted clojure code
> and .clj containing only clojure code. Needless to say, it is a hack
> and I would like to see if I can adopt your approach.
Well, I'm something of a primitivist so I don't use modes in Emacs.
I know there is a noweb mode and a clojure mode and a latex mode but
I've never used any of them so I can't comment.
In the Knuth web approach (ala noweb) the chunk markup is not valid
latex. Therefore you need to run "weave" to extract valid latex.
I found this pointless so I wrote the latex environment macros to
wrap the code chunks. This means that the document you are writing
is always pure latex. So you might find it useful to use the chunk
environment and skip the "generate .tex file" step.
All that is left is to extract the clojure code. While writing the
code I have a REPL open so I can kill/yank changes directly. The
full build/test cycle uses the attached code to extract from the
literate document (which I call a "pamphlet").
>
> However, I have a question on mapping of line numbers in clojure
> stacktrace to its source. AFAIK, there is no support in clojure
> compiler for #LINE directive. In my case, a code is always in one
> location and I just replace latex lines with empty lines. How do you
> address this issue?
I don't use the #LINE directive. I usually use a split screen
with lisp or clojure in a shell buffer (not a slime setup) and
my code in the other buffer. I've hand-checked the code in the
REPL before adding to or changing the document. After each change
I do a complete system rebuild/test cycle.
Thus, if it fails I know exactly what I broke and where I broke it.
It is broken at the last change, which is likely still in a buffer.
When it works, the code is already documented in latex and pdf (since
the build/test automatically regenerates the pdf).
I would like to see Clojure move toward literate programming and
have direct support for reading and compiling latex documents.
It is surprisingly efficient and effective. In my last 3 year
project I implemented over 60k lines of lisp and 6k pages of
documentation (with the embedded lisp code in literate form, of
course). When the "program was done", the "documentation was done".
In fact, that was true from the first day of the project and was
an invariant throughout. I highly recommend it.
Tim
On 12/26/2010 8:33 PM, Robert McIntyre wrote:
> That's really cool. I was _just reading_ your comments from 2006 at
> http://www.mail-archive.com/gard...@lispniks.com/msg01006.html and
> wondering about how hard something like this would be to write. If
> possible, could you expand on how one might use this in a development
> work-flow with emacs or texmacs? Is there some sort of lisp-pamphlet
> mode that could be extended to clojure? I'm assuming you've already
> solved all these problems and more while working with Axiom. I'd
> appreciate any pointers or best practices you might have found.
Wow. That's an old reference. Most of those links are dead. Try
http://en.wikipedia.org/wiki/Axiom_computer_algebra_system
or http://axiom-developer.org
As I mentioned in my other email on this subject I don't use Emacs
modes. I don't want the computer editing my text. I just want it to
do exactly what I told it to do. But I'm a curmudgeon :-)
My style of programming is to change one thing and then do a complete
system rebuild and test, which takes about 1 hour on my fastest machine,
5 on my slowest. I "work a pipeline" of changes so the build/test time
is overlapped with changes. I can have up to 5 machines doing build/test
while I'm working on the 6th change. (Git helps a LOT here. I strongly
recommend git also.)
I usually use a split screen with the literate latex document (I call
it a "pamphlet") in one buffer (not a slime setup) and the REPL in the
other buffer. I use the REPL to hand-check the code before modifying
the pamphlet.
Thus, if (when :-)) it fails I know what I broke and where I broke it.
When it works, the code is already documented in latex and pdf.
The build/test automatically regenerates the pdf.
The goal is to make the smallest change possible and concentrate on getting
it right. It might require writing or updating test cases which are also
in the same literate document under a "test" chunk name. The automated build
extracts and runs all "test" chunks.
So the essential loop is:
edit the pamphlet in a split buffer
define the functions in the REPL buffer
change the function
hand-test the change in the REPL
update the pamphlet with the changed function
write/update the test code
save the pamphlet
do a full system build and test
check that the test pass
check the pdf documentation
The second step would be a lot easier if Clojure knew how to load
a latex file. I'm thinking of writing a load-pamphlet function to
do this. It could specify a chunk name to extract as in:
(defn
(load-pamphlet [file] (load-pamphlet file "*"))
(load-pamphlet [file chunk] (tangle file chunk)))
Thus, to do step 2 above I just need to (load-pamphlet...) the file.
I could run the tests with (load-pamphlet file "test")
\begin{rant}{
\tl;dr{Let's move programming out of the dark ages.}
We of the Clojure community are trying hard to move Lisp out of
the 60s programming model. We can also move programming out of
the 60s code-then-maybe-document model. The Knuth technology is
30+ years old.
When I started 40 years ago (http://www.dilbert.com/fast/2010-12-23/)
we had limits like 8k of memory so no file was bigger than 4k. Thus,
C programs were "tiny piles of sand" with include/overlay linkers/etc.
Few of you have ever worked on a fully loaded PDP 11/40. Why do you
program like you only have 4k?
Now we can have a machine with 128Gig of memory. Its about time for
programmers to lose the "one function-one file" idea based on keeping
the machine happy. We need to think about one-program-one-document to
keep the humans happy. For a perfect example, see the book:
"Lisp In Small Pieces by Christian Queinnec" [1]
Queinnec's book is my ultimate example. He MOTIVATES every piece of
code that gets introduced and he talks about everything. The code
contains a complete lisp system including the interpreter, compiler,
and garbage collector. If you want to know lisp, read this book.
If you want to learn literate programming, study this book.
I want "Clojure In Small Pieces", a literate form of Clojure that
I can execute from the book. Then I can open to the chapter on
PersistentHashMap and read all about log32 tries, why they matter,
and how Clojure implements them, with literature references and a
good index. Oh, yeah, and the ACTUAL code that gets executed.
Forget the jar file. Send me the pamphlet containing Clojure.
Programming will come out of the dark ages when we employ English
majors as project leads with the title "Editor in Chief". If you
can explain it to an English major in text, you probably understand
the problem :-)
Let's move programming out of the dark ages.
\end{rant}
Tim Daly
[1] http://www.amazon.com/Lisp-Small-Pieces-Christian-Queinnec/dp/0521545668
Yes I did play with org-mode + babel for clojure.
It works great :-)
Just make sure you are using latest and greatest of org-mode.
Cheers,
Hubert.
sincerely,
--Robert McIntyre
Cheers,
Hubert
At least in theory. I am stuck with running a couple
tests. The only real change I've made to the sources
is to make it fit a printable page which involves
changing a line to make it shorter.
I've run into a syntax for strings that I don't understand.
The string #"some string" is used in the test files. The
documentation on the reader does not list this as a possible
input case. What does it mean?
Once I cross this hurdle everything else works and I can
post a new version for your experiments.
Tim
It's reader syntax for a regular expression.
user=> (type #"some string")
java.util.regex.Pattern
It and its reader macro friends can be found here:
http://clojure.org/reader
> Once I cross this hurdle everything else works and I can
> post a new version for your experiments.
Thanks for your efforts on this. I'm quite interested in its potential!
- Jeff
hope that helps,
--Robert McIntyre
Note that 'literate programming' involves writing literature
for other people to read. The executable code is included as
a 'reduction to practice' but the emphasis is on describing
the ideas. Rich has some powerful ideas that he has reduced
to running code. What we need to do is start with a description
of the ideas and bridge the gap to the actual implementation.
Ideally you can read a literate program like a novel, from
beginning to end, and find that every line of code has a
'motivation' for being introduced. The side-effect is that
there is a reason why the idea is implemented in a particular
way rather than 'just because it worked'. Literate programming
tends to improve code quality because you have to explain it.
Emacs org-mode, on the other hand, is a useful development
technology but it really isn't literate programming.
Tim Daly
http://daly.axiom-developer.org/clojure.pdf
http://daly.axiom-developer.org/clojure.pamphlet
http://daly.axiom-developer.org/clojure.sty
This version of the literate document contains a
complete, working system. The steps for building
it are in the preface.
Essentially you compile the tangle function from
the document (or use the same source code here:
http://daly.axiom-developer.org/tangle.c )
Then you run tangle to extract the Makefile.
Then you type make.
Or, for the programmers:
1) edit the file, clip out and save tangle.c
2) gcc -o tangle tangle.c
3) tangle clojure.pamphlet Makefile >Makefile
4) make
It should extract the sources, build Clojure,
test it, build the pdf, and leave you at a
REPL prompt.
The source tree lives under the 'tpd' directory.
You can put it anywhere with an argument to make, e.g.
4) make WHERE=myplace
This means that you only need the latex document
to develop (resist the urge to edit the other
files).
Now the problem is to write the ideas and connect
them to the code. I started doing this for the
Red Black tree idea and PersistentTreeMap. Feel
free to pick an idea (or suggest one) and work
out the details.
I urge you to try the edit/build cycle using
literate tools as a possible different way to
work. My usual command line after every change is:
rm -rf tpd && tangle clojure.pamphlet Makefile >Makefile && make
A complete rebuild from scratch takes less than a
minute on a fast machine.
Tim
On 1/5/2011 9:27 AM, Seth wrote:
>> Just discovered org-mode myself --- does anyone know of guide to using
>> it with clojure for a total newbie?
> I havent actually used it for clojure per se. I was just imagining how
> it could be used. You have the ability to embed arbitrary code (from
> many different languages). You can edit the code in its own emacs
> major mode and then it will automatically be saved back once done. You
> can then document it using org-modes awesome abilities.
> However, this is sort of clumsy.
>
> I would rather be able to have all of my code in all of its 'little
> files' arranged in directories.
Just out of curiosity, what is the advantage of maintaining code
in 'little files'? The main reason people use an IDE is that they
get a whole-project view. The IDE lets you move around and find
things as though your project was effectively one big file. It really
is a kind of patch on the little-file organization. What is it about
'little file format' that is actually useful? Except for habit, is
there a real advantage?
Conceptually separate efforts, such as the Clojure-contrib effort
could be done in separate volumes. But you would expect this kind of
natural organization, just as you might expect a multi-volume story
like the Asimov Foundation series. Wouldn't a well organized contrib
literate document be useful? When you program and reach for your
books, don't you find a good index and cross-reference the best way
to find the idea you need?
> And when im editing the clojure files,
> i would like to be like 'oh, i want to document this better/introduce
> the motivation etc! And then automatically have the code, or parts of
> the code, copied to the org file and then i could document it. And
> then jump back to the code to continue developing.
Viewing the step of 'documenting the code' as just another step in
development is one of the reasons that documentation is rarely done
and even more rarely done well. The target audience is usually another
developer so tools like Javadoc exist to make it easy to look up class
details. What tools exist to support whole-project, idea-to-implementation
documentation?
Viewing documentation as a phase of development is conceptually and
actually very different from viewing the project as a literate effort.
Development targets the machine whereas a literate effort targets people.
The difference in target audience makes a qualitative difference in what
you write, when you write it, and why you write it. There is nothing
about the Javadoc organization (to pick on one tool) that encourages or
even supports the 'idea-to-implementation' flow. Are there open source
examples of documentation from the ideas to the implementation?
Literate programming allows you to reorganize your code by ideas. For
instance, in the clojure example, the PersistentTreeMap class is split
into its subclasses like Node. Okasaki's work starts with the idea of
Nodes so we highlight and explain the Node structure of PersistentTreeMap
before we get into the top level class details. In this way you motivate
the need for the Node class and 'bring the reader along' so that when
they get to PersistentTreeMap they already understand Nodes.
Because of the way Java forces you to organize your code you have to
introduce the PersistentTreeMap class before you introduce the Node class.
This is the late 90s and we ought to be able to organize our code any
way we want rather than be forced to organize it for the compilers,
linkers, and loaders. Why would we want to organize our code for the
convenience of our tools?
> And have changes in
> the clojure file automatically reflected in the org file. I was
> thinking that 'chunk' labels could be embedded in the source code
> (like in marginalia in github: just comments like ;;##Block Name) so
> that we wouldn't have to have all code in one file in one chunk, but
> could split it up.
>
Having literate code in more than one file is certainly possible because
Latex supports an 'include' command. You could include the code chunks or
you could include the chapters and keep them in 'litte file format'.
I'm not sure what advantage this confers. Working in a single file or
working in multiple files is pretty transparent. Emacs lets me split
buffers in the same file as easily as having two files in split windows.
Finding things is SO much easier (hey, its certainly in THIS file :-) )
The hardest initial part is using apt-get texlive.
The difference is not the physical organization but the mental organization.
A single file format gives the impression of a book and with that comes the
skills of organizing the material for presenting the ideas in a logical
fashion. It is this change of mental viewpoint that is the critical part
of literate programming rather than the tools like emacs-org-mode or latex.
The mental transition to this style of programming is at least as hard as
the mental transition from Object Oriented programming to Functional
programming.
In my experience, the gain is worth the effort.
Clojure code is nearly documentation-free at the moment. We could certainly
create some Javadoc-like tools or IDEs or Emacs-modes that fit our normal
routines or we could experiment with a whole new mindset. Clojure users
are early-adopting, risk-taking, leading-edge developers. Literate
programming
is just the right kind of challenge for innovating away from the past.
Besides, wouldn't it be great if the community could point at "the Clojure
book", like the Lisp community points to "the Steele book"? Steele managed
to organize an unreadable standard into a readable and very useful book.
Surely we can do the same.
Tim Daly
> The literate programming is actually a contrib to org-mode.
> http://orgmode.org/worg/org-contrib/babel/
>
This has been moved out of contrib and into the Org-mode core, so with
recent versions of Org-mode the code block "Literate Programming" and
"Reproducible Research" support is built in.
In fact when Emacs 24 is released this will be part of the Emacs core.
>
> Ive actually used it to create my emacs.el, by having code in
> emacs.org and have init.el tangle out the emacs code. Of course i
> never documented
> anything and did it for the novelty of being able to organize all that
> code in one file, instead of expanding it to other files :)
I do this myself and find it very convenient. In fact I maintain a
Literate fork of Phil Hagelberg's emacs-starter-kit which does exactly
this, allowing you to keep you emacs customizations in either .org files
or .el files.
The git repo for this is here
https://github.com/eschulte/emacs-starter-kit
and the documentation (exported from the literate .org files) is here
http://eschulte.github.com/emacs-starter-kit/
Cheers -- Eric
Seth <wbu...@gmail.com> writes:
>>Just discovered org-mode myself --- does anyone know of guide to using
>>it with clojure for a total newbie?
>
> I havent actually used it for clojure per se. I was just imagining how
> it could be used. You have the ability to embed arbitrary code (from
> many different languages). You can edit the code in its own emacs
> major mode and then it will automatically be saved back once done. You
> can then document it using org-modes awesome abilities.
> However, this is sort of clumsy.
>
There are a variety of options here
- you can write *all* of your code in a single large Org-mode file, and
tangle out .clj files for compilation.
- you can write *all* of your code in .clj files, and simply link to the
code from your .org files
- you can write some code in external .clj files, and some embedded in
.org files
- with current versions of Org-mode it is even possible to propagate
changes from a tangled .clj file back into the code blocks in a .org
file if e.g. you are working on a project with non-org users who would
rather edit the .clj files directly. See the `org-babel-detangle'
function.
>
> I would rather be able to have all of my code in all of its 'little
> files' arranged in directories. And when im editing the clojure files,
> i would like to be like 'oh, i want to document this better/introduce
> the motivation etc! And then automatically have the code, or parts of
> the code, copied to the org file and then i could document it. And
> then jump back to the code to continue developing. And have changes in
> the clojure file automatically reflected in the org file. I was
> thinking that 'chunk' labels could be embedded in the source code
> (like in marginalia in github: just comments like ;;##Block Name) so
> that we wouldn't have to have all code in one file in one chunk, but
> could split it up.
I am a grad student and spend much of my time writing code and running
experiments in Clojure. I do all of this in an environment of mixed
.org and .clj files. I find I prefer to write larger libraries directly
in .clj files, but then I often embed the snippets of code required for
running experiments, generating tables/graphs and analyzing experimental
results in code blocks embedded in Org-mode files. From these code
blocks I can either tangle the clojure code out into executable scripts,
or execute it /in situ/ in the .org file with the results dumped
directly into my org-mode buffer.
I find this to be a *very* comfortable research and development
environment, although as one of the main developers of the code block
support for Org-mode I'm certainly biased.
Cheers -- Eric
I'm confused as to what parts of LP practice are not supported by
Org-mode. Are you aware that Org-mode files can be exported to formats
more suitable for publication and human consumption (e.g. woven). See
http://orgmode.org/manual/Exporting.html
Tim Daly <da...@axiom-developer.org> writes:
> I looked at org-mode.
>
> Note that 'literate programming' involves writing literature
> for other people to read. The executable code is included as
> a 'reduction to practice' but the emphasis is on describing
> the ideas. Rich has some powerful ideas that he has reduced
> to running code. What we need to do is start with a description
> of the ideas and bridge the gap to the actual implementation.
>
> Ideally you can read a literate program like a novel, from
> beginning to end, and find that every line of code has a
> 'motivation' for being introduced. The side-effect is that
> there is a reason why the idea is implemented in a particular
> way rather than 'just because it worked'. Literate programming
> tends to improve code quality because you have to explain it.
>
> Emacs org-mode, on the other hand, is a useful development
> technology but it really isn't literate programming.
>
I would be interested to hear your thoughts as to why Org-mode is not a
literate programming tool.
Thanks -- Eric
http://orgmode.org/manual/Working-With-Source-Code.html
also, for a good review of Org-mode's support for the practices of
Literate Programming and Reproducible Research, see this draft
manuscript (currently in submission).
http://cs.unm.edu/~eschulte/org-paper/
both the .org source and the exported .pdf are available
Cheers -- Eric
On 1/5/2011 7:37 PM, Eric Schulte wrote:
> Hi Tim,
>
> I'm confused as to what parts of LP practice are not supported by
> Org-mode. Are you aware that Org-mode files can be exported to formats
> more suitable for publication and human consumption (e.g. woven). See
> http://orgmode.org/manual/Exporting.html
I am truly impressed with the number of formats org-mode can
support. Of course, I expect nothing less as a heavy emacs user.
> Tim Daly<da...@axiom-developer.org> writes:
>
>> I looked at org-mode.
>>
>> Note that 'literate programming' involves writing literature
>> for other people to read. The executable code is included as
>> a 'reduction to practice' but the emphasis is on describing
>> the ideas. Rich has some powerful ideas that he has reduced
>> to running code. What we need to do is start with a description
>> of the ideas and bridge the gap to the actual implementation.
>>
>> Ideally you can read a literate program like a novel, from
>> beginning to end, and find that every line of code has a
>> 'motivation' for being introduced. The side-effect is that
>> there is a reason why the idea is implemented in a particular
>> way rather than 'just because it worked'. Literate programming
>> tends to improve code quality because you have to explain it.
>>
>> Emacs org-mode, on the other hand, is a useful development
>> technology but it really isn't literate programming.
>>
> I would be interested to hear your thoughts as to why Org-mode is not a
> literate programming tool.
I never said org-mode wasn't a 'literate programming tool'. It is clearly an
outstanding version of a literate programming tool. What I said was that
org-mode "really isn't literate programming".
I am trying to distinguish between tool and task.
Literate programming, as a tool, can be done with notepad.
Literate programming, as a task, is a radical change of mindset.
It is at least as difficult as going from Object Oriented programming
to Functional programming.
The point of the clojure.pamphlet file isn't to highlight how it
was created (emacs, fundamental mode). The point is to begin to
think about documentation as an "ideas to implementation", speaking
from "human to human", way of looking at the problem.
I made the machinery as simple as possible so people could experiment
with a new way of creating software. It is hardly new, and it isn't
my idea (see Knuth). I just have come to understand that it is a very
efficient and effective way to develop software that can "live".
Clojure breaks with the past in many ways. I am advocating breaking
with the past in terms of the 'little files' idea, 'javadoc', and
other ways of documenting. And, since Advocacy is Volunteering, I
pretty much put myself into a position where I had to demonstrate
what I was advocating. Thus, the Clojure in Small Pieces book.
In literate programming org-mode, will Clojure code be properly
highlighted and indented?
Is there a keystroke (like Ctrl-c,Ctrl-k) that will evaluate the
Clojure code in the entire file and send it to the swank REPL?
Will stacktraces point at the correct line number?
It seems that your real question is whether Clojure knows about
a literate document. It does not. But it would be possible to
modify the reader behavior when given a pamphlet file. The REPL
uses a line numbering reader. Anything between the last
\end{chunk} and the next \begin{chunk} could be considered
as comments to be ignored but the line numbers for the function
would be correct and therefore the stack traces would be correct.
I suppose it would be reasonably easy (there is no such thing
as a simple job) to write a literate reader for the REPL. All
it would need to know is where to turn-on and turn-off the normal
read semantics.
When I get to documenting the REPL I will look at how reading
is done and think about writing a LiterateReader class.
Tim
If we had custom reader macros in Clojure we wouldn't even be having
this discussion; you would probably have already implemented it by
now. :)
LispReader is a class that appears to have a read function that does
Clojure s-expression parsing. Wrapping that around a LiterateReader
stream would seem to do the job. The LiterateReader stream only has
to change any non-chunk line into a Clojure comment by prepending
a semicolon. The code to scan a line that begins with a \begin{chunk}
or \end{chunk} does not seem all that challenging.
In fact, a slightly smarter LiterateReader stream could be given the
particular line as an argument and only call LispReader on that
s-expression. So a LiterateReader with an optional line number
argument would allow an editor to specify where to start .read.
A reader macro would require special syntax. This may be reasonable
but it seems that a simple (load-literate "filename" N) would be
all that is needed, requiring no special syntax.
Alternatively the reader could use the file extension so that
pamphlet files would invoke the LiterateReader stream automatically.
Or if the parse of the first line begins with \ then use LiterateReader.
I'm documenting Bit-partitioned hash tries at the moment. I'll
see if I can document the reader next and get an idea of exactly
how it works and what it would take to change it.
Tim Daly
> On Wed, Jan 5, 2011 at 4:44 PM, Eric Schulte <schult...@gmail.com> wrote:
>> For the most up-to-date and comprehensive documentation of using
>> Org-mode to work with code blocks (e.g. Literate Programming or
>> Reproducible Research) the online manual is also very useful.
>
> In literate programming org-mode, will Clojure code be properly
> highlighted and indented?
yes, Clojure code is displayed using the Emacs major mode for Clojure
code, so the appearance is as you would expect. See this screenshot
from the file I am working on at the moment, the upper frame is a .clj
file and the lower is a .org file. http://i.imgur.com/kdbDp.png
>
> Is there a keystroke (like Ctrl-c,Ctrl-k) that will evaluate the
> Clojure code in the entire file and send it to the swank REPL?
Yes, C-c C-c evaluates the code block under the point, and many other
keystrokes bind to various other functions [1], specifically C-c C-v b
executes the entire buffer.
> Will stacktraces point at the correct line number?
No, this is one of the reasons that I currently tend to do large-scale
development in .clj files and reserve embedded code for shorter chunks
of code. That said I have successfully completed large clojure projects
in which the entirety of the code was tangled from a single literate
.org file.
Cheers -- Eric
Footnotes:
[1] M-x org-babel-describe-bindings
Major Mode Bindings Starting With C-c C-v:
key binding
--- -------
C-c C-v a org-babel-sha1-hash
C-c C-v b org-babel-execute-buffer
C-c C-v d org-babel-demarcate-block
C-c C-v e org-babel-execute-maybe
C-c C-v f org-babel-tangle-file
C-c C-v g org-babel-goto-named-src-block
C-c C-v h org-babel-describe-bindings
C-c C-v i org-babel-lob-ingest
C-c C-v l org-babel-load-in-session
C-c C-v n org-babel-next-src-block
C-c C-v o org-babel-open-src-block-result
C-c C-v p org-babel-previous-src-block
C-c C-v r org-babel-goto-named-result
C-c C-v s org-babel-execute-subtree
C-c C-v t org-babel-tangle
C-c C-v u org-babel-goto-src-block-head
C-c C-v v org-babel-expand-src-block
C-c C-v x org-babel-do-key-sequence-in-edit-buffer
C-c C-v z org-babel-switch-to-session-with-code
Ok, then it sounds like we're in agreement. I just was confused by the
use of "org-mode" as a verb to describe one particular task when it
supports many disparate tasks, including LP as you have defined it.
Thanks for the explanation.
Cheers -- Eric
On 1/5/2011 10:58 PM, Seth wrote:
> Now that i think of it, it is mostly a fear of having decreased
> productivity in writing code that affected my statement that i liked
> the little files. Im used to, i suppose, developing code for a
> specific function in a file, being able to compile, goto line numbers
> where there are errors,
Try inserting a syntax error anywhere in the code. Then type
'make' to the shell. You'll get a traceback that shows you the exact
line in the file that failed since the literate document is really
feeding Java files to the compiler. This is forced by Java since
there is a (bogus) connection between filename and contents.
In any case, you still get the same traceback you always got.
> send code to slime, etc. Looking over your example made things much
> clearer. Its like your guiding your reader to specific parts of the
> 'little files', describing the theory behind them, moving on, etc. And
> each code fragment has a chunk name associated with it, and all of
> them are combined into the final .clj file using the code fragment
> names (in a separate chunk).
The REPL makes it much easier to develop lisp code in a literate
style because you can kill/yank an s-expression into a shell buffer
(or use slime).
My usual method of working is to build and test a function in the
REPL. Once it works I have the source already in the file so I can
just save it, build the whole system, run the tests, and make sure
I didn't break anything (it takes less than a minute).
> At first, i thought this would be less productive than simply putting
> all of the code in one clj file, but now that i think about it i think
> it would, with the appropriate tools. And it wouldn't even be too
> difficult, with org-mode (prefer it over latex any day!)
You see all of the source as one file. The compiler sees all of the
source in little files. The beauty of literate programming is that you
no longer have to pay attention to the compiler's failings.
> Im going to start transferring a subsection of my program to literate
> programming, using org-mode. See how it goes...
>
> Oh, and Tim, you might want to take a closer look at org-mode. Instead
> of having to tangle out the code that builds everything, you could
> create an executable shell script block in org-mode - the makefile
> script could be tangled into a string using noweb syntax, and then
> everything could go from there. You can execute the block by hitting C-
> c c-c in the org file (or something like that). Pretty cool, in my
> opinion!
I used noweb (ref: Norman Ramsey) for years in Axiom. It is a useful
tool. However I finally understood that I can get rid of the 'weave'
function with a couple latex macros I wrote (see a prior post) and
I could get rid of the 'tangle' function by modifying the reader.
I patched lisp and build tangle directly into the image.
Thus, with some simple changes I no longer need any special tools.
That makes life simple and I like simple.
Org-mode sounds great and from what I've seen from the docs it does
everything but cook rice. I would highly recommend it as a tool if you
like that sort of thing. It would integrate well in a slime environment
if you like that sort of thing.
ANYTHING that helps you write literate programs is a win in my book.
I'm afraid that I have two personal failing that make org-mode unlikely.
I don't like modes (My emacs mode table is smashed to be fundamental mode
for everything). Editors should not change my source code.
Thus, org-mode is "right-out", to quote Monty Python.
Second, I'm addicted to Latex. Latex is an amazing program, simply
stunning. I cannot imagine trying to write Clojure in Small Pieces
without it. It is just a markup language like HTML and thus not hard
to learn but it is also a turing complete language that has a huge
ecosystem of tools and techniques. I am creating the graphics for
CISP at the moment in Latex. I could do it using some other tool such
as gimp but life would not be as simple and the clojure.pamphlet file
would now need image files (more 'little files' cruft).
But whatever works for you is great. Please post an example of
a literate document using org-mode. We can then compare and contrast,
as my English teacher used to say. It would be interesting to see another
example of a literate document for Clojure. Slime and org-mode may
be the proper way to go.
Tim Daly
On 1/5/2011 11:18 PM, Eric Schulte wrote:
> Mark Engelberg<mark.en...@gmail.com> writes:
>
>> On Wed, Jan 5, 2011 at 4:44 PM, Eric Schulte<schult...@gmail.com> wrote:
>>> For the most up-to-date and comprehensive documentation of using
>>> Org-mode to work with code blocks (e.g. Literate Programming or
>>> Reproducible Research) the online manual is also very useful.
>> In literate programming org-mode, will Clojure code be properly
>> highlighted and indented?
> yes, Clojure code is displayed using the Emacs major mode for Clojure
> code, so the appearance is as you would expect. See this screenshot
> from the file I am working on at the moment, the upper frame is a .clj
> file and the lower is a .org file. http://i.imgur.com/kdbDp.png
>
>> Is there a keystroke (like Ctrl-c,Ctrl-k) that will evaluate the
>> Clojure code in the entire file and send it to the swank REPL?
> Yes, C-c C-c evaluates the code block under the point, and many other
> keystrokes bind to various other functions [1], specifically C-c C-v b
> executes the entire buffer.
>
>> Will stacktraces point at the correct line number?
> No, this is one of the reasons that I currently tend to do large-scale
> development in .clj files and reserve embedded code for shorter chunks
> of code. That said I have successfully completed large clojure projects
> in which the entirety of the code was tangled from a single literate
> .org file.
Can you post examples of these? I'd love to see some other examples.
So my comment was unclear. I so badly need an Editor-in-Chief to
scan my emails for clarity, punctuation, and other things literate
people ought have mastered by now. Sorry about that.
Sure thing, check out this old version of a file which tangles out into
the directory layout expected by lein.
http://gitweb.adaptive.cs.unm.edu/?p=asm.git;a=blob;f=asm.org;h=f043a8c8b0a917f58b62bdeac4c0dca441b8e2cb;hb=HEAD
Also, this project has an org-mode front page with code examples, the
html woven from this front page is shown at
http://repo.or.cz/w/neural-net.git
and the raw org file is available here
http://repo.or.cz/w/neural-net.git/blob/HEAD:/neural-net.org
I'll have to check out clojure.pamphlet, it sounds like an elegant
alternative. It's always interesting to see other solutions in this
space. For example I think scribble is a nice tool from the scheme
world. http://lambda-the-ultimate.org/node/4017
Best -- Eric