John MacFarlane
unread,Aug 10, 2016, 5:13:45 PM8/10/16Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Sign in to report message as abuse
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to pandoc-...@googlegroups.com
Pandoc is ten today! The first release, pandoc version 0.1,
was made public on August 10, 2006.
Metric 0.1 current
---------------------- ------- --------
lines of Haskell code* 3420 34076
input formats 4 19
output formats 6 34
contributors 1 > 150
size of manual 17k 140k
* including pandoc-types, but not testing code or
subsidiary libraries like texmath, highlighting-kate, or
pandoc-citeproc.
Pandoc started out as a small practice project to mess
around with Haskell. After a while it became capable enough
for me to use in my own work. I released it under an
open-source license because I used other people's
open-source software, and because there was no reason not
to. But I didn't expect it would ever be used much by
anyone but me.
When I started working on pandoc, the Haskell ecosystem was
small. There was no central database of packages, and no
cabal or stack tool for managing dependencies. I had to
write my own libraries for dealing with zip archives
(zip-archive) and highlighting syntax (highlighting-kate).
Things have come a long way since then.
Because I was just a beginner in Haskell when I began the
project, and key libraries like text didn't yet exist, there
are a number of things about the project's design that are
less than ideal. If I were starting over, I'd use Text
everywhere instead of String. I'd also use a lot more
newtypes, and I'd use free monads or type classes so that
all of the readers and writers could be used outside of IO.
I'd use a data structure that allowed attributes to be
attached uniformly to all elements. I think we'll
ultimately need to make these changes to move forward, but
they're not simple changes to make in a 34k line code base.
Fortunately, more and more people are becoming familiar with
the pandoc code. Over time the project has attracted many
excellent contributors. To them we owe the ZimWiki writer
(Alex Ivkin), the TEI writer (Chris Forster), the InDesign
ICML writer (Maura Bieg), the FB2 writer (Sergey Astanin),
the Texinfo writer (Peter Wang), the org-mode writer
(Puneeth Chaganti and later Albert Krewinkel), the DokuWiki
writer (Clare Macrae), the Textile reader (Paul Rivier), the
Haddock reader (David Lazar), the org-mode reader (Albert
Krewinkel), the ODT reader (Martin Linnemann), the Twiki
reader (Alexander Sulfrian), the EPUB reader (Matthew
Pickering), and the docx reader (Jesse Rosenthal). Andrea
Rossato was primarily responsible for pandoc's excellent
citation support, through his citeproc-hs project (which
formed the basis of pandoc-citeproc). Matthew Pickering
dramatically improved math support and made many other
contributions to pandoc's architecture. Mauro Bieg added
attributes to links and images, and (together with Andrew
Dunning) helped improve multi-language support. Many others
have contributed important bug fixes and suggestions. Some
of these contributors had never touched Haskell before they
worked on pandoc.
One might wonder whether it matters that pandoc is written
in Haskell. Certainly it would have been possible to write a
program that does what pandoc does in any language. But the
security provided by Haskell's type system has kept me sane
when I have needed to make major changes or do large-scale
refactoring. I can make one very central change (say,
adding attributes to headers in the basic Pandoc document
model) and the compiler will direct me to everything else I
need to change. When the program compiles again, I can feel
reasonably confident that I've modified all of the code that
was affected. Purity also helps preserve sanity. When a
writer defines a pure function from a Pandoc document to a
String, I can be confident that changes to that function
won't have side effects in other parts of the program. I
doubt that, using a language like C or JavaScript, I would
have been able to manage an evolving code base this size in
my small scraps of free time without it turning into
unmaintainable spaghetti.
It has been a good ten years. Happy birthday pandoc, and
thanks to everyone who has helped to improve it.