ANN: pandoc 2.12

13 views
Skip to first unread message

John MacFarlane

unread,
Mar 8, 2021, 4:01:07 PM3/8/21
to pandoc-...@googlegroups.com, pandoc-...@googlegroups.com

I'm pleased to announce the release of pandoc 2.12,
available in the usual places:

Binary packages & changelog:
https://github.com/jgm/pandoc/releases/tag/2.12

Source & API documentation:
http://hackage.haskell.org/package/pandoc-2.12

In addition to the usual bug fixes and minor improvements, there
are a number of notable user-facing changes in this release.

The `--resource-path` option can now be specified multiple times,
and the resource path components will accumulate. Resource paths
specified later on the command line are prepended to those specified
earlier. `resource-path` in defaults files behaves the same way: it
will be prepended to the resource path set by earlier command line options
or defaults files. This change facilitates the use of multiple defaults
files: each can specify a directory containing resources it refers to
without clobbering the resource paths set by the others.

In defaults files, there is now a way to refer to the home directory, the
user data directory, and the directory containing the defaults file
itself. In fields that expect file paths (and only in these fields),

+ `${VARIABLE}` will expand to the value of the environment variable
`VARIABLE` (and in particular `${HOME}` will expand to the path
of the home directory). A warning will be raised for undefined
variables.
+ `${USERDATA}` will expand to the path of the user data
directory in force when the defaults file is being processed.
+ `${.}` will expand to the directory containing the defaults file.
(This allows default files to be placed in a directory containing
resources they make use of.)

When downloading content from URL arguments, pandoc is now sensitive to
the character encoding (#5600). We can properly handle UTF-8 and latin1
(ISO-8859-1); for others we raise an error. We fall back to latin1 if
no charset is given in the mime type and UTF-8 decoding fails.

With `--abbreviations`, a period is no longer required at the end
of an abbreviation (so, e.g. `M` can be put in an abbreviation
file for French).

The `task_lists` extension now works for org input.

We have replaced xml-light with xml-conduit for all the XML
format readers. (This required some patches upstream.)
The change was required by some XML that xml-light could not
handle, but a tangible side benefit is a very significant speed
bump. xml-conduit is now also used in skylighting, which speeds
up parsing of syntax definitions by about 4X.

In addition to the XML based readers, we have improved performance
in the HTML, LaTeX, and Markdown readers. Here are some benchmarks
comparing the previous version to this one:

Conversion pandoc 2.11.4 pandoc 2.12
------------------ ------------------- ----------------------
-f docbook 18 ms (35MB alloc) 6 ms (20 MB alloc)
-f docx 72 ms (105 MB) 42 ms (98 MB)
-f epub 64 ms (161 MB) 35 ms (80 MB)
-f fb2 14 ms (26 MB) 4 ms (13 MB)
-f html 34 ms (95 MB) 24 ms (53 MB)
-f latex 29 ms (147 MB) 16 ms (54 MB)
-f odt 78 ms (110 MB) 27 ms (66 MB)
-f org 54 ms (183 MB) 44 ms (137 MB)

So many more milliseconds to enjoy life!

We now provide arm64 binaries for linux. In addition, we have
started using an option in creating linux binaries that makes
them 30% smaller.

This release has a major version bump because of a number
of (mostly minor) API changes:

* Text.Pandoc.Readers.HTML:
+ Remove exported class `NamedTag(..)`.
+ The functions `isInlineTag` and `isBlockTag` are no longer polymorphic;
they apply to a `Tag Text`.
* Text.Pandoc.Readers.LaTeX:
+ `tokenize` and `untokenize` are no longer exported.
* Text.Pandoc.Readers.LaTeX.Types is no longer exported.
* Text.Pandoc.Shared:
+ Remove formerly exported functions that are no longer used in the
code base: `splitByIndices`, `splitStringByIndicies`, `substitute`,
and `underlineSpan`.
+ Export `handleTaskListItem`.
+ Change `defaultUserDataDirs` to `defaultUserDataDir` which returns
a single directory.
* Text.Pandoc.MIME:
+ Export `getCharset`.
* Text.Pandoc.UTF8:
+ Change IO functions to return Text, not String. This affects
`readFile`, `getContents`, `writeFileWith`, `writeFile`, `putStrWith`,
`putStr`, `putStrLnWith`, `putStrLn`. `hPutStrWith`, `hPutStr`,
`hPutStrLnWith`, `hPutStrLn`, `hGetContents`.
* Text.Pandoc.App
+ Export `parseOptionsFromArgs`.
+ Add fields for CSL options to `Opt`:
`optCSL`, `optbibliography`, `optCitationAbbreviations`.
* Text.Pandoc.Class
+ Export `getTimestamp`.
* Text.Pandoc.Error
+ Add `PandocUnsupportedCharsetError` constructor for `PandocError`.
+ Export `renderError`.

Thanks to all who contributed, especially Albert Krewinkel and
new contributors Loïc Grobol, Lorenzo, Nick Berendsen,
and Nixon Enraght-Moony. Olivier Benz deserves credit for helping
make the arm64 binaries possible by providing a Docker image
with recent ghc compiled for arm64.
Reply all
Reply to author
Forward
0 new messages