ANN: pandoc 3.0

58 views
Skip to first unread message

John MacFarlane

unread,
Jan 18, 2023, 4:28:00 PM1/18/23
to pandoc-...@googlegroups.com, pandoc-...@googlegroups.com

I'm pleased to announce the release of pandoc 3.0,
available in the usual places:

Binary packages & changelog:
https://github.com/jgm/pandoc/releases/tag/3.0

Source & API documentation:
http://hackage.haskell.org/package/pandoc-3.0

This is a monster release, with many changes (the changelog
is over 1000 lines long).

Significant user-facing changes
-------------------------------

* The main `pandoc` package now contains just the library;
the executable has been moved to `pandoc-cli`. (So, if you
used to `cabal install pandoc` you will now need to
`cabal install pandoc-cli` to get the command-line tool.)

* pandoc now behaves like a Lua interpreter when called as
`pandoc-lua` or when `pandoc lua` is used. The Lua API that
is available in filters is automatically available.
(See the `pandoc-lua` man page.)

* pandoc behaves like a server when called as `pandoc-server`
or when `pandoc server` is used. (See the `pandoc-server`
man page.)

* `pandoc-server` now returns a JSON object if JSON is accepted
(instead of just a JSON-encoded string, as previously). Properties
are `output` (string), `base64` (boolean), and `messages`
(array of string).

* New output format: `chunkedhtml`. This creates a zip file
containing multiple HTML files, one for each section,
linked with "next," "previous," "up," and "top" links.
(If `-o` is used with an argument without an extension,
it is treated as a directory and the zip file is automatically
extracted there, unless it already exists.) The top page will
contain a table of contents if `--toc` is used. A
`sitemap.json` file is also included. The option
`--split-level` determines the level at which sections are
to be split.

* A new command-line option `--list-tables`, causes tables to be
formatted as list tables in RST.

* New command line option: `--epub-title-page=true|false` allows
the EPUB title page to be omitted.

* `--reference-doc` can now accept a URL argument (#8535) and
load a remote reference doc.

* A new option `--split-level` replaces `--epub-chapter-level`
and affects both EPUB and chunked HTML output.
`--epub-chapter-level` will still work but is deprecated.

* These deprecated options have been removed: `--atx-headers`,
`--strip-empty-paragraphs`.

* Add new `mark` extension for highlighted text in Markdown,
using `==` delimiters.

* Add new extensions `wikilinks_title_after_pipe` and
`wikilinks_title_before_pipe` for `commonmark` and `markdown`.
The former enables links of style `[[Name of page|Title]]` and
the latter `[[Title|Name of page]]`.

* In a Markdown fenced code block, a "bare" language can now be
combined with attributes, e.g.

```haskell {.class #id}

* In grid tables, a table foot can be specified by enclosing it
with part separator lines (`+` and `=`):

+------+-------+
| Item | Price |
+======+=======+
| Eggs | 5£ |
+------+-------+
| Spam | 3£ |
+======+=======+
| Sum | 8£ |
+======+=======+

* Improved support for complex figures in many formats. There is
now a dedicated block-level constructor for Figure, and
figures can contain short captions and arbitrary block-level
content. In Markdown the `implicit_figures` extension
currently remains the only way to represent figures, but the
groundwork is there a more flexible figure syntax.

* A new `tagging` extension is now supported for the `context`
writer, for producing tagged PDFs.

* Pandoc no longer looks in `readers` and `writers`
subdirectories of the user data directory to find custom
readers and writers. Scripts in those directories must be
moved to the `custom` folder.

* The `man` writer now uses UTF-8 by default; if you are
generating man pages for a system that does not support UTF-8,
use `--ascii`.

* The default HTML template now uses less opinionated CSS, not
specifying a font size, line height, or font family.
A new `maxwidth` variable sets `max-width`; if not set, 36em is
used as a default.

* We have reduced the use of inline CSS used for EPUBs. Almost
everything is now in the default EPUB CSS, which can be overridden
either by putting `epub.css` in the user data directory or by using
`--css` on the command line. Inline styles are only used for
syntax highlighting (which depends on the style specified, and
is only included on pages with highlighted code) and for bibliography
formatting (which can depend on the CSL style, and is only used in the
page containing the bibliography).

* The Lua subsystem now provides many new useful functions,
allowing even programmatic control over extensions and
formats, templates, zip files, text encodings, CLI options,
embedded media, and division into sections and chunks.

* Classic custom writers are deprecated. The global variables
`PANDOC_DOCUMENT` and `PANDOC_WRITER_OPTIONS` are no longer set
when the writer script is loaded. Both variables are still set in
classic writers before the conversion is started, so they can be used
when they are wrapped in functions. A new function
`pandoc.write_classic` can be used to convert a classic writer into
a new-style writer: see the changelog for an example of its use.

* It is now possible to have a custom reader and a custom writer for
a format together in the same file.

* It is now possible to create a bytestring writer (e.g. one
that produces a zip file).

* Custom readers and writers can now define the extensions that
they support via the global `writer_extensions`.

* Custom writers can define a default template via a global `Template`
function; the data directory is no longer searched for a default
template. See the changelog for instructions on restoring
the old behavior.

API changes
-----------

* pandoc-server, pandoc-cli, and pandoc-lua-engine have been
split off into separate packages (#8309). It is possible to
compile pandoc without Lua or server support, using cabal
flags.

* New module Text.Pandoc.Writers.ChunkedHTML.

* Rename Text.Pandoc.Readers.Odt -> Text.Pandoc.Readers.ODT,
for consistency with Writers.ODT. Rename readOdt -> readODT.

* Rename Text.Pandoc.Writers.Docbook -> Text.Pandoc.Writers.DocBook.
Rename writeDocbook -> writeDocBook, for consistency with
the DocBook reader's naming.

* Text.Pandoc.App:

+ parseOptionsFromArgs and parseOptions now return Either OptInfo Opt.
+ Add OptInfo type.
+ Add handleOptInfo function.
+ convertWithOpts: add argument for a ScriptingEngine.
+ New optEpubTitlePage field on Opt.
+ Remove optEpubChapterLevel, add optSplitLevel.
+ Export IpynbOutput(..).

* New exported module Text.Pandoc.Slides.

* New module Text.Pandoc.Format.

* Text.Pandoc.Sources: Add UpdateSourcePos instances for String and
strict and lazy ByteString.

* Text.Pandoc.Extensions:

+ Fix JSON decoding of Extensions.
+ Add new exported function readExtension.
+ Remove parseFormatSpec. This has been moved to Text.Pandoc.Format and
renamed as parseFlavoredFormat.
+ Add CustomExtension constructor to Extension.
+ Remove Bounded, Enum instances for Extension.
+ Add extensionsToList function.
+ Add showExtension.
+ Add Ext_mark, Ext_tagging, Ext_wikilinks_title_after_pipe,
Ext_wikilinks_title_before_pipe constructors for Extension.

* Text.Pandoc.XML: Export lookupEntity.

* Text.Pandoc.Parsing:

+ Remove gratuitious renaming of Parsec types. We were exporting
Parser, ParserT as synonyms of Parsec, ParsecT.
New (re-)exports: Stream(..), updatePosString, SourceName,
Parsec, ParsecT. Removed exports: Parser, ParserT
+ Export errorMessages, messageString.
+ Export fromParsecError, which can be used to turn a parsec
ParseError into a regular PandocParseError.
+ Remove unused function nested.
+ Change characterReference, charsInBalanced. characterReference
so they now return a Text (some named references don't correspond
to a single Char). Use the the lookupEntity function from
commonmark-hs instead of the slow one from tagsoup.
+ charsInBalanced now takes a Text parser rather than a Char parser
as argument.

* Text.Pandoc.Shared:

+ Export textToIdentifier.
+ Remove deprecated crFilter.
+ Remove deprecated deLink.
+ Deprecate notElemText.
+ Deprecate makeMeta.
+ Remove pandocVersion (now available in Text.Pandoc.Version
as pandocVersionText).
+ Remove findM.
+ Remove deprecated makeMeta.
+ Remove ordNub. Use nubOrd from Data.Containers.ListUtils instead.
+ Remove mapLeft. This is just a synonym for Bifunctor.first.
+ Remove elemText, notElemText.
+ Drop export of pandocVersion and pandocVersionText,
which are now exported by Text.Pandoc.Version.
+ Remove escapeURI, isURI. These are now exported by Text.Pandoc.URI.
+ Use LineBreak as default block sep in blocksToInlines.
+ defaultUserDataDir is no longer exported (it has been
moved to Text.Pandoc.Data).
+ New function figureDiv, offering offers a standardized way
to convert a figure into a Div element.

* Text.Pandoc.Writers.Shared: export htmlAddStyle,
htmlAlignmentToString, and htmlAttrs.

* Text.Pandoc.Options:

+ WriterOptions now has a field writerListTables.
+ New writerEpubTitlePage field on WriterOptions.
+ Remove writerEpubChapterLevel, add writerSplitLevel.

* Text.Pandoc.Filter:

+ Export applyFilters, applyJSONFilter.
+ Parameterize applyFilters over scripting engine.

* New exported module Text.Pandoc.Chunks. This
module provides functions to split Pandoc documents into
chunks to be rendered in separate files, e.g. one per section.

* Text.Pandoc.Readers: change argument type of getReader, so it takes a
FlavoredFormat instead of a Text.

* Text.Pandoc.Writers: change argument type of getWriter, so it takes a
FlavoredFormat instead of a Text.

* New exported module Text.Pandoc.Scripting. The module contains the
central data structure for scripting engines (e.g., Lua).

* Text.Pandoc.Error:

+ Add new PandocError constructors PandocNoScriptingEngine,
PandocFormatError, PandocNoTemplateError.
+ Remove PandocParsecError constructor. Henceforth we just use
PandocParseError.

* New module Text.Pandoc.Version, exporting pandocVersionText
and pandocVersion. pandocVersion returns a Version instead
of a Text.

* Text.Pandoc.Class:

+ Remove exports readDataFile, readDefaultDataFile, setTranslations,
and translateTerm. (See Text.Pandoc.Data, Text.Pandoc.Translations
for these functions.)
+ Export checkUserDataDir.

* New exported module Text.Pandoc.Class.IO.

* Text.Pandoc.Data is now an exported module, providing readDataFile
and readDefaultDataFile (both formerly provided by
Text.Pandoc.Class), and also getDataFileNames (formerly
unexported in Text.Pandoc.App.CommandLineOptions)
and defaultUSerDataDir (formerly provided by Text.Pandoc.Shared).

* New exported module Text.Pandoc.Translations is now an exported module
(along with Text.Pandoc.Translations.Types), providing readTranslations,
getTranslations, setTranslations, translateTerm, lookupTerm,
readTranslations, Term(..), and Translations.

* Text.Pandoc now exports Text.Pandoc.Data and setTranslations
and translateTerm.

* Remove modules Text.Pandoc.Writers.Custom and
Text.Pandoc.Readers.Custom. The functions writeCustom and
readCustom are available from module Text.Pandoc.Lua.

* Text.Pandoc.Server is now in a separate package,
pandoc-server. parseServerOpts has been removed.

* Text.Pandoc.Lua is now in a separate package,
pandoc-lua-engine.

+ Export applyFilter, readCustom, and writeCustom.
+ No longer export the lower-level function runFilterFile.
+ Change type of applyFilter:

applyFilter :: (PandocMonad m, MonadIO m)
=> Environment-> [String]-> FilePath-> Pandoc-> m Pandoc

where Environment is defined in Text.Pandoc.Filter.Environment.
+ Export new function getEngine.
+ The writeCustom function has changed to return a Writer and
an ExtensionsConfig. This allows ByteString writers to be defined.
+ The readCustom function has changed to return a Reader and an
ExtensionsConfig.

---

For a full accounting of the many changes and bug fixes in this
release, see the changelog.

Thanks to all who contributed, including Akos Marton, Albert
Krewinkel, Alexander Batischev, Amar Al-Zubaidi, Amir Dekel,
Aner Lucero, Artem Pelenitsyn, Bastien Dumont, Francesco Occhipinti,
Ian Max Andolina, Ilona, Jeremie Knuesel, Justin Wood, Link Swanson,
Marcin Serwin, Mathias Walter, Olivier Benz, Pranesh Prakash, Prat,
R. N. West, Ruqi, Siphalor, Sven Wick, Terence Eden, TomBen,
Vladimir Alexiev, William Rusnack, Wout Gevaert, lifeunleaded,
nbehrnd, and vkraven.

Reply all
Reply to author
Forward
0 new messages