Makefile: in make prelease, add checks that pandoc-cli and pandoc have the same version, that pandoc-cli depends on this exact version of pandoc, that there is an entry for this version in the changelog, and that the version numbers in the generated man pages are correct.
T.P.RoffChar: escape - as \-. The groff_man (7) man page indicates that - characters will be treated as typographic hyphens and are not appropriate for cases where the output should be copy-pasteable as an ASCII hyphen-minus character. (E.g. in command line options.) However, until a recent update groff man did not actually do this; it treated - and \- the same. With the new update (1.23.0) the two are distinguished (see for background), so now it is important that pandoc escape -.
Fix regression on short boolean arguments (#8956). In 3.1.5 boolean arguments were allowed an optional argument (truefalse). This created a regression for uses of fused short arguments, e.g. -somyfile.html, which was equivalent to -s -omyfile.html, but now raised an error because pandoc attempted to parse o as a boolean true or false. This change allows the fused short arguments to be used again. Note that -strue will be interpreted as -s with an argument true, not as -s -t -rue. It is best to use long option names with the optional boolean values, to avoid confusion.
Improve errors for illegal output formats. Previously if you did pandoc -s -t bbb, it would give you an error about the missing bbb template instead of saying that bbb is not a supported output format.
RST writer: fix figure handling (#8930, #8871). This fixes a number of regressions from pandoc 2.x. Properly handle caption, alt attribute in figures. No longer treat a paragraph with a single image in it as a figure (we have a dedicated Figure element now).
Pandoc has a modular design: it consists of a set of readers, which parse text in a given format and produce a native representation of the document (an abstract syntax tree or AST), and a set of writers, which convert this native representation into a target format. Thus, adding an input or output format requires only adding a reader or writer. Users can also run custom pandoc filters to modify the intermediate AST.
Supported input and output formats are listed below under Options (see -f for input formats and -t for output formats). You can also use pandoc --list-input-formats and pandoc --list-output-formats to print lists of supported formats.
By default, pandoc will use LaTeX to create the PDF, which requires that a LaTeX engine be installed (see --pdf-engine below). Alternatively, pandoc can use ConTeXt, roff ms, or HTML as an intermediate format. To do this, specify an output file with a .pdf extension, as before, but add the --pdf-engine option or -t context, -t html, or -t ms to the command line. The tool used to generate the PDF from the intermediate format may be specified using --pdf-engine.
When using LaTeX, the following packages need to be available (they are included with all recent versions of TeX Live): amsfonts, amsmath, lm, unicode-math, iftex, listings (if the --listings option is used), fancyvrb, longtable, booktabs, graphicx (if the document contains images), bookmark, xcolor, soul, geometry (with the geometry variable set), setspace (with linestretch), and babel (with lang). If CJKmainfont is set, xeCJK is needed. The use of xelatex or lualatex as the PDF engine requires fontspec. lualatex uses selnolig. xelatex uses bidi (with the dir variable set). If the mathspec variable is set, xelatex will use mathspec instead of unicode-math. The upquote and microtype packages are used if available, and csquotes will be used for typography if the csquotes variable or metadata field is set to a true value. The natbib, biblatex, bibtex, and biber packages can optionally be used for citation rendering. The following packages will be used to improve output quality if present, but pandoc does not require them to be present: upquote (for straight quotes in verbatim environments), microtype (for better spacing adjustments), parskip (for better inter-paragraph spaces), xurl (for better line breaks in URLs), and footnotehyper or footnote (to allow footnotes in tables).
Write output to FILE instead of stdout. If FILE is -, output will go to stdout, even if a non-textual format (docx, odt, epub2, epub3) is specified. If the output format is chunkedhtml and FILE has no extension, then instead of producing a .zip file pandoc will create a directory FILE and unpack the zip archive there (unless FILE already exists, in which case an error will be raised).
Shift heading levels by a positive or negative integer. For example, with --shift-heading-level-by=-1, level 2 headings become level 1 headings, and level 3 headings become level 2 headings. Headings cannot have a level less than 1, so a heading that would be shifted below level 1 becomes a regular paragraph. Exception: with a shift of -N, a level-N heading at the beginning of the document replaces the metadata title. --shift-heading-level-by=-1 is a good choice when converting HTML or Markdown documents that use an initial level-1 heading for the document title and level-2+ headings for sections. --shift-heading-level-by=1 may be a good choice for converting Markdown documents that use level-1 headings for sections to HTML, since pandoc uses a level-1 heading to render the document title.
Filters may be written in any language. Text.Pandoc.JSON exports toJSONFilter to facilitate writing filters in Haskell. Those who would prefer to write filters in python can use the module pandocfilters, installable from PyPI. There are also pandoc filter libraries in PHP, perl, and JavaScript/node.js.
Preserve tabs instead of converting them to spaces. (By default, pandoc converts tabs to spaces before parsing its input.) Note that this will only affect tabs in literal code spans and code blocks. Tabs in regular text are always treated as spaces.
Specifies a custom abbreviations file, with abbreviations one to a line. If this option is not specified, pandoc will read the data file abbreviations from the user data directory or fall back on a system default. To see the system default, use pandoc --print-default-data-file=abbreviations. The only use pandoc makes of this list is in the Markdown reader. Strings found in this list will be followed by a nonbreaking space, and the period will not produce sentence-ending space in formats like LaTeX. The strings may not contain spaces.
Use the specified file as a custom template for the generated document. Implies --standalone. See Templates, below, for a description of template syntax. If no extension is specified, an extension corresponding to the writer will be added, so that --template=special looks for special.html for HTML output. If the template is not found, pandoc will search for it in the templates subdirectory of the user data directory (see --data-dir). If this option is not used, a default template appropriate for the output format will be used (see -D/--print-default-template).
Run pandoc in a sandbox, limiting IO operations in readers and writers to reading the files specified on the command line. Note that this option does not limit IO operations by filters or in the production of PDF documents. But it does offer security against, for example, disclosure of files through the use of include directives. Anyone using pandoc on untrusted user input should use this option.
Note: some readers and writers (e.g., docx) need access to data files. If these are stored on the file system, then pandoc will not be able to find them when run in --sandbox mode and will raise an error. For these applications, we recommend using a pandoc binary compiled with the embed_data_files option, which causes the data files to be baked into the binary instead of being stored on the file system.
Determine how text is wrapped in the output (the source code, not the rendered version). With auto (the default), pandoc will attempt to wrap lines to the column width specified by --columns (default 72). With none, pandoc will not wrap lines at all. With preserve, pandoc will attempt to preserve the wrapping from the source document (that is, where there are nonsemantic newlines in the source, there will be nonsemantic newlines in the output as well). In ipynb output, this option affects wrapping of the contents of markdown cells.
Specifies the coloring style to be used in highlighted source code. Options are pygments (the default), kate, monochrome, breezeDark, espresso, zenburn, haddock, and tango. For more information on syntax highlighting in pandoc, see Syntax highlighting, below. See also --list-highlight-styles.
Instructs pandoc to load a KDE XML syntax definition file, which will be used for syntax highlighting of appropriately marked code blocks. This can be used to add support for new languages or to use altered syntax definitions for existing languages. This option may be repeated to add multiple syntax definitions.
A stylesheet is required for generating EPUB. If none is provided using this option (or the css or stylesheet metadata fields), pandoc will look for a file epub.css in the user data directory (see --data-dir). If it is not found there, sensible defaults will be used.
For best results, the reference docx should be a modified version of a docx file produced using pandoc. The contents of the reference docx are ignored, but its stylesheets and document properties (including margins, page size, header, and footer) are used in the new docx. If no reference docx is specified on the command line, pandoc will look for a file reference.docx in the user data directory (see --data-dir). If this is not found either, sensible defaults will be used.
To produce a custom reference.docx, first get a copy of the default reference.docx: pandoc -o custom-reference.docx --print-default-data-file reference.docx. Then open custom-reference.docx in Word, modify the styles as you wish, and save the file. For best results, do not make changes to this file other than modifying the styles used by pandoc:
8d45195817