Dave Jarvis <
dave....@gmail.com> started it with:
T. Kurt Bond <
tkur...@gmail.com> made some changes to the filter
from that page and then noted:
> Unfortunately, there is a problem. See that .LP right after the
> .startpoem in the output? It turns out .LP is not allowed in a
> display, so the the .LP cancels the display and the lines show up
> filled in the output.
John MacFarlane <
j...@berkeley.edu> wrote:
> Just have your lua filter change the Para element inside the
> container into a Plain element.
That worked very well.
I work with ReStructuredText documents a lot, and wanted to try
something like this with one of them.
This filter wraps spans with a class, such as from interpreted text
roles defined in the source ReST (like :program:`pandoc`) in calls
to user defined groff strings \*[start<class>] and \*[stop<class>]
(the definitions are included in the source ReST as a raw block for
ms output) that include groff escapes to change the font and the
glyph color and then change back to the previous font and glyph
color.
It also wraps divs with classes with calls to user defined groff
macros .start<class> and .stop<class> (also included in the source
ReST as a raw block for ms output).
For divs with the poem class, it converts any contained LineBlock
elements into a list of Plain elements containing its contents,
avoiding the ms output for the LineBlock starting with .LP, which
would cancel the .DS (start display) macro we want to use in the
.startpoem macro definition. The .LP would also reset the font family
in use to the default, another reason to avoid it.
It also converts the empty element that occurs in the line block
as a result of a blank line in the line block input into a RawBlock
that creates a blank line in the ms output, to show the division into
stanzas of the poem.
Interestingly, the first Str elements in the each line in the content
of the line block preserved the leading spaces from the input as
Unicode NO-BREAK SPACE characters, preserving indentation of lines in
the line block. Unfortunately, the width of those spaces alone is not
enough create a visually distinct indentation, so this filter changes
those Str elements into a RawInline that outputs a groff horizfontal
movement whose width is based on the number of leading NO-BREAK SPACE
characters, and follow this with a new Str element that has the
leading NO-BREAK SPACE characters removed.
Here is the lua filter:
===== classify-rst-ms.lua ==================================
onig = require ("rex_onig") -- Need a regex package that understands UTF8.
-- text in LineBreak preserves leading spaces as Unicode NO-BREAK SPACE
leading_nobreakspace_rx = onig.new ("^(\u{a0}+)(.*)$", nil, "UTF8")
function Div( element )
local annotation = element.classes:find_if( matches )
local numPara = 0
if annotation then
annotation = annotation:gsub( "[^%w]*", "" )
if annotation == "poem" then
element = pandoc.walk_block (
element, {
-- Replace LineBlock element with a list of Plain elements
-- containing the LineBlock's subelements.
LineBlock = function (el)
local l = {}
for _, subel in ipairs (el.content) do
if #subel == 0 then
-- If subel is an empty table, output a raw empty line
table.insert (l, pandoc.RawBlock ("ms", "\n\n"))
else
-- Check for leading NO-BREAK SPACE charaters
local m1, m2 = onig.match (subel[1].text,
leading_nobreakspace_rx)
if m1 then
-- Replace the NO-BREAK SPACE characters with a raw
-- groff horizontal movement, because the
-- NO-BREAK SPACE characters are too narrow.
table.insert (subel, 1, pandoc.RawInline ("ms", string.format ("\\h'%dn'", utf8.len (m1))))
-- Modify what was used to be the first item to just
-- include the trailing characters of the match.
subel[2] = pandoc.Str (m2)
table.insert (l, pandoc.Plain (subel))
else
-- Just put the subel in Plain element.
table.insert (l, (pandoc.Plain (subel)))
end
end
end
return l
end })
end
Here is the ReST source of the document:
===== poem-plus.rst ========================================
Lua Filters For Massaging ``ms`` Output
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
.. raw:: ms
.ds startprogram \\f[CW]\\m[red]
.ds stopprogram \\m[]\\fP
.de startpoem
.ds OLDFAM \\*[FAM]
.ds FAM BM
.DS I 3
..
.de stoppoem
.DE
.ds FAM \\*[OLDFAM]
..
.. role:: program
This is a sentence. This sentence talks about :program:`pandoc`.
This is
another sentence.
.. class:: poem
| Some say the world will end in fire,
| Some say in ice.
| From what I've tasted of desire
| I hold with those who favor fire.
| But if it had to perish twice,
| I think I know enough of hate
| To say that for destruction ice
| Is also great,
| And would suffice.
|
| And another line,
| And an indented line.
This is a final sentence.
============================================================
And here is the ms output:
=====
poem-plus-rst.ms =====================================
.SH 1
Lua Filters For Massaging \f[CB]ms\f[B] Output
.pdfhref O 1 "Lua Filters For Massaging ms Output"
.pdfhref M "lua-filters-for-massaging-ms-output"
.ds startprogram \\f[CW]\\m[red]
.ds stopprogram \\m[]\\fP
.de startpoem
.ds OLDFAM \\*[FAM]
.ds FAM BM
.DS I 3
..
.de stoppoem
.DE
.ds FAM \\*[OLDFAM]
..
.LP
This is a sentence.
This sentence talks about \*[startprogram]pandoc\*[stopprogram].
This is
another sentence.
.startpoem
Some say the world will end in fire,
\h'3n'Some say in ice.
From what I\[aq]ve tasted of desire
\h'3n'I hold with those who favor fire.
But if it had to perish twice,
\h'3n'I think I know enough of hate
\h'3n'To say that for destruction ice
\h'3n'Is also great,
And would suffice.
And another line,
\h'3n'And an indented line.
.stoppoem
.LP
This is a final sentence.
============================================================
Being able to rewrite the tree and insert RawBlocks and RawInlines is
really powerful when it comes to customizing output for particular
output formats.
I hope this example is useful for others like me just learning to use
Lua filters.