New extension point - preprocess-prefilter

24 views
Skip to first unread message

Tom Glastonbury

unread,
Jul 22, 2015, 6:18:36 AM7/22/15
to DITA-OT Development
Hi all,

I have been trying to solve a particular problem, and have come up with a solution that involves adding a new extension point to the base OT. Further, one plugin that I have developed to use this extension point may well be worth absorbing into the base OT as built-in functionality (see (4) below). I wanted to discuss this here prior to firing off a pull request.

Feedback please!

Thanks,

Tom

My implementation is here:

1. The problem I needed to solve

I wanted to allow wiki-style references (primarily keyconrefs) within text nodes within any DITA XML file - text like "[[SomeKey]]". From here on, I will call these "wikilinks". The wikilinks are transformed as appropriate into XML like "<ph conkeyref="SomeKey"/>. The content inside [[ ]] can be richer than just a key name - I'm simplifying here. The transformation is implemented using XSLT. The overall format of the source files remains fully DITA compliant.

2. Available extension points

Because the wikilinks transform creates conkeyrefs, the transform needs to run before gen-list and debug-filter. Because the original DITA source files are not copied to the temp directory prior to gen-list/debug-filter, it's not possible to add a new module that modifies the files because it would have to modify the original source files, not the (not yet copied) temp files. The only possible extension point I could find was dita.parser, but it was not clear how this could be used for all dita files as it appears to have been designed for supporting additional non-DITA formats (eg, markdown). I also wanted the ability to chain several filters.

3. Solution

To solve this, I added a new extension point that I've called "dita.preprocess.prefilter". This adds XMLFilter objects to the start of the pipeline (in GenMapAndTopicListModule.getProcessingPipe(...) and DebugAndFilterModule.getProcessingPipe(...)). The XMLFilter objects can be expressed as java class names or as XSLT file names. Plugin dependency order is respected so that the transforms are added to the pipe in the expected order.

4. XSLT/DOCTYPE issue

There is a further XSLT/DOCTYPE issue. The wikilinks transform needs to handle XML documents of various doctypes, and emit documents with the same doctype as the input. My research shows that this is not possible without giving XSLT some help. So I implemented an XMLFilter that uses org.xml.sax.ext.LexicalHandler to examine DTD events during parsing, and emit processing instructions that expose the public and system doctypes. These can then be used by an XSLT 2.0 transform to handle input documents with arbitrary doctype. It is this injection of the doctype-public and doctype-system processing instructions that I think might be worth absorbing into the base OT. The implementation of this XMLFilter is not on github (it's currently one of our in-house plugins). If it's deemed desirable to have it in base OT, I'll add it to my dita-ot fork.

The XMLFilter adds PIs like:

<?doctype-public -//OASIS//DTD DITA Composite//EN?>
<?doctype-system ditabase.dtd?>

An XSLT transform can then be written like:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0">
    
    <xsl:template match="@*|node()" mode="#all">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()" mode="#current"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="/">
      <xsl:result-document
        doctype-public="{processing-instruction('doctype-public')}"
        doctype-system="{processing-instruction('doctype-system')}">
        <xsl:apply-templates/>
      </xsl:result-document>
    </xsl:template>
    
</xsl:stylesheet>


Jarno Elovirta

unread,
Jul 22, 2015, 11:49:00 AM7/22/15
to Tom Glastonbury, DITA-OT Development
To me this sounds so advanced usage that a simple extension point may not cut it. Especially since I'm planning on at some point to merge gen-list and debug-filter. Need to think about this.

J

--
Sent from a mobile device.
--
You received this message because you are subscribed to the Google Groups "DITA-OT Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dita-ot-dev...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tom Glastonbury

unread,
Jul 22, 2015, 12:20:18 PM7/22/15
to Jarno Elovirta, DITA-OT Development
Hi Jarno,

Thanks for giving it some thought. 

I've also now put the DTD to processing instruction plugin up on github:

Many thanks,

Tom
Reply all
Reply to author
Forward
0 new messages