Implications of setting preprocess.chunk.skip parameter to 'true'

Corinna Kinchin

unread,

Feb 17, 2021, 6:26:55 AM2/17/21

to DITA-OT Users

Good day all,

I'm hoping one of you experts might be able to shed some light on the implications of setting the preprocess.chunk.skip parameter?

To give some background, one of our customers reported an issue with duplicated glossary data as follows:

We notice strange behavior when publishing a ditamap with the audience values stored in the dita topics (instead of the ditamap).

I attached a ZIP file with 2 glossary test maps:

_glossary_map_filtering.ditamap: this ditamap has the filtering values stored in the ditamap. This works fine when publishing to Miramo PDF.
_glossary_topic_filtering.ditamap: this ditamap has no filtering values stored in the ditamap. The audience attributes are stored in the individual ditafiles. Publishing this map to Miramo PDF results in duplicated glossary entries

The intermediate topicmerged DITA produced as part of the MiramoPDF DITA-OT plugin included a lot of duplicate glossary information, which duly found its way into the final PDF.

After some digging I found that setting the 'preprocess.chunk.skip' parameter to 'true' prevents the duplication of data.

My question is, is it safe to always set the parameter to 'true' or might that have repercussions? I notice the supplied glossary_topic_filtering.ditamap contains this @chunk attribute:

TBH I don't understand 'chunking' so any feedback would be gratefully received!

Many thanks in advance

Toshihiko Makita

unread,

Feb 18, 2021, 7:14:16 AM2/18/21

to DITA-OT Users

Hi Corinna,

I don't have information about MiramoPDF DITA-OT plug-in. However if you want generate PDF from DITA instances, @chunk attribute should be always ignored because it is used for HTML output.

Look at my PDF5-ML plug-in itegrator.xml line 25 to 28.

https://github.com/AntennaHouse/pdf5-ml/blob/master/com.antennahouse.pdf5.ml/integrator.xml

In HTML output, topic is converted into one HTML page. To gather topics into one page, @chunk is frequently used. But it is not intended for PDF output.

Hope this helps your understanding.

Regards,

--
/*-----------------------------------------------------------------------------------
Toshihiko Makita
Development Group. Antenna House, Inc. Ina Branch
Web site:
http://www.antenna.co.jp/
http://www.antennahouse.com/
------------------------------------------------------------------------------------*/

2021年2月17日水曜日 20:26:55 UTC+9 ckin...@datazone.com:

Corinna Kinchin

unread,

Feb 18, 2021, 7:47:08 AM2/18/21

to DITA-OT Users

Hi Toshihiko,

thank you, that is good to know.

Many thanks for your response, much appreciated!

Reply all

Reply to author

Forward