Topics in different languages in a single PDF

Simcha Gralla

unread,

Jul 29, 2015, 9:38:54 AM7/29/15

to DITA-OT Development

We are working on a document that contains 18 topics, each of which will get printed starting on a new page, and (most importantly) each that is in a different language. In my tests with OT 2.1 so far, the transform currently only uses the default language's strings and fonts. We need both to change depending on the topic's language.

I've tried adding @xml:lang to the topicrefs and the topics, with it making no difference.

Did I miss something? If not, any leads on where we should add the necessary changes into the transform?

Simcha Gralla
IBM DITA Tools development

Jarno Elovirta

unread,

Jul 29, 2015, 1:30:40 PM7/29/15

to Simcha Gralla, DITA-OT Development

Currently it cannot be done if you're using the i18N processing (which is on by default), it only works on one language at a time. (I've disabled the I18N processing in PDF2 for all my customers, since you really don't need it with current versions of PDF renderers.) I recommend you disable I18N and just generate the correct font-family based on the language in the DITA to FO stylesheets.

For variable text, that that should already work, based on looking at the code. Does it really not use the xml:lang of the nearest ancestor?

Jarno

--
You received this message because you are subscribed to the Google Groups "DITA-OT Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dita-ot-dev...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Toshihiko Makita

unread,

Aug 1, 2015, 10:09:09 AM8/1/15

to DITA-OT Development

> (most importantly) each that is in a different language.

I'm developing DITA-OT PDF plug-in that enables to process DITA documents that have multiple languages specified by xml:lang attribute.

I guess that such PDF plug-in does not published yet until now.

Please find attached snapshots. The original authoring is as follows:

<p>We say "Hello" in English.</p>

<p>We say "<ph xml:lang="ja-JP">こんにちは！</ph>" in Japanese.</p>

<p>We say "<ph xml:lang="zh-CN">你好！</ph>" in Chinese.</p>

<p>We say "<ph xml:lang="ko-KR">안녕하세요！</ph>" in Korean.</p>

</section>

<note type="note" xml:lang="en">This is English note!</note>

<note type="note" xml:lang="ja-JP">これは日本語の備考です．This is Japanese note.</note>

<note type="note" xml:lang="zh-CN">简体字中国语的提示 This is Simplified Chinese note.</note>

<note type="note" xml:lang="zh-TW">繁体字中国语的註釋 This is Traditional Chinese note.</note>

<note type="note" xml:lang="ko-KR">이것은 한국어의 비고입니다 This is Korean note.</note>

</section>

The most important features are:

- We can apply language specific font-family from xml:lang automatically.

- We can apply language specific literals & icons from xml:lang automatically.

- We can apply language specific styles from xml:lang automatically.

The plug-in is now preparing in GitHub private repository. If it has published, I will update you.

Regards,

Toshihiko Makita

2015年7月29日水曜日 22時38分54秒 UTC+9 Simcha Gralla:

formatting-note.png

formatting-ph.png

Toshihiko Makita

unread,

Aug 31, 2015, 3:07:50 PM8/31/15

to DITA-OT Development

Hi,

I've published it followinng:

AntennaHouse/pdf5-ml

https://github.com/AntennaHouse/pdf5-ml

The multi-language sample is here:

https://github.com/AntennaHouse/pdf5-ml/tree/master/samples/sample_udhr

Regards,

--

/*--------------------------------------------------

Toshihiko Makita

Development Group. Antenna House, Inc. Ina Branch

E-Mail tma...@antenna.co.jp

Web site:

http://www.antenna.co.jp/

http://www.antennahouse.com/

--------------------------------------------------*/

2015年8月1日土曜日 23時09分09秒 UTC+9 Toshihiko Makita:

Frank Ralf

unread,

Sep 24, 2015, 11:11:49 AM9/24/15

to DITA-OT Development, sgr...@us.ibm.com

Hi Jarno,

I suspect I'm facing the same or a similar problem with DITA-OT 2.0.1. I've set @xml:lang on <note> elements but the variables don't get replaced correctly. It looks like the @xml:lang setting of the top map is overriding everything else. However, we are using a RNG-based DITA customization and might have broken the language processing inheritance mechanism somehow. Do you have any pointers how I could check this?

Kind regards,
Frank

BTW
Can you elaborate a bit on "disabling I18N processing in PDF2" you mentioned? How is this done and what is it good for?

Kendall Shaw

unread,

Sep 24, 2015, 1:10:52 PM9/24/15

to DITA-OT Development

About the relaxng part, I can offer that it should have nothing to do with whether an xml:lang attribute is set on an element in the dita document. Relaxng intentionally didn't modify the parsed result. James Clark said at the time that he thought it is not the place of a schema language to modify the result. This was compared to w3c xml schema which creates a post-schema validation infoset which modifiies the parsed result.

DITA-NG does modify the parsed result, but only by adding the DTD compatibility default values, which should not be a value of an xm:lang attribute specification.

A way to check is to look at the merged xml file and see if the xml:lang attributes are there.

As far as pdf2 variables, they work for me when specifying xml:lang on a note element (meaning the static text changes based on the language). The other thing that Simcha was referring to was fonts, like in:

org.dita.pdf2/cfg/fo/font-mappings.xml

I think.

Kendall

--
You received this message because you are subscribed to the Google Groups "DITA-OT Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dita-ot-dev...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
DITA/XML person, currently unemployed and seeking opportunities.

Frank Ralf

unread,

Sep 28, 2015, 3:07:00 AM9/28/15

to DITA-OT Development

Hi Kendall,

Many thanks for this background information and the pointers. I will have another closer look and report back.

Kind regards,
Frank

Robert D Anderson

unread,

Sep 28, 2015, 9:51:13 AM9/28/15

to Frank Ralf, DITA-OT Development

I was looking into this last week and the logic seems to be:

* Current language (for generated text) is determined with the named template getLowerCaseLang
* PDF imports the common code xsl/common/dita-utilities.xsl which has that function, returning the current language based on context
* But, PDF overrides that in org.dita.pdf2/xsl/common/vars.xsl to use the default locale, set by parameter - meaning that regardless of context, it returns the locale parameter
* To override PDF so that it works the same as XHTML (returning language of closest ancestor), I created a plugin that just copies the original template from dita-utilities. Not a lot of code, but it means I've copied code [A] to override code [B] which overrides code [A].

To get this working, I just created a plugin that extends dita.xsl.xslfo with the following stylesheet - taking the original code from dita-utilities.xsl
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet
version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template name="getLowerCaseLang">
<xsl:variable name="ancestorlangUpper">
<xsl:choose>
<xsl:when test="ancestor-or-self::*/@xml:lang">
<xsl:value-of select="ancestor-or-self::*[@xml:lang][1]/@xml:lang"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$DEFAULTLANG"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<xsl:value-of select="lower-case($ancestorlangUpper)"/>
</xsl:template>

</xsl:stylesheet>

I'd kind of like to see this as a configurable option, so that I don't need to extend it. Every multi-lang PDF I've worked with should get generated text from the closest element, rather than from the document, but the current behavior has been around so long that I'm reluctant to just change it.

Robert D Anderson
IBM Authoring Tools Development
Chief Architect, DITA Open Toolkit (http://www.dita-ot.org/)

Frank Ralf ---09/28/2015 02:07:56---Hi Kendall, Many thanks for this background information and the pointers. I will have

--

You received this message because you are subscribed to the Google Groups "DITA-OT Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to

dita-ot-dev...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward