Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Missing japanese characters in PDF file

262 views
Skip to first unread message

Alexander Landa

unread,
Jun 2, 2016, 9:25:25 AM6/2/16
to
Hello all,

it's many years ago I've intensivelly worked with Tex, GS, PDF and related stuff but never done something with asian CJK fonts.

But today I've a requiremet to put japanese chars in our reports.
All reports were generated by our own specialized .Net Library which worked well.

It's particulary successed, but some characters are not rendered by Adobe Reader DC (actually updated).

What I have done is:
1. use one of standard Adobe Fonts "HeiseiMin-W3" which is installed with Adobe Reader DC Extended Asian Fonts Pack. This is checked to be sure - font is in right place C:\Program Files (x86)\Adobe\Acrobat Reader\DC\Resource\CIDFont\KozMinPr6N-Regular.otf.
2. Reencode Unicode to Big Endian Unicode inside programm code
3. Write /Encoding /UniJIS-UTF16-H. Don't know why but this produce the best results and likes to me as UniJIS-UTF16-H CMap will be used. Checked: the CMap-file is also installed in right place c:\Program Files (x86)\Adobe\Acrobat Reader DC\Resource\CMap\UniJIS-UTF16-H.

Within very simply PDF file.

Following string must be the output:

覎绨 脂閙 xx スターゼン株式会社

But what I see is:

􈦎􇻨 脂閙 xx スターゼン株式会社

The first two are replaced by placeholders

In unicode byte[] =
{142, 137, 232,126, 32, 0, ....}

or in big endian unicode byte [] =

{137,142 126,232, 0,32, ....}

In other words the chars 覎绨 with unicodes pairs {142,137} and {232,126} are not renderable.

I've particulary overheated my head to put some japanese chars in PDF and was now I'm very unlucky that it does not worked well.

Can somebody, please, help me to resolve this problem?

Here is a link to an example document (without flate encoding for readability!)

https://drive.google.com/file/d/0B1tF6IyZkk3ES1BfZ0xfNVVGMHc/view?usp=sharing

P.S. Inside PDF the internal FontName is "Meiryo-H" (simply don't changed!) but the experts know, that has nothing with real font name. Sad here to prevent confusion.

My problem is I can't read and understand japanese and I can't even check if some chars are simply missing inside installed Adobe Font.

Many, many thanks in advance!!!

Alex

Peter Flynn

unread,
Jun 6, 2016, 4:54:46 PM6/6/16
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 06/02/2016 02:25 PM, Alexander Landa wrote:
> Hello all,
>
> it's many years ago I've intensivelly worked with Tex, GS, PDF and
> related stuff but never done something with asian CJK fonts.
>
> But today I've a requiremet to put japanese chars in our reports.

I wanted a Japanese phrase in my (LaTeX) thesis, so I used the CJK
package:

\documentclass{report}
\usepackage[utf8x]{inputenc}
\usepackage[T1]{fontenc}
\usepackage[encapsulated]{CJK}
\begin{document}
Following string must be the output:
\begin{CJK}{UTF8}{min}覎绨 脂閙 xx スターゼン株式会社\end{CJK}
\end{document}

That works (using pdflatex).

///Peter

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBCAAGBQJXVeMTAAoJEHt9ZfbX6inQjEQH/A1zP4a7WyTdd4nV0meI5vt/
SO7mfGyXaYPzpDZXJg6E5+I3Iwkk6pxkz6+5qHDOdncg3uBmoQcotr3LzZPST6yD
BtUr8eOhYdaokg64hUR0whkgtMM132FqORvzNXol4jK2Uw1JRXiiNAikgbgoYwZu
k0QqZQgZa5x/FH0ExPXTIOdww6T6J2krYqmcQk3ffDV+m77EsVeCLlbhWlU6q+11
JzV3AzD0w3yuhqIQIsReI1E9Igjm+HrrJmU2IvG46KLPM/b+UK6RSjmjfR01mos1
KPD3IKJqXRpnB9oOFlyd2Sk8tGmnHzNTMwJZj25g04ZO3VVFSCIP5ZqYBqPORXk=
=m4Ie
-----END PGP SIGNATURE-----
0 new messages