Hey Furjuk,
2013/8/27 Furjuk Mityos <
mtmty...@gmail.com>:
> Hi,
>
> Thank you for trying.
>
> I knew some viewer cannot view utf8 (or other normal) text with custom CID
> map.
> You can select CID text or normal text by new TT loading option.
> At first I thought CID text should be default because many viewers can view,
> but I found we cannot use word-space in CID text, so normal text must be
> default for backward compatibility.
Perhaps it's important to clarify what is what (for the record and for
the ignorant).
The utf8 file defines a custom encoding that corresponds to UTF-8, and
encodes the Unicode text in the bytestream using UTF-8 encoding, is
that correct?
I did the previous (limited) implementation of unicode support in
libharu and would be happy to see a more mature solution in place (and
you seem to have made many other improvements too). But it was also my
experience that defining a UTF-8 encoding in the PDF file broke some
viewers (such as Preview) and that's why we settled with the 'identity
encoding' (i.e. UCS-2/UCS-4) in the byte stream which we got to work
on all viewers. I believe that crashing a PDF viewer (that is default
on MacOSX) kills the UTF-8 approach in practical terms?
Therefore I think the method used in cidtext.pdf file should be the
default, or what downside is there to notice? How does the
'word-space' issue manifest itself? Is the cidtext.pdf file using
Identity-H encoding similar to what the previous implementation did?
Backwards compatibilty would rather point at CID map instead of UTF-8
encoding; is this what you mean?
Regards,
koen