Moin Markus (und Gruß nach Karlsruhe),
As you discovered, supporting this charset is not trivial because its text
encoding model is different from Unicode's. It is possible to add support
for such charsets under ICU's ucnv_* API as we have shown with ISCII, but
it is not easy.
There are many charsets that require more than 1:1 code point mappings,
but each requires different special handling. The problem is that these
charsets are all used much, much less frequently than the ones ICU already
supports, so we are getting into a field of diminishing returns for
increasing efforts.
Our current thinking is to make a new API with existing implementations:
Combining a converter and a transliterator to perform complex conversions
in a two-step manner but with a single call.
It would be better for everyone to just work with Unicode charsets; that
would give us more time to add more interesting features, like beefing up
our regular expression support :-)
Having said this, I encourage you to do several things:
- You could file an RFE in our Jitterbug system for support of ISO 5426.
+ Note that there is an RFE already to marry an ICU converter with a
transliterator
in an easy API.
- You could use a converter+transliterator yourself in your code, as
George suggested.
- Since ICU is open source, you could implement either a dedicated
converter or the cnv+translit RFE.
Viele Grüße/Frohe Weihnachten/Guten Rutsch,
markus
Markus Scherer IBM GCoC-Unicode/ICU San José, CA
markus....@us.ibm.com
Markus Schöpflin <
markus.s...@ginit-technology.com>
Sent by:
icu-chars...@www-124.southbury.usf.ibm.com
2002-12-18 04:05
To:
icu-ch...@www-124.southbury.usf.ibm.com
cc:
Subject: ISO 5426:1983 mapping
...