Does anyone know if there are any microsoft libraries that have the same
functionality as the iconv convertor that is available on Unix.
I have found the GNU licenced version but would prefer to use a microsoft
supplied library if one exists.
Thanks
Neil
--
MichKa
a new book on internationalization in VB at
http://www.i18nWithVB.com/
"Neil Dando" <shandy...@THISmyrealbox.com> wrote in message
news:enwuVMxUAHA.196@cppssbbsa04...
To give some more detail I'm trying to convert the text of an email from a
LPSTR as retrieved by an SDK we are using into a Unicode format to be passed
on to a handheld device. The email has the useual Mime headers with
charset="ISO-2022-JP", and the content is:
I think you're a big bad ぽとあとろあと. So there.
When I view the email in outlook express it recognises the encoding as
Japanese (Autoselect) and displays it correctly as: (Hope this displays
correctly on the newsgroup post).
I think you're a big bad ???????. So there
N.B. I'm running Windows2000 with the Japanese IME also loaded.
If I force the encoding being used to be Western European (Windows) the
content is displayed in the raw form.
Will using MultibyteToWide or MLang enable me to convert the string into a
unicode CString or is there another approach I need to use.
Thanks for any advice
Neil
"Michael (michka) Kaplan" <forme...@spamfree.trigeminal.nospam.com> wrote
in message news:OgcZljxUAHA.296@cppssbbsa04...
--
MichKa
a new book on internationalization in VB at
http://www.i18nWithVB.com/
"Neil Dando" <shandy...@THISmyrealbox.com> wrote in message
news:uVbFjB8...@cppssbbsa02.microsoft.com...
In a nutshell, ISO-2022-JP (actually ISO-2022 in general) uses escape
sequences to switch character sets. Text is initially assumed to be in
ISO-8859-1 (one byte) but the <ESC>$B sequence means that what
follows is in JIS-X-0202-1983 (2 byte)
An <ESC>(J switches to JIS-Roman (one byte - ASCII in lower 127,
Katakana in upper 128)
There are other escape sequences as well.
Once you've figured out how many bytes in the charset (and byte order
for 2-byte stuff) it's just a matter of running them through a lookup
table.
I've now been quite succesful using MLang and ConvertToUnicode using 50220
and 1200 as the source and destination code page respectively for
initializing the ConvertCharset object.
My next problem is with the mail subject line where the same approach does
not work.
If I look at the source of the message I see
Subject: =?ISO-2022-JP?B?QSBqYXBhbmVzZSBlbWFpbCAbJEIhSiRRJCkkPyQqIUsbKEo=?=
But when viewed in Outlook express the japanese charaters are displayed
correctly and the charset info removed.
Does anyone have any ideas how to convert this into unicode for display on a
Japanese device.
Thanks
Neil
"David Williss" <dwil...@microimages.com> wrote in message
news:Ox9V5.1770$ie4....@nntp3.onemain.com...
RFC 1522 along with RFC 1521 and RFC 822 seem to indicate that I need to
parse the string to get the encoded word, decode using BASE64 for B encoding
or Quoted Printable for Q then perform the charset conversion to unicode..
Time to experiment....
"Neil Dando" <shandy...@THISmyrealbox.com> wrote in message
news:Ob$qUFjWAHA.242@cppssbbsa03...
Decode QSBqYXBhbmVzZSBlbWFpbCAbJEIhSiRRJCkkPyQqIUsbKEo= using Base64.
The character set used is ISO-2022-JP.
-sm