One of our applications written in VC++ uses _MBCS (not _UNICODE) to support
international text. The application generates XML by "hand" in an MFC
CString and then POSTs that XML to an ASP page using WinInet. The ASP page
is hosted on Windows 2000 Server, and now has CODEPAGE=65001 set, which
means it expects to receive its data (notably this XML) in UTF-8 format.
The app supports Win98 and Win2000, and the Microsoft Layer for Unicode
(MSLU) is not currently being used. The plumbing works fine for English (of
course) but now I've been brought on to the team and asked to see that it
works with XML containing data in 17 other languages.
I'm trying to find the best way for them to convert the MBCS CString to
UTF-8 before posting it. I'm thinking:
1) Use the MSXML parser to load the XML in a designated code page, then
extract the XML in UTF-8. One lingering question is: Does MSXML 3.0 or 4.0
require the MSLU to do these translations on Win98?
2) Compile the application with the MSLU and do the conversions manually
(<code page x> to UCS-2 to UTF-8 using MultiByteToWideChar and
WideCharToMultiByte). This option, however, would require significant
re-testing of the application, which would be difficult as they are nearing
deployment.
3) Use a third-party library, like International Components for Unicode
(ICU), to do the conversions.
Can anyone give me some suggestions?
Galen Murdock
But anyway, MSXML does not use MSLU at all. And since MSXML fully supports
encoding, doing it yourself really does not make good sense and I would
truly recommend rethinking this "roll your own" mentality, unless you are:
a) paid by the hour, and
b) not subject to regular review for productive use of time.
If either of these is NOT true, then sticking to approproate usage makes a
lot more sense.
--
MichKa
Michael Kaplan
(principal developer of the MSLU)
Trigeminal Software, Inc. -- http://www.trigeminal.com/
the book -- http://www.i18nWithVB.com/
"Galen Earl Murdock" <ga...@veracitysolutions.nospam.com> wrote in message
news:uLmRQRjZBHA.1992@tkmsftngp03...
As for "best" development practices, that's exactly what I'm trying to come
up with. I've been charged with discovering, addressing, and helping to
resolve i18N issues on our team of 16 developers. So far I've written a
30-page document with "Unicode Essentials" section and a "Best Practices"
section. This latter section recommends against the "roll your own"
mentality for the exact reasons you listed.
So, in my list of options that I originally posted, I'd definitely prefer
#1, especially now that I know MSXML doesn't depend on MSLU on Win98.
So the sample code I'll now write is "How to use the MSXML parser to load
XML from a MBCS CString and extract the XML in UTF-8". I just wanted a
sanity check before proceeding down a dead end. I'm assuming, then that
this is a recommended approach for this MBCS VC++ app that needs to
communicate in UTF-8 XML?
Galen
--
"Michael (michka) Kaplan" <forme...@spamfree.trigeminal.nospam.com> wrote
in message news:#FsbUzlZBHA.1868@tkmsftngp04...
For best practices, non-Unicode apps are their own form of dead end, at this
point -- may be worth considering this? <g>
--
MichKa
Michael Kaplan
(principal developer of the MSLU)
Trigeminal Software, Inc. -- http://www.trigeminal.com/
the book -- http://www.i18nWithVB.com/
"Galen Earl Murdock" <ga...@veracitysolutions.nospam.com> wrote in message
news:#E#tfEwZBHA.1704@tkmsftngp07...