Enabling NSPR UTF16 APIs for Win32

Jungshik Shin

unread,

Jul 20, 2003, 9:02:19 AM7/20/03

to

I'm wondering what others (especially wtc) think about enabling
NSPR UTF16 APIs for Win32 by default (or when it's build for Mozilla
client). To fix bug 162361
(http://bugzilla.mozilla.org/show_bug.cgi?id=162361) and other related
bugs involving Win32 'W' APIs [1], that seems to be the way to go.

Any comment?

Jungshik

[1] http://www.mozilla.org/releases/mozilla1.4/known-issues-int.html
See the 4th paragraph in the section named 'General'

Message has been deleted

Jungshik Shin

unread,

Jul 25, 2003, 1:37:58 AM7/25/03

to

Wan-Teh Chang wrote:
> Jungshik Shin wrote:

> > I'm wondering what others (especially wtc) think about enabling
> > NSPR UTF16 APIs for Win32 by default (or when it's build for Mozilla
> > client).
>

> I am still debating whether the NSPR Unicode APIs
> should use UTF16 or UTF8. I am inclined towards
> using UTF8 because NSPR doesn't have any string
> functions that operate on UTF16 strings.
>
> Once this issue is resolved we can enable the
> NSPR Unicode APIs by default.

Indeed, using UTF8 for NSPR Unicode APIs may be a good idea. For instance,
my patch for bug 162361
(http://bugzilla.mozilla.org/show_bug.cgi?id=162361) has lines like
this :

mDir.dirW = PR_OpenDirUTF16(NS_ConvertUTF8toUCS2(filepath.get()).get());

I don't have to call NS_ConvertUTF8toUTF16 if UTF8 is used in NSPR
Unicode APIs. Further simplifcations are possible as well with UTF-8 in
NSPR Unicode APIs.

Some other consumers of NSPR APIs may find it more convenient to use
UTF-16 than UTF-8 (Win32, JS, and ICU use UTF-16), but the conversion
between them is easy so that it shouldn't be a big deal.

An alternative for Win32 (9x/ME vs 2k/XP. At least on POSIX systems, we
don't need to worry about Unicode APIs) is to make existing 'char'-based
APIs behave differently based on the run-time detection of the OS (Win
9x/ME vs 2k/XP). On Win9x/ME, 'char *' is regarded as 'ANSI' string and
passed to Win32 'A' APIs while on Win2k/XP, 'char *' is regarded as
UTF-8 string and is passed to 'W' APIs after convertng to UTF-16.

However, this alternative seems 'risky' in that it can break the
backward compatibility, which is why I didn't mention it in my first
posting.

Anyway, it'd be nice to have NSPR Unicode APIs enabled on Win32 soon so
that we can fix a bunch of bugs on Win32.

Jungshik

Darin Fisher

unread,

Aug 2, 2003, 12:38:57 AM8/2/03

to Jungshik Shin

Another alternative for Mozilla is to "fake" the native charset to be
UTF-8, and then under the hood in NSPR we can make a conversion from
UTF-8 to UTF-16 (WinNT) or the native charset (Win9x). This solution
would involve no API changes. We'd just need to have nsIPlatformCharset
and NSPR agree that the platform charset is actually (albeit a lie) UTF-8.

Thoughts?
Darin

Jungshik Shin

unread,

Aug 3, 2003, 8:03:24 AM8/3/03

to Darin Fisher

Darin Fisher wrote:
> Jungshik Shin wrote:
>
>> Wan-Teh Chang wrote:
>>
>>> I am still debating whether the NSPR Unicode APIs
>>> should use UTF16 or UTF8. I am inclined towards
>>> using UTF8 because NSPR doesn't have any string
>>> functions that operate on UTF16 strings.

>>

>> Indeed, using UTF8 for NSPR Unicode APIs may be a good idea. For
>> instance,

......

>> An alternative for Win32 (9x/ME vs 2k/XP. At least on POSIX systems,
>> we don't need to worry about Unicode APIs) is to make existing
>> 'char'-based APIs behave differently based on the run-time detection
>> of the OS (Win 9x/ME vs 2k/XP). On Win9x/ME, 'char *' is regarded as
>> 'ANSI' string and passed to Win32 'A' APIs while on Win2k/XP, 'char *'
>> is regarded as UTF-8 string and is passed to 'W' APIs after convertng
>> to UTF-16.
>>
>> However, this alternative seems 'risky' in that it can break the
>> backward compatibility, which is why I didn't mention it in my first
>> posting.

> Another alternative for Mozilla is to "fake" the native charset to be
> UTF-8, and then under the hood in NSPR we can make a conversion from
> UTF-8 to UTF-16 (WinNT) or the native charset (Win9x). This solution
> would involve no API changes. We'd just need to have nsIPlatformCharset
> and NSPR agree that the platform charset is actually (albeit a lie) UTF-8.

Actually, this is more or less like my alternative (+
nsIPlatformCharset), isn't it? Making nsIPlatformCharset to return
'UTF-8' on NT may cut down the amount of works to do with my alternative
because we don't have to fix every place where NSPR file I/O is used. On
the other hand, it can break quite many things because Platform Charset
is supposed to provide the default values for several things (as a
fallback). Another downside of this approach (and my alternative) is the
duplication of codes to detect the OS. The OS detection code is already
duplicated in a few places througout the tree. I want to consolidate
them all into a 'helper' function nsWindowsAPI::IsNT() (that works more
or less like nsMemory::Alloc()/Free()). If we take your (or my
alternative), we need the OS detection code in NSPR as well.

Another 'bloat' is a separate UTF-8 <-> UTF-16 conversion in NSPR.
Again, I like to avoid this duplication.

In conclusion, I'm inclined toward a separate UTF-8 (or UTF-16) NSPR
APIs. This should be all right as long as we don't have too many plaecs
where NSPR APIs are directly used instead of xpcom file I/O.