Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: Zip file while Chinese or Japanese Characters.

7 views
Skip to first unread message

Jon Skeet [C# MVP]

unread,
Jun 9, 2005, 1:40:59 AM6/9/05
to
Shrek <Sh...@discussions.microsoft.com> wrote:
> I write a C# class which uses J# class
> java.io.FileOutputStream,java.util.ZipInputStream and so on to do operations
> on zip file.
>
> For example:
> When I add a file '1.txt' into 'hah.zip',everything goes well.
> But if I add a file named '??.txt',whose name contains Chinese or Japanese
> characters,the zip file corrupts.
>
> How cound I solve the problem.

I believe (although only on the basis of overhearing the conversations
of colleagues) that the zip format doesn't actually specify the
character encoding used for filenames, unfortunately. I think there's
only 8 bits per character though, which makes it impossible to reliably
get proper internationalisation :(

--
Jon Skeet - <sk...@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Shrek

unread,
Jun 9, 2005, 1:52:01 AM6/9/05
to
Thx,Jon.

But when I use WinZip or WinRar the add the file with Chinsese name into the
zip file,it works just fine.

How do they make it?

Thx.

Jon Skeet [C# MVP]

unread,
Jun 9, 2005, 2:21:40 AM6/9/05
to
Shrek <Sh...@discussions.microsoft.com> wrote:
> But when I use WinZip or WinRar the add the file with Chinsese name into the
> zip file,it works just fine.
>
> How do they make it?

I suspect they're either using proprietary extensions to the zip file
format, or they're making a "best guess" at the encoding based on which
characters are in the filenames, unfortunately :(

Jacky Kwok

unread,
Jun 9, 2005, 2:34:14 AM6/9/05
to
Shrek wrote:
> HELP!

>
> I write a C# class which uses J# class
> java.io.FileOutputStream,java.util.ZipInputStream and so on to do operations
> on zip file.
>
> For example:
> When I add a file '1.txt' into 'hah.zip',everything goes well.
> But if I add a file named '你好.txt',whose name contains Chinese or Japanese
> characters,the zip file corrupts.
>
> How cound I solve the problem.
>
> Thx in advanced.
>

Hi Shrek:

You may try the other zip library for .NET,
http://www.icsharpcode.net/OpenSource/SharpZipLib/Default.aspx

I use it in my application. It can handle Chinese filename without problem.

However, you need to know that the zip file format does not use unicode
code character. All the filename, which are stored into the zip file,
are ANSI MBCS string.

Hence, there is a Unicode(in dotnet) to ANSI MBCS character convertion.
You Windows OS must set to use the Ansi codepage to this character set
to support the convertion by default.
For example, in an English Windows XP OS, I need to set the "Language
for non-Unicode programs" to Chinese(Taiwan) to support the Traditional
Chinese (Big5) filename.


--
Jacky Kwok
jacky@alumni_DOT_cuhk_DOT_edu_DOT_hk
jacky@compose_DOT_com_DOT_hk

Shrek

unread,
Jun 10, 2005, 12:55:02 AM6/10/05
to
Thx very much!
0 new messages