Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Filename encoding on XP

0 views
Skip to first unread message

kent sin

unread,
Jan 2, 2007, 8:08:27 PM1/2/07
to pytho...@python.org
What encoding does the NTFS store the filename?

I got some downloaded files, some with Chinese filename, I can not
backup them to CD because the name is not accepted.

I use walk, then print the filename, there are some ? in it, but some
Chinese characters were display with no problem. I suspect the
encoding of the filename is not unicode. How do I find out more about
this?

--
Sin Hang Kin.

"Martin v. Löwis"

unread,
Jan 3, 2007, 10:50:36 AM1/3/07
to
kent sin schrieb:

> What encoding does the NTFS store the filename?

In UTF-16LE. However, on-disk storage is mostly irrelevant, what
matters is what encoding is used on the OS API.

Windows has two forms of file API: Wide (Unicode) and ANSI
(byte-oriented). On NT, the Wide API is "native"; the ANSI API
converts strings forth and back (introducing ? if the conversion
fails).

> I use walk, then print the filename, there are some ? in it, but some
> Chinese characters were display with no problem. I suspect the
> encoding of the filename is not unicode.

You need to pass a Unicode string into walk as the directory; then
recursively all results should also be Unicode strings.

Regards,
Martin


0 new messages