[ruby-core:19304] 1.9, encoding & win32 wide char support

16 views
Skip to first unread message

Lloyd Hilaiel

unread,
Oct 12, 2008, 4:57:58 PM10/12/08
to ruby...@ruby-lang.org, Lloyd Hilaiel
hello,

I've recently hacked 1.8.7 on win32 to use Wide char CRT & system
calls. The general strategy is to assume that all strings are
represented in utf8 (as on unix platforms) and to convert to/from
"wide" using MultiByteToWideChar at the lowest possible level.

In general, this has meant that the win32 wrappers that already exist
in win32.c are used everywhere, and we're using the win32/crt calls
that accept wchar_t.

After completing this work I took a peek at trunk, and noticed that it
still looks like ascii versions of these calls are in use.

I was wondering if there's any interest in this patch, and further
what the strategy is on win32 to allow correct handling of paths with
non-english characters?

This feature is especially important for our product because we're
embedding ruby and installing the standard library in a user-scoped
path. Given a non-english login name on vista or xp, most paths will
have non-english characters. If there is general interest, I'd love
to contribute some time.

best,
lloyd


Yukihiro Matsumoto

unread,
Oct 12, 2008, 5:59:13 PM10/12/08
to ruby...@ruby-lang.org
Hi,

In message "Re: [ruby-core:19304] 1.9, encoding & win32 wide char support"


on Mon, 13 Oct 2008 05:57:58 +0900, Lloyd Hilaiel <ll...@hilaiel.com> writes:

|I was wondering if there's any interest in this patch, and further
|what the strategy is on win32 to allow correct handling of paths with
|non-english characters?

It's something we should do before the release, so we are VERY
interested in the patch.

|This feature is especially important for our product because we're
|embedding ruby and installing the standard library in a user-scoped
|path. Given a non-english login name on vista or xp, most paths will
|have non-english characters. If there is general interest, I'd love
|to contribute some time.

I have to ask Win32 maintainers first, but I believe they will
appreciate your contribution.

matz.

Bill Kelly

unread,
Oct 12, 2008, 7:49:09 PM10/12/08
to ruby...@ruby-lang.org

From: "Lloyd Hilaiel" <ll...@hilaiel.com>

>
> I was wondering if there's any interest in this patch, and further
> what the strategy is on win32 to allow correct handling of paths with
> non-english characters?

VERY interested!!! :D

win32 unicode filename support came up again on ruby-talk
last month:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/313943


It would be wonderful to have an in-core solution.


You Rock,

Bill

Lloyd Hilaiel

unread,
Oct 13, 2008, 5:05:02 PM10/13/08
to ruby...@ruby-lang.org, Lloyd Hilaiel
Hi Matz & Bill,

Great! I'm glad this work has the potential of being useful. I'll
clean up
the patch a bit and post it. One thing that I haven't done is to
integrate
my ad-hoc tests which exercise the creation and manipulation of files
under
non-english pathing.

If anyone has a complete set of tests focused at non-english paths...
throw it my way.

I should be able to clean things up and post in the next couple
weeks... I'll
hurry.

very best,
lloyd

Bill Kelly

unread,
Nov 25, 2008, 10:19:09 PM11/25/08
to ruby...@ruby-lang.org

From: "Lloyd Hilaiel" <ll...@hilaiel.com>

>
> On Oct 12, 2008, at 3:59 PM, Yukihiro Matsumoto wrote:
>>
>> on Mon, 13 Oct 2008 05:57:58 +0900, Lloyd Hilaiel <ll...@hilaiel.com
>> writes:
>>
>> |I was wondering if there's any interest in this patch, and further
>> |what the strategy is on win32 to allow correct handling of paths with
>> |non-english characters?
>>
>> It's something we should do before the release, so we are VERY
>> interested in the patch.
>
> Great! I'm glad this work has the potential of being useful. I'll
> clean up the patch a bit and post it. One thing that I haven't done
> is to integrate my ad-hoc tests which exercise the creation and
> manipulation of files under non-english pathing.
>
> If anyone has a complete set of tests focused at non-english paths...
> throw it my way.
>
> I should be able to clean things up and post in the next couple
> weeks... I'll hurry.

Hi,

Does anyone have information as to the current status of
adding Unicode-savvy path handling to 1.9 ruby?

It sounded like Lloyd may have most of the work already done.

I would be happy to help get this integrated into 1.9.

(I had inquired about the status about a month ago, on
October 28, but it appears my post never reached the list.)


Regards,

Bill

Bill Kelly

unread,
Nov 25, 2008, 10:26:53 PM11/25/08
to ruby...@ruby-lang.org

From: "Bill Kelly" <bi...@cts.com>

>
> Does anyone have information as to the current status of
> adding Unicode-savvy path handling to 1.9 ruby?

Ugh. Sorry, I mean of course: Unicode-savvy path handling
on *win32* ruby 1.9.


Regards,

Bill

Yukihiro Matsumoto

unread,
Nov 25, 2008, 10:33:24 PM11/25/08
to ruby...@ruby-lang.org
Hi,

In message "Re: [ruby-core:20109] Re: 1.9, encoding & win32 wide char support"


on Wed, 26 Nov 2008 12:26:53 +0900, "Bill Kelly" <bi...@cts.com> writes:

|> Does anyone have information as to the current status of
|> adding Unicode-savvy path handling to 1.9 ruby?
|
|Ugh. Sorry, I mean of course: Unicode-savvy path handling
|on *win32* ruby 1.9.

Every path encoding is UTF-8 and converted to UTF-16 internally. If
there's something still use *A functions, it will eventually replaced
by *W functions. In short, if you're using UTF-8 for your program
encoding, you should not see any problem (if you do, it's a bug).

matz.

Bill Kelly

unread,
Nov 25, 2008, 10:46:17 PM11/25/08
to ruby...@ruby-lang.org
Hi matz,

Awesome. This is excellent news!

Thanks to everyone involved!


Regards,

Bill

Reply all
Reply to author
Forward
0 new messages