Fenix to conform latest requirements from Bug #6836

20 views
Skip to first unread message

Luis Lavena

unread,
Aug 18, 2012, 10:31:17 AM8/18/12
to theco...@googlegroups.com
Hello folks,

Resuming the conversation around faster File.expand_path for require
without breaking WEBrick...
http://bugs.ruby-lang.org/issues/6836

Latest comments from Usa point out to keep current File.expand_path
behavior but remove the path expansion from require and such.

I decided to update Fenix code to follow those requirements and put
again code from Hiroshi that expanded path:

https://gist.github.com/1734954

And updated the repository with the changes:

https://github.com/luislavena/fenix

Now File.expand_path solves short to long paths properly, but I
noticed some slowdowns compared to Ruby itself:

https://gist.github.com/3387055

Fenix ''
Fenix '.'
Fenix '', 'C:/'
Fenix '~'
Fenix '~', 'C:/Foo'

The two benchs where dir is not nil should have been faster than Ruby,
yet thanks to short-to-long conversion they became slower.

Any suggestion around that?

Thank you in advance for your time.
--
Luis Lavena
AREA 17
-
Perfection in design is achieved not when there is nothing more to add,
but rather when there is nothing more to take away.
Antoine de Saint-Exupéry

Hiroshi Shirosaki

unread,
Aug 19, 2012, 12:57:54 PM8/19/12
to theco...@googlegroups.com
On Sat, Aug 18, 2012 at 11:31 PM, Luis Lavena <luisl...@gmail.com> wrote:
> Hello folks,
>
> Resuming the conversation around faster File.expand_path for require
> without breaking WEBrick...
> http://bugs.ruby-lang.org/issues/6836
>
> Latest comments from Usa point out to keep current File.expand_path
> behavior but remove the path expansion from require and such.
>
> I decided to update Fenix code to follow those requirements and put
> again code from Hiroshi that expanded path:
>
> https://gist.github.com/1734954
>
> And updated the repository with the changes:
>
> https://github.com/luislavena/fenix
>
> Now File.expand_path solves short to long paths properly, but I
> noticed some slowdowns compared to Ruby itself:
>
> https://gist.github.com/3387055
>
> Fenix ''
> Fenix '.'
> Fenix '', 'C:/'
> Fenix '~'
> Fenix '~', 'C:/Foo'
>

Hello Luis,

I looked commits. Nice work!

Above slowdown would be cause of calling GetFileAttributesW().

https://github.com/luislavena/fenix/blob/master/ext/fenix/file.c#L318

This makes performance slowdown if the path *exists*. All above cases
are existing path.
I guess performance will be same level as ruby if you remove
GetFileAttributesW().
Removing GetFileAttributesW() might be better.

See also:
https://github.com/luislavena/fenix/pull/2


> Fenix ''
> Fenix '.'
> Fenix '', 'C:/'

About above three cases, we could skip to call FindFirstFileW()
because we can expect they are already long name after
GetFullPathNameW().


> Fenix '~'
> Fenix '~', 'C:/Foo'

About above two cases, it depends on environment variable setting if
the path is short name or long name.


--
Hiroshi Shirosaki

Luis Lavena

unread,
Aug 19, 2012, 3:42:07 PM8/19/12
to theco...@googlegroups.com
On Sun, Aug 19, 2012 at 1:57 PM, Hiroshi Shirosaki
<h.shi...@gmail.com> wrote:
>
> Hello Luis,
>
> I looked commits. Nice work!
>
> Above slowdown would be cause of calling GetFileAttributesW().
>
> https://github.com/luislavena/fenix/blob/master/ext/fenix/file.c#L318
>
> This makes performance slowdown if the path *exists*. All above cases
> are existing path.
> I guess performance will be same level as ruby if you remove
> GetFileAttributesW().
> Removing GetFileAttributesW() might be better.
>
> See also:
> https://github.com/luislavena/fenix/pull/2
>

Oh my, it has been 6 months since we discussed this, sorry I
completely forgot all those details.

>
>> Fenix ''
>> Fenix '.'
>> Fenix '', 'C:/'
>
> About above three cases, we could skip to call FindFirstFileW()
> because we can expect they are already long name after
> GetFullPathNameW().
>

So we could say that if path is '' or '.' then long_name is not required?

>
>> Fenix '~'
>> Fenix '~', 'C:/Foo'
>
> About above two cases, it depends on environment variable setting if
> the path is short name or long name.
>

Understood, thank you for taking the time and look into this.

Going to work later today and tomorrow on a final path for Ruby-Core
and update my benchmarks.

Don't hesitate in let me know your thoughts or any other consideration on this.

Thank you!

Hiroshi Shirosaki

unread,
Aug 19, 2012, 6:09:52 PM8/19/12
to theco...@googlegroups.com
On Mon, Aug 20, 2012 at 4:42 AM, Luis Lavena <luisl...@gmail.com> wrote:
>>> Fenix ''
>>> Fenix '.'
>>> Fenix '', 'C:/'
>>
>> About above three cases, we could skip to call FindFirstFileW()
>> because we can expect they are already long name after
>> GetFullPathNameW().
>>
>
> So we could say that if path is '' or '.' then long_name is not required?
>

Yes.
If path is '', '.', '..' (not contains file name) or 'C:/' (root
directory), long_name is not required.


--
Hiroshi Shirosaki

Hiroshi Shirosaki

unread,
Aug 19, 2012, 8:49:54 PM8/19/12
to theco...@googlegroups.com
I'm sorry, above is not true. GetFullPathNameW() returns short name if
`cd` short name directory as below.

C:\>cd PROGRA~1

C:\PROGRA~1>ruby -ve "p File.expand_path('.')"
ruby 1.9.3p0 (2011-10-30) [i386-mingw32]
"C:/Program Files"

C:\PROGRA~1>pik tcs

C:\PROGRA~1>ruby -ve "p File.expand_path('.')"
tcs-ruby 1.9.3p179 (2012-04-11, TCS patched 2012-04-11) [i386-mingw32]
"C:/PROGRA~1"


--
Hiroshi Shirosaki

Luis Lavena

unread,
Aug 20, 2012, 5:56:26 PM8/20/12
to theco...@googlegroups.com
On Sun, Aug 19, 2012 at 9:49 PM, Hiroshi Shirosaki
<h.shi...@gmail.com> wrote:
>>
>> Yes.
>> If path is '', '.', '..' (not contains file name) or 'C:/' (root
>> directory), long_name is not required.
>>
>
> I'm sorry, above is not true. GetFullPathNameW() returns short name if
> `cd` short name directory as below.
>

Yeah, was going to respond that after testing this it didn't work as expected.

Not going to remove expansion in those cases.

I'll work on modify the patch based on latest code changes and run benchs again.

Thank you Hiroshi.

Hiroshi Shirosaki

unread,
Aug 21, 2012, 3:04:19 AM8/21/12
to theco...@googlegroups.com
Hello,

I noticed one issue about encoding.
Inspired by Luis's comment.
https://github.com/rubygems/rubygems/issues/357#issuecomment-6986579


File.expand_path("~").encoding.must_equal Encoding.find("filesystem")

From my tests, it seems ruby 1.9.3 and 2.0 have above spec.
The following line exists in rb_home_dir().

enc = rb_filesystem_encoding();
https://github.com/ruby/ruby/blob/trunk/file.c#L2848

I pushed the change to fenix. Please take a look.
https://github.com/luislavena/fenix/commit/556942239a638d7e55032e059adb840a4bf4ec95


--
Hiroshi Shirosaki

Luis Lavena

unread,
Aug 22, 2012, 10:06:39 AM8/22/12
to theco...@googlegroups.com
On Tue, Aug 21, 2012 at 4:04 AM, Hiroshi Shirosaki
<h.shi...@gmail.com> wrote:
> Hello,
>
> I noticed one issue about encoding.
> Inspired by Luis's comment.
>
> I pushed the change to fenix. Please take a look.
> https://github.com/luislavena/fenix/commit/556942239a638d7e55032e059adb840a4bf4ec95
>

Thanks, that worked as expected and fixed the encoding issue!

Now, I've created a patch against trunk:

https://gist.github.com/3425888

This time including long_name expansion, as commented in the Ruby ticket:
http://bugs.ruby-lang.org/issues/6836#note-11

But I found that require is failing:

C:\Users\Luis\Code\ruby\ruby\build32>miniruby.exe -I. -rrbconfig -e "puts :ok"
-e:1:in `require': method `to_path' called on hidden T_STRING object
(0x2ab6d48 flags=0x2c005 klass=0x0) (NotImplementedError)

Hiroshi, maybe you can take a look and tell me what I missed?

The only difference is that I removed "file_path_convert" from
coerce_to_path as was a no-op for Windows.

Thank you in advance,

Hiroshi Shirosaki

unread,
Aug 22, 2012, 11:47:17 AM8/22/12
to theco...@googlegroups.com
On Wed, Aug 22, 2012 at 11:06 PM, Luis Lavena <luisl...@gmail.com> wrote:
> Now, I've created a patch against trunk:
>
> https://gist.github.com/3425888
>
> This time including long_name expansion, as commented in the Ruby ticket:
> http://bugs.ruby-lang.org/issues/6836#note-11
>
> But I found that require is failing:
>
> C:\Users\Luis\Code\ruby\ruby\build32>miniruby.exe -I. -rrbconfig -e "puts :ok"
> -e:1:in `require': method `to_path' called on hidden T_STRING object
> (0x2ab6d48 flags=0x2c005 klass=0x0) (NotImplementedError)
>
> Hiroshi, maybe you can take a look and tell me what I missed?
>
> The only difference is that I removed "file_path_convert" from
> coerce_to_path as was a no-op for Windows.
>

Last part of expand_path needs some tweaks to merge into ruby.
https://gist.github.com/3425888#L753

result = rb_enc_str_new(fullpath, size, path_encoding);

This line doen't work. We need to push expanded string into
`result`(String Object).

Please see previous patch.
https://gist.github.com/3242245#L911


Would it be better that declaration of `void rb_w32_init_file(void)`
is in win32.h?
https://gist.github.com/3242245#L112


Thank you,
--
Hiroshi Shirosaki

Luis Lavena

unread,
Aug 22, 2012, 11:51:49 AM8/22/12
to theco...@googlegroups.com
On Wed, Aug 22, 2012 at 12:47 PM, Hiroshi Shirosaki
<h.shi...@gmail.com> wrote:
>
> Last part of expand_path needs some tweaks to merge into ruby.
> https://gist.github.com/3425888#L753
>
> result = rb_enc_str_new(fullpath, size, path_encoding);
>
> This line doen't work. We need to push expanded string into
> `result`(String Object).
>
> Please see previous patch.
> https://gist.github.com/3242245#L911
>

Ah, will take a look again.

>
> Would it be better that declaration of `void rb_w32_init_file(void)`
> is in win32.h?
> https://gist.github.com/3242245#L112
>

I wanted to avoid the function getting exported in the DLL, so moved
to internal.h but perhaps you're right.

Going to take a look again and update the patch.

Luis Lavena

unread,
Aug 22, 2012, 12:17:30 PM8/22/12
to theco...@googlegroups.com
On Wed, Aug 22, 2012 at 12:51 PM, Luis Lavena <luisl...@gmail.com> wrote:
> On Wed, Aug 22, 2012 at 12:47 PM, Hiroshi Shirosaki
> <h.shi...@gmail.com> wrote:
>>
>> Last part of expand_path needs some tweaks to merge into ruby.
>> https://gist.github.com/3425888#L753
>>
>> result = rb_enc_str_new(fullpath, size, path_encoding);
>>
>> This line doen't work. We need to push expanded string into
>> `result`(String Object).
>>
>> Please see previous patch.
>> https://gist.github.com/3242245#L911
>>
>
> Ah, will take a look again.
>

Updated my patch following your comments, however the issue with
"to_path" being called on a private string seems to be caused by
coerce_to_path, seems that rb_find_file* functions are the cause of
this.

Also, I think rb_get_expanded_load_path will need to be patched too,
but maybe not by using rb_file_expand_path_internal but a different
one that hides abs_mode and long_name along the buffer.

I'm a bit sick at home and will try work around this, but any help is
appreciated.

Thank you.

Luis Lavena

unread,
Aug 23, 2012, 12:48:34 AM8/23/12
to theco...@googlegroups.com
On Wed, Aug 22, 2012 at 1:17 PM, Luis Lavena <luisl...@gmail.com> wrote:
>
> Updated my patch following your comments, however the issue with
> "to_path" being called on a private string seems to be caused by
> coerce_to_path, seems that rb_find_file* functions are the cause of
> this.
>

I figured out the issue around this, I was coercing the path inside
rb_file_expand_path_internal and I shouldn't

I've updated the ticket, the gist and the patch:

https://bugs.ruby-lang.org/issues/6836
https://gist.github.com/3242245
https://github.com/luislavena/ruby/compare/improve-require-and-file-expand_path

Once again thank your for your help,

Jon

unread,
Aug 23, 2012, 12:00:47 PM8/23/12
to theco...@googlegroups.com
> > Updated my patch following your comments, however the issue with
> > "to_path" being called on a private string seems to be caused by
> > coerce_to_path, seems that rb_find_file* functions are the cause of
> > this.
> >
>
> I figured out the issue around this, I was coercing the path inside
> rb_file_expand_path_internal and I shouldn't
>
> I've updated the ticket, the gist and the patch:
>
> https://bugs.ruby-lang.org/issues/6836
> https://gist.github.com/3242245
> https://github.com/luislavena/ruby/compare/improve-require-and-file-expand_path


While I've been following your progress on this (I need to find better words than awesome or fabulous), I'd planned to use the older `win-file/ruby_1_9_3` codes for the next tcs-ruby 1.9.3 release.

But, given the latest status and Usa approving the commit to trunk, we should use your and Hiroshi's latest codes don't you think?

So let's do this:

1) I'll update our `ruby_1_9_3` branch and rebase all the `*/ruby_1_9_3` branches to ruby_1_9_3@36792
2) Once you see (1) complete, you or Hiroshi create a new `win-file-trunk/ruby_1_9_3` branch with the
new codes based on our `ruby_1_9_3` branch.
3) When Yui Naruse releases the next official 1.9.3 patchlevel, I'll do a final rebase and a tcs-ruby release.

Oh, and did you notice that http://bugs.ruby-lang.org/issues/6846 (Hiroshi and Dusan's stat patch) has just been officially backported to 1.9.3?

:) :) :) :) :)

Jon

---
Fail fast. Fail often. Fail publicly. Learn. Adapt. Repeat.
http://thecodeshop.github.com | http://jonforums.github.com/
twitter: @jonforums

Hiroshi Shirosaki

unread,
Aug 23, 2012, 4:48:34 PM8/23/12
to theco...@googlegroups.com
On Thu, Aug 23, 2012 at 1:48 PM, Luis Lavena <luisl...@gmail.com> wrote:
>
> I figured out the issue around this, I was coercing the path inside
> rb_file_expand_path_internal and I shouldn't

Indeed I changed that with previous patch. I missed that.
I looked approved to commit. Thank you for great work!


I noticed some small points. Please consider.


* Encoding with ascii8bit and us-ascii part is different from
previous, though I'm not completely sure which is better.

https://github.com/luislavena/ruby/compare/improve-require-and-file-expand_path#L6R209

https://github.com/thecodeshop/ruby/commit/bf6a82a551f6ee9aba58d85342d91ff03dffcb53#L7R232


* code format

else statement format is different from ruby code. We need to insert
newline just before `else`.

`//` style comment remains.
https://github.com/luislavena/ruby/compare/improve-require-and-file-expand_path#L6R535


* There are some specs which are not included in latest patch.

https://github.com/luislavena/fenix/blob/master/spec/fenix/file_spec.rb#L74
https://github.com/luislavena/fenix/blob/master/spec/fenix/file_spec.rb#L300


--
Hiroshi Shirosaki

Luis Lavena

unread,
Aug 23, 2012, 5:49:07 PM8/23/12
to theco...@googlegroups.com
On Thu, Aug 23, 2012 at 5:48 PM, Hiroshi Shirosaki
<h.shi...@gmail.com> wrote:
> On Thu, Aug 23, 2012 at 1:48 PM, Luis Lavena <luisl...@gmail.com> wrote:
>>
>> I figured out the issue around this, I was coercing the path inside
>> rb_file_expand_path_internal and I shouldn't
>
> Indeed I changed that with previous patch. I missed that.
>
>>
>> I've updated the ticket, the gist and the patch:
>>
>> https://bugs.ruby-lang.org/issues/6836
>> https://gist.github.com/3242245
>> https://github.com/luislavena/ruby/compare/improve-require-and-file-expand_path
>>
>
> I looked approved to commit. Thank you for great work!
>
>
> I noticed some small points. Please consider.
>
>
> * Encoding with ascii8bit and us-ascii part is different from
> previous, though I'm not completely sure which is better.
>
> https://github.com/luislavena/ruby/compare/improve-require-and-file-expand_path#L6R209
>
> https://github.com/thecodeshop/ruby/commit/bf6a82a551f6ee9aba58d85342d91ff03dffcb53#L7R232
>

Ahh, which one you prefer?

>
> * code format
>
> else statement format is different from ruby code. We need to insert
> newline just before `else`.
>

Will change that.
Ouch, missed that!
Will add those tests and then commit.

Thank you!

Hiroshi Shirosaki

unread,
Aug 23, 2012, 8:45:08 PM8/23/12
to theco...@googlegroups.com
On Fri, Aug 24, 2012 at 6:49 AM, Luis Lavena <luisl...@gmail.com> wrote:
>>
>> * Encoding with ascii8bit and us-ascii part is different from
>> previous, though I'm not completely sure which is better.
>>
>> https://github.com/luislavena/ruby/compare/improve-require-and-file-expand_path#L6R209
>>
>> https://github.com/thecodeshop/ruby/commit/bf6a82a551f6ee9aba58d85342d91ff03dffcb53#L7R232
>>
>
> Ahh, which one you prefer?
>

ascii-8bit == binary on ruby, which means encoding information is not specified.
But code page is required for MultiByteToWideChar.

CP437 seems modified ascii encoding.
http://en.wikipedia.org/wiki/Code_page_437#Difference_from_ASCII

20127 is plain ascii. So I think 20127 might be suitable not to do too
much conversion.
20127 — US-ASCII The classic US 7 bit character set with no char larger than 127


--
Hiroshi Shirosaki

Luis Lavena

unread,
Aug 23, 2012, 8:57:36 PM8/23/12
to theco...@googlegroups.com
On Thu, Aug 23, 2012 at 9:45 PM, Hiroshi Shirosaki
<h.shi...@gmail.com> wrote:
>
> ascii-8bit == binary on ruby, which means encoding information is not specified.
> But code page is required for MultiByteToWideChar.
>
> CP437 seems modified ascii encoding.
> http://en.wikipedia.org/wiki/Code_page_437#Difference_from_ASCII
>
> 20127 is plain ascii. So I think 20127 might be suitable not to do too
> much conversion.
> 20127 — US-ASCII The classic US 7 bit character set with no char larger than 127
>

Understood, I'm changing the branch then to set 20127 for both. Will
push to my fork in a bit (after compile and tests) and then will start
working on commit this into trunk.

Luis Lavena

unread,
Aug 23, 2012, 11:52:36 PM8/23/12
to theco...@googlegroups.com
On Thu, Aug 23, 2012 at 9:57 PM, Luis Lavena <luisl...@gmail.com> wrote:
>
> Understood, I'm changing the branch then to set 20127 for both. Will
> push to my fork in a bit (after compile and tests) and then will start
> working on commit this into trunk.
>

Merged in trunk:

https://github.com/ruby/ruby/commit/86df08dac6fc4642a6bac7bab6ccd5f44b7a293c

Added .time by mistake, which I got removed one commit after:

https://github.com/ruby/ruby/commit/3d8d595932bef78aa3a175ea2150c88ad3582cba

RubyInstaller CI is building now...

Hiroshi Shirosaki

unread,
Aug 24, 2012, 6:23:31 AM8/24/12
to theco...@googlegroups.com
On Fri, Aug 24, 2012 at 1:00 AM, Jon <jon.f...@gmail.com> wrote:
> 1) I'll update our `ruby_1_9_3` branch and rebase all the `*/ruby_1_9_3` branches to ruby_1_9_3@36792
> 2) Once you see (1) complete, you or Hiroshi create a new `win-file-trunk/ruby_1_9_3` branch with the
> new codes based on our `ruby_1_9_3` branch.
> 3) When Yui Naruse releases the next official 1.9.3 patchlevel, I'll do a final rebase and a tcs-ruby release.
>

I've created win-file-trunk/ruby_1_9_3 branch.
But sorry for misreading your mail. I've also updated win-file/ruby_1_9_3.
Do you need original win-file/ruby_1_9_3?

https://github.com/thecodeshop/ruby/commits/win-file-trunk/ruby_1_9_3

--
Hiroshi Shirosaki

Hiroshi Shirosaki

unread,
Aug 24, 2012, 7:47:45 AM8/24/12
to theco...@googlegroups.com
I had previous local branch on other box. So I reverted it.

https://github.com/thecodeshop/ruby/commits/win-file/ruby_1_9_3


--
Hiroshi Shirosaki

Jon

unread,
Aug 24, 2012, 1:52:22 PM8/24/12
to theco...@googlegroups.com
Perfect, thank you.

`win-file/ruby_1_9_3` is only useful as a quick fallback if we find something odd in the new `win-file-trunk/ruby_1_9_3` codes.

Sorry for the delay in updating all the branches. I'm in Seattle for a couple weeks on a "working vacation" (??) and will clean things up today or tomorrow.
Reply all
Reply to author
Forward
0 new messages