> Just yesterday I read Luis' article on engineyard:
> , and would love to help out. I noticed the discussion he linked to
> was about a month old, and mainly involved just getting Ruby built, so
> I'm starting this thread as a new direct offshoot from the article.
I think that's a great idea!
> In the article tenderlove mentions trying strace side by side with
> Linux, and the best system call viewer I could find (that actually
> even worked) was API Monitor v2. I have not built Ruby with any debug
> info, nor have I tried any side by side strace run with Linux, but I
> figured I'd give you guys my first pass anyway. This may or may not
> be a red herring :)
Interesting stuff that you found. However, I do recommend that you repost that data as a pastie or gist so that it is more readable. You may have noticed that it was line-wrapped at around 78 characters, so it's really hard to decipher. Putting it into a pastie will preserve the line length and make it easier to read.
Also, please post a link to that API monitor that you found.
> This is part of the problem, but the biggest culprit (in 1.9.2) is
> that it calls expand_path on things it tries to require. It does this
> a lot. Expand_path stats every single part of the path leading up to
> the actual file, which is excruciatingly slow on Windows.
This becomes clear when you run Process Monitor, by the way. I
provided an example of this on both this list and on ruby-core, where
there was something close to but got no responses. Here’s my summary:
For some reason require does an insane amount of I/O. I’ve attached
logs that I generated from Process Monitor. (dtruss/strace will
generate similar, less verbose, results.) Not only does require stat
every directory component leading to the found file, once found, it,
if I understand things correctly, makes sure that specification.rb
actually exists 148 times!
It seems, from some slight debugging, that something funky is going on
in rb_get_expanded_load_path/rb_file_expand_path. These results have
to be cached to minimize I/O.
The awesome part is that 1.9.3 (trunk) already solved this issue, as I
indicated in the article:
"Something to note is that Ruby 1.9.2 really dropped loading
performance, which seems got fixed in 1.9.3 (trunk, not released
Perfection in design is achieved not when there is nothing more to add,
but rather when there is nothing more to take away.
Antoine de Saint-Exupéry
Dne 27.1.2011 18:14, Jason Frey napsal(a):
That is 1.8.7 :-P
Breathe... we need to look at 1.9.3, all the methods should be W ;-)
> The awesome part is that 1.9.3 (trunk) already solved this issue, as I
> indicated in the article:
> "Something to note is that Ruby 1.9.2 really dropped loading
> performance, which seems got fixed in 1.9.3 (trunk, not released
Sorry, I must have missed that paragraph (in my rather quick
read-through). That’s excellent news. Perhaps now I can finally move
on to 1.9.2.
Amazing isn't it when you do something very simple like filter by "Process Name is ruby.exe" in Process Monitor and do:
ruby --disable-gems -ve ""
...hey, what are you doing traversing my "C:\ruby19trunk\lib\ruby\gems\1.9.1\gems" dir with CreateFile, SetBasicInformationFile, etc??
> It seems, from some slight debugging, that something funky is going on
> in rb_get_expanded_load_path/rb_file_expand_path. These results have
> to be cached to minimize I/O.
Ah, here comes the clone the GitHub Ruby repo, patch-o-rama, and "rake ruby19 local=c:\ruby-trunk-git" dance ;)
Awesome findings, and I appreciate your interest in help us!
I've started to look into this but using ruby-trunk as base.
I would suggest we build ruby from source and clone Ruby repository,
so we have a working environment to look at.
1) Have Ruby 1.8.7 installed (ehem, you can download RubyInstaller)
CHECK, you already have that ;-)
2) Clone RubyInstaller repository:
git clone git://github.com/oneclick/rubyinstaller.git rubyinstaller
3) Clone Ruby repository:
git clone git://github.com/ruby/ruby.git ruby
4) Build Ruby using RubyInstaller:
rake ruby19 LOCAL=C:\Path\To\Where\I\Cloned\ruby
5) Get some tea, this is going to take some time ;-)
6) Profile against sandbox\ruby19_mingw\bin\ruby.exe directory
If you used the simple-bench-ruby-io, then you will need to install
isolate into that sandboxed ruby.
Open a new console and add sandbox\ruby19_mingw\bin to the PATH, and
then "gem install isolate --pre"
From there, you can run the profiling and let's see what is the result ;-)
Another trick is that you can:
rake ruby19:dependencies devkit:sh
And inside bash, go into sandbox/ruby19_build
And do: make
If you updated files in Ruby source code, these changes will trigger
new file compilation, then you can "make install" which will target
the sandbox directory we had...
Oh la la.. hope I'm not causing you dizziness!
> It seems that rb_require eventually calls rb_find_file_ext, to
> determine the file extension of the file to require. It does this by
> trying every expected extension in every directory in the $LOAD_PATH,
> using the API call FindFirstFileA (which is technically used for Dir
> globbing), and then if that fails it tries GetFileAttributesA, until
> it finds the file. These two calls seem to be made in winnt_stat
> which is called by rb_w32_stat.
This is part of the problem, but the biggest culprit (in 1.9.2) is
> Incidentally, there were a couple issues that I made it through.
> - I kept getting connection timeouts when downloading things, but just
> rerunning the rake worked perfect.
co> - openssl.org is down...had to manually download from a mirror
Ouch, OpenSSL is on and off to often. Definitely we need to change to
a better mirror.
Can you provide me a link where you got it?
> - After the build, running the simple-bench-ruby-io I'm getting an
> error in lib\ruby\1.9.1\rubygems\format.rb: Uninitialized constant
> Gem::UserInteraction. I just commented out the line and was able to
Yeah, format is not depending anymore on User interaction:
> Ready for the next steps!
I would love get your thoughts after running against trunk.
Specially the excessive lookup you mention on your post, if we can
track it down and reduce it, will be the first win.
For the time being, you can add a "gets" to the start of app.rb so you
can hook the running process with API Monitor and only trace loading
after the whole VM is initialized, that will help us find out what to
optimize outside of the VM startup stuff.
Keep the feedback coming, I'm loving it!
Thank you for your time and eagerly looking forward your next steps email!
I'm not certain if I'm interpreting this data correctly, but it
looks like FindFirstFileA is getting used in some places. I'm
basically just loading rubygems and exiting, using:
ruby 1.9.3dev (2011-01-28 trunk 30684) [i386-mingw32]
I did find a call to FindFirstFile in: file.c:3148: h = FindFirstFile(b, &wfd);
I believe that's part of File#expand_path.
Also, it seems like when FindFirstFileW is used, it does what looks
like a recursive descent down the path, even though we should know the
exact location of the file from the previous FindFirstFileA
Luis and I dug into this a bit this evening. Here is my understanding
of the issue:
When a ruby file is required, first the files is located in the
load_path, (several stats to every dir in the load path until the file
is found). These stats seem necessary, although Dusan Majkic has some
teriffic ideas on how to optimize stat itself.
Once the ruby file is found, it's loaded via a call to
rb_load_internal, which makes a call to rb_realpath_internal.
rb_real_path_internal is used to find the real path of a file. I was
thinking this had to do with relative vs. absolute paths, but Luis
thinks it is more to do with symlinks. rb_real_path_internal makes a
call to rb_real_path_rec, which resolves by doing a stat to every
directory in the path to the file to make sure it's real. If a file
is 10 directories deep, that is 10 stats, and 20 API calls, for a file
whose absolute path we already know.
Here's a gist to show you what I mean. Basically, I fired up IRB and
attached to it with API Monitor V2. I then did a single method call
My hope is that we can short-circuit this (seemingly) unneeded
path-traversal on Windows.
How does it work with hardlinks on Windows? Could it be issue? How often
are hardlinks used on Windows? It seems that they are used more often
than before, eg "Users" and "Documents and Settings" on W7 ...
Ok, the below gist shows my findings. It doesn't seem to be aware of
links on Windows. On my machine, C:\repo is a soft link to
C:\users\gthiesfeld\Work\repo. And "C:\Documents and Settings" is a
hard link to C:\Users. It seems to have the same behavior as
File#expand_path, but goes about it differently.
I tried to line up the stuff I was doing in IRB with what I was seeing
in API Monitor. The lines from API mon starts with ###.
I will say no issue.
realpath_rec, which is used by rb_realpath_internal, traverse
directory by directory and build a hash of each of the elements
This is to avoid possible infinite recursion.
Now, realpath_rec traverse each part of the path requested and stat
it. It will only resolve links if readlink() is present (#ifdef
That is the only recursive part that resolve symlinks, and only if
readlink() is present, which is not under Windows.
So, no matter if you have a Junction, Hardlink or Symlink, the results
are the same:
ruby 1.9.3dev (2011-01-30 trunk 30723) [i386-mingw32]
realpath_rec do not resolve symlinks on Windows, IMO, and it will not
avoid infinite recursion due a bad symlink/hardlink/junction point.
So, to Gordon's point:
realpath_rec needs an optimized version for Windows that do not
require stat every single directory, since stat information is not
cached for already stated directories.
To Dusan's point:
winnt_stat needs to introduce newer Windows API to the cost of Windows
2000 or greater, dropping support for previous versions.
IF NOT "%~f0" == "~f0" GOTO :WinNT
ECHO.This version of Ruby has not been built with support for Windows 95/98/Me.
rubyinstaller.org Windows OS drilldown:
Server 2003: 1%