File operations leak handles?

34 views
Skip to first unread message

pete

unread,
Mar 18, 2010, 12:31:43 AM3/18/10
to RubyInstaller
I have a long-running script that does a lot of file operations, and
it's been crashing with mysterious errors pointing to memory/reference
leaks. The memory usage while it was under observation seemed like a
lot, but acceptable for what it was doing. I then noticed that in the
Windows Task Manager the number of File Handles reported for my script
grew steadily over time. I decided to take one particular operation
(md5 generation), run it repeatedly, and sure enough it "leaked"
handles.

This code reports around ~230 file handles when finished:

require 'digest/md5'
$test_file = "lightin2.mp3"
100.times do
Digest::MD5.file( $test_file ).digest
end

puts "done"
sleep

Changing the above to .digest! or using Digest::MD5.digest didn't make
a difference.

To attempt to narrow it down I tried seeing what happened when I did a
simple buffered read of a file, similar to what I expect
Digest::MD5.file is doing, and got similar results (~230 file handles
in Task Manager):

$test_file = "lightin2.mp3"

def buffered_read(path)
data = ""
File.open(path, "rb") do |f|
until f.eof?
data << f.read(16384)
end
end
data
end

100.times do
buffered_read( $test_file )
end

puts "done"
sleep

I was able to workaround this (ie, get it down to ~30 file handles no
matter how many iterations I did) with the following code:

require 'digest/md5'
$test_file = "lightin2.mp3"

class EmDee5 < Digest::MD5
def file( path )
buffer(path) do |data|
update( data )
end
self
end

def buffer(path)
n = 0
size = File.size( path )
chunk = 2**16
File.open(path, "rb") do |f|
while n < size
n += chunk
yield f.sysread( chunk )
end
end
end
end

# generated with md5sum from GNU's coreutils
known_good_value = "c597dbc0eaf066b9b0ab0febcca7f178"

# sanity check
md5 = EmDee5.file( $test_file ).digest
raise unless md5.unpack("H*").first == known_good_value

100.times do
EmDee5.file( $test_file ).digest
end

puts "done"
# pause so Task Manager can be checked
sleep

Is this expected behavior, or did I stumble onto a bug in ruby?

BTW, this is with Windows XP SP3 (two different machines) and using
ruby 1.9.1p378 (2010-01-10 revision 26273) [i386-mingw32].

Luis Lavena

unread,
Mar 18, 2010, 5:15:23 AM3/18/10
to rubyin...@googlegroups.com
On Thu, Mar 18, 2010 at 5:31 AM, pete <pe...@peterhiggins.org> wrote:
>
> I have a long-running script that does a lot of file operations, and
> it's been crashing with mysterious errors pointing to memory/reference
> leaks. The memory usage while it was under observation seemed like a
> lot, but acceptable for what it was doing. I then noticed that in the
> Windows Task Manager the number of File Handles reported for my script
> grew steadily over time. I decided to take one particular operation
> (md5 generation), run it repeatedly, and sure enough it "leaked"
> handles.
>
> This code reports around ~230 file handles when finished:
>

File handlers or any object coming from OS is released immediately. It
needs to be garbage collected.

Trying this:

require 'digest/md5'
test_file = "m.mp3"

puts "Pid: #{Process.pid}"
STDIN.gets

100.times do
Digest::MD5.file(test_file).digest
end

GC.start
puts "done"
STDIN.gets

===


Present me 43 handles to the first gets, grows to 44 during the
100.times and then returns to 43 at second gets.

Tested on Ruby 1.8.6 too, Handlers are 34 at the start and grow to 35
during the 100.times.

--
Luis Lavena
AREA 17
-
Perfection in design is achieved not when there is nothing more to add,
but rather when there is nothing more to take away.
Antoine de Saint-Exupéry

pete

unread,
Mar 18, 2010, 3:06:01 PM3/18/10
to RubyInstaller
On Mar 18, 2:15 am, Luis Lavena <luislav...@gmail.com> wrote:

I had tried using GC.start in my experiments and found it didn't make
a difference.

Running your script I get results similar to what I saw before; the
process uses 30 handles while waiting for the first gets, then 230
waiting for the second gets.

pete

Luis Lavena

unread,
Mar 18, 2010, 3:07:50 PM3/18/10
to rubyin...@googlegroups.com
On Thu, Mar 18, 2010 at 8:06 PM, pete <pe...@peterhiggins.org> wrote:
>
> I had tried using GC.start in my experiments and found it didn't make
> a difference.
>
> Running your script I get results similar to what I saw before; the
> process uses 30 handles while waiting for the first gets, then 230
> waiting for the second gets.
>

Then I would say that your environment is leaking.

Windows 7 Ultimate x64 here, worked like a charm.

What size is the file you're digesting? Mine was ~4MB

pete

unread,
Mar 18, 2010, 7:39:39 PM3/18/10
to RubyInstaller
On Mar 18, 12:07 pm, Luis Lavena <luislav...@gmail.com> wrote:

My test file (lightin2.mp3) was similar ~4.5MB or so.

I was able to reproduce the leaking file handles on a third XP
machine, but not on Vista or Win7 (32-bit) machines.

Looks like I will be using my workaround for now.

pete

Luis Lavena

unread,
Mar 19, 2010, 3:07:28 AM3/19/10
to rubyin...@googlegroups.com
On Fri, Mar 19, 2010 at 12:39 AM, pete <pe...@peterhiggins.org> wrote:
>
> My test file (lightin2.mp3) was similar ~4.5MB or so.
>
> I was able to reproduce the leaking file handles on a third XP
> machine, but not on Vista or Win7 (32-bit) machines.
>
> Looks like I will be using my workaround for now.
>

If this can be reproduced with 1.8.7 and 1.9.1, I would suggest raise
this as Bug in Ruby redmine:

http://redmine.ruby-lang.org/

Include a list to this thread also.

Thank you.

Octagon

unread,
Mar 19, 2010, 7:09:05 AM3/19/10
to RubyInstaller
> If this can be reproduced with 1.8.7 and 1.9.1, I would suggest raise
> this as Bug in Ruby redmine:
>
> http://redmine.ruby-lang.org/
>
> Include a list to this thread also.

I confirm that this happens on my fully patched XP, wint both mingw
and VS compiled
official ruby-1.9.1-p376-i386-mswin32.zip.

No handle leak on 7, no reaction to GC.start.

My PC is single core, possibly this matters.

Luis Lavena

unread,
Mar 19, 2010, 7:20:21 AM3/19/10
to rubyin...@googlegroups.com

Don't think is a core thing but more a OS version.

Please report this issue to RubyCore.

You can also build trunk and see if the issue is present with that
specific version too.

pete

unread,
Mar 19, 2010, 7:13:21 PM3/19/10
to rubyin...@googlegroups.com
--
You received this message because you are subscribed to the Google Groups "RubyInstaller" group.
To post to this group, send email to rubyin...@googlegroups.com.
To unsubscribe from this group, send email to rubyinstalle...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/rubyinstaller?hl=en.


Octagon,

Thanks for confirming I'm not crazy!

Luis,

I will submit this to RubyCore this weekend. Thanks for your help debugging.

pete

pete

unread,
Mar 20, 2010, 4:35:32 PM3/20/10
to RubyInstaller
On Mar 19, 4:13 pm, pete <p...@peterhiggins.org> wrote:
> > rubyinstalle...@googlegroups.com<rubyinstaller%2Bunsu...@googlegroups.com>

> > .
> > For more options, visit this group at
> >http://groups.google.com/group/rubyinstaller?hl=en.
>
> Octagon,
>
> Thanks for confirming I'm not crazy!
>
> Luis,
>
> I will submit this to RubyCore this weekend. Thanks for your help debugging.
>
> pete

Here's my bug report:
http://redmine.ruby-lang.org/issues/show/2992

pete

Reply all
Reply to author
Forward
0 new messages