I wrote this little method to return the size of a given directory, but
I think it's very ugly. Could anyone help me clean it up a bit?
def Dir.size(dirname)
Dir.chdir(dirname)
entries = Dir.entries(".").reject { |x| %w(. ..).include? x }
entries.collect! { |filename| File.expand_path(filename) }
size = 0
entries.each do |filename|
begin
if File.file?(filename)
size += File.size(filename) rescue 0
else
size += Dir.size(filename)
end
rescue
next
end
end
size
end
Thanks,
Vincent.
> Hello guys,
>
> I wrote this little method to return the size of a given directory,
> but
> I think it's very ugly. Could anyone help me clean it up a bit?
File.size(dirname) seems to be working on my system. Am I missing
something obvious?
James Edward Gray II
File.size(dirname) only tells you how many bytes the directory's own
inode is using. It doesn't include the bytes for the directory's
files.
How about this:
def Dir.size(dname)
Dir.new(dname).inject(File.size(dname)) {|total,name|
begin
exname = File.expand_path(name,dname)
if File.file?(exname)
total + File.size(exname)
elsif File.directory?(exname) and name != '.' and name != '..'
total + Dir.size(exname)
else
total
end
rescue
total
end
}
end
-Ed
Hi guys,
I managed to get rid of the file names discovery by using Dir's globbing
facilities. The size calculation is then a matter of a single inject call:
def Dir.size(name)
Dir.chdir(name)
files = Dir["**/*"]
files.inject(0) do |total, name|
if File.file?(name)
total + File.size(name)
else
total
end
end
end
puts Dir.size(".")
puts Dir.size("D:/tmp/ruby")
puts Dir.size("C:/Windows")
I don't like the "if" statement inside the block that gets injected. Is
there a better, idiomatic way to express the same thing?
Hristo Deshev
|def Dir.size(name)
| Dir.chdir(name)
| files = Dir["**/*"]
| files.inject(0) do |total, name|
| if File.file?(name)
| total + File.size(name)
| else
| total
| end
| end
|end
Hristo Deshev
|class Dir
| def size(name)
| Dir.chdir(name)
| Dir["**/*"].inject(0) do |total,name|
| total + (File.file?(name) ? File.size(name) : 0)
| end
| end
|end
Also, be aware that you're changing the working dir and not changing it
back.
Are there situations where Dir[name + "/**/*"] wouldn't work?
Hi guys,
Hristo Deshev
#####################################################################################
This email has been scanned by MailMarshal, an email content filter.
#####################################################################################
On 9/5/05, Daniel Sheppard <dan...@pronto.com.au> wrote:
>
> This is cleaner to my eye, but YMMV.
>
> |class Dir
> | def size(name)
> | Dir.chdir(name)
> | Dir["**/*"].inject(0) do |total,name|
> | total + (File.file?(name) ? File.size(name) : 0)
> | end
> | end
> |end
The ?: operator is indeed shorter. I am not sure if I like it better, but I
will stick with it. Maybe extracting the file or dir size calculation to a
separate method would be best.
Also, be aware that you're changing the working dir and not changing it
> back.
>
> Are there situations where Dir[name + "/**/*"] wouldn't work?
I can swear I tried that first and it did not return any files. Maybe I was
missing a slash somewhere. It now works like a charm. Thanks for pointing it
out.
Hristo Deshev
================ Errno::EACCES =====================
C:\Documents and Settings\All Users\Documents\test.rb:10:in `chdir'
Dir.chdir('c:/System Volume Information')
C:\Documents and Settings\All Users\Documents\test.rb:10
Dir.chdir('c:/System Volume Information')
c:\ruby\lib\ruby\site_ruby\1.8/rubygems/custom_require.rb:21:in `require__'
require__ path
c:\ruby\lib\ruby\site_ruby\1.8/rubygems/custom_require.rb:21:in `require'
require__ path
=============================================
Exception: Permission denied - c:/System Volume Information
Wayne Vucenic
No Bugs Software
Ruby and C++ Contract Programming in Silicon Valley
> Hello guys,
>
> I wrote this little method to return the size of a given directory,
> but
> I think it's very ugly. Could anyone help me clean it up a bit?
require 'find'
def Dir.size(dirname)
size = 0
Find.find dirname do |name|
next unless File.file? name
size += File.size name rescue 0
end
return size
end
--
Eric Hodel - drb...@segment7.net - http://segment7.net
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04
def Dir.size(name)
Dir[ name + "/**/*" ].select{|x| File.file?(x)}.inject(0){|sum,f|
sum + File.size(f) }
end
One more safety addition
def Dir.size(name)
Dir[ File.join(name, "**/*") ].select{ | f | File.file?(f)
}.inject(0){ | sum, f |
sum + File.size(f)
}
end
regards,
Brian
--
http://ruby.brian-schroeder.de/
Stringed instrument chords: http://chordlist.brian-schroeder.de/
just thought i'd point out that every solution posted thus far fails in a
variety of ways when links are considered - in the best cases linked files are
counted twice or linked dirs are not checked, in the worst case infinite loops
occur. the methods using 'Dir[glob]' v.s. 'Find::find' suffer from the link
issue but also will perfom badly on large file systems. unfortunately ruby's
built-in 'Find::find' cannot deal with links - for that you have to rely on
motoyuki kasahara's Find2 module, which you can get off of the raa. i have it
inlined in my personal library (alib - also on the raa) with few small bug
fixes and interface additions, to use you would do something like:
require 'alib'
def dir_size dir
size = 0
totalled = {}
ALib::Util::find2(dir, 'follow' => true) do |path, stat|
begin
next if totalled[stat.ino]
next unless stat.file?
size += stat.size
ensure
totalled[stat.ino] = true
end
end
size
end
p dir_size('.')
this handles huge directories, duplicate files (links) in a directory, linked
directories, and potential infinite loops. i think this is about as simply as
one can write this without introducing subtle, or not so subtle, bugs.
cheers.
-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| Your life dwells amoung the causes of death
| Like a lamp standing in a strong breeze. --Nagarjuna
===============================================================================
> just thought i'd point out that every solution posted thus far fails in a
> variety of ways when links are considered
If you have links, you also have du, which IMO is the Right Tool For
This Job:
total = `du --max-depth=0|cut -f 1`.to_i
-dB
--
David Brady
ruby...@shinybit.com
C++ Guru. Ruby nuby. Apply salt as needed.
sorry for stating the obvious, but: this isn't very portable.
cheers
Simon
> David Brady wrote:
>
>> If you have links, you also have du, which IMO is the Right Tool For
>> This Job:
>>
>> total = `du --max-depth=0|cut -f 1`.to_i
>
>
> sorry for stating the obvious, but: this isn't very portable.
Yup! :-)
<soapbox>
Portability isn't a good idea here.
(wait for scandalized gasps to quiet down)
The need here was to clean up the code. The solutions so far have
bulkily danced around the fact that Find::find doesn't seem to work
satisfactorily.
If the Standard Library is defective, the proper, *portable* solution
should be to patch the .c files that are defective. Until then, if
we're going to work around the Standard Library, we should be as quick
and deadly as possible.
I recognize that it may be seen as cheating or possibly even
inappropriate to suggest "don't use Ruby for this" on a Ruby mailing
list, but I feel strongly that Ruby's very ability to take advantage of
external tools is a great strength of the language. A superior wheel
already exists; reinventing it poorly does not seem to me to leverage
Ruby's power very well.
There seems to me to be a "right" way to do this, and it is to have the
Standard Library work as desired. Until then, how much effort should be
spent standardizing on a portable kludge?
</soapbox>
And of course, if the OP is on a different platform, the if statement
that started my earlier post will become important. The else clause
probably reads "then the issues with File::find link don't affect you.
Use the Standard Library."
Just my $0.02. Note sig.
This solution fails under windoze by always returning 0, apparently
because
File.stat(path).ino always returns 0.
If you're stuck with windoze, use the previously posted
def Dir.size(name)
Dir[File.join(name, "**/*")].select{|f|
File.file?(f)}.inject(0){|sum,f|
sum + File.size(f)
}
end
> Ara.T.Howard wrote:
>
>> just thought i'd point out that every solution posted thus far fails in a
>> variety of ways when links are considered
>
> If you have links, you also have du, which IMO is the Right Tool For This
> Job:
>
> total = `du --max-depth=0|cut -f 1`.to_i
windows has links - and systems that have du may not have one that supports
max-depth. and du reports on file system blocks - not directory sums. this is
typically close, but can diverge greatly depending on file system setup and the
number of directoriess - since du will report usage for directories too..
harp:~ > mkdir foobar
harp:~ > du foobar
4 foobar
fyi.
> This solution fails under windoze by always returning 0, apparently because
> File.stat(path).ino always returns 0.
>
> If you're stuck with windoze, use the previously posted
>
> def Dir.size(name)
> Dir[File.join(name, "**/*")].select{|f|
> File.file?(f)}.inject(0){|sum,f|
> sum + File.size(f)
> }
> end
works for me :
Ara@JEN ~
$ cat a.rb
require 'alib'
def dir_size dir
size = 0
totalled = {}
ALib::Util::find2(dir, 'follow' => true) do |path, stat|
begin
next if totalled[stat.ino]
next unless stat.file?
size += stat.size
ensure
totalled[stat.ino] = true
end
end
size
end
p dir_size('.')
Ara@JEN ~
$ ruby a.rb
29432845
Ara@JEN ~
$ du -sb .
29432903 .
Ara@JEN ~
$ ruby -r rbconfig -r yaml -e'y Config::CONFIG' |egrep -i win
target: i686-pc-cygwin
ac_ct_WINDRES: windres
WINDRES: windres
archdir: /usr/lib/ruby/1.8/i386-cygwin
sitearch: i386-cygwin
arch: i386-cygwin
host_os: cygwin
build: i686-pc-cygwin
host: i686-pc-cygwin
build_os: cygwin
target_os: cygwin
sitearchdir: /usr/lib/ruby/site_ruby/1.8/i386-cygwin
so it seems like your ruby may be broken - how'd you install it?
Mine:
build: i686-pc-mswin32
build_os: mswin32
host: i686-pc-mswin32
host_os: mswin32
target: i386-pc-mswin32
target_os: mswin32
Yours:
build: i686-pc-cygwin
build_os: cygwin
host: i686-pc-cygwin
host_os: cygwin
target: i686-pc-cygwin
target_os: cygwin
Try it without cygwin. On my system, stat.ino is always 0.
If anyone else is running plain windoze without cygwin, see if
File.stat(path).ino is always 0.
hmmm. you are right. this seems to be a bug in ruby - amazing that no-one
has seen it before though? it looks like this might be in rb_uint2big but i'm
kinda guessing since i can't compile on windows myself and therefore can't
look at config.h - except in cygwin, which works. anyhow - the numbers spat
out in cygwin are huge - so i'm gussing the bug is here. perhaps someone out
there with a windows compiler tool-chain could examine?
so, just to re-state the bug: File::stat(anypath).ino is always zero under the
one click installer, but not under cygwin using the default or compiling by
hand.
<snip>
> just thought i'd point out that every solution posted thus far fails in a
> variety of ways when links are considered - in the best cases linked files are
> counted twice or linked dirs are not checked, in the worst case infinite loops
> occur.
For a good summary of the problems with calculating the size of a
directory (on Windows) see
http://blogs.msdn.com/oldnewthing/archive/2004/12/28/336219.aspx.
Regards,
Dan
<snip>
> so, just to re-state the bug: File::stat(anypath).ino is always zero under the
> one click installer, but not under cygwin using the default or compiling by
> hand.
Generally speaking, File.stat on Windows is not reliable. Too many of
that Stat members are either meaningless or wrong. Revamping it is on
my TODO list for the win32-file package.
Regards,
Dan
thanks daniel, some searching lead me to suspect this. seems like a severe
limitation since 'ino' is the only way to determine if a file is unique -
bummer. reading over the source leads me to think that the issue is simply a
casting bug - but that the information should be there for ruby (in the
inode). does this sound correct?
indeed.
we've got a script here (dirsum) that does essentially all the things outlined
- especially checking compressed files - for monitoring data volumes. the
problem is much trickier that one would assume.
By all means, please share. I'd be happy to add a Dir.size method to
win32-dir. :)
Regards,
Dan
Upon further review, this is not a bug. From the MSDN documentation on
the st_ino struct member:
"The inode, and therefore st_ino, has no meaning in the FAT, HPFS, or
NTFS file systems."
See
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclib/html/_crt__stat.2c_._wstat.2c_._stati64.2c_._wstati64.asp
for more details.
Regards,
Dan
> Upon further review, this is not a bug. From the MSDN documentation on the
> st_ino struct member:
>
> "The inode, and therefore st_ino, has no meaning in the FAT, HPFS, or NTFS
> file systems."
>
> See
> http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclib/html/_crt__stat.2c_._wstat.2c_._stati64.2c_._wstati64.asp
> for more details.
thanks for that! any idea how to tell if a file is unique on that file system
then? eg, given a filename and it's stat - how can you show that it is not
some other file? nothing but File::expand_path(pathname)?
At Fri, 23 Sep 2005 07:01:39 +0900,
Ara.T.Howard wrote in [ruby-talk:157179]:
> thanks for that! any idea how to tell if a file is unique on that file system
> then? eg, given a filename and it's stat - how can you show that it is not
> some other file? nothing but File::expand_path(pathname)?
eban has suggested File.identical? for that purpose.
--
Nobu Nakada
> eban has suggested File.identical? for that purpose.
that sounds great - but how would it be implemented on windows if there is no
unique field in the inode?
At Fri, 23 Sep 2005 08:49:15 +0900,
Ara.T.Howard wrote in [ruby-talk:157186]:
> > eban has suggested File.identical? for that purpose.
>
> that sounds great - but how would it be implemented on windows if there is no
> unique field in the inode?
Of course, comparing expanded pathes ;)
--
Nobu Nakada
lol!
that's fine - but does that work with hard/soft links? i know next to nothing
about windows but remember something about it having something like hard links
and:
harp:~ > touch a
harp:~ > ln a b
harp:~ > ruby -e'p File::expand_path("b")'
"/home/ahoward/b"
so that's wouldn't work on unix - maybe it does on windows?
1. There is, at least on NTFS, a unique file identifier that is
somehow available. Don't ask me how right now, but I should be able to
find out in a few days (work-related stuff).
2. Files cannot be hardlinked on any Windows filesystem. Directories
can be hardlinked on NTFS5 systems.
-austin
--
Austin Ziegler * halos...@gmail.com
* Alternate: aus...@halostatue.ca
The file's unique ID is assigned by the system and is stored in the
nFileIndexHigh and nFileIndexLow fields of BY_HANDLE_FILE_INFORMATION
(API call
is GetFileInformationByHandle())
(source: MSDN)
>
> 2. Files cannot be hardlinked on any Windows filesystem. Directories
> can be hardlinked on NTFS5 systems.
>
Erm.. they can - use CreateHardLink() - but there is no shell support for them
(so that users won't delete real files by accident I guess).
Directories ~cannot~ be hard-linked (but apparently you can create
'junction points' - a
kind of soft link - though I've never used them myself).
Regards,
Sean
At Fri, 23 Sep 2005 11:16:48 +0900,
Sean O'Halpin wrote in [ruby-talk:157221]:
> > 1. There is, at least on NTFS, a unique file identifier that is
> > somehow available. Don't ask me how right now, but I should be able to
> > find out in a few days (work-related stuff).
>
> The file's unique ID is assigned by the system and is stored in the
> nFileIndexHigh and nFileIndexLow fields of BY_HANDLE_FILE_INFORMATION
> (API call
> is GetFileInformationByHandle())
> (source: MSDN)
Thank you for the info. I've forgotton it. It will be used in
the case it is available.
> > 2. Files cannot be hardlinked on any Windows filesystem. Directories
> > can be hardlinked on NTFS5 systems.
> >
> Erm.. they can - use CreateHardLink() - but there is no shell support for them
> (so that users won't delete real files by accident I guess).
Mswin32 and mingw32 version rubys support it.
> Directories ~cannot~ be hard-linked (but apparently you can create
> 'junction points' - a
> kind of soft link - though I've never used them myself).
Right. It wouldn't be identical to the original.
--
Nobu Nakada