Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

catting files

2 views
Skip to first unread message

Mark Probert

unread,
Feb 28, 2005, 3:17:03 PM2/28/05
to

Hi ..

There is approximately an order of magnitude difference in the performance of
these two snippets. Is there a faster way of doing the pure Ruby version?


### use the system call 'cat'
def sys_cat
of = "plato.txt.cat"
clean(of)
l = `cat Plato/*.txt > #{of}`
end


### open each file and copy it
def rby_cat
of = "plato.txt.rby"
clean(of)
off = File.new(of, "w+")
Dir["Plato/*.txt"].each do |f|
text = IO.readlines(f)
off.puts text
end
off.close
end

user system total real
sys_cat 0.000000 0.015625 0.117188 ( 0.167577)
rby_cat 0.937500 0.085938 1.023438 ( 1.064247)


Thanks,

--
-mark. (probertm at acm dot org)


Florian Gross

unread,
Feb 28, 2005, 3:22:49 PM2/28/05
to
Mark Probert wrote:

> There is approximately an order of magnitude difference in the performance of
> these two snippets. Is there a faster way of doing the pure Ruby version?
>

> ### open each file and copy it
> def rby_cat
> of = "plato.txt.rby"
> clean(of)
> off = File.new(of, "w+")
> Dir["Plato/*.txt"].each do |f|
> text = IO.readlines(f)
> off.puts text
> end
> off.close
> end

def ruby_cat()
of = "plato.txt.ruby"
clean of
File.open(of, "w") do |off|
Dir.glob("Plato/*.txt") do |f|
off << File.read(f)
end
end
end

You might get better performance by reading the files in 4096 byte
blocks or something similar.

Robert Klemme

unread,
Feb 28, 2005, 3:29:28 PM2/28/05
to

"Florian Gross" <fl...@ccan.de> schrieb im Newsbeitrag
news:38hcsvF...@individual.net...

... and binary possibly helps, too.

def stream_copy(in, out)
while ( b = in.read(4096) )
out.write b
end
end

def ruby_file_cat(in, out)
File.open(in, "rb") do |i|
File.open(out, "wb") {|o| stream_copy(i, o)}
end
end

Or similar.

robert

Alexander Kellett

unread,
Feb 28, 2005, 3:37:21 PM2/28/05
to
IO.read?
gets(nil)?

Javier Valencia

unread,
Feb 28, 2005, 4:01:03 PM2/28/05
to
Mark Probert wrote:

Try mine:

def rby_cat
File.open("file.rby", "w+") do |file|
Dir["*.txt"].each do |f|
file.write(IO.readlines(f))
end
end
end


Austin Ziegler

unread,
Feb 28, 2005, 4:02:27 PM2/28/05
to
On Tue, 1 Mar 2005 05:17:03 +0900, Mark Probert <prob...@acm.org>
wrote:

> There is approximately an order of magnitude difference in the
> performance of these two snippets. Is there a faster way of doing
> the pure Ruby version?

> ### use the system call 'cat'
> def sys_cat
> of = "plato.txt.cat"
> clean(of)
> l = `cat Plato/*.txt > #{of}`
> end

def faster_cat
of = "plato.txt.fct"
clean(of)
File.open(of, "wb+") do |outf|
Dir["Plato/*.txt"].each do |inf|
outf.puts IO::read(inf)
end
end
end

As I don't have the "Plato/*.txt" files, I can't test it, but that
you're not splitting the file into multiple lines and iterating over
them will be faster.

-austin
--
Austin Ziegler * halos...@gmail.com
* Alternate: aus...@halostatue.ca


Mark Probert

unread,
Feb 28, 2005, 4:20:50 PM2/28/05
to
Hi ..

On Monday 28 February 2005 13:02, Austin Ziegler wrote:
>     of = "plato.txt.fct"
>     clean(of)
>     File.open(of, "wb+") do |outf|
>       Dir["Plato/*.txt"].each do |inf|
>         outf.puts IO::read(inf)
>       end
>     end

Perfect!
user system total real
sys_cat 0.000000 0.007812 0.101562 ( 0.164781)
rb_cat 0.953125 0.070312 1.023438 ( 1.065408)
faster_cat 0.046875 0.078125 0.125000 ( 0.166651)


Thanks, Austin. I knew there was a better way :-)

0 new messages