I'm trying to use IO.popen to send data to and read data from an
external process. I've create a sample case in the shell like this:
$ { echo hello ; sleep 2 ; echo world; } | cat
hello
world
I've written the same in ruby like so, which works:
$ cat foo.rb
#!/usr/bin/env ruby
if $0 == __FILE__
cat = IO.popen("cat", "w+") ;
cat.puts("hello, ") ;
puts(cat.gets) ;
sleep 2 ;
cat.puts("world") ;
puts(cat.gets) ;
end
$ ./foo.rb
hello
world
However, if I change the cat command to a sed command, the ruby
version no longer works. The command-line equivalent does work, but
the ruby version waits forever and has to be interrupted:
$ { echo hello ; sleep 2 ; echo world; } | sed -ne p
hello
world
$ cat foo.rb
#!/usr/bin/env ruby
if $0 == __FILE__
cat = IO.popen("sed -ne p", "w+") ;
cat.puts("hello, ") ;
puts(cat.gets) ;
sleep 2 ;
cat.puts("world") ;
puts(cat.gets) ;
end
$ ./foo.rb
./foo.rb:6:in `gets': Interrupt
from ./foo.rb:6
Why does ruby work in the first case but wait forever in the second?
Regards,
- Robert
That's unfortunate. What if I don't want to close the pipe? That is, I would
like to keep the pipe open so that I can send some data, read some
data and work on it, send some more data, read some more data and work
on it, etc. much like the process was a service, e.g. database. I am
trying to code the equivalent of a Call and Response. My examples
using cat and sed are just stand-ins for the real program.
> ...but calling IO#sync=true on the object didn't help.
Tried that, also, and noticed it didn't help.
> Anyway, the following works. I'm not sure there's any simple way to
> process the stream in real-time though. I also converted it to more
> idiomatic Ruby, which is much shorter.
And that does work, sort of. The output "hello" and "world" do
appear, but only after the 2 second pause. I also tried 'cat -n' at
the shell and it also pauses before the output:
$ { echo hello ; sleep 2 ; echo world; } | cat -n
Makes me wonder if buffering the output is a feature of the shell or
program or system and if that feature can be temporarily overridden.
It would be nice to do something like this:
open_pipe
set_of_data.each do |foo|
send_foo_to_pipe
flush_pipe
read_foo_output_from_pipe
process_foo_output
end
close_pipe
Of course, it's entirely possible that IO.popen is not the "right" way
to tackle this and I have not discovered the Ruby way, yet.
Again, any pointers in the right direction are greatly appreciated.
Regards,
- Robert
I suspect that it may be the cat/sed program that's doing the caching,
> I also tried 'cat -n' at
> the shell and it also pauses before the output:
>
> $ { echo hello ; sleep 2 ; echo world; } | cat -n
>
> Makes me wonder if buffering the output is a feature of the shell or
> program or system and if that feature can be temporarily overridden.
since it works with a plain cat.
Yes, it appears that the external program is controlling the buffering
and not Ruby. When I tried the same process with the program I really
wanted to use, IO.popen worked pretty much the way I wanted it to.
The pattern was this:
foo = io.popen("external_program", "w+")
while data = gets
prepare data
foo.puts(data)
while not end of record
newdata += foo.readlines
end
process newdata
end
foo.close
Turns out that the program I used has a signal to signify the end of a
chunk of data. So the program knows when I am finished sending data
and it can start crunching away. And I know when I can stop reading
data from the pipe and begin processing it. This saves the time of
repeatedly having to open and close the pipe.
Thanks for the help.
Regards,
- Robert