query database and sort via shell pipe then read within ruby

1 view

Skip to first unread message

Robert Citek

unread,

Dec 19, 2009, 5:10:51 AM12/19/09

to stl...@googlegroups.com

I would like to sort a list of random integers that I have in a
sqlite3 database. Normally, I would do something like this in ruby:

$ time -p ruby -e '
require "sqlite3"
db = SQLite3::Database.new("sample.db")
select_sql = <<-eof
select samp2
from sample
order by samp2
;
eof
db.execute(select_sql) do |row|
puts row[0] # surrogate for real processing here
end
' | wc -l
1000000
real 175.21
user 106.20
sys 13.47

But for some reason the sort takes a very long time in sqlite3.
Piping the data from sqlite3 into the sort command is much faster
(~20x):

$ time -p { grep -v ^# <<eof
select samp2
from sample
;
eof
} | sqlite3 sample.db |
LC_ALL=C sort -k1,1n |
ruby -e '
STDIN.each do |row|
puts row # surrogate for real processing here
end
' | wc -l
1000000
real 7.55
user 7.20
sys 0.56

My question is, what would be a "clean" way of sending a query string
to sqlite3, piping the output through sort, and then reading it into
ruby, all within ruby? Is io.popen the way to go?

BTW, below is a script that creates the sqlite3 sample.db with 1 MM records.

Regards,
- Robert

---- script ---

#!/bin/bash
echo "=== load data"
time -p echo {1..1000000}$'\t'${RANDOM} |
tr ' ' '\n' |
sqlite3 -init <(echo '
CREATE TABLE sample ( samp1 INTEGER, samp2 INTEGER)
;
.mod tabs
.imp "/dev/stdin" "sample"
select count(*) from sample
;
') sample.db .quit

Robert Citek

unread,

Dec 22, 2009, 11:39:17 AM12/22/09

to stl...@googlegroups.com

On Sat, Dec 19, 2009 at 5:10 AM, Robert Citek <robert...@gmail.com> wrote:
> My question is, what would be a "clean" way of sending a query string
> to sqlite3, piping the output through sort, and then reading it into
> ruby, all within ruby? Is io.popen the way to go?

Here's my stab at it using IO.popen. It works and works fast. Any
suggestions on how to make it more ruby-ish?

Regards,
- Robert

-----

$ time -p ruby -e '
require "sqlite3"

dbfile = "sample.db"
db = SQLite3::Database.new(dbfile)
sql = <<-eof.gsub(/^\s+/,"")
.mod tabs

select samp2
from sample
;
eof

command = <<-eof.gsub(/^\s+/,"")
echo "#{sql}" |
sqlite3 #{dbfile} |
LC_ALL=C sort -k1,1n
eof
IO.popen(command, "r") do |pipe|
pipe.each do |line|
puts line
end
end
' | wc -l