Hi Mike,
All kinds of things could be going wrong here. You really should be
grabbing these in batches from the database, rather than all at once.
Use the limit option to grab a thousand at a time or something. If
you're using an ACID database, do the whole thing in a transaction to
ensure records inserting mid batch don't mess things up.
Here is an extract from a rake task I wrote to do batched conversion of
Wordpress comments. I'm using some Rails models, but you can see what
is going on:
I'm sure there is a better way to do this tbh, but it works ok.
Really though, if it's not a complex query you're running and it's a
very large dataset, then consider using mysqldump - it'll be faster and
use less ram than Ruby, and I'm sure it supports csv output.
John.
--
http://www.brightbox.co.uk - UK Ruby on Rails hosting
http://johnleach.co.uk
csv = FasterCSV.open( target_file, "w" )
res = ActiveRecord::Base.connection.execute( your_sql )
# Header
csv << res.fetch_fields.map { |f| f.name }
# Results
res.each do |row_data|
csv << row_data
end
csv.close
This is probably specific to the MySQL adapter, although I'm sure
there will be similar methods to fetch_fields in the other adapters.
Regards,
Carl.
--
Carl Drinkwater
Director,
29degrees Ltd
Bespoke Web Application Development
t: 0161 953 6669
m: 07515 351 589
f: 07092 217 531
w: http://29degrees.co.uk
-----------------------------------------------
Office: 23 New Mount Street, Manchester, M4 4DE
Registered Address: St. George's House, 215/219
Chester Road, Manchester, M14 4JE.
Company Number 05952987 | VAT Number 897293655
-----------------------------------------------
Nice - that's a very tidy solution. I'm assuming the each method on res
fetches the rows one at a time from the MySQL server, rather than
grabbing them all at once.
> Nice - that's a very tidy solution. I'm assuming the each method on
> res
> fetches the rows one at a time from the MySQL server, rather than
> grabbing them all at once.
Yeah, exactly.
Regards,
Carl.
What's bad with mine is that MySQL does the query on the whole table on
every loop and then just sends you the batch you asked for. With a big
table, this is really bad.
Anyone know a good way of having ActiveRecord models do this kind of
execute and fetch on each, rather than get the whole lot at the start?
It's clearly preferable sometimes.
John.
> Anyone know a good way of having ActiveRecord models do this kind of
> execute and fetch on each, rather than get the whole lot at the start?
> It's clearly preferable sometimes.
ActiveRecord doesn't support it, so you have to write your own methods
in your models in cases where you need it :
class Foo < ActiveRecord::Base
def many_rows( &block )
ActiveRecord::Base.connection.execute( sql ).each do |row|
yield row
end
end
end
Foo.new.many_rows do |r|
puts r.inspect
end
Note that ActiveRecord::Base.connection.execute returns a
MySQL::Result object, and MySQL::Result#each yields just an array -
You don't get the column names. That said, there might be a
#each_as_hash or something - I've not checked.
Using the method above, you're side-stepping ActiveRecord completely.
Sometimes that's necessary though, and you shouldn't feel dirty by
doing so :)
Regards,
Carl.