Best Way To Convert Mongo Cursor To Ruby Array?

810 views
Skip to first unread message

Jim Mulholland

unread,
May 29, 2009, 12:45:17 PM5/29/09
to mongodb-user
What is the best way to convert a Mongo Cursor into a Ruby Array?

I've been doing .to_a, but this is not very performant.

For example, I have a collection with just over 1000 records and 50
fields. To return all of these records into a Mongo Cursor is
instant. However, to convert them to a Ruby array using .to_a takes >
10 seconds which seems rather long for that little data.

Thoughts?

Michael Dirolf

unread,
May 29, 2009, 12:53:53 PM5/29/09
to mongod...@googlegroups.com
to_a basically just iterates over the cursor and pushes the resulting
documents into an array. so it shouldn't be much slower than doing
that manually (although the work happens up front with to_a).

one thing extra that it does is save a copy of the array in the
cursor, so that further calls to to_a or attempts to iterate the
cursor can use that internal array rather than having to re-run the
query...

Michael Dirolf

unread,
May 29, 2009, 1:01:50 PM5/29/09
to mongod...@googlegroups.com
Another thing that will affect performance is the optional C extension
- do you have this installed? Instructions are in the README:

http://github.com/mongodb/mongo-ruby-driver/blob/d679a17478081fecfd84223ec5d9fc095d73f19e/README.rdoc

On May 29, 2009, at 12:45 PM, Jim Mulholland wrote:

>

Jim Mulholland

unread,
May 29, 2009, 1:29:09 PM5/29/09
to mongodb-user
Hi Michael,

Thanks for the tip on the C extension. We were missing that from the
server.

However, that still just cut our time in about half. So instead of 12
seconds, the array conversion is taking about 6 seconds.

I thought it may just be a Ruby issue since it is a pretty big array.
However, if I loop through the Ruby array and save it as a new Ruby
array it takes about 1/10th of a second, so it is definitely something
to do with the cursor.

Below is a quick function I created to demonstrate what I am talking
about. Here are the results of running this on our server:

>> mongo_array_timer
FROM MONGO CURSOR
Array size: 1083; Duration to convert to array: 6.169699
FROM RUBY ARRAY
Array size: 1083; Duration to convert to new array: 0.123068

The Mongo to array conversion is < 1 second if we select an individual
column, but it is > 15 seconds if I do not specify a select clause
since Person includes some embedded objects.

Anything else we can do to improve performance?

- Jim

def mongo_array_timer
p=Person.find(:all, :select=>Person.field_names)
p2=[]
set_time = Time.now
p.each {|pp| p2<<pp} ;nil
puts "FROM MONGO CURSOR"
puts "Array size: #{p2.size}; Duration: #{Time.now - set_time}"

set_time = Time.now
p3=[]
p2.each {|pp| p3<<pp} ;nil
puts "FROM RUBY ARRAY"
puts "Array size: #{p3.size}; Duration: #{Time.now - set_time}"
end





On May 29, 12:01 pm, Michael Dirolf <m...@10gen.com> wrote:
> Another thing that will affect performance is the optional C extension  
> - do you have this installed? Instructions are in the README:
>
> http://github.com/mongodb/mongo-ruby-driver/blob/d679a17478081fecfd84...

John Nunemaker

unread,
May 30, 2009, 12:30:53 AM5/30/09
to mongod...@googlegroups.com
The mongo cursor is definitely going to take longer than a normal
array as every so often as it is cursoring through it has to query for
more records. Have you tried setting a high limit and calling to_a on
it?

I didn't check the ruby driver's implementation, but I'm wondering if
it would be faster to set the limit to 500 or 1000 records all at
once, rather than 50 or 100 or whatever the default database cursor
amount is.

I haven't investigated, just thought I would throw that thought out
there.

Out of curiosity why are you needed to fetch 1000 records at once?

Jim Mulholland

unread,
May 30, 2009, 10:21:46 AM5/30/09
to mongodb-user
Thanks for clearing that up, John.

We had a page that was displaying every avatar of every user in a
group. This particular group had > 1000 people which is why we
noticed the delay.

We got the query down to < 0.5 seconds by selecting only 5 of the 53
Person fields. We may also go back and limit the page to 100 people
which would bring the query down to ~.03 seconds.

Thanks again!
Reply all
Reply to author
Forward
0 new messages