Sorting two Cursors using mongo

500 views
Skip to first unread message

Abhishek kumar

unread,
Sep 13, 2011, 3:09:58 AM9/13/11
to mongodb-user
var cur1=db.messages.find({fromId:"b...@d.com",toId:"s...@d.com"}).limit(2)
var cur2=db.messages.find({fromId:"a...@d.com",toId:"p...@d.com"}).limit(2)

the collection('messages') contains a timeStamp, I need to sort 'cur1'
and 'cur2' on the timeStamp

Asoka Sampath Edussooriya

unread,
Sep 13, 2011, 3:28:32 AM9/13/11
to mongod...@googlegroups.com
Hi,

this can be achieved as follow.

var cur1=db.messages.find({fromId:"b...@d.com",toId:"s...@d.com"}).sort({"dateColumn":true}) or 

var cur1=db.messages.find({fromId:"b...@d.com",toId:"s...@d.com"}).sort({"dateColumn":1})

Thanks!

Asoka


--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.




--
Asoka Sampath Edussooriya

karl

unread,
Sep 13, 2011, 3:30:00 AM9/13/11
to mongod...@googlegroups.com
doesn't cur1.sort({timestamp:1})  and cur2.sort({timestamp:1}) work?  (or -1 if you want descending)

Asoka Sampath Edussooriya

unread,
Sep 13, 2011, 4:08:16 AM9/13/11
to mongod...@googlegroups.com
Hi Karl,

Yeah. this way too works !

Asoka

On Tue, Sep 13, 2011 at 1:00 PM, karl <karls...@gmail.com> wrote:
doesn't cur1.sort({timestamp:1})  and cur2.sort({timestamp:1}) work?  (or -1 if you want descending)

--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/mongodb-user/-/PlqlBUZt_8gJ.

To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.



--
Asoka Sampath Edussooriya

Abhishek kumar

unread,
Sep 13, 2011, 5:13:15 AM9/13/11
to mongodb-user

I want to sort both cur1 and cur2, together, based on dateColumn..
so, isn't a way to do this in mongo

On Sep 13, 12:28 pm, Asoka Sampath Edussooriya <eas...@gmail.com>
wrote:
> Hi,
>
> this can be achieved as follow.
>
> var cur1=db.messages.find({fromId:"b...@d.com",toId:"s...@d.com"}).sort({"dateColumn":true})
> or
>
> var cur1=db.messages.find({fromId:"b...@d.com",toId:"s...@d.com
> "}).sort({"dateColumn":1})
>
> Thanks!
>
> Asoka
>
> On Tue, Sep 13, 2011 at 12:39 PM, Abhishek kumar
> <abhishekiit...@gmail.com>wrote:

karl

unread,
Sep 13, 2011, 5:31:22 AM9/13/11
to mongod...@googlegroups.com
ah, I see, you want to merge the result and have that sorted.  Your best best, given that you want to limit each query separate, is to sort them in your application...it's simple and clean.


Asoka Sampath Edussooriya

unread,
Sep 13, 2011, 5:49:19 AM9/13/11
to mongod...@googlegroups.com
Hi Abhishek,

Did u try the way that Karl noted? I think you are asking something similar to combining both queries together right ? Please follow the below links to get an Idea.


Thanks!

Asoka

Sam Millman

unread,
Sep 13, 2011, 6:09:52 AM9/13/11
to mongod...@googlegroups.com
As Asoka states, the best method to combine the two cursors is to query both statements in one find() statement.

Mongo can merge two separate cursors, however, this IS NOT advised.

You can pull both cursors down into a merged collection (temporary) from which you can read from. This is really really really slow and totally negates any speed gain from using MongoDB. This is since it would require to write to disk.

Sam Millman

unread,
Sep 13, 2011, 6:30:59 AM9/13/11
to mongod...@googlegroups.com
Also as Karl states it can be done client side however you will lose the fast sorting and indexing capabilities of MongoDB since the cursor will no longer be a cursor but an array in your specific language.

This can make it near on impossible to sort some result sets at an easy pace (such as large result sets) and most definitely would not be scalable.

It is by far best to combine the two queries into one single query if you require them to interact together and be sorted as a single list.

karl

unread,
Sep 13, 2011, 8:04:26 AM9/13/11
to mongod...@googlegroups.com
I'm pretty sure if you look at his query, you'll note that they *can't* be merged into a single find.

Also, he's pulling down 4 records, there's no way Mongo's sorting is going to be faster than any client-side code.


Asoka Sampath Edussooriya

unread,
Sep 13, 2011, 8:18:37 AM9/13/11
to mongod...@googlegroups.com
Hi Karl,

But I think this will be possible if he can use "$all" condition for the query. If I am not correct please rectify my comment.

Thanks!

Asoka

On Tue, Sep 13, 2011 at 5:34 PM, karl <karls...@gmail.com> wrote:
I'm pretty sure if you look at his query, you'll note that they *can't* be merged into a single find.

Also, he's pulling down 4 records, there's no way Mongo's sorting is going to be faster than any client-side code.


--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/mongodb-user/-/7nBAZSGvhNEJ.

To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.



--
Asoka Sampath Edussooriya

Sam Millman

unread,
Sep 13, 2011, 8:28:22 AM9/13/11
to mongod...@googlegroups.com
var cur1=db.messages.find({fromId:"b...@d.com",toId:"s...@d.com"}).limit(2)
var cur2=db.messages.find({fromId:"a...@d.com",toId:"p...@d.com"}).
limit(2)

I believe he is finding where b...@d.com told s...@d.com or where a...@d.com told p...@d.com.

find({$or: [{"fromId": "b...@d.com", "told": "s...@d.com"}, {"fromId": "a...@d.com", "told": "p...@d.com"}]}).sort("dateColumn": 1).limit(2)

Sam Millman

unread,
Sep 13, 2011, 8:31:16 AM9/13/11
to mongod...@googlegroups.com
I suppose now looking at it again he might want two set of he told she and two sets of she told he.

I suppose if it were to stay at a low number of records it would be ok.

karl

unread,
Sep 13, 2011, 8:43:28 AM9/13/11
to mongod...@googlegroups.com
No, don't think so. Each query needs to be separately sorted and limited, and then the result needs to be sorted again.

Consider the following documents:

{name: 'james', type: 'unicorn', vampireKills: 17}
{name: 'jordan', type: 'donkey', vampireKills: 2}
{name: 'junior', type: 'donkey', vampireKills: 1}
{name: 'jasmine', type: 'unicorn', vampireKills: 13}
{name: 'jacob', type: 'unicorn', vampireKills: 15}

if you want the top 2 unicorns and the top 2 donkeys, and then you want to sort those,  it has to be a 2 step proces.

db.weird.find({type: 'unicorns'}).sort({vampireKills:-1}).limit(2) //first get the unicorns
db.weird.find({type: 'donkey'}).sort({vampireKills:-1}).limit(2)  //next the donkeys
//merge and sort the two results

This'll return
james //unicorn
jacob //unicorn
jordan //donkey
junior //donkey


If you merge it into a single query, you'll pull down james, jacob, jasmin (a 3rd unicorn, which is the problem!!) and jordan.

So, again, the limit and sort needs to be applied to each query individually, else you get a different result.


Sam Millman

unread,
Sep 13, 2011, 8:58:20 AM9/13/11
to mongod...@googlegroups.com
Well so long as he keeps the record number down it's ok to do complex sorting client side.

--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/mongodb-user/-/k9ugIov-PKAJ.

Asoka Sampath Edussooriya

unread,
Sep 13, 2011, 9:05:02 AM9/13/11
to mongod...@googlegroups.com
Hi all,

Yeah Karl is right. the problem happens when the sorting comes to the scene. I tried according to the Karl's collection. I got the following results for the following query.

Query -
  db.Unicorn.find({$or:[{type:"unicorn"},{type:"donkey"}]}).sort({vampireKills:-1});
Results - 

{ "_id" : ObjectId("4e6f51a0cb0665778029de86"), "name" : "james", "type" : "unicorn", "vampireKills" : 17 }
{ "_id" : ObjectId("4e6f51e3cb0665778029de8a"), "name" : "jacob", "type" : "unicorn", "vampireKills" : 15 }
{ "_id" : ObjectId("4e6f51d5cb0665778029de89"), "name" : "jasmine", "type" : "unicorn", "vampireKills" : 13 }
{ "_id" : ObjectId("4e6f51b6cb0665778029de87"), "name" : "jordan", "type" : "donkey", "vampireKills" : 2 }
{ "_id" : ObjectId("4e6f51c7cb0665778029de88"), "name" : "junior", "type" : "donkey", "vampireKills" : 1 }

when it sorts the result omits the following record.
 
{ "_id" : ObjectId("4e6f51d5cb0665778029de89"), "name" : "jasmine", "type" : "unicorn", "vampireKills" : 13 }

Thanks!

Asoka
--
Asoka Sampath Edussooriya

Asoka Sampath Edussooriya

unread,
Sep 13, 2011, 9:08:33 AM9/13/11
to mongod...@googlegroups.com
Actually not the sort. it omits when it limits the results.

Thanks !

Asoka 
--
Asoka Sampath Edussooriya

Abhishek kumar

unread,
Sep 13, 2011, 1:31:57 PM9/13/11
to mongodb-user
@Sam :
I totally got your point, but there is one issue which I was trying to
resolve....

var
cur1=db.messages.find({fromId:"b...@d.com",toId:"s...@d.com"}).limit(2)
var
cur2=db.messages.find({fromId:"s...@d.com",toId:"b...@d.com"}).limit(2)

I was having indexes on (fromID : 1,toId : 1,sentOn : -1) such that
cur1 returns me the last latest 2 messages of (fromId and toId) using
the index, without using the sort, so total no of scanned elements was
just 2 (and it was ensured that these are latest messages from
["b...@d.com" to "s...@d.com"])
same with the case of cur2

So, in both case I don't require to call sort() function, which is a
costly affair, now I need to do the sort,

In your case:
find({$or: [{"fromId": "b...@d.com", "told": "s...@d.com"}, {"fromId":
"a...@d.com",
"told": "p...@d.com"}]}).sort("dateColumn": 1).limit(2)

it will take a lot more time, because query parsing plan is not as
powerful in mongo as in mysql, is scans the entries, do sorting and
then put limit()


I think, now after getting the cur1 and cur2, I should do the sort in
my application only...
any new ideas


On Sep 13, 3:30 pm, Sam Millman <sam.mill...@gmail.com> wrote:
> Also as Karl states it can be done client side however you will lose the
> fast sorting and indexing capabilities of MongoDB since the cursor will no
> longer be a cursor but an array in your specific language.
>
> This can make it near on impossible to sort some result sets at an easy pace
> (such as large result sets) and most definitely would not be scalable.
>
> It is by far best to combine the two queries into one single query if you
> require them to interact together and be sorted as a single list.
>
> On 13 September 2011 11:09, Sam Millman <sam.mill...@gmail.com> wrote:
>
>
>
>
>
>
>
> > As Asoka states, the best method to combine the two cursors is to query
> > both statements in one find() statement.
>
> > Mongo can merge two separate cursors, however, this IS NOT advised.
>
> > You can pull both cursors down into a merged collection (temporary) from
> > which you can read from. This is really really really slow and totally
> > negates any speed gain from using MongoDB. This is since it would require to
> > write to disk.
>
> > On 13 September 2011 10:49, Asoka Sampath Edussooriya <eas...@gmail.com>wrote:
>
> >> Hi Abhishek,
>
> >> Did u try the way that Karl noted? I think you are asking something
> >> similar to combining both queries together right ? Please follow the below
> >> links to get an Idea.
>
> >>http://www.mongodb.org/display/DOCS/OR+operations+in+query+expressions
> >>http://www.mongodb.org/display/DOCS/Advanced+Queries
>
> >> Thanks!
>
> >> Asoka
>
> >> On Tue, Sep 13, 2011 at 2:43 PM, Abhishek kumar <abhishekiit...@gmail.com

Sam Millman

unread,
Sep 13, 2011, 1:45:52 PM9/13/11
to mongod...@googlegroups.com
"So, in both case I don't require to call sort() function, which is a
costly affair, now I need to do the sort,"

I just guessed your sort. If you want the last two then your index will work fine without the sort.


"it will take a lot more time, because query parsing plan is not as
powerful in mongo as in mysql, is scans the entries, do sorting and
then put limit()"

Your saying calling two cursors with limit() is quicker than calling one cursor with $or?

What you gotta remember is that MongoDB's querying system is a lot slimmer than SQLs which requires tonnes fo stuff that MongoDB just does not implement.

Because of that it doesn't make sense


"without using the sort, so total no of scanned elements was
just 2 (and it was ensured that these are latest messages from
["b...@d.com" to "s...@d.com"])
same with the case of cur"

But I do get what you are trying to do, which is to two get the latest two messages from two specific people.

As such the only method left for you is client side sorting.

Abhishek kumar

unread,
Sep 13, 2011, 3:17:11 PM9/13/11
to mongodb-user
My application is, two person are chatting, I want to fetch the past
chat, in less number of operations, in terms of time
So, I tried to use index on (fromId : 1,toId : 1,sentOn : -1)

let the chat be:
b...@d.com : '1' (to s...@d.com) at time1
s...@d.com : '2' (to d...@d.com) at time2
b...@d.com : '3' (to s...@d.com) at time3
s...@d.com : '4' (to d...@d.com) at time4

time1<time2<time3<time4

and there are other chats of b...@d.com and s...@d.com with other person in
the same table, so I decided to go for index (fromId : 1,toId :
1,sentOn : -1) so that I can make benefit of the index to optimize the
time
so, I just want to fetch the last 2 message i.e.

b...@d.com : '3' (to s...@d.com) at time3
s...@d.com : '4' (to d...@d.com) at time4

so what can be more efficient way of doing...
-------------------------------------------------------

Your saying calling two cursors with limit() is quicker than calling
one
cursor with $or?
As per I checked in my DB, when I did $or as told by you and called
explain() function which gave me following result
{
"cursor" : "BasicCursor",
"nscanned" : 7594,
"nscannedObjects" : 7594,
"n" : 100,
"scanAndOrder" : true,
"millis" : 32,
"indexBounds" : {

}
}

but, in my case using two cursors, nscanned Objects are just (100+100)
then have to do sorting to get the latest 100

Kindly, explain if I am wrong anywhere.......

On Sep 13, 10:45 pm, Sam Millman <sam.mill...@gmail.com> wrote:
> "So, in both case I don't require to call sort() function, which is a
> costly affair, now I need to do the sort,"
>
> I just guessed your sort. If you want the last two then your index will work
> fine without the sort.
>
> "it will take a lot more time, because query parsing plan is not as
> powerful in mongo as in mysql, is scans the entries, do sorting and
> then put limit()"
>
> Your saying calling two cursors with limit() is quicker than calling one
> cursor with $or?
>
> What you gotta remember is that MongoDB's querying system is a lot slimmer
> than SQLs which requires tonnes fo stuff that MongoDB just does not
> implement.
>
> Because of that it doesn't make sense
>
> "without using the sort, so total no of scanned elements was
> just 2 (and it was ensured that these are latest messages from
> ["b...@d.com" to "s...@d.com"])
> same with the case of cur"
>
> But I do get what you are trying to do, which is to two get the latest two
> messages from two specific people.
>
> As such the only method left for you is client side sorting.
>

Sam Millman

unread,
Sep 13, 2011, 3:39:36 PM9/13/11
to mongod...@googlegroups.com
Ah you have an index there.

Though $or should still use an index (I think).

If you were to create a compound index on formId and toldId would that lower the amount of scanned?

Abhishek kumar

unread,
Sep 13, 2011, 10:40:50 PM9/13/11
to mongodb-user
The index that I made, in that I dont need to sort based on timeStamp,
because I am just asking 100 elements (limit(100)) so its using the
index and fetching me the top 100 in decreasing order, which is quite
expected.
But in case if I have separate [index on fromId and index on toId] or
[index on (fromId,date) and index on (toId,date)], the number of
scanned entries are quite more, so the time taken is much more


On Sep 14, 12:39 am, Sam Millman <sam.mill...@gmail.com> wrote:
> Ah you have an index there.
>
> Though $or should still use an index (I think).
>
> If you were to create a compound index on formId and toldId would that lower
> the amount of scanned?
>
Reply all
Reply to author
Forward
0 new messages