Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
How to resolve the $or sort not using index problem
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  17 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Hanson Lu  
View profile  
 More options Sep 7 2012, 4:43 am
From: Hanson Lu <hans...@gmail.com>
Date: Fri, 7 Sep 2012 01:43:01 -0700 (PDT)
Local: Fri, Sep 7 2012 4:43 am
Subject: How to resolve the $or sort not using index problem

My app has a collection like this
{
  sender:'1000',
  receiver:'9999',
  type:'text',
  content:' this is content' ,
  sentDate:ISODate("2011-10-12T14:54:02.069Z)

}

The collection records number > 50M.

There is a requirement to query records *between *sender 'X' and receiver
'Y' in short time.

First, i create index, {sender:1, receiver:1, sentDate:-1}, use query

  db.msgs.find({$or:[{sender:'1000',receiver:'9999'}, { sender:'9999',
receiver:'1000'}]}).sort(sentDate:-1},

but this query will not use the above index, it do scan full table.  The
problem is known as
https://jira.mongodb.org/browse/SERVER-1205

Next, I change the structure of the collection like this
{
  sender:'1000',
  receiver:'9999',
  type:'text',
  both:['1000','9999']
  content:' this is content' ,
  sentDate:ISODate("2011-10-12T14:54:02.069Z)

}

add a new field named 'both' to store 'sender' and 'receiver' , and use
query

db.msgs.find({both:{$all:['1000','9999']}).sort(sendDate:-1},.

The query do hit index, but if  sender '1000'  has large number of
records(>100000),

the query become very slow, i saw lots page fault from mongostat.

It seems MongoDB will load all documents for indexed for {both:'1000'},
 and compare whether the 'both' contents is exactly['1000','9999'].
is it right?

I have no idea how to resolve the problem. Can you give me some suggestion.

Regards.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Hanson Lu  
View profile  
 More options Sep 7 2012, 4:54 am
From: Hanson Lu <hans...@gmail.com>
Date: Fri, 7 Sep 2012 01:54:56 -0700 (PDT)
Local: Fri, Sep 7 2012 4:54 am
Subject: Re: How to resolve the $or sort not using index problem

I forgot to mention that i have created the index {both:1,sentDate:-1}  for
the query db.msgs.find({both:{$all:['1000','9999']}).sort(sendDate:-1},.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kay  
View profile  
 More options Sep 7 2012, 5:46 pm
From: Kay <kay....@10gen.com>
Date: Fri, 7 Sep 2012 14:46:40 -0700 (PDT)
Local: Fri, Sep 7 2012 5:46 pm
Subject: Re: How to resolve the $or sort not using index problem

Hi Hanson --

We have a ticket to implement index intersection which I believe will
provide a solution in the future.
https://jira.mongodb.org/browse/SERVER-3071

One possible suggestion -- instead of the *both* field being an array, if
you make it an embedded document
so that you would have

{
sender:'1000',
receiver:'9999',
type:'text',
*both: { s: '1000', r: '9999'},*
content:' this is content' ,
sentDate:ISODate("2011-10-12T14:54:02.069Z)

}

and

{
sender:'9999',
receiver:'1000',
type:'text',
*both: { s: '9999', r: '1000'},*
content:' this is content' ,
sentDate:ISODate("2011-10-12T15:54:02.069Z)

}

then, you would create the index

*db.msgs.ensureIndex({both: 1, sendDate:-1})*

and your query would use the *$in *operator rather than the *$all*

*db.msgs.find( { both: { $in: [ { s:'1000', r:'9999' }, {s:'9999',
r:'1000'} ] } } ).sort({sendDate:-1})*

--

And just a summary of what was happening with the your earlier queries (it
already seems like you have a good grasp on the situation):

1)  Regarding your initial query with *$or* and the compound-key index on *{sender:1,
receiver:1, sentDate:-1}*

   - *$or* does not use a compound-key index*.  *Instead,* $or** r*uns
   multiple index plans in parallel.
   -
   https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/ncFY...
   * <https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/ncFY...>
     *presents a good discussion of indexes with *$o**r *

2)  Regarding your second query with *$and* and the array structure

   - creating an index on an array field creates an index on each element
   of the array (http://www.mongodb.org/display/DOCS/Multikeys)
   - an *$all* looks up just the first element in the index (which is the
   first element in the* both* array)

Hope this helps.

Kay


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Hanson Lu  
View profile  
 More options Sep 8 2012, 2:18 am
From: Hanson Lu <hans...@gmail.com>
Date: Fri, 7 Sep 2012 23:18:12 -0700 (PDT)
Local: Sat, Sep 8 2012 2:18 am
Subject: Re: How to resolve the $or sort not using index problem

Many Thanks, Kay. It is a good suggestion.

My app also has another requirement that query records of someone(sender is
'X' or receiver is 'X')

so when  both is  array, i can query like

 db.msgs.find({both:'1000}').sort({sendDate:-1}

if both is embedded document,  it seems that it can not support the query.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Sam Millman  
View profile  
 More options Sep 8 2012, 7:46 am
From: Sam Millman <sam.mill...@gmail.com>
Date: Sat, 8 Sep 2012 12:46:44 +0100
Local: Sat, Sep 8 2012 7:46 am
Subject: Re: [mongodb-user] Re: How to resolve the $or sort not using index problem

It sounds like you would just modify the query above to just do an $or on
the r and s field rather than a compound $or on groups of the r and s field
within the subdocument.

On 8 September 2012 07:18, Hanson Lu <hans...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Sam Millman  
View profile  
 More options Sep 8 2012, 7:51 am
From: Sam Millman <sam.mill...@gmail.com>
Date: Sat, 8 Sep 2012 12:51:25 +0100
Local: Sat, Sep 8 2012 7:51 am
Subject: Re: [mongodb-user] Re: How to resolve the $or sort not using index problem

Sorry I mean you could replace that with two $ins, I think that will use
index.

On 8 September 2012 12:46, Sam Millman <sam.mill...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Sam Millman  
View profile  
 More options Sep 8 2012, 10:05 am
From: Sam Millman <sam.mill...@gmail.com>
Date: Sat, 8 Sep 2012 15:04:58 +0100
Local: Sat, Sep 8 2012 10:04 am
Subject: Re: [mongodb-user] Re: How to resolve the $or sort not using index problem

I just realised that won't work either. I did have a plan but after more
thought I realised it wouldnt do the trick.

Probably your best bet, only if the query is slow, is to store two forms of
it. One in that form that Kay talked about and another in the form you had
originally.

Though I doubt your query will be slow so if you put a index on the query
itself the sort would probably be instantaneous unless your hoping to pull
lots of records.

On 8 September 2012 12:51, Sam Millman <sam.mill...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kay  
View profile  
 More options Sep 8 2012, 11:09 am
From: Kay <kay....@10gen.com>
Date: Sat, 8 Sep 2012 08:09:43 -0700 (PDT)
Local: Sat, Sep 8 2012 11:09 am
Subject: Re: How to resolve the $or sort not using index problem

Oops --
just realized I typed "sen*d*Date" instead of "sen*t*Date" in both the
index creation and find statements.

The correct statements would be:

*db.msgs.ensureIndex({both: 1, sentDate:-1})*
*db.msgs.find( { both: { $in: [ { s:'1000', r:'9999' }, {s:'9999',
r:'1000'} ] } } ).sort({sentDate:-1})*
*
*


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kay  
View profile  
 More options Sep 8 2012, 1:25 pm
From: Kay <kay....@10gen.com>
Date: Sat, 8 Sep 2012 10:25:50 -0700 (PDT)
Local: Sat, Sep 8 2012 1:25 pm
Subject: Re: How to resolve the $or sort not using index problem

Hi Hanson --

It does seem like it's hard to get away from the $or and sort.

I agree with Sam's solution in that within your document, you would also
need your original solution of an array structure.  

I'll keep tinkering with it and if I run across something, I'll let you
know.

Best regards,

Kay


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Hanson Lu  
View profile  
 More options Sep 9 2012, 5:19 am
From: Hanson Lu <hans...@gmail.com>
Date: Sun, 9 Sep 2012 02:19:47 -0700 (PDT)
Local: Sun, Sep 9 2012 5:19 am
Subject: Re: How to resolve the $or sort not using index problem

It seems that is a way to resolve my issue, though index size will be very
large.

Thanks,  Sammaye  and Kay.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Hanson Lu  
View profile  
 More options Sep 10 2012, 3:21 am
From: Hanson Lu <hans...@gmail.com>
Date: Mon, 10 Sep 2012 00:21:32 -0700 (PDT)
Local: Mon, Sep 10 2012 3:21 am
Subject: Re: How to resolve the $or sort not using index problem

Hi Kay
  I tested the query with  the index you suggest. I found a problem that it
will scan all documents of sender 'X' and receiver 'Y'.
  db.msgs.find( { both: { $in: [ { s:'1000', r:'9999' }, {s:'9999',
r:'1000'} ] } } ).limit(5).sort({sentDate:-1}).explain()
  {
        "cursor" : "BtreeCursor both_1_type_1_sentDate_-1 multi",
       * "nscanned" : 28,*
      *  "nscannedObjects" : 26*,
    *    "n" : 5,*
        "scanAndOrder" : true,
        "millis" : 0,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "isMultiKey" : false,
        "indexOnly" : false,
 }

I think if the result set is little, the solution is acceptable.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kay  
View profile  
 More options Sep 10 2012, 11:30 pm
From: Kay <kay....@10gen.com>
Date: Mon, 10 Sep 2012 20:30:35 -0700 (PDT)
Local: Mon, Sep 10 2012 11:30 pm
Subject: Re: How to resolve the $or sort not using index problem

Hi Hanson --
Thanks so much for the update.  I was actually quite curious to see how the
query would work for you.

If I do come across something that might work better, I'll definitely keep
you posted.

Regards,

Kay


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Asya Kamsky  
View profile  
 More options Sep 11 2012, 12:08 am
From: Asya Kamsky <a...@10gen.com>
Date: Mon, 10 Sep 2012 21:08:32 -0700 (PDT)
Local: Tues, Sep 11 2012 12:08 am
Subject: Re: How to resolve the $or sort not using index problem

You might check out this comment (if you are using 2.2, which I think you
must be):

https://jira.mongodb.org/browse/SERVER-5063?focusedCommentId=138239&p...

This explains why the nscanned may be higher than you would be expecting.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Hanson Lu  
View profile  
 More options Sep 11 2012, 1:05 am
From: Hanson Lu <hans...@gmail.com>
Date: Mon, 10 Sep 2012 22:05:20 -0700 (PDT)
Local: Tues, Sep 11 2012 1:05 am
Subject: Re: How to resolve the $or sort not using index problem

I'am using version 2.0.2

The problem is that  I think *nscannedObjects  should be equal to 5 that i
limited in the query.*
*
*


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Asya Kamsky  
View profile  
 More options Sep 11 2012, 2:23 am
From: Asya Kamsky <a...@10gen.com>
Date: Mon, 10 Sep 2012 23:23:42 -0700 (PDT)
Local: Tues, Sep 11 2012 2:23 am
Subject: Re: How to resolve the $or sort not using index problem

The fix for SERVER-5063 would improve things by using the index to sort but
even then I can't see how nscanned can be less than 10.
You have two values which are being read from the index - limiting each of
them to 5 is doable but not 5 (you should see 5 if there was not an "$in"
or "$or" expression.

Asya


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Hanson Lu  
View profile  
 More options Sep 11 2012, 3:00 am
From: Hanson Lu <hans...@gmail.com>
Date: Tue, 11 Sep 2012 00:00:20 -0700 (PDT)
Local: Tues, Sep 11 2012 3:00 am
Subject: Re: How to resolve the $or sort not using index problem

Hi Asya

 I verified with 2.2.0 just now. The problem is fixed , and nscanobject is
equal to the limit count(5).  
 {
        "isMultiKey" : false,
        *"n" : 5,*
      *  "nscannedObjects" : 5,*
        "nscanned" : 6,
        "nscannedObjectsAllPlans" : 10,
        "nscannedAllPlans" : 11,
        "scanAndOrder" : true,
        "indexOnly" : false,
 }

Thanks!
Regards


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Hanson Lu  
View profile  
 More options Sep 11 2012, 3:31 am
From: Hanson Lu <hans...@gmail.com>
Date: Tue, 11 Sep 2012 00:31:00 -0700 (PDT)
Subject: Re: How to resolve the $or sort not using index problem

Update..
The query int above i typing miss a field  'type' . The correct query is
  db.msgs.find( { both: { $in: [ { s:'1000', r:'9999' }, {s:'9999',
r:'1000'} ] } , *type:'text'*} ).limit(5).sort({sentDate:-1})

if use the query without 'type' in version 2.2, it will scan all matched
document in  query with 'limit'.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »