My app has a collection like this { sender:'1000', receiver:'9999', type:'text', content:' this is content' , sentDate:ISODate("2011-10-12T14:54:02.069Z)
}
The collection records number > 50M.
There is a requirement to query records *between *sender 'X' and receiver 'Y' in short time.
First, i create index, {sender:1, receiver:1, sentDate:-1}, use query
Next, I change the structure of the collection like this { sender:'1000', receiver:'9999', type:'text', both:['1000','9999'] content:' this is content' , sentDate:ISODate("2011-10-12T14:54:02.069Z)
}
add a new field named 'both' to store 'sender' and 'receiver' , and use query
The query do hit index, but if sender '1000' has large number of records(>100000),
the query become very slow, i saw lots page fault from mongostat.
It seems MongoDB will load all documents for indexed for {both:'1000'}, and compare whether the 'both' contents is exactly['1000','9999']. is it right?
I have no idea how to resolve the problem. Can you give me some suggestion.
On Friday, September 7, 2012 4:43:01 PM UTC+8, Hanson Lu wrote:
> My app has a collection like this > { > sender:'1000', > receiver:'9999', > type:'text', > content:' this is content' , > sentDate:ISODate("2011-10-12T14:54:02.069Z) > }
> The collection records number > 50M.
> There is a requirement to query records *between *sender 'X' and receiver > 'Y' in short time.
> First, i create index, {sender:1, receiver:1, sentDate:-1}, use query
> Next, I change the structure of the collection like this > { > sender:'1000', > receiver:'9999', > type:'text', > both:['1000','9999'] > content:' this is content' , > sentDate:ISODate("2011-10-12T14:54:02.069Z) > }
> add a new field named 'both' to store 'sender' and 'receiver' , and use > query
> The query do hit index, but if sender '1000' has large number of > records(>100000),
> the query become very slow, i saw lots page fault from mongostat.
> It seems MongoDB will load all documents for indexed for {both:'1000'}, > and compare whether the 'both' contents is exactly['1000','9999']. > is it right?
> I have no idea how to resolve the problem. Can you give me some suggestion.
2) Regarding your second query with *$and* and the array structure
- creating an index on an array field creates an index on each element of the array (http://www.mongodb.org/display/DOCS/Multikeys) - an *$all* looks up just the first element in the index (which is the first element in the* both* array)
On Friday, September 7, 2012 4:54:56 AM UTC-4, Hanson Lu wrote:
> I forgot to mention that i have created the index {both:1,sentDate:-1} > for the > query db.msgs.find({both:{$all:['1000','9999']}).sort(sendDate:-1},.
> On Friday, September 7, 2012 4:43:01 PM UTC+8, Hanson Lu wrote:
>> My app has a collection like this >> { >> sender:'1000', >> receiver:'9999', >> type:'text', >> content:' this is content' , >> sentDate:ISODate("2011-10-12T14:54:02.069Z) >> }
>> The collection records number > 50M.
>> There is a requirement to query records *between *sender 'X' and >> receiver 'Y' in short time.
>> First, i create index, {sender:1, receiver:1, sentDate:-1}, use query
>> Next, I change the structure of the collection like this >> { >> sender:'1000', >> receiver:'9999', >> type:'text', >> both:['1000','9999'] >> content:' this is content' , >> sentDate:ISODate("2011-10-12T14:54:02.069Z) >> }
>> add a new field named 'both' to store 'sender' and 'receiver' , and use >> query
>> The query do hit index, but if sender '1000' has large number of >> records(>100000),
>> the query become very slow, i saw lots page fault from mongostat.
>> It seems MongoDB will load all documents for indexed for {both:'1000'}, >> and compare whether the 'both' contents is exactly['1000','9999']. >> is it right?
>> I have no idea how to resolve the problem. Can you give me some >> suggestion.
> 2) Regarding your second query with *$and* and the array structure
> - creating an index on an array field creates an index on each element > of the array (http://www.mongodb.org/display/DOCS/Multikeys) > - an *$all* looks up just the first element in the index (which is the > first element in the* both* array)
> Hope this helps.
> Kay
> On Friday, September 7, 2012 4:54:56 AM UTC-4, Hanson Lu wrote:
>> I forgot to mention that i have created the index {both:1,sentDate:-1} >> for the >> query db.msgs.find({both:{$all:['1000','9999']}).sort(sendDate:-1},.
>> On Friday, September 7, 2012 4:43:01 PM UTC+8, Hanson Lu wrote:
>>> My app has a collection like this >>> { >>> sender:'1000', >>> receiver:'9999', >>> type:'text', >>> content:' this is content' , >>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>> }
>>> The collection records number > 50M.
>>> There is a requirement to query records *between *sender 'X' and >>> receiver 'Y' in short time.
>>> First, i create index, {sender:1, receiver:1, sentDate:-1}, use query
>>> Next, I change the structure of the collection like this >>> { >>> sender:'1000', >>> receiver:'9999', >>> type:'text', >>> both:['1000','9999'] >>> content:' this is content' , >>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>> }
>>> add a new field named 'both' to store 'sender' and 'receiver' , and use >>> query
>>> The query do hit index, but if sender '1000' has large number of >>> records(>100000),
>>> the query become very slow, i saw lots page fault from mongostat.
>>> It seems MongoDB will load all documents for indexed for {both:'1000'}, >>> and compare whether the 'both' contents is exactly['1000','9999']. >>> is it right?
>>> I have no idea how to resolve the problem. Can you give me some >>> suggestion.
It sounds like you would just modify the query above to just do an $or on
the r and s field rather than a compound $or on groups of the r and s field
within the subdocument.
On 8 September 2012 07:18, Hanson Lu <hans...@gmail.com> wrote:
>> On Friday, September 7, 2012 4:54:56 AM UTC-4, Hanson Lu wrote:
>>> I forgot to mention that i have created the index {both:1,sentDate:-1}
>>> for the query db.msgs.find({both:{$**all:['1000','9999']}).sort(**
>>> sendDate:-1},.
>>> On Friday, September 7, 2012 4:43:01 PM UTC+8, Hanson Lu wrote:
>>>> My app has a collection like this
>>>> {
>>>> sender:'1000',
>>>> receiver:'9999',
>>>> type:'text',
>>>> content:' this is content' ,
>>>> sentDate:ISODate("2011-10-**12T14:54:02.069Z)
>>>> }
>>>> The collection records number > 50M.
>>>> There is a requirement to query records *between *sender 'X' and
>>>> receiver 'Y' in short time.
>>>> First, i create index, {sender:1, receiver:1, sentDate:-1}, use query
>>>> Next, I change the structure of the collection like this
>>>> {
>>>> sender:'1000',
>>>> receiver:'9999',
>>>> type:'text',
>>>> both:['1000','9999']
>>>> content:' this is content' ,
>>>> sentDate:ISODate("2011-10-**12T14:54:02.069Z)
>>>> }
>>>> add a new field named 'both' to store 'sender' and 'receiver' , and use
>>>> query
>>>> The query do hit index, but if sender '1000' has large number of
>>>> records(>100000),
>>>> the query become very slow, i saw lots page fault from mongostat.
>>>> It seems MongoDB will load all documents for indexed for {both:'1000'},
>>>> and compare whether the 'both' contents is exactly['1000','9999'].
>>>> is it right?
>>>> I have no idea how to resolve the problem. Can you give me some
>>>> suggestion.
>>>> Regards.
>>>> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongodb-user@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user+unsubscribe@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb
> It sounds like you would just modify the query above to just do an $or on
> the r and s field rather than a compound $or on groups of the r and s field
> within the subdocument.
> On 8 September 2012 07:18, Hanson Lu <hans...@gmail.com> wrote:
>> Many Thanks, Kay. It is a good suggestion.
>> My app also has another requirement that query records of someone(sender
>> is 'X' or receiver is 'X')
>> so when both is array, i can query like
>> db.msgs.find({both:'1000}').sort({sendDate:-1}
>> if both is embedded document, it seems that it can not support the query.
>> On Saturday, September 8, 2012 5:46:40 AM UTC+8, Kay wrote:
>>> 2) Regarding your second query with *$and* and the array structure
>>> - creating an index on an array field creates an index on each
>>> element of the array (http://www.mongodb.org/** >>> display/DOCS/Multikeys)<http://www.mongodb.org/display/DOCS/Multikeys)>
>>> - an *$all* looks up just the first element in the index (which is
>>> the first element in the* both* array)
>>> Hope this helps.
>>> Kay
>>> On Friday, September 7, 2012 4:54:56 AM UTC-4, Hanson Lu wrote:
>>>> I forgot to mention that i have created the index {both:1,sentDate:-1}
>>>> for the query db.msgs.find({both:{$**all:['1000','9999']}).sort(**
>>>> sendDate:-1},.
>>>> On Friday, September 7, 2012 4:43:01 PM UTC+8, Hanson Lu wrote:
>>>>> My app has a collection like this
>>>>> {
>>>>> sender:'1000',
>>>>> receiver:'9999',
>>>>> type:'text',
>>>>> content:' this is content' ,
>>>>> sentDate:ISODate("2011-10-**12T14:54:02.069Z)
>>>>> }
>>>>> The collection records number > 50M.
>>>>> There is a requirement to query records *between *sender 'X' and
>>>>> receiver 'Y' in short time.
>>>>> First, i create index, {sender:1, receiver:1, sentDate:-1}, use query
>>>>> Next, I change the structure of the collection like this
>>>>> {
>>>>> sender:'1000',
>>>>> receiver:'9999',
>>>>> type:'text',
>>>>> both:['1000','9999']
>>>>> content:' this is content' ,
>>>>> sentDate:ISODate("2011-10-**12T14:54:02.069Z)
>>>>> }
>>>>> add a new field named 'both' to store 'sender' and 'receiver' , and
>>>>> use query
>>>>> The query do hit index, but if sender '1000' has large number of
>>>>> records(>100000),
>>>>> the query become very slow, i saw lots page fault from mongostat.
>>>>> It seems MongoDB will load all documents for indexed for
>>>>> {both:'1000'}, and compare whether the 'both' contents is
>>>>> exactly['1000','9999'].
>>>>> is it right?
>>>>> I have no idea how to resolve the problem. Can you give me some
>>>>> suggestion.
>>>>> Regards.
>>>>> --
>> You received this message because you are subscribed to the Google
>> Groups "mongodb-user" group.
>> To post to this group, send email to mongodb-user@googlegroups.com
>> To unsubscribe from this group, send email to
>> mongodb-user+unsubscribe@googlegroups.com
>> See also the IRC channel -- freenode.net#mongodb
I just realised that won't work either. I did have a plan but after more
thought I realised it wouldnt do the trick.
Probably your best bet, only if the query is slow, is to store two forms of
it. One in that form that Kay talked about and another in the form you had
originally.
Though I doubt your query will be slow so if you put a index on the query
itself the sort would probably be instantaneous unless your hoping to pull
lots of records.
On 8 September 2012 12:51, Sam Millman <sam.mill...@gmail.com> wrote:
> Sorry I mean you could replace that with two $ins, I think that will use
> index.
> On 8 September 2012 12:46, Sam Millman <sam.mill...@gmail.com> wrote:
>> It sounds like you would just modify the query above to just do an $or on
>> the r and s field rather than a compound $or on groups of the r and s field
>> within the subdocument.
>> On 8 September 2012 07:18, Hanson Lu <hans...@gmail.com> wrote:
>>> Many Thanks, Kay. It is a good suggestion.
>>> My app also has another requirement that query records of someone(sender
>>> is 'X' or receiver is 'X')
>>>> 2) Regarding your second query with *$and* and the array structure
>>>> - creating an index on an array field creates an index on each
>>>> element of the array (http://www.mongodb.org/** >>>> display/DOCS/Multikeys)<http://www.mongodb.org/display/DOCS/Multikeys)>
>>>> - an *$all* looks up just the first element in the index (which is
>>>> the first element in the* both* array)
>>>> Hope this helps.
>>>> Kay
>>>> On Friday, September 7, 2012 4:54:56 AM UTC-4, Hanson Lu wrote:
>>>>> I forgot to mention that i have created the index {both:1,sentDate:-1}
>>>>> for the query db.msgs.find({both:{$**all:['1000','9999']}).sort(**
>>>>> sendDate:-1},.
>>>>> On Friday, September 7, 2012 4:43:01 PM UTC+8, Hanson Lu wrote:
>>>>>> My app has a collection like this
>>>>>> {
>>>>>> sender:'1000',
>>>>>> receiver:'9999',
>>>>>> type:'text',
>>>>>> content:' this is content' ,
>>>>>> sentDate:ISODate("2011-10-**12T14:54:02.069Z)
>>>>>> }
>>>>>> The collection records number > 50M.
>>>>>> There is a requirement to query records *between *sender 'X' and
>>>>>> receiver 'Y' in short time.
>>>>>> First, i create index, {sender:1, receiver:1, sentDate:-1}, use query
>>>>>> Next, I change the structure of the collection like this
>>>>>> {
>>>>>> sender:'1000',
>>>>>> receiver:'9999',
>>>>>> type:'text',
>>>>>> both:['1000','9999']
>>>>>> content:' this is content' ,
>>>>>> sentDate:ISODate("2011-10-**12T14:54:02.069Z)
>>>>>> }
>>>>>> add a new field named 'both' to store 'sender' and 'receiver' , and
>>>>>> use query
>>>>>> The query do hit index, but if sender '1000' has large number of
>>>>>> records(>100000),
>>>>>> the query become very slow, i saw lots page fault from mongostat.
>>>>>> It seems MongoDB will load all documents for indexed for
>>>>>> {both:'1000'}, and compare whether the 'both' contents is
>>>>>> exactly['1000','9999'].
>>>>>> is it right?
>>>>>> I have no idea how to resolve the problem. Can you give me some
>>>>>> suggestion.
>>>>>> Regards.
>>>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "mongodb-user" group.
>>> To post to this group, send email to mongodb-user@googlegroups.com
>>> To unsubscribe from this group, send email to
>>> mongodb-user+unsubscribe@googlegroups.com
>>> See also the IRC channel -- freenode.net#mongodb
> 2) Regarding your second query with *$and* and the array structure
> - creating an index on an array field creates an index on each element > of the array (http://www.mongodb.org/display/DOCS/Multikeys) > - an *$all* looks up just the first element in the index (which is the > first element in the* both* array)
> Hope this helps.
> Kay
> On Friday, September 7, 2012 4:54:56 AM UTC-4, Hanson Lu wrote:
>> I forgot to mention that i have created the index {both:1,sentDate:-1} >> for the >> query db.msgs.find({both:{$all:['1000','9999']}).sort(sendDate:-1},.
>> On Friday, September 7, 2012 4:43:01 PM UTC+8, Hanson Lu wrote:
>>> My app has a collection like this >>> { >>> sender:'1000', >>> receiver:'9999', >>> type:'text', >>> content:' this is content' , >>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>> }
>>> The collection records number > 50M.
>>> There is a requirement to query records *between *sender 'X' and >>> receiver 'Y' in short time.
>>> First, i create index, {sender:1, receiver:1, sentDate:-1}, use query
>>> Next, I change the structure of the collection like this >>> { >>> sender:'1000', >>> receiver:'9999', >>> type:'text', >>> both:['1000','9999'] >>> content:' this is content' , >>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>> }
>>> add a new field named 'both' to store 'sender' and 'receiver' , and use >>> query
>>> The query do hit index, but if sender '1000' has large number of >>> records(>100000),
>>> the query become very slow, i saw lots page fault from mongostat.
>>> It seems MongoDB will load all documents for indexed for {both:'1000'}, >>> and compare whether the 'both' contents is exactly['1000','9999']. >>> is it right?
>>> I have no idea how to resolve the problem. Can you give me some >>> suggestion.
>> 2) Regarding your second query with *$and* and the array structure
>> - creating an index on an array field creates an index on each >> element of the array (http://www.mongodb.org/display/DOCS/Multikeys) >> - an *$all* looks up just the first element in the index (which is >> the first element in the* both* array)
>> Hope this helps.
>> Kay
>> On Friday, September 7, 2012 4:54:56 AM UTC-4, Hanson Lu wrote:
>>> I forgot to mention that i have created the index {both:1,sentDate:-1} >>> for the >>> query db.msgs.find({both:{$all:['1000','9999']}).sort(sendDate:-1},.
>>> On Friday, September 7, 2012 4:43:01 PM UTC+8, Hanson Lu wrote:
>>>> My app has a collection like this >>>> { >>>> sender:'1000', >>>> receiver:'9999', >>>> type:'text', >>>> content:' this is content' , >>>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>>> }
>>>> The collection records number > 50M.
>>>> There is a requirement to query records *between *sender 'X' and >>>> receiver 'Y' in short time.
>>>> First, i create index, {sender:1, receiver:1, sentDate:-1}, use query
>>>> Next, I change the structure of the collection like this >>>> { >>>> sender:'1000', >>>> receiver:'9999', >>>> type:'text', >>>> both:['1000','9999'] >>>> content:' this is content' , >>>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>>> }
>>>> add a new field named 'both' to store 'sender' and 'receiver' , and use >>>> query
>>>> The query do hit index, but if sender '1000' has large number of >>>> records(>100000),
>>>> the query become very slow, i saw lots page fault from mongostat.
>>>> It seems MongoDB will load all documents for indexed for {both:'1000'}, >>>> and compare whether the 'both' contents is exactly['1000','9999']. >>>> is it right?
>>>> I have no idea how to resolve the problem. Can you give me some >>>> suggestion.
>>> 2) Regarding your second query with *$and* and the array structure
>>> - creating an index on an array field creates an index on each >>> element of the array (http://www.mongodb.org/display/DOCS/Multikeys) >>> - an *$all* looks up just the first element in the index (which is >>> the first element in the* both* array)
>>> Hope this helps.
>>> Kay
>>> On Friday, September 7, 2012 4:54:56 AM UTC-4, Hanson Lu wrote:
>>>> I forgot to mention that i have created the index {both:1,sentDate:-1} >>>> for the >>>> query db.msgs.find({both:{$all:['1000','9999']}).sort(sendDate:-1},.
>>>> On Friday, September 7, 2012 4:43:01 PM UTC+8, Hanson Lu wrote:
>>>>> My app has a collection like this >>>>> { >>>>> sender:'1000', >>>>> receiver:'9999', >>>>> type:'text', >>>>> content:' this is content' , >>>>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>>>> }
>>>>> The collection records number > 50M.
>>>>> There is a requirement to query records *between *sender 'X' and >>>>> receiver 'Y' in short time.
>>>>> First, i create index, {sender:1, receiver:1, sentDate:-1}, use query
>>>>> Next, I change the structure of the collection like this >>>>> { >>>>> sender:'1000', >>>>> receiver:'9999', >>>>> type:'text', >>>>> both:['1000','9999'] >>>>> content:' this is content' , >>>>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>>>> }
>>>>> add a new field named 'both' to store 'sender' and 'receiver' , and >>>>> use query
>>>>> The query do hit index, but if sender '1000' has large number of >>>>> records(>100000),
>>>>> the query become very slow, i saw lots page fault from mongostat.
>>>>> It seems MongoDB will load all documents for indexed for >>>>> {both:'1000'}, and compare whether the 'both' contents is >>>>> exactly['1000','9999']. >>>>> is it right?
>>>>> I have no idea how to resolve the problem. Can you give me some >>>>> suggestion.
Hi Kay I tested the query with the index you suggest. I found a problem that it will scan all documents of sender 'X' and receiver 'Y'. db.msgs.find( { both: { $in: [ { s:'1000', r:'9999' }, {s:'9999', r:'1000'} ] } } ).limit(5).sort({sentDate:-1}).explain() { "cursor" : "BtreeCursor both_1_type_1_sentDate_-1 multi", * "nscanned" : 28,* * "nscannedObjects" : 26*, * "n" : 5,* "scanAndOrder" : true, "millis" : 0, "nYields" : 0, "nChunkSkips" : 0, "isMultiKey" : false, "indexOnly" : false, }
I think if the result set is little, the solution is acceptable.
On Saturday, September 8, 2012 11:09:43 PM UTC+8, Kay wrote: > Oops -- > just realized I typed "sen*d*Date" instead of "sen*t*Date" in both the > index creation and find statements.
>> 2) Regarding your second query with *$and* and the array structure
>> - creating an index on an array field creates an index on each >> element of the array (http://www.mongodb.org/display/DOCS/Multikeys) >> - an *$all* looks up just the first element in the index (which is >> the first element in the* both* array)
>> Hope this helps.
>> Kay
>> On Friday, September 7, 2012 4:54:56 AM UTC-4, Hanson Lu wrote:
>>> I forgot to mention that i have created the index {both:1,sentDate:-1} >>> for the >>> query db.msgs.find({both:{$all:['1000','9999']}).sort(sendDate:-1},.
>>> On Friday, September 7, 2012 4:43:01 PM UTC+8, Hanson Lu wrote:
>>>> My app has a collection like this >>>> { >>>> sender:'1000', >>>> receiver:'9999', >>>> type:'text', >>>> content:' this is content' , >>>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>>> }
>>>> The collection records number > 50M.
>>>> There is a requirement to query records *between *sender 'X' and >>>> receiver 'Y' in short time.
>>>> First, i create index, {sender:1, receiver:1, sentDate:-1}, use query
>>>> Next, I change the structure of the collection like this >>>> { >>>> sender:'1000', >>>> receiver:'9999', >>>> type:'text', >>>> both:['1000','9999'] >>>> content:' this is content' , >>>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>>> }
>>>> add a new field named 'both' to store 'sender' and 'receiver' , and use >>>> query
>>>> The query do hit index, but if sender '1000' has large number of >>>> records(>100000),
>>>> the query become very slow, i saw lots page fault from mongostat.
>>>> It seems MongoDB will load all documents for indexed for {both:'1000'}, >>>> and compare whether the 'both' contents is exactly['1000','9999']. >>>> is it right?
>>>> I have no idea how to resolve the problem. Can you give me some >>>> suggestion.
>>> 2) Regarding your second query with *$and* and the array structure
>>> - creating an index on an array field creates an index on each >>> element of the array (http://www.mongodb.org/display/DOCS/Multikeys) >>> - an *$all* looks up just the first element in the index (which is >>> the first element in the* both* array)
>>> Hope this helps.
>>> Kay
>>> On Friday, September 7, 2012 4:54:56 AM UTC-4, Hanson Lu wrote:
>>>> I forgot to mention that i have created the index {both:1,sentDate:-1} >>>> for the >>>> query db.msgs.find({both:{$all:['1000','9999']}).sort(sendDate:-1},.
>>>> On Friday, September 7, 2012 4:43:01 PM UTC+8, Hanson Lu wrote:
>>>>> My app has a collection like this >>>>> { >>>>> sender:'1000', >>>>> receiver:'9999', >>>>> type:'text', >>>>> content:' this is content' , >>>>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>>>> }
>>>>> The collection records number > 50M.
>>>>> There is a requirement to query records *between *sender 'X' and >>>>> receiver 'Y' in short time.
>>>>> First, i create index, {sender:1, receiver:1, sentDate:-1}, use query
>>>>> Next, I change the structure of the collection like this >>>>> { >>>>> sender:'1000', >>>>> receiver:'9999', >>>>> type:'text', >>>>> both:['1000','9999'] >>>>> content:' this is content' , >>>>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>>>> }
>>>>> add a new field named 'both' to store 'sender' and 'receiver' , and >>>>> use query
>>>>> The query do hit index, but if sender '1000' has large number of >>>>> records(>100000),
>>>>> the query become very slow, i saw lots page fault from mongostat.
>>>>> It seems MongoDB will load all documents for indexed for >>>>> {both:'1000'}, and compare whether the 'both' contents is >>>>> exactly['1000','9999']. >>>>> is it right?
>>>>> I have no idea how to resolve the problem. Can you give me some >>>>> suggestion.
>>> 2) Regarding your second query with *$and* and the array structure
>>> - creating an index on an array field creates an index on each >>> element of the array (http://www.mongodb.org/display/DOCS/Multikeys) >>> - an *$all* looks up just the first element in the index (which is >>> the first element in the* both* array)
>>> Hope this helps.
>>> Kay
>>> On Friday, September 7, 2012 4:54:56 AM UTC-4, Hanson Lu wrote:
>>>> I forgot to mention that i have created the index {both:1,sentDate:-1} >>>> for the >>>> query db.msgs.find({both:{$all:['1000','9999']}).sort(sendDate:-1},.
>>>> On Friday, September 7, 2012 4:43:01 PM UTC+8, Hanson Lu wrote:
>>>>> My app has a collection like this >>>>> { >>>>> sender:'1000', >>>>> receiver:'9999', >>>>> type:'text', >>>>> content:' this is content' , >>>>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>>>> }
>>>>> The collection records number > 50M.
>>>>> There is a requirement to query records *between *sender 'X' and >>>>> receiver 'Y' in short time.
>>>>> First, i create index, {sender:1, receiver:1, sentDate:-1}, use query
>>>>> Next, I change the structure of the collection like this >>>>> { >>>>> sender:'1000', >>>>> receiver:'9999', >>>>> type:'text', >>>>> both:['1000','9999'] >>>>> content:' this is content' , >>>>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>>>> }
>>>>> add a new field named 'both' to store 'sender' and 'receiver' , and >>>>> use query
>>>>> The query do hit index, but if sender '1000' has large number of >>>>> records(>100000),
>>>>> the query become very slow, i saw lots page fault from mongostat.
>>>>> It seems MongoDB will load all documents for indexed for >>>>> {both:'1000'}, and compare whether the 'both' contents is >>>>> exactly['1000','9999']. >>>>> is it right?
>>>>> I have no idea how to resolve the problem. Can you give me some >>>>> suggestion.
On Tuesday, September 11, 2012 12:08:32 PM UTC+8, Asya Kamsky wrote: > You might check out this comment (if you are using 2.2, which I think you > must be):
>>>> 2) Regarding your second query with *$and* and the array structure
>>>> - creating an index on an array field creates an index on each >>>> element of the array (http://www.mongodb.org/display/DOCS/Multikeys) >>>> - an *$all* looks up just the first element in the index (which is >>>> the first element in the* both* array)
>>>> Hope this helps.
>>>> Kay
>>>> On Friday, September 7, 2012 4:54:56 AM UTC-4, Hanson Lu wrote:
>>>>> I forgot to mention that i have created the index {both:1,sentDate:-1} >>>>> for the >>>>> query db.msgs.find({both:{$all:['1000','9999']}).sort(sendDate:-1},.
>>>>> On Friday, September 7, 2012 4:43:01 PM UTC+8, Hanson Lu wrote:
>>>>>> My app has a collection like this >>>>>> { >>>>>> sender:'1000', >>>>>> receiver:'9999', >>>>>> type:'text', >>>>>> content:' this is content' , >>>>>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>>>>> }
>>>>>> The collection records number > 50M.
>>>>>> There is a requirement to query records *between *sender 'X' and >>>>>> receiver 'Y' in short time.
>>>>>> First, i create index, {sender:1, receiver:1, sentDate:-1}, use query
>>>>>> Next, I change the structure of the collection like this >>>>>> { >>>>>> sender:'1000', >>>>>> receiver:'9999', >>>>>> type:'text', >>>>>> both:['1000','9999'] >>>>>> content:' this is content' , >>>>>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>>>>> }
>>>>>> add a new field named 'both' to store 'sender' and 'receiver' , and >>>>>> use query
>>>>>> The query do hit index, but if sender '1000' has large number of >>>>>> records(>100000),
>>>>>> the query become very slow, i saw lots page fault from mongostat.
>>>>>> It seems MongoDB will load all documents for indexed for >>>>>> {both:'1000'}, and compare whether the 'both' contents is >>>>>> exactly['1000','9999']. >>>>>> is it right?
>>>>>> I have no idea how to resolve the problem. Can you give me some >>>>>> suggestion.
The fix for SERVER-5063 would improve things by using the index to sort but even then I can't see how nscanned can be less than 10. You have two values which are being read from the index - limiting each of them to 5 is doable but not 5 (you should see 5 if there was not an "$in" or "$or" expression.
On Tuesday, September 11, 2012 2:05:20 AM UTC-3, Hanson Lu wrote:
> I'am using version 2.0.2
> The problem is that I think *nscannedObjects should be equal to 5 that > i limited in the query.* > * > * > On Tuesday, September 11, 2012 12:08:32 PM UTC+8, Asya Kamsky wrote:
>> You might check out this comment (if you are using 2.2, which I think you >> must be):
>>>>> 2) Regarding your second query with *$and* and the array structure
>>>>> - creating an index on an array field creates an index on each >>>>> element of the array ( >>>>> http://www.mongodb.org/display/DOCS/Multikeys) >>>>> - an *$all* looks up just the first element in the index (which is >>>>> the first element in the* both* array)
>>>>> Hope this helps.
>>>>> Kay
>>>>> On Friday, September 7, 2012 4:54:56 AM UTC-4, Hanson Lu wrote:
>>>>>> I forgot to mention that i have created the index >>>>>> {both:1,sentDate:-1} for the >>>>>> query db.msgs.find({both:{$all:['1000','9999']}).sort(sendDate:-1},.
>>>>>> On Friday, September 7, 2012 4:43:01 PM UTC+8, Hanson Lu wrote:
>>>>>>> My app has a collection like this >>>>>>> { >>>>>>> sender:'1000', >>>>>>> receiver:'9999', >>>>>>> type:'text', >>>>>>> content:' this is content' , >>>>>>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>>>>>> }
>>>>>>> The collection records number > 50M.
>>>>>>> There is a requirement to query records *between *sender 'X' and >>>>>>> receiver 'Y' in short time.
>>>>>>> First, i create index, {sender:1, receiver:1, sentDate:-1}, use >>>>>>> query
>>>>>>> Next, I change the structure of the collection like this >>>>>>> { >>>>>>> sender:'1000', >>>>>>> receiver:'9999', >>>>>>> type:'text', >>>>>>> both:['1000','9999'] >>>>>>> content:' this is content' , >>>>>>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>>>>>> }
>>>>>>> add a new field named 'both' to store 'sender' and 'receiver' , and >>>>>>> use query
>>>>>>> The query do hit index, but if sender '1000' has large number of >>>>>>> records(>100000),
>>>>>>> the query become very slow, i saw lots page fault from mongostat.
>>>>>>> It seems MongoDB will load all documents for indexed for >>>>>>> {both:'1000'}, and compare whether the 'both' contents is >>>>>>> exactly['1000','9999']. >>>>>>> is it right?
>>>>>>> I have no idea how to resolve the problem. Can you give me some >>>>>>> suggestion.
I verified with 2.2.0 just now. The problem is fixed , and nscanobject is equal to the limit count(5). { "isMultiKey" : false, *"n" : 5,* * "nscannedObjects" : 5,* "nscanned" : 6, "nscannedObjectsAllPlans" : 10, "nscannedAllPlans" : 11, "scanAndOrder" : true, "indexOnly" : false, }
On Tuesday, September 11, 2012 2:23:42 PM UTC+8, Asya Kamsky wrote:
> The fix for SERVER-5063 would improve things by using the index to sort > but even then I can't see how nscanned can be less than 10. > You have two values which are being read from the index - limiting each of > them to 5 is doable but not 5 (you should see 5 if there was not an "$in" > or "$or" expression.
> Asya
> On Tuesday, September 11, 2012 2:05:20 AM UTC-3, Hanson Lu wrote:
>> I'am using version 2.0.2
>> The problem is that I think *nscannedObjects should be equal to 5 that >> i limited in the query.* >> * >> * >> On Tuesday, September 11, 2012 12:08:32 PM UTC+8, Asya Kamsky wrote:
>>> You might check out this comment (if you are using 2.2, which I think >>> you must be):
>>>>>> 2) Regarding your second query with *$and* and the array structure
>>>>>> - creating an index on an array field creates an index on each >>>>>> element of the array ( >>>>>> http://www.mongodb.org/display/DOCS/Multikeys) >>>>>> - an *$all* looks up just the first element in the index (which >>>>>> is the first element in the* both* array)
>>>>>> Hope this helps.
>>>>>> Kay
>>>>>> On Friday, September 7, 2012 4:54:56 AM UTC-4, Hanson Lu wrote:
>>>>>>> I forgot to mention that i have created the index >>>>>>> {both:1,sentDate:-1} for the >>>>>>> query db.msgs.find({both:{$all:['1000','9999']}).sort(sendDate:-1},.
>>>>>>> On Friday, September 7, 2012 4:43:01 PM UTC+8, Hanson Lu wrote:
>>>>>>>> My app has a collection like this >>>>>>>> { >>>>>>>> sender:'1000', >>>>>>>> receiver:'9999', >>>>>>>> type:'text', >>>>>>>> content:' this is content' , >>>>>>>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>>>>>>> }
>>>>>>>> The collection records number > 50M.
>>>>>>>> There is a requirement to query records *between *sender 'X' and >>>>>>>> receiver 'Y' in short time.
>>>>>>>> First, i create index, {sender:1, receiver:1, sentDate:-1}, use >>>>>>>> query
>>>>>>>> Next, I change the structure of the collection like this >>>>>>>> { >>>>>>>> sender:'1000', >>>>>>>> receiver:'9999', >>>>>>>> type:'text', >>>>>>>> both:['1000','9999'] >>>>>>>> content:' this is content' , >>>>>>>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>>>>>>> }
>>>>>>>> add a new field named 'both' to store 'sender' and 'receiver' , and >>>>>>>> use query
>>>>>>>> The query do hit index, but if sender '1000' has large number of >>>>>>>> records(>100000),
>>>>>>>> the query become very slow, i saw lots page fault from mongostat.
>>>>>>>> It seems MongoDB will load all documents for indexed for >>>>>>>> {both:'1000'}, and compare whether the 'both' contents is >>>>>>>> exactly['1000','9999']. >>>>>>>> is it right?
>>>>>>>> I have no idea how to resolve the problem. Can you give me some >>>>>>>> suggestion.
Update.. The query int above i typing miss a field 'type' . The correct query is db.msgs.find( { both: { $in: [ { s:'1000', r:'9999' }, {s:'9999', r:'1000'} ] } , *type:'text'*} ).limit(5).sort({sentDate:-1})
if use the query without 'type' in version 2.2, it will scan all matched document in query with 'limit'.
On Tuesday, September 11, 2012 3:00:20 PM UTC+8, Hanson Lu wrote:
> Hi Asya
> I verified with 2.2.0 just now. The problem is fixed , and nscanobject is > equal to the limit count(5). > { > "isMultiKey" : false, > *"n" : 5,* > * "nscannedObjects" : 5,* > "nscanned" : 6, > "nscannedObjectsAllPlans" : 10, > "nscannedAllPlans" : 11, > "scanAndOrder" : true, > "indexOnly" : false, > }
> Thanks! > Regards
> On Tuesday, September 11, 2012 2:23:42 PM UTC+8, Asya Kamsky wrote:
>> The fix for SERVER-5063 would improve things by using the index to sort >> but even then I can't see how nscanned can be less than 10. >> You have two values which are being read from the index - limiting each >> of them to 5 is doable but not 5 (you should see 5 if there was not an >> "$in" or "$or" expression.
>> Asya
>> On Tuesday, September 11, 2012 2:05:20 AM UTC-3, Hanson Lu wrote:
>>> I'am using version 2.0.2
>>> The problem is that I think *nscannedObjects should be equal to 5 >>> that i limited in the query.* >>> * >>> * >>> On Tuesday, September 11, 2012 12:08:32 PM UTC+8, Asya Kamsky wrote:
>>>> You might check out this comment (if you are using 2.2, which I think >>>> you must be):
>>>>>>> One possible suggestion -- instead of the *both* field being an >>>>>>> array, if you make it an embedded document >>>>>>> so that you would have
>>>>>>> 2) Regarding your second query with *$and* and the array structure
>>>>>>> - creating an index on an array field creates an index on each >>>>>>> element of the array ( >>>>>>> http://www.mongodb.org/display/DOCS/Multikeys) >>>>>>> - an *$all* looks up just the first element in the index (which >>>>>>> is the first element in the* both* array)
>>>>>>> Hope this helps.
>>>>>>> Kay
>>>>>>> On Friday, September 7, 2012 4:54:56 AM UTC-4, Hanson Lu wrote:
>>>>>>>> I forgot to mention that i have created the index >>>>>>>> {both:1,sentDate:-1} for the >>>>>>>> query db.msgs.find({both:{$all:['1000','9999']}).sort(sendDate:-1},.
>>>>>>>> On Friday, September 7, 2012 4:43:01 PM UTC+8, Hanson Lu wrote:
>>>>>>>>> My app has a collection like this >>>>>>>>> { >>>>>>>>> sender:'1000', >>>>>>>>> receiver:'9999', >>>>>>>>> type:'text', >>>>>>>>> content:' this is content' , >>>>>>>>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>>>>>>>> }
>>>>>>>>> The collection records number > 50M.
>>>>>>>>> There is a requirement to query records *between *sender 'X' and >>>>>>>>> receiver 'Y' in short time.
>>>>>>>>> First, i create index, {sender:1, receiver:1, sentDate:-1}, use >>>>>>>>> query
>>>>>>>>> Next, I change the structure of the collection like this >>>>>>>>> { >>>>>>>>> sender:'1000', >>>>>>>>> receiver:'9999', >>>>>>>>> type:'text', >>>>>>>>> both:['1000','9999'] >>>>>>>>> content:' this is content' , >>>>>>>>> sentDate:ISODate("2011-10-12T14:54:02.069Z) >>>>>>>>> }
>>>>>>>>> add a new field named 'both' to store 'sender' and 'receiver' , >>>>>>>>> and use query
>>>>>>>>> The query do hit index, but if sender '1000' has large number of >>>>>>>>> records(>100000),
>>>>>>>>> the query become very slow, i saw lots page fault from mongostat.
>>>>>>>>> It seems MongoDB will load all documents for indexed for >>>>>>>>> {both:'1000'}, and compare whether the 'both' contents is >>>>>>>>> exactly['1000','9999']. >>>>>>>>> is it right?
>>>>>>>>> I have no idea how to resolve the problem. Can you give me some >>>>>>>>> suggestion.