To be clear, that was for a collection with 200 million records. I
checked the disk utilization with iostat during the query, and it was
a constant 98%, but performing the individual queries had very little
utilization. This is what gave me the impression that the index wasn't
being used.
I've tried using $in and it's much more manageable, taking exactly the
amount of time it should. Thanks!
However, I've compared the individual results to $in as you suggested,
and the output is different.
Individual:
{ "_id" : "1743", "value" : { "total_data" : 58016271 } }
{ "_id" : "23", "value" : { "total_data" : 103653535 } }
Using $in:
{ "_id" : "1743", "value" : { "total_data" : 58016271 } }
{ "_id" : "23", "value" : { "total_data" : 103653535 } }
The output of the map reduce call indicates that the exact same amount
of records were read and inputted for both types of queries, yet the
amounts are incorrect. What could cause this? Obviously my reduce has
a bug, but it's so simple I have no idea what could be causing this.
On Sep 10, 1:03 am, Kyle Banker <
kyleban...@gmail.com> wrote:
> I don't believe that a difference of 45 minutes could possibly be
> explained by whether $or is using the index or not.
>
> For the record, the $or does use an index, but you should actually
> express this differently:
>
> query: { time: {'$gte': start, '$lte': end}, username: { $in: ['1743',
> '23'] } }
>
> I also suggest that you run the both the $orqueryand the individual
> queries separate to ensure that they both return the same total number
> of documents.
>
> On Sep 8, 9:09 pm, Nathan Hoad <
nat...@getoffmalawn.com> wrote:
>
>
>
>
>
>
>
> > I have map and reduce functions defined as follows;
>
> > map = function() {
> > emit(this.username, { total_data: this.sent + this.received });
>
> > }
>
> > reduce = function (key, values) {
> > var result = {total_data:0};
> > values.forEach(function (value) {result.total_data +=
> > value.total_data;});
> > return result;
>
> > }
>
> > And I'm running the followingquery;
>
> > db.collection.mapReduce(m, r, {out:'myoutput',query: { time: {'$gte':
> > start, '$lte': end}, '$or': [ { username: '1743'}, {username:
> > '23'}] }, sort: {username: -1} });
>
> > If I run thequerywithout the $or, and do two separate queries for