Hello,
I've a collection I use for storing aggregated historical datas coming
from the online system application.
The aggregation for every document is done on three steps of "unique
keys signature", according to the granularity of the reports I need to
produce. The documents are not finally aggregated due to the needed
output, because it can vary on to many variables. In order to produce
the output I make on-fly mapreduce choosing which kind of granularity
access in the query part.
I choose this solution after a while of performance tests, preferring
direct discarding of documents by querying and then applying the
mapreduce to the remaining. (If there's is a smarter/better way to do
that, please, let me know :))
The problem is in querying part.
That's a sample shell that simulates what I ecounter:
[shell]
use test
db.testcollection.ensureIndex({date: -1, country_code: 1, user_id: 1},
{unique: 1, background: 1});
db.testcollection.insert({ date: new Date("27/08/2010"), tot_visit:
100});
db.testcollection.insert({ date: new Date("27/08/2010"), country_code:
"IT", tot_visit: 77});
db.testcollection.insert({ date: new Date("27/08/2010"), country_code:
"ES", tot_visit: 23});
db.testcollection.insert({ date: new Date("27/08/2010"), country_code:
"ES", user_id: "
and...@spacca.org", tot_visit: 11});
db.testcollection.insert({ date: new Date("27/08/2010"), country_code:
"ES", user_id: "
andrea...@gmail.com", tot_visit: 5});
db.testcollection.insert({ date: new Date("27/08/2010"), country_code:
"ES", user_id: "
andrea...@progloedizioni.com", tot_visit: 7});
db.testcollection.find({date: new Date("27/08/2010")}).count(); // 6
[OK]
db.testcollection.find({date: new Date("27/08/2010"), country_code:
{$exists: true}}).count(); // 5 [OK]
db.testcollection.find({date: new Date("27/08/2010"), country_code:
{$exists: false}}).count(); // 1 [OK]
db.testcollection.find({date: new Date("27/08/2010"), country_code:
{$type: 10}}).count(); // 1 [OK]
db.testcollection.find({date: new Date("27/08/2010"), country_code:
{$exists: true}, user_id: {$exists: true}}).count(); // 3 [OK]
db.testcollection.find({date: new Date("27/08/2010"), country_code:
{$exists: true}, user_id: {$exists: false}}).count(); // 2 [OK]
db.testcollection.find({date: new Date("27/08/2010"), country_code:
{$exists: true}, user_id: {$type: 10}}).count(); // 0 [??]
[/shell]
ps: the test was made on a three shard enviroment