I'm currently trying out the aggregation framework on a collection that sometimes is updated via map reduce so documents only have the _id and value documents with various subdocuments. But I'm running into some things that seem quite odd and I'm at a loss to figure out what I'm supposed to be trying or doing differently. I apologize for all of the text in the email I'm copying. I've tried several different aggregation queries and often not getting results similar to what is described in
http://stackoverflow.com/questions/12878722/mongodb-nested-field-in-groups-id but I haven't found any documentation that addresses this. Most recently I have tried the query at the bottom of the email, { $match : { "value.U" : 158, "_id.t" :"I"} , $group : {_id : {Z: "$_id.Z", U : "$value.U" } , cou
nt : {$sum: "$value.I"} }} . Assuming that I understand the $match pipeline correctly it serves a similar function this find query.
mongos> db.hourAggregate.find({"value.U" : 158, "_id.t" : "I"})
{ "_id" : { "K" : 837, "C" : 123837, "t" : "I", "Z" : 137, "T" : ISODate("2011-09-13T17:00:00Z") }, "value" : { "I" : NumberLong(1), "U" : 158, "u" : 84 } }
{ "_id" : { "K" : 836, "C" : 123836, "t" : "I", "Z" : 138, "T" : ISODate("2011-09-13T17:00:00Z") }, "value" : { "I" : NumberLong(1), "U" : 158, "u" : 84 } }
{ "_id" : { "K" : 860, "C" : 123860, "t" : "I", "Z" : 138, "T" : ISODate("2011-09-13T17:00:00Z") }, "value" : { "I" : NumberLong(1), "U" : 158, "u" : 84 } }
{ "_id" : { "K" : 855, "C" : 123855, "t" : "I", "Z" : 138, "T" : ISODate("2011-09-13T17:00:00Z") }, "value" : { "I" : NumberLong(1), "U" : 158, "u" : 84 } }
{ "_id" : { "K" : 839, "C" : 123839, "t" : "I", "Z" : 139, "T" : ISODate("2011-09-13T17:00:00Z") }, "value" : { "I" : NumberLong(1), "U" : 158, "u" : 84 } }
{ "_id" : { "K" : 836, "C" : 123836, "t" : "I", "Z" : 139, "T" : ISODate("2011-09-13T17:00:00Z") }, "value" : { "I" : NumberLong(1), "U" : 158, "u" : 84 } }
(there are 63664 documents that match this query in the data I'm using)
In essence I simply want to get the sum of value.I for all documents with value.U being 158 grouped by the value of Z. _id.t is a type of document produced by the map reduce that outputs to this collection so all the documents with _id.t: "I" are of the form above, though they may have additional fields in value.
As can be seen in the results below, it appears that not only is the match restriction on "value.U" ignored, but that the _id.Z field simply doesn't appear in the output. The expected output has about 3 or so documents similar to {_id: {U:158, Z:138}, count : NumberLong{345numbers)}} as there are around 3 different values for Z (and this is reflected by the map reduce query I'm trying to reimplement in the aggregation framework). My attempt to use the suggested workaround for that stack overflow question was
mongos> db.hourAggregate.aggregate( { $match : { "value.U" : 158, "_id.t" :"I"} , $project : { Z:"$_id.Z", U:"$value.U", I:"$value.I" }, $group : {_id : {Z: "$Z", U : "$U" } , count : {$sum: "$I"} }})
{ "result" : [ ], "ok" : 1 } provided almost instantaneously.
Any guidance would be greatly appreciated, I'm looking forward to being able to run multiple of these queries (varying the number for value.U and adding date restrictions to the match) in parallel on multiple cores with the aggregation framework as opposed to the one core limitation for map reduce.
mongos> db.hourAggregate.aggregate( { $match : { "value.U" : 158, "_id.t" :"I"} , $group : {_id : {Z: "$_id.Z", U : "$value.U" } , cou
nt : {$sum: "$value.I"} }})
{
"result" : [
{
"_id" : {
"U" : 122
},
"count" : 0
},
{
"_id" : {
"U" : 127
},
"count" : 0
},
{
"_id" : {
"U" : 192
},
"count" : 0
},
{
"_id" : {
"U" : 150
},
"count" : 0
},
{
"_id" : {
"U" : 158
},
"count" : NumberLong(3230956)
},
{
"_id" : {
"U" : 125
},
"count" : 0
},
{
"_id" : {
"U" : 91
},
"count" : 0
},
{
"_id" : {
"U" : 116
},
"count" : 0
},
{
"_id" : {
"U" : 203
},
"count" : 0
},
{
"_id" : {
"U" : 111
},
"count" : 0
},
{
"_id" : {
"U" : 210
},
"count" : 0
},
{
"_id" : {
"U" : 213
},
"count" : NumberLong(97058)
},
{
"_id" : {
"U" : 221
},
"count" : NumberLong(20680)
},
{
"_id" : {
"U" : 139
},
"count" : 0
},
{
"_id" : {
"U" : 156
},
"count" : 0
},
{
"_id" : {
"U" : 120
},
"count" : NumberLong(115950)
},
{
"_id" : {
"U" : 142
},
"count" : 0
},
and so forth for quite a few screens including the entry for U:158 with a "count" : NumberLong(<numbers>)