Best way to show dates out of $dayOfYear in Aggregation?

1,662 views
Skip to first unread message

PaulKE

unread,
Jun 12, 2013, 1:25:35 PM6/12/13
to mongod...@googlegroups.com
Hello, 

here is a scenario:

I got number of 'active' users daily basis for a week or month as an example. 

And, the problem is.. I can group them on daily basis using '$dayOfYear' on aggregation framework. 

But, I am not sure how to present the data back on the chart I am working on. 

So, here is a command:

db.metric.aggregate( {$match:{"time":{$gte:ISODate("2013-06-01T00:00:00Z")}}}, 
                          {$group:{_id: {date:{$dayOfYear:"$time"},field:"$user"}}}, 
                          {$group:{_id:{date:'$_id.date'},count:{$sum:1}}} 

This will return data set from June 1, 2013. 

        "result" : [
                {
                        "_id" : {
                                "date" : 152
                        },
                        "count" : 2
                },
                {
                        "_id" : {
                                "date" : 153
                        },
                        "count" : 5
                },
                {
                        "_id" : {
                                "date" : 154
                        },
                        "count" : 5
                },
                {
                        "_id" : {
                                "date" : 155
                        },
                        "count" : 10
                }
        ],
        "ok" : 1


As you see, it returns 'dayOfYear' as it says. But, is there way I can 'ALSO' return the string format of date?

I tried 'project', but, it did not work. 

What would be the best way? so that I can display dates & count on my chart? 

Thanks!

PaulKE

unread,
Jun 12, 2013, 2:45:12 PM6/12/13
to mongod...@googlegroups.com
I think I found a solution using 'dayOfMonth' with year, month sets:

   db.metric.aggregate( {$match:{"time":{$gte:ISODate("2013-06-01T00:00:00Z")}}}, 
                         {$group:{_id: {date:{$dayOfMonth:"$time"}, year:{$year:"$time"}, month:{$month:"$time"}, field:"$user"}}}, 
                         {$group:{_id:{year:"$_id.year", month:"$_id.month", date:'$_id.date'},  count:{$sum:1}}}, 
                         {$sort:{"_id.year":1,"_id.month":1,"_id.date":1}} )


Above seems to give me unique set of 'date' resolution with following prints:

.....
                {
                        "_id" : {
                                "year" : 2013,
                                "month" : 6,
                                "date" : 4,
                                "hr" : 14
                        },
                        "count" : 1
                },
                {
                        "_id" : {
                                "year" : 2013,
                                "month" : 6,
                                "date" : 4,
                                "hr" : 15
                        },
                        "count" : 1
                },
                {
                        "_id" : {
                                "year" : 2013,
                                "month" : 7,
                                "date" : 1,
                                "hr" : 16
                        },
                        "count" : 1
                }
        ],
        "ok" : 1

So,I can format in the client using the given field info above. 

Now, I have this ISODate field as 'time'. If I store separate fields for year, month, date, like:

time.full = ISODate()
time.year = given_year
time.month = given_month
time.date = given_date_of_month

And, If I run 'aggregation' on time.year, time.month, time.date integer fields rather than using $year, $month, $dayOfMonth operators, would that be faster? 

Thanks. 

Asya Kamsky

unread,
Jun 12, 2013, 10:53:36 PM6/12/13
to mongod...@googlegroups.com
This is definitely a valid way to do this.  I describe another way to do it (preserving the ISODate() type of the field you're aggregating on here:

PaulKE

unread,
Jun 13, 2013, 2:17:33 PM6/13/13
to mongod...@googlegroups.com
Interesting... Thanks, Asya. 

Paul

PaulKE

unread,
Jun 14, 2013, 4:45:35 PM6/14/13
to mongod...@googlegroups.com
Hello, 

how can I get time string or information with 'group' info? 

for my earlier result: 

                {
                        "_id" : {
                                "year" : 2013,
                                "month" : 6,
                                "date" : 4,
                                "hr" : 14
                        },
                        "count" : 1
                },

I like to see:

                {
                        "_id" : {
                                "year" : 2013,
                                "month" : 6,
                                "date" : 4,
                        },
                        "time" : (time stamp),
                        "count" : 1
                },

from this:
   db.metric.aggregate( {$match:{"time":{$gte:ISODate("2013-06-01T00:00:00Z")}}}, 
                         {$group:{_id: {date:{$dayOfMonth:"$time"}, year:{$year:"$time"}, month:{$month:"$time"}, field:"$user"}}}, 
                         {$group:{_id:{year:"$_id.year", month:"$_id.month", date:'$_id.date'},  count:{$sum:1}}}, 
                         {$sort:{"_id.year":1,"_id.month":1,"_id.date":1}} )

this way, I can convert to local hours to present data in the local time in the client. 

I like to group based on UTC, but, want to present data in LOCAL time. 

I tried few different variations, but, so far, it did not work. 

Evan Zamir

unread,
Jun 15, 2013, 1:36:51 PM6/15/13
to mongod...@googlegroups.com
I'm actually having quite a bit of trouble inserting ISODate types using the native mongodb driver for Node.js. Any idea how to do that? So far, I've only been able to create strings using new ISODate().


On Wednesday, June 12, 2013 10:25:35 AM UTC-7, PaulKE wrote:

Asya Kamsky

unread,
Jun 16, 2013, 6:22:56 PM6/16/13
to mongod...@googlegroups.com
Since you can't do aggregation on more than one "_id" (key space) the thing to do would be to aggregate on the ISODate in UTC and then do the $project step to convert to local time zone *after* the $group step.

PaulKE

unread,
Jun 17, 2013, 8:47:27 PM6/17/13
to mongod...@googlegroups.com
Hi, Asya, 

Thanks for reply. 

1) First, how can I do that? can you provide example of doing 'project' to convert to local on my earlier group example? 

2) Even though it works, it will work on 'hour' level. But, for day, month, grouping is already done on day. In UTC, it may be 'two days', but, locally, it may be just 'one' day. If you just group by day or month (data on 31st and next day cross over), how can you convert to local time without knowing hour unless you always group by 'hour' then add...? I think doing in hour is expensive. 

3) Just today, I started getting 'my doc is over the max doc size (16 MB) error' on hourly aggregation. Daily works fine. But, I think this is strange. Since actual result set would be 5 days worth of hourly aggregations. Which is like 5 days times 24 hours and 'count' result for each hour. So, how can this be over 16 MB? I am sharded environment. So, the error comes from 'mongos'. Currently, i have about 1.6 million record sets for past 5 days. 

4) I tried adding separate 'ltime' field that contains like "ltime.year, ltime.month, ltime.day, ltime.hr, ltime.min". But, when I checked the db.stats(), averageObjSize grew 100%!. Before adding new field and updating all record sets, average size was 727 bytes, and now, it's 1557 bytes from db.stats(). So, now physical file size has increased significantly, and free memory reduced significantly. I think soon the system will run out of memory. Can anyone suggest me what's going on? 

Thanks. 
Reply all
Reply to author
Forward
0 new messages