time series aggregation

133 views
Skip to first unread message

Michael Di Domenico

unread,
Dec 10, 2015, 5:25:47 PM12/10/15
to mongodb-user
I'm attempting to aggregate time series data, but can't seem to get the aggregation pipeline correct.  hopefully someone can help...

document example

{
  metadata: {
topic: "memory_used:,
group: "group1"
ts_day: ISODate("2013-10-10T00:00:00.000Z"),
}
  values: { 0: { 0: 999999, 5: 999999, …, 55: 1000000 },
1: { 0: 999999, 5: 999999, …, 55: 1000000 }, 2: { 0: 999999, 5: 999999, …, 55: 1000000 },  } }

the values are the hour and then the 5 minute interval within the hour

i can aggregate the groups at the 5 minute interval using

db.collection.aggregate([
    { $match: {
        metadata.topic: "memory_used:,
        metadata.ts_day: ISODate("2013-10-10T00:00:00.000Z"),
        }
    },
    { $group: {
        '_id': { groupname: $metadata.group },
        'mem': { $sum: $values.1.5 }
        }
    }
])

but how do i aggregate each hour of each group

Rhys Campbell

unread,
Dec 11, 2015, 5:06:12 AM12/11/15
to mongodb-user
Perhaps something using $hour?


Do you want a running total of each group? Perhaps an example document of what you want to output will be useful.

Rhys

John De Goes

unread,
Dec 11, 2015, 6:06:06 PM12/11/15
to mongodb-user

As with all classes of problems like this (data-in-schema), you need the $unravel operator which does not currently exist.


Until then you can use map/reduce or hard code the keys in values and its subdocuments.

Regards,

John

Michael Di Domenico

unread,
Dec 13, 2015, 4:33:20 PM12/13/15
to mongod...@googlegroups.com
My goal is to have an hourly average for each group. For example

{
_id: 'group1', hour: 0, avg: 1234
_id: 'group1', hour: 1, avg: 1234
_id: 'group1', hour: 2, avg: 1234
_id: 'group2', hour: 0, avg: 1234
_id: 'group2', hour: 1, avg: 1234
_id: 'group2', hour: 2, avg: 1234
}

I'm not opposed to changing the data around to fit a better model.
I'll admit i'm new to mongo and not positive i'm doing it right...

Michael Di Domenico

unread,
Dec 13, 2015, 4:39:06 PM12/13/15
to mongod...@googlegroups.com
On Fri, Dec 11, 2015 at 6:06 PM, John De Goes <jo...@degoes.net> wrote:
>
> As with all classes of problems like this (data-in-schema), you need the
> $unravel operator which does not currently exist.
>
> Please vote for it here: https://jira.mongodb.org/browse/SERVER-11392

Yes, that does sound like what i need. I registered and voted for it.

> Until then you can use map/reduce or hard code the keys in values and its
> subdocuments.

I guess i'll have to brush up on my javascript then. The data model i
used was from the time-series mongo example i (and everyone else)
seems to reference on the web. but there isn't much, clearly written,
around on how to exact useful information from it.

Wan Bachtiar

unread,
Dec 14, 2015, 3:00:17 AM12/14/15
to mongodb-user

I’m not opposed to changing the data around to fit a better model.

I’ll admit i’m new to mongo and not positive i’m doing it right…

Hi Michael,

Data in MongoDB has a flexible schema, and this means your data model can be designed to benefit how the applications will use the data.

Depending on your applications use case, there are a number of possible data schemas you can try. One example, you could modify the hour to contain an array of minutes.

values:{
     0: [9999, 9999, 9999, 9999], 
     1: [9999, 9999, 9999, 9999], 
     2: [9999, 9999, 9999, 9999]
     ...
};

/* Fetch aggregation of hour 0 */
db.collection.aggregate([
    {$unwind:"$values.0"}, 
    {$group:
        {_id: "$metadata.group", 
         hour0: {$sum: "$values.0" }
        }
    }
]);

See Data Modelling Introduction for more examples.


If you would like to stay with your current schema, you can try this aggregation script:

/* Match the document that you wanted */
var match = {
        
"metadata.topic": "memory_used",
        "metadata.ts_day": ISODate("2013-10-10T00:00:00.000Z"
)
};

/* Assuming that group is unique per day */
var project = {"_id":"$metadata.group"};
for (var hour=0; hour<24; hour++){
   /* Add values with incremental of 5 minutes */
    var minutes = [];
    for (var min=0; min<60; min=min+5){
        minutes.push("$values."+hour+"."+min);
    }
    project["hour"+hour] = {"$add": minutes};
};

printjson(
    db.collection.aggregate([
        {$match: match},
        {$project: project}
    ]).toArray()
);

The example above utilises $add to add all the values for the hour.


Kind regards,

Wan.

John De Goes

unread,
Dec 14, 2015, 4:50:15 PM12/14/15
to mongodb-user

I guess i'll have to brush up on my javascript then.  The data model i 
used was from the time-series mongo example i (and everyone else) 
seems to reference on the web.  but there isn't much, clearly written, 
around on how to exact useful information from it. 

It won't exactly help with performance, but soon Quasar will at least write the map/reduce for you.

select avg(values{*}{_}) as avg, values{*:} as hour, metadata.group as group
  from collection
  group by metadata.group, values{*:}

Which can be read as, average up all the values inside the minute buckets inside values, and group them by the label of the hour bucket and the group name.

An efficient (non-map/reduce) implementation will require the $unravel operator or equivalent.

Regards,

John

Asya Kamsky

unread,
Dec 14, 2015, 6:47:57 PM12/14/15
to mongodb-user
While Wan is correct about how you can restructure schema, since you know exactly how many hours are in a day and how many 5 minute periods are in each hour, you can programmatically generate aggregation pipeline to handle this - you certainly don't need map/reduce here (and it would be _significantly_ slower).

Asya





--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/CABOsP2MAe7Gu6Uzk7so4Egqu0zrPoGRQkBuWj%3DwfUytMYAv0Hw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.



--
Asya Kamsky
Lead Product Manager
MongoDB
Download MongoDB - mongodb.org/downloads
Free MongoDB Monitoring - cloud.mongodb.com
Free Online Education - university.mongodb.com
Get Involved - mongodb.org/community
We're Hiring! - https://www.mongodb.com/careers

Kowshik Nandagudi

unread,
Dec 15, 2016, 1:41:35 PM12/15/16
to mongodb-user, mdidom...@gmail.com
Hi Guys,

I have a similar issues .

Could you please share the query on the original schema to calculate hourly average.

I also need to calculate daily averages.

Thanks in Advance

Kowshik
Reply all
Reply to author
Forward
0 new messages