Using mapreduce to average metrics over time.

28 views
Skip to first unread message

Chris Matta

unread,
Oct 21, 2012, 7:44:27 PM10/21/12
to mongod...@googlegroups.com
I'd like to output documents with a rounded metric per five minutes. My data is minutely, and it would seem that mapreduce is the right way to go about this.

Here's what the data schema looks like: 

{
"_id" : ObjectId("508475a3ff4d6fee7c86a7a6"),
"array_serial" : "12345",
"port_name" : "CL1-A",
"datetime" : ISODate("2012-10-20T22:22:27.469Z"),
"metric" : 465.35
}

My question is how do you output records from mapreduce to a collection in the same exact format? Is that possible or do you always need to output in key, value format?

Ideally I'd like a collection of documents in the above schema but rounded to 5 minute intervals.

Here's some code I was playing with, but never got to work:


var map = function() { 
        var coeff = 1000*60*5
        var roundtime = new Date(Math.round(this.datetime.getTime()/coeff)*coeff);
        var key = { 
            array_serial : this.array_serial,
            port_name : this.port_name,
            datetime : roundtime
        }
        emit(key, { value : this.metric, count : 1})
    }
 
var reduce = function(key, value) { 
        var result = { metric : 0, count : 0}
        result.metric += value.value
        result.count += value.count
        return result
    }
 

 var finalize = function(key, value) {
        var met = value.metric / value.count 
       key.metric = met
 
       return key
    }

Chris Matta

unread,
Oct 22, 2012, 9:25:52 AM10/22/12
to mongod...@googlegroups.com
It seems like this might not be possible : https://jira.mongodb.org/browse/SERVER-2517

Is that true?

Jenna deBoisblanc

unread,
Oct 24, 2012, 10:22:29 PM10/24/12
to mongod...@googlegroups.com
Hello Chris,

To round the time per five minutes, you can use the following MR functions:

map = function () {
   var date = this.datetime;
   var min = date.getMinutes() % 5;
   var test = Math.round(min/5);
   if(test == 0){
      date.setMinutes(date.getMinutes() - min);
   }
   else {
      date.setMinutes(date.getMinutes() - min + 5);
   }
   date.setSeconds(0);
   date.setMilliseconds(0);
   emit(this._id, {array_serial: this.array_serial, port_name: this.port_name, min: min, datetime: date, metric: this.metric});
}

reduce = function() {}

You are correct that the structure of the final documents will not be exactly the same as the original doc structure.  Please feel free to track or upvote this issue in JIRA.

On Sunday, October 21, 2012 7:44:27 PM UTC-4, Chris Matta wrote:
Reply all
Reply to author
Forward
0 new messages