Group command while using Regex

Merl

unread,

Sep 8, 2011, 6:31:01 PM9/8/11

to mongodb-user

I have a collection which logs bulk inserts into my database. The
record looks like the following:

{ "_id" : ObjectId(.......),
"dt" : "201109082300", //Date and Time yyyymmddhhmiss
"rows" : 354
}

I am trying to do a grouping where I show rows loaded for each hour in
the day, but I can not figure out how to set it up. I put together a
Regex statement (^/d{8}(/d{2}) to retrieve the hour from each document
but not sure how to implement it. Should the regex be the key or part
of the cond or both?

Robert Stam

unread,

Sep 9, 2011, 10:37:13 AM9/9/11

to mongodb-user

Perhaps neither. The approach I would suggest is to use a key function
to extract the part of the "dt" value you want to group on. Here's a
sample mongo shell session showing how you could do it:

> db.test.find()
{ "_id" : ObjectId("4e6a1d90a16b32d6fb183f31"), "dt" : "201109082300",
"rows" : 354 }
{ "_id" : ObjectId("4e6a1da2a16b32d6fb183f32"), "dt" : "201109082200",
"rows" : 100 }
{ "_id" : ObjectId("4e6a1dada16b32d6fb183f33"), "dt" : "201109072200",
"rows" : 200 }
{ "_id" : ObjectId("4e6a1db5a16b32d6fb183f34"), "dt" : "201109072300",
"rows" : 300 }
>
> keyf
function (doc) {
return {hour:doc.dt.substring(8, 10)};
}
>
> reduce
function (doc, prev) {
prev.rows += doc.rows;
}
>
> db.test.group({
... $keyf: keyf,
... initial : { rows : 0 },
... $reduce : reduce
... })
[
{
"hour" : "23",
"rows" : 654
},
{
"hour" : "22",
"rows" : 300
}
]
>

If you wanted to group only data for a range of "dt" values then you
would add a condition to the group command. I probably still wouldn't
use regular expressions. For example, all the documents for 2011-09
could be matched using this query:

> db.test.find({ dt : { $gte : "20110901000000", $lt : "2011100100000000" } })
{ "_id" : ObjectId("4e6a1d90a16b32d6fb183f31"), "dt" : "201109082300",
"rows" : 354 }
{ "_id" : ObjectId("4e6a1da2a16b32d6fb183f32"), "dt" : "201109082200",
"rows" : 100 }
{ "_id" : ObjectId("4e6a1dada16b32d6fb183f33"), "dt" : "201109072200",
"rows" : 200 }
{ "_id" : ObjectId("4e6a1db5a16b32d6fb183f34"), "dt" : "201109072300",
"rows" : 300 }

Robert Stam

unread,

Sep 9, 2011, 10:40:50 AM9/9/11

to mongodb-user

p.s. Just realized I added too many zeros at the end of the values I'm
comparing to... the strings should end after the "00" for minutes.

> db.test.find({ dt : { $gte : "201109010000", $lt : "201110010000" } })

{ "_id" : ObjectId("4e6a1d90a16b32d6fb183f31"), "dt" : "201109082300",
"rows" : 354 }
{ "_id" : ObjectId("4e6a1da2a16b32d6fb183f32"), "dt" : "201109082200",
"rows" : 100 }
{ "_id" : ObjectId("4e6a1dada16b32d6fb183f33"), "dt" : "201109072200",
"rows" : 200 }
{ "_id" : ObjectId("4e6a1db5a16b32d6fb183f34"), "dt" : "201109072300",
"rows" : 300 }
>

Merl Corry

unread,

Sep 12, 2011, 3:54:07 PM9/12/11

to mongod...@googlegroups.com

That worked! One last thing. In the Java driver how do I pass the keyf function to the group() command? Everything I have seen is strictly for the keys, not a function ? Thanks

--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.

Antoine Girbal

unread,

Sep 13, 2011, 6:05:51 PM9/13/11

to mongodb-user

Hi Merl,
the Java driver does not let you specify the key function.
You could do it yourself using code like:

GroupCommand cmd = new GroupCommand(this, key, cond, initial,
reduce, finalize);
DBObject cmdObj = cmd.toDBObject();
cmd.put("keyf", function); // function is a Code object
CommandResult res = _db.command( cmdObj, getOptions() );
res.throwOnError();
return (DBObject)res.get( "retval" );

But it is probably better to just use map/reduce.
The code will be pretty much same but driver handles it entirely, and
performance is usually better with M/R.
AG

Reply all

Reply to author

Forward