Map/Reduce value attribute

803 views
Skip to first unread message

Chris Eppstein

unread,
Feb 9, 2011, 12:19:53 PM2/9/11
to mongodb-user
The result of map/reduce leaves me with documents that have the result
in a value property. I don't really understand why this is necessary
and I'd like to disable this behavior so that the permanent collection
I'm building from m/r has proper indexes and normal query access.

Gates

unread,
Feb 9, 2011, 2:06:14 PM2/9/11
to mongodb-user
> I don't really understand why this is necessary

The standard output of a Map/Reduce is in fact "reducible". It's
actually kind of a nice feature.I think this is best illustrated by an
example.

Let's assume that you are running an on-line widget sales site. You
want to roll up widget sales by state, by day. The output would look
something like this:

{ _id : { day : "2011-02-09", state : "NY" }, value : { num : 5,
revenue: 25 } }
{ _id : { day : "2011-02-09", state : "CA" }, value : { num : 7,
revenue: 40 } }
{ _id : { day : "2011-02-08", state : "NY" }, value : { num : 3,
revenue: 10 } }

Some benefits of this structure.
- It clarifies how the data is organized. If you look at the "key",
that value is effectively the "group by" columns in an SQL query. That
key clearly indicates which data that is static, while the values
indicate the data that was calculated.
- It makes merges easier to program and easier to understand (merges
are new in 1.7.4, 1.8.0)

> ... so that the permanent collection I'm building from m/r has proper indexes and normal query access

From an indexing perspective, you're going to automatically get an
index on the _id field. Given the you did roll-up "grouped by" day and
state, you're probably going to want to query by day and state. So you
probably already have the basic index that you want.

If you want additional indexes, you can also index into that object.
So you can add an index on _id.day.

> ... normal query access

I think that we need some clarity on "normal query access". The output
will be accessible by doing something like "value.num" and
"value.revenue".

I agree that this is not "normal", but it should be easy to work with.

If this does not clarify what's happening, would you be able to
provide a clear example of what you would like to do?
It may be possible to write a JIRA task, or there may be another way
to achieve what you're looking for.

- Gates

Jared Rosoff

unread,
Feb 9, 2011, 3:36:32 PM2/9/11
to mongodb-user

There is another practical reason why the output of M/R is stored in
the "value" attribute.

It is possible for the reduce phase of M/R to return an integer, or a
string, or any other non-object BSON value. Without having a "value"
key, there would be no place to store the result as every value in the
document must have a key.

-j

Gaetan Voyer-Perrault

unread,
Feb 9, 2011, 3:49:01 PM2/9/11
to mongod...@googlegroups.com
Thanks Jared, that's also definitely a good point. Map Reduce can produce results that look like this:

{ _id : 1, value : 5 }
{ _id : 2, value : 29 }

Obviously you need to use something like this _id, value setup.

--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.


Keith Branton

unread,
Feb 9, 2011, 4:34:23 PM2/9/11
to mongod...@googlegroups.com
Hi Chris,

so it sounds like you're getting results like 

{_id : "cat", value : {count : 3}}

from m/r when you really want a result like

{_id : "cat", count : 3}

instead - i.e. normal looking collection.

It would be great if the optional finalize function could be used to achieve this, something like if your finalize function returns a document with _id already in it, then that could prevent the default wrapping of the value.

finalize doesn't currently do this but that may be an easy enhancement for the 10gen folks.

Keith.

Chris Eppstein

unread,
Feb 9, 2011, 4:45:11 PM2/9/11
to mongodb-user
Thanks for the replies. I understand why it works the way it does now.
And while I understand that I _can_ build queries and indexes that
take the structure of map/reduce results into account, I don't think
it's particularly pleasant to work with once the result is a well
formed document. So I'd like to have a map/reduce option that requires
that the reduction return a hash and that the results returned would
be the value returned by reduce with _id set to the reduction key.
This allows the results of map/reduce jobs to look and feel like 1st
class documents despite their map/reduce origins. Given the direction
that map/reduce is going with respect to writing to permanent
collections, I think such an option makes a lot of sense.

Thanks,
Chris

Gates

unread,
Feb 9, 2011, 8:29:49 PM2/9/11
to mongodb-user
@Chris: is the following a good summary of what you want?

Originals:
------------
{ _id : 1, value : 2 }
{ _id : 1, value : { num : 3, revenue, 10 } }
{ _id : { day : "2011-02-08", state : "NY" }, value 2 }
{ _id : { day : "2011-02-08", state : "NY" }, value : { num : 3,
revenue: 10 } }

New versions:
------------------
{ _id : 1, value : 2 }
{ _id : 1, num : 3, revenue, 10 }
{ day : "2011-02-08", state : "NY", value 2 }
{ day : "2011-02-08", state : "NY", num : 3, revenue: 10 }

This is definitely a reasonable idea, but there are obviously some
reasons that this is not / has not been done.

I would suggest filing a JIRA request for this behavior.
http://jira.mongodb.org/

- Gates

Chris Eppstein

unread,
Feb 9, 2011, 10:46:12 PM2/9/11
to mongodb-user
Yep, that's what I want! Thanks. I have filed a JIRA request:
http://jira.mongodb.org/browse/SERVER-2517

Chris


On Feb 9, 5:29 pm, Gates <ga...@10gen.com> wrote:
> @Chris: is the following a good summary of what you want?
>
> Originals:
> ------------
> { _id : 1, value : 2 }
> { _id : 1, value : { num : 3, revenue, 10 } }
> { _id : { day : "2011-02-08", state : "NY" }, value 2 }
> { _id : { day : "2011-02-08", state : "NY" }, value : { num : 3,
> revenue: 10 } }
>
> New versions:
> ------------------
> { _id : 1, value : 2 }
> { _id : 1, num : 3, revenue, 10 }
> { day : "2011-02-08", state : "NY", value 2 }
> { day : "2011-02-08", state : "NY", num : 3, revenue: 10 }
>
> This is definitely a reasonable idea, but there are obviously some
> reasons that this is not / has not been done.
>
> I would suggest filing a JIRA request for this behavior.http://jira.mongodb.org/

Micah

unread,
Feb 15, 2011, 3:38:31 PM2/15/11
to mongodb-user
You can use finalize() to save the value to an alternate collection.
It's not the cleanest way to do things, but it works. For example:

function finalize(key, value) {
value['_id'] = key; // Or, just leave _id blank and Mongo will
assign one
db.my_other_collection.save(value);

return null;
}

Not super clean, but it works.
Reply all
Reply to author
Forward
0 new messages