Hi,
You have an interesting problem to solve. Jan makes a good point-
using a query to filter out non-matching peers will significantly
improve MR performance/complexity. If possible, I would recommend
using the aggregation framework (available in MongoDB version 2.2) for
this task. MR uses Javascript, and since MongoDB uses SpiderMonkey -
a single-threaded Javascript engine, MR is a slow, blocking operation.
As such, I would not recommend running multiple MR commands back-to-
back for each user. It is also possible to specify a query in your
aggregation command:
Agg framework:
http://docs.mongodb.org/manual/applications/aggregation/
It may also be possible to do the calculation for all users with a
single aggregation command. If you are still searching for a solution,
could you provide a sample document to help us tailor our response to
your use case?
"Binning" or "bucketing" the users based on age could simplify the
problem, and it might allow you could accomplish the aggregation with
a single command. An example of "binning":
{name: Bob, age: 21, weight: 100}
{name: Jenna, age: 23, weight: 108}
{name: Gene, age: 24, weight: 120}
{name: Susan, age: 22, weight; 127}
{name: Tom, age: 25, weight: 101}
{name: Ellen, age: 26, weight: 102}
bin age by 5, weight by 10:
Bob and Jenna are peers, defined as users in the ages 20 - 24, weight
100 - 109.
Gene and Susan are peers, defined as users in the ages 20 - 24, weight
120 - 129.
Tom and Ellen are peers, defined as users in ages 25 - 29, weight 100
- 109.
Hope this helps! Let us know if you need any additional help.