MapReduce function isn't quite working as it should

33 views
Skip to first unread message

Andrew Carter

unread,
Apr 25, 2015, 8:25:54 AM4/25/15
to mongod...@googlegroups.com
I'm running the following on my two collections, named "players" and "pharmacies".

The issue I'm having is that when I run the mapReduce functions, the first one outputs it's rows to the collection and the second one outputs it's rows to the collection as well.

There is some merge of data and that's all I want.

How can I modify this to ensure that ONLY data that has a "match" of _id and pharmacy (which are both object ID's like ObjectId("1287943s76a87flkl2")) will remain in the table and data that doesn't have a "match" isn't put in there?

//Merge data so it outputs a complete row of data from the reduce function, no rows should appear if they don't have all the data

player_map = function() {
emit(this.pharmacy, {"firstname" : this.first_name, "lastname" : this.last_name, "email" : this.email});
}

pharma_map = function() {
emit(this._id, {"name" : this.name, "country" : this.country});
}


r = function(key, values) {
var result = {
"firstname" : "",
"lastname" : "",
"email" : "",
"name" : "",
"country" : ""
};

values.forEach(function(value) {
if(value.firstname !== null) {result.firstname = value.firstname;}
if(value.lastname !== null) {result.lastname = value.lastname;}
if(value.email !== null) {result.email = value.email;}
if(value.name !== null) {result.name = value.name;}
if(value.country !== null) {result.country = value.country;}
});

return result;
}

res = db.players.mapReduce(player_map, r, {out: {reduce : 'joined'}});
res = db.pharmacies.mapReduce(pharma_map, r, {out: {reduce : 'joined'}});

Asya Kamsky

unread,
Apr 25, 2015, 8:51:48 PM4/25/15
to mongodb-user

The issue is that if there is only one document emitted from map for a particular key, reduce is never even called and you don't even have an option not to return the value if it's not complete.

But I think that if you add a finalizer function which only returns if all the expected fields are there, maybe that would do the trick for you.

Asya




--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/c1c248a2-79d1-4bd6-8390-8f40cdbb6c2c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
MongoDB World is back! June 1-2 in NYC. Use code ASYA for 25% off!

Andrew Carter

unread,
Apr 25, 2015, 10:58:30 PM4/25/15
to mongod...@googlegroups.com
Thank you, Asya.

I'm not too experienced with Mongo, hence why I'm posting on this group.

How would that finalizer function fit in with my above code? I.e.: If I was to add the 

finalize: finalizefunction

in to my

mapReduce

function, what am I supposed to be doing in that finalizefunction to get the joined data?

Asya Kamsky

unread,
Apr 26, 2015, 2:54:01 AM4/26/15
to mongodb-user

It's added to the argument that includes "out" info - the docs show some examples, basically it is called last for each key after all the reducing is done.

Asya

Andrew Carter

unread,
Apr 26, 2015, 5:58:41 AM4/26/15
to mongod...@googlegroups.com
Thanks Asya.

That example shows "out" as a merge, although I would need the merged data to be created in a new collection.

Would this not work? http://blog.knoldus.com/2014/03/12/easiest-way-to-implement-joins-in-mongodb-2-4/ it's exactly what I'm trying to achieve.

Otherwise, could you please assist with my code above to show how I can join on the object ID and get the joined data in to a new collection only for the "players" that don't have an empty "pharmacy" ObjectID?

Thanks again for your help!

Asya Kamsky

unread,
Apr 26, 2015, 8:43:17 AM4/26/15
to mongodb-user

The example you point to is doing exactly what you are already doing.  Wouldn't his code have the same issue you are trying to avoid?

If he had a department with no employees it would still be output in the end.

How frequently do you have to do this "merge" and how are you using the output collection?   Maybe mapReduce isn't the only/best way to get there?

Asya

Andrew Carter

unread,
Apr 26, 2015, 8:46:16 AM4/26/15
to mongod...@googlegroups.com
That's what I thought.. hmmmm.

I only have to do this merge once. I'd just like to see the correlating data (players matching up with the pharmacies name+country instead of just the pharmacy objectID). So this matched up data should be either in an outputted CSV or new collection - no real difference for me.

Any other ideas?

Asya Kamsky

unread,
Apr 26, 2015, 9:55:33 AM4/26/15
to mongod...@googlegroups.com
Andy,

I just realized that your original (and the blog post you linked to) was doing it wrong - you can't do it with map reduce since if two players had the same pharmacy id you would only get first player's info with pharamcy details in your map/reduce.  In other words, if you emit unique pharmacy keys (department keys in the blog post) then everything will be reduced to a single row/document per pharmacy (department).

Your simpler option, is to just do it in application or in mongo shell - here's how in the shell:

db.players.find().forEach(function(pl) {
     pharm=db.pharmacies.find({_id:pl.pharmacy});
     pl.phama_name=pharm.name;
     pl.pharma_country=pharm.country;
     db.merged.insert(pl)
});

This iterates over players collection and for each player it looks up the pharmacy name and city by id, adds that to player document and saves it to a new collection.   New collection will be identical to players collection but with extra two fields for each player.   My sample code assumes that every player pharmacy field maps to existing entry in pharmacies collection - if you think that might not be the case (dangling or dead-end foreign key in other words) then you would want to add some error checking to the code).

Asya


For more options, visit https://groups.google.com/d/optout.

Andrew Carter

unread,
Apr 26, 2015, 8:43:15 PM4/26/15
to mongod...@googlegroups.com
Thanks Asya.

All players definitely have a pharmacy so no issue there.

It's inserting the two new fields but data is empty. 

I checked on the existing collections by doing a 

findOne({_id: ObjectId(...


It finds a match. But for some reason your example isn't inserting the found data (name and country) :/
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+unsubscribe@googlegroups.com.

Andrew Carter

unread,
Apr 26, 2015, 9:36:10 PM4/26/15
to mongod...@googlegroups.com
Those two are coming back as undefined in Mongo shell.

Andrew Carter

unread,
Apr 26, 2015, 11:33:52 PM4/26/15
to mongod...@googlegroups.com
All sorted, thanks Asya!

Asya Kamsky

unread,
Apr 27, 2015, 1:40:14 AM4/27/15
to mongodb-user
Sorry I had a mistake in mine.

Change

pharm=db.pharmacies.find({_id:pl.pharmacy});
to
pharm=db.pharmacies.findOne({_id:pl.pharmacy});






--
MongoDB World is back! June 1-2 in NYC. Use code ASYA for 25% off!

--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages