Chris
unread,Nov 18, 2010, 8:36:00 PM11/18/10Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to mongodb-user
(Apologies for the length of this post.)
I've come across what seems like a rather pernicious bug in MongoDB's
mapreduce functionality. In essence, you can actually not have a map
or reduce function and no errors are thrown. An illustration:
db.testing.save({"foo":"Hello World"});
db.testing.save({"foo":"Goodbye, Cruel World"});
db.runCommand(
{ mapreduce : "testing",
map : function(){emit(this.foo, {count: 1})},
reduce : function(key, values){
return {count: values[0].count};
},
out: "test_mr"
});
This isn't doing anything interesting (the reduce function is
particularly dumb), but it should do *something*. Here's the output
you should see:
> db.test_mr.find()
{ "_id" : "Goodbye, Cruel World", "value" : { "count" : 1 } }
{ "_id" : "Hello World", "value" : { "count" : 1 } }
All well and good. Now, if you were to replace your map function
definition with a String, it still works:
db.runCommand(
{ mapreduce : "testing",
map : "function(){emit(this.foo, {count: 1})}",
reduce : function(key, values){
return {count: values[0].count};
},
out: "test_mr"
});
This is good, becuase I actually have my map and reduce functions
specified in separate files (for ease of maintenance) and convert them
to strings to feed into MongoDB. (I actually interact with MongoDB
through Clojure's congomongo library, which is a wrapper for the Java
MongoDB driver, but I've found this behavior in the JavaScript shell,
which is what my examples will use.)
If you've got that function specified in a file that looks like this
function(){
emit(this.foo, {count: 1})
}
you'll end up with a String that looks like this (note the newlines):
"function(){\n emit(this.foo, {count: 1})\n}"
Stick that into your mapreduce command, and everything still works:
db.runCommand(
{ mapreduce : "testing",
map : "function(){\n emit(this.foo, {count: 1})\n}",
reduce : function(key, values){
return {count: values[0].count};
},
out: "test_mr"
});
It even works with comments in your file:
/**
* This is my awesome map function.
*/
function(){
// Emit some stuff
emit(this.foo, {count: 1})
}
Convert to a string, then run:
db.runCommand(
{ mapreduce : "testing",
map : "/**\n * This is my awesome map function.\n */
\nfunction(){\n // Emit some stuff\n emit(this.foo, {count: 1})
\n}",
reduce : function(key, values){
return {count: values[0].count};
},
out: "test_mr"
});
Still works... but only if your leading comment(s) before the
beginning of the function are block comments and not line comments!
For example:
// This will fail
function(){
// Emit some stuff
emit(this.foo, {count: 1})
}
Run as a string:
db.runCommand(
{ mapreduce : "testing",
map : "// This will fail\nfunction(){\n // Emit some stuff
\n emit(this.foo, {count: 1})\n}",
reduce : function(key, values){
return {count: values[0].count};
},
out: "test_mr"
});
Here's the status object returned from that call:
{
"result" : "test_mr",
"timeMillis" : 2,
"counts" : {
"input" : 2,
"emit" : 0,
"output" : 0
},
"ok" : 1
}
Note that 0 records are emitted. To carry it out to the endpoint, the
following two invocations run, but emit no records:
db.runCommand(
{ mapreduce : "testing",
map : "//",
reduce : function(key, values){
return {count: values[0].count};
},
out: "test_mr"
});
db.runCommand(
{ mapreduce : "testing",
map : "",
reduce : function(key, values){
return {count: values[0].count};
},
out: "test_mr"
});
Same thing happens if you do the same with your reduce function, too:
db.runCommand(
{ mapreduce : "testing",
map : "",
reduce : "",
out: "test_mr"
});
No errors of any kind are thrown at all, despite the fact that there
are no functions to run!
The behavior of the comments is a bit odd, as internal line comments
are handled just fine, as can be seen in the above examples. If the
whole string starts with a line comment, however, internal newlines
are not recognized, and the entire string is apparently seen as a
comment. The fact that MongoDB will happily "run" your mapreduce
query without having either a map or a reduce function, and without
giving any error of any kind is pretty insidious, particularly since
it is quick to throw an error when you have incorrect syntax in your
map or reduce functions. I lost a day to tracking this thing down.
This surely must be a bug, right?