Hi all,
While working on my project (basically, it is a proxy for mongodb, written in Java), I observed a very strange behavior in the implementation of mongodb.
My observation: mongodb is sensitive to the order of fields in a bson document! This must not be the case. mongodb must use the element name to find an element and must not retrieve element based on its location.
Scenario:
Let suppose we want to run db.players.find() in football database
The wire protocol will generate
[71, 0, 0, 0, 38, 0, 0, 0, 0, 0, 0, 0, -38, 7, 0, 0, 102, 111, 111, 116, 98, 97, 108, 108, 0, 102, 105, 110, 100, 0,
36, 0, 0, 0, 2, 102, 105, 110, 100, 0, 8, 0, 0, 0, 112, 108, 97, 121, 101, 114, 115, 0, 3, 102, 105, 108, 116, 101, 114, 0, 5, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0]
Green color is a bson document and its field name is metadata
The whole message is
Header
Message length is 71
Request ID is 15
Response To is 0
opCode is OP_COMMAND
Body
database is football
commandName is find
metadata is
{"find":"players","filter":{}} commandArgs is {}
my app generates (I commented the processing part)
[71, 0, 0, 0, 15, 0, 0, 0, 0, 0, 0, 0, -38, 7, 0, 0, 102, 111, 111, 116, 98, 97, 108, 108, 0, 102, 105, 110, 100, 0,
36, 0, 0, 0, 3, 102, 105, 108, 116, 101, 114, 0, 5, 0, 0, 0, 0, 2, 102, 105, 110, 100, 0, 8, 0, 0, 0, 112, 108, 97, 121, 101, 114, 115, 0, 0, 5, 0, 0, 0, 0]
The whole message is
Header
Message length is 71
Request ID is 15
Response To is 0
opCode is OP_COMMAND
Body
database is football
commandName is find
metadata is
{"filter":{},"find":"players"} commandArgs is {}
Then mongodb
fails with the following message
Header
Message length is 112
Request ID is 104
Response To is 15
opCode is OP_COMMANDREPLY
Body
metadata is {"code":73,"errmsg":"Invalid collection name:
football","ok":0.000000,"waitedMS":}
commandArgs is {}
Instead of interpreting football.player, mongodb finds football. Why? Because mongodb append database name with the string value of the first element in mettadata field. In the above example, the first element type is BSONDocument not String, therefore the code returns error. Instead of retrieving the first element, the code must retrieve the find element.
Details
mongo/src/mongo/db/commands/find_cmd.cpp
bool run(OperationContext* txn, const std::string& dbname, BSONObj& cmdObj, int options, std::string& errmsg, BSONObjBuilder& result)
{
const NamespaceString nss(parseNs(dbname, cmdObj));
if (!nss.isValid() || nss.isCommand() || nss.isSpecialCommand()) {
return appendCommandStatus(result,
{ErrorCodes::InvalidNamespace,
str::stream() << "Invalid collection name: " << nss.ns()});
}
...
}
mongo/src/mongo/db/commands.cpp
string Command::parseNs(const string& dbname, const BSONObj& cmdObj) const {
BSONElement first = cmdObj.firstElement();
if (first.type() != mongo::String)
return dbname;
return str::stream() << dbname << '.' << cmdObj.firstElement().valuestr();
}
string Command::parseNsFullyQualified(const string& dbname, const BSONObj& cmdObj) {
BSONElement first = cmdObj.firstElement(); //this line is the problem
uassert(ErrorCodes::BadValue,
str::stream() << "collection name has invalid type " << typeName(first.type()),
first.canonicalType() == canonicalizeBSONType(mongo::String));
const NamespaceString nss(first.valueStringData());
uassert(ErrorCodes::InvalidNamespace,
str::stream() << "Invalid namespace specified '" << nss.ns() << "'",
nss.isValid());
return nss.ns();
}
I think one should fix this. What do you think?
Regards,
Lida