MongoDB 2.0.2 MapReduce Assertion with Authenticated Shards

81 views
Skip to first unread message

George Macon

unread,
Feb 9, 2012, 3:00:17 PM2/9/12
to mongod...@googlegroups.com
When v2.0.0 was released, we tried to upgrade because "Although the
major version number has changed, MongoDB 2.0 is a standard, incremental
production release and works as a drop-in replacement for MongoDB 1.8.".
After upgrading, our map-reduce scripts started failing, so we backed
out the change and are currently running 1.8.4 in production.

Recently, I built a testing box because we thought the problem might
manifest only under heavy load.

Here's the setup I'm using:

* Ubuntu 10.04 LTS (Lucid Lynx)
* MongoDB 2.0.2 from the 10gen apt repository.
* Two replica sets consisting of two replicas and an arbiter. These two
sets are shards in the database.
* --keyAuth is enabled. (This is the reason we want to upgrade to 2.0.)
* For testing, the shard server has been configured with a small chunk size.

I created 10,000,000 dummy documents in the sharded collection and
attempted to
run a Map-Reduce. It eventually failed with this error:

pymongo.errors.OperationFailure: command SON([('mapreduce', u'phrases'),
('map', "function () { var a = this.phrase.split(' '); for (var i = 0; i
< a.length; ++i) { emit(a[i], 1); } }"), ('reduce', 'function (k, v) { t
= 0; for(var i = 0; i < v.length; ++i) { t += v[i]; } return t; }'),
('out', 'word_counts')]) failed: mongod mr failed: { assertion:
"assertion db/commands/../../util/net/../../db/../bson/bson-inl.h:184",
errmsg: "db assertion failure", ok: 0.0 }

Here are selected lines from mongos.log. I can send the full logs
off-list to
anyone who is interested.

Tue Feb 7 09:23:11 [conn2] authenticate: { authenticate: 1, nonce:
"e4b51fb3d5abd571", user: "mars", key: "201f1b95aba070bf9a323b9bb2b86614" }
Tue Feb 7 09:23:11 [conn2] initializing shard connection to cid:27018
Tue Feb 7 09:23:11 [conn2] initializing shard connection to cid:27021
Tue Feb 7 09:23:11 [conn2] setShardVersion shard1 cid:27018
test.phrases { setShardVersion: "test.phrases", configdb:
"cid:27023,cid:27024,cid:27025", version: Timestamp 10000|17, serverID:
ObjectId('4f31332942c5c1373aa9c862'), shard: "shard1", shardHost:
"shard1/cid:27017,cid:27018" } 0x15c9c60
Tue Feb 7 09:23:11 [conn2] setShardVersion failed!
Tue Feb 7 09:23:11 [conn2] setShardVersion shard1 cid:27018
test.phrases { setShardVersion: "test.phrases", configdb:
"cid:27023,cid:27024,cid:27025", version: Timestamp 10000|17, serverID:
ObjectId('4f31332942c5c1373aa9c862'), authoritative: true, shard:
"shard1", shardHost: "shard1/cid:27017,cid:27018" } 0x15c9c60
Tue Feb 7 09:23:11 [conn2] setShardVersion success: { oldVersion:
Timestamp 0|0, ok: 1.0 }
Tue Feb 7 09:23:11 [conn2] setShardVersion shard2 cid:27021
test.phrases { setShardVersion: "test.phrases", configdb:
"cid:27023,cid:27024,cid:27025", version: Timestamp 10000|21, serverID:
ObjectId('4f31332942c5c1373aa9c862'), shard: "shard2", shardHost:
"shard2/cid:27020,cid:27021" } 0x15c9fa0
Tue Feb 7 09:23:11 [conn2] setShardVersion failed!
Tue Feb 7 09:23:11 [conn2] setShardVersion shard2 cid:27021
test.phrases { setShardVersion: "test.phrases", configdb:
"cid:27023,cid:27024,cid:27025", version: Timestamp 10000|21, serverID:
ObjectId('4f31332942c5c1373aa9c862'), authoritative: true, shard:
"shard2", shardHost: "shard2/cid:27020,cid:27021" } 0x15c9fa0
Tue Feb 7 09:23:11 [conn2] setShardVersion success: { oldVersion:
Timestamp 0|0, ok: 1.0 }
Tue Feb 7 10:15:35 [conn2] ERROR: sharded m/r failed on shard:
shard1/cid:27017,cid:27018 error: { assertion: "assertion
db/commands/../../util/net/../../db/../bson/bson-inl.h:184", errmsg: "db
assertion failure", ok: 0.0 }
Tue Feb 7 10:21:01 [conn2] ERROR: sharded m/r failed on shard:
shard2/cid:27020,cid:27021 error: { assertion: "assertion
db/commands/../../util/net/../../db/../bson/bson-inl.h:184", errmsg: "db
assertion failure", ok: 0.0 }
Tue Feb 7 10:21:01 [conn2] end connection 127.0.0.1:46513

bson/bson-inl.h:184 is `assert( isABSONObj() );`

Any idea what's causing this?
--
George Macon
Georgia Tech Research Institute
Cyber Technology and Information Security Laboratory
404-407-8185


Eliot Horowitz

unread,
Feb 12, 2012, 1:24:36 AM2/12/12
to mongod...@googlegroups.com
Can you send the map/reduce code from this?

George Macon

unread,
Feb 13, 2012, 8:48:03 AM2/13/12
to mongod...@googlegroups.com
Sure. I've attached the map reduce test script and the fake data
generator (both Python scripts).

On 02/12/2012 01:24 AM, Eliot Horowitz wrote:
> Can you send the map/reduce code from this?

load_phrases.py
map_reduce_test.py

Antoine Girbal

unread,
Feb 13, 2012, 10:53:24 AM2/13/12
to mongodb-user
this looks like the error fixed in SERVER-4114 which should be in
2.0.2.
Did you properly set --keyFile option on every single machine (mongos,
all mongods, configs)?
>  load_phrases.py
> < 1KViewDownload
>
>  map_reduce_test.py
> < 1KViewDownload
>
>  smime.p7s
> 6KViewDownload

George Macon

unread,
Feb 14, 2012, 9:27:03 AM2/14/12
to mongod...@googlegroups.com

On 02/13/2012 10:53 AM, Antoine Girbal wrote:
> this looks like the error fixed in SERVER-4114 which should be in
> 2.0.2.
I had found SERVER-4490, which is a dupe of SERVER-4114, but my symptoms
are not exactly the same. I'm not getting the "could not inititalize
cursor across all shards" error, which I remember seeing with 2.0.0.
When I run the map-reduce, it runs for a while and then exits with the
assertion.

> Did you properly set --keyFile option on every single machine (mongos,
> all mongods, configs)?
>

Yes. (For testing, I'm running this all in one machine.)

/etc/mongodb$ grep keyFile *.conf ../mongos.conf
cs-1.conf:keyFile = /etc/mongodb/key
cs-2.conf:keyFile = /etc/mongodb/key
cs-3.conf:keyFile = /etc/mongodb/key
shard1-arbiter.conf:keyFile = /etc/mongodb/key
shard1-replica1.conf:keyFile = /etc/mongodb/key
shard1-replica2.conf:keyFile = /etc/mongodb/key
shard2-arbiter.conf:keyFile = /etc/mongodb/key
shard2-replica1.conf:keyFile = /etc/mongodb/key
shard2-replica2.conf:keyFile = /etc/mongodb/key
../mongos.conf:keyFile=/etc/mongodb/key
/etc/mongodb$

Antoine Girbal

unread,
Feb 24, 2012, 8:15:54 PM2/24/12
to mongodb-user
Also, just to double check, does the error go away if you do not use
keyFile authentication?
>  smime.p7s
> 6KViewDownload

George Macon

unread,
Feb 27, 2012, 9:23:25 AM2/27/12
to mongod...@googlegroups.com
It does. With auth disabled, the MapReduce completes successfully.

Antoine Girbal

unread,
Feb 28, 2012, 2:40:36 AM2/28/12
to mongodb-user
uh looks like google discarded my earlier message..
I have tried to reproduce with about same setup:
- stock 2.0.2
- 2 shard system with single servers as shards
- 1 config server
- Using keyFile for all servers.

This works fine, no sign of error.
I'll have to try with replset instead of single servers.
Could you give the output of db.printShardingInfo()?
thx
>  smime.p7s
> 6KViewDownload

George Macon

unread,
Feb 28, 2012, 12:07:28 PM2/28/12
to mongod...@googlegroups.com
For the record:

mongos> db.printShardingStatus()
--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards:
{ "_id" : "shard1", "host" : "shard1/cid:27017,cid:27018" }
{ "_id" : "shard2", "host" : "shard2/cid:27020,cid:27021" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "test", "partitioned" : true, "primary" : "shard2" }
test.phrases chunks:
shard1 16
shard2 18
too many chunks to print, use verbose if you want to force
print

Antoine Girbal

unread,
Mar 19, 2012, 6:06:17 PM3/19/12
to mongodb-user
unfortunately I still cannot reproduce this with 2.0.2 and the keyFile
option turned on.
I set up a different cluster with 2 shards using replica sets.

mongos> db.printShardingStatus()
--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards:
{ "_id" : "foo", "host" : "foo/agmac.local:27017,agmac.local:
27019,agmac.local:27018" }
{ "_id" : "foo2", "host" : "foo2/agmac.local:27020,agmac.local:
27021" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "test", "partitioned" : true, "primary" : "foo" }
test.sharded chunks:
foo 1
foo2 1
{ "a" : { "$MinKey" : true } } -->> { "a" : 250 } on : foo { "t" :
2000, "i" : 1 }
{ "a" : 250 } -->> { "a" : { "$MaxKey" : true } } on : foo2 { "t" :
2000, "i" : 0 }

Map reduce is running successfully and uses documents from both
shards:

mongos> db.runCommand( { "mapreduce" : "sharded" , "map" : "function()
{ emit(this.a, {count: 1}); }" , "reduce" : "function(key, values)
{ total = 0; for (var i = 0; i < values.length; ++i) { total +=
values[i].count; }; return {count: total}; }" , "verbose" : true ,
"out" : { "replace" : "mrout"}} )
{
"result" : "mrout",
"shardCounts" : {
"foo/agmac.local:27017,agmac.local:27019,agmac.local:27018" : {
"input" : 2512,
"emit" : 2512,
"reduce" : 712,
"output" : 250
},
"foo2/agmac.local:27020,agmac.local:27021" : {
"input" : 2418,
"emit" : 2418,
"reduce" : 694,
"output" : 250
}
},
"counts" : {
"emit" : NumberLong(4930),
"input" : NumberLong(4930),
"output" : NumberLong(500),
"reduce" : NumberLong(1406)
},
"ok" : 1,
"timeMillis" : 189,
"timing" : {
"shards" : 169,
"final" : 20
}
}


On Feb 28, 10:07 am, George Macon <george.ma...@gtri.gatech.edu>
>  smime.p7s
> 6KViewDownload

Antoine Girbal

unread,
Mar 19, 2012, 6:09:44 PM3/19/12
to mongodb-user
is this reproducible always?
Could you try to restart all components and specify --keyFile on the
command line instead of config file?
If you create a new small sharded collection, put a few thousand
documents in it, do you still get the same MR error against it?
anything to narrow down the issue would help thanks
Reply all
Reply to author
Forward
0 new messages