Hi!
We are evaluating different mongodb setups on amazon ec2 with ubuntu
10.0.4 AMIs. We have tested setting up 3 config servers on 3 micro
instances and a router on a large instance. We first tried to shard on
one large instance and got ~9000 upserts/second, which didn't change
much during the test. When adding a shard on another large instance we
only got ~7000 upserts/second. When checking mongostat we can see that
upserts vary very much (between 0 to 35000 update/s). we also get a
weird StaleConfigException at the same time. Reads seems also to get
slower with two shards. This seems very weird and we can't find the
error, do you have any ideas?
mongostat:
insert/s query/s update/s delete/s getmore/s command/s mapped
vsize res faults/s locked % idx miss % conn time
0 0 2975 0 0 1 0
438 20 0 0 0 13 10:34:10
0 0 4292 0 0 1 0
438 20 0 0 0 13 10:34:11
0 0 4349 0 0 1 0
438 20 0 0 0 13 10:34:12
0 0 22941 0 0 1 0
438 20 0 0 0 13 10:34:13
0 0 12207 0 0 1 0
438 20 0 0 0 13 10:34:14
0 0 10701 0 0 1 0
438 20 0 0 0 13 10:34:15
0 0 4544 0 0 1 0
438 21 0 0 0 13 10:34:16
0 0 4453 0 0 1 0
438 21 0 0 0 13 10:34:17
0 0 3957 0 0 1 0
438 21 0 0 0 13 10:34:18
0 0 5227 0 0 1 0
438 21 0 0 0 13 10:34:19
0 0 4192 0 0 1 0
438 21 0 0 0 13 10:34:20
0 0 25435 0 0 1 0
438 21 0 0 0 13 10:34:21
0 0 4121 0 0 1 0
438 21 0 0 0 13 10:34:22
0 0 2416 0 0 1 0
438 21 0 0 0 13 10:34:23
0 0 2166 0 0 1 0
438 21 0 0 0 13 10:34:24
0 0 2195 0 0 1 0
438 21 0 0 0 13 10:34:25
0 0 2044 0 0 1 0
438 21 0 0 0 13 10:34:26
0 0 12227 0 0 1 0
438 21 0 0 0 13 10:34:27
0 0 3643 0 0 1 0
438 21 0 0 0 13 10:34:28
0 0 25697 0 0 1 0
438 21 0 0 0 13 10:34:29
insert/s query/s update/s delete/s getmore/s command/s mapped
vsize res faults/s locked % idx miss % conn time
0 0 35755 0 0 1 0
438 21 0 0 0 13 10:34:30
0 0 7316 0 0 1 0
438 21 0 0 0 13 10:34:31
0 0 9863 0 0 1 0
438 21 0 0 0 13 10:34:32
0 0 10730 0 0 1 0
438 21 0 0 0 13 10:34:33
0 0 10401 0 0 1 0
438 21 0 0 0 13 10:34:34
0 0 8030 0 0 1 0
438 21 0 0 0 13 10:34:35
0 0 8410 0 0 1 0
438 21 0 0 0 13 10:34:36
....
mongos log:
Wed Jan 5 10:29:22 [conn3] autosplitting ud.visits size: 58893640
shard: ns:ud.visits at: shard0001:
10.228.94.255:27017 lastmod: 3|3
min: { uuid: "7cbe5f73-b804-4d76-834b-26ea8dfc2060" } max: { uuid:
"fffff375-ed72-4482-a20e-dd4edbf7ef97" } on: { uuid:
"bdaef5f0-8766-41b5-992c-05155942a798" }(splitThreshold 52428800)
Wed Jan 5 10:29:22 [conn3] config change: { _id:
"ip-10-235-79-102-2011-01-05T10:29:22-4", server: "ip-10-235-79-102",
time: new Date(1294223362667), what: "split", ns: "ud.visits",
details: { before: { min: { uuid: "7cbe5f73-
b804-4d76-834b-26ea8dfc2060" }, max: { uuid: "fffff375-ed72-4482-a20e-
dd4edbf7ef97" } }, left: { min: { uuid: "7cbe5f73-
b804-4d76-834b-26ea8dfc2060" }, max: { uuid:
"bdaef5f0-8766-41b5-992c-05155942a798" } }, right: { min: { uuid:
"bdaef5f0-8766-41b5-992c-05155942a798" }, max: { uuid: "fffff375-
ed72-4482-a20e-dd4edbf7ef97" } } } }
Wed Jan 5 10:29:22 [conn5] update failed b/c of StaleConfigException,
retrying left:4 ns: ud.visits query: { uuid: "a004fb80-9a35-43e5-
afbb-44d4af5c9982" }
Wed Jan 5 10:29:22 [conn11] update failed b/c of
StaleConfigException, retrying left:4 ns: ud.visits query: { uuid:
"7401f7a0-7a7c-448d-b5ce-e556ddd087a2" }
Wed Jan 5 10:29:22 [conn8] update failed b/c of StaleConfigException,
retrying left:4 ns: ud.visits query: { uuid: "6df95be6-df9e-4aa8-b621-
c7aaca98fd59" }
Wed Jan 5 10:29:22 [WriteBackListener] update failed b/c of
StaleConfigException, retrying left:4 ns: ud.visits query: { uuid:
"cd9298db-fe7b-4001-a3b6-c208752ca448" }
Wed Jan 5 10:29:22 [conn12] update failed b/c of
StaleConfigException, retrying left:4 ns: ud.visits query: { uuid:
"d6357772-3872-4872-9c7e-bba466a3a770" }
Wed Jan 5 10:29:22 [conn10] update failed b/c of
StaleConfigException, retrying left:4 ns: ud.visits query: { uuid:
"e4145573-9f70-4be2-b048-04f41569be5a" }
Wed Jan 5 10:29:22 [conn4] update failed b/c of StaleConfigException,
retrying left:4 ns: ud.visits query: { uuid:
"3e6137f3-208c-4812-8dd6-65114ecfedff" }
Wed Jan 5 10:29:22 [conn6] update failed b/c of StaleConfigException,
retrying left:4 ns: ud.visits query: { uuid:
"3bf80a2e-2535-4c3f-908d-6b2138c6111f" }
Wed Jan 5 10:29:22 [conn7] update failed b/c of StaleConfigException,
retrying left:4 ns: ud.visits query: { uuid: "c1748bde-efc4-4b1f-8149-
e36a2a550c46" }
Wed Jan 5 10:29:22 [conn9] update failed b/c of StaleConfigException,
retrying left:4 ns: ud.visits query: { uuid: "de0af13f-e141-465d-93cd-
c923321813a5" }
Wed Jan 5 10:29:23 [conn4] update failed b/c of StaleConfigException,
retrying left:4 ns: ud.visits query: { uuid:
"fffffac7-276a-4d6c-82f4-372ea4aa3ae8" }
Wed Jan 5 10:29:28 [conn13] creating WriteBackListener for:
10.235.73.46:27019,
10.235.46.84:27019,
10.234.167.245:27019
Wed Jan 5 10:29:31 [conn5] update failed b/c of StaleConfigException,
retrying left:4 ns: ud.visits query: { uuid: "000018a8-fa63-48c5-
ad46-7fffec6a33fc" }
Wed Jan 5 10:29:32 [conn10] update failed b/c of
StaleConfigException, retrying left:4 ns: ud.visits query: { uuid:
"000018a8-fa63-48c5-ad46-7fffec6a33fc" }
Wed Jan 5 10:30:27 connection accepted from
127.0.0.1:59287 #14
Wed Jan 5 10:30:53 [conn12] autosplitting ud.visits size: 54813920
shard: ns:ud.visits at: shard0001:
10.228.94.255:27017 lastmod: 3|6
min: { uuid: "7cbe5f73-b804-4d76-834b-26ea8dfc2060" } max: { uuid:
"bdaef5f0-8766-41b5-992c-05155942a798" } on: { uuid:
"9cd3aaa4-5ee4-4a41-b0df-d06f3333269c" }(splitThreshold 52428800)
...
Sharding status:
> db.printShardingStatus();
--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards:
{ "_id" : "shard0000", "host" : "
10.235.57.133:27017" }
{ "_id" : "shard0001", "host" : "
10.228.94.255:27017" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" :
"config" }
{ "_id" : "ud", "partitioned" : true, "primary" :
"shard0001" }
ud.visits chunks:
{ "uuid" : { $minKey : 1 } } -->> { "uuid" :
"00003c42-dee2-47d8-8d5e-423a7ad949b7" } on : shard0000 { "t" : 2000,
"i" : 0 }
{ "uuid" : "00003c42-
dee2-47d8-8d5e-423a7ad949b7" } -->> { "uuid" : "1d6b4869-8bc7-4f6f-
b308-fe48766e9ace" } on : shard0001 { "t" : 3000, "i" : 14 }
{ "uuid" : "1d6b4869-8bc7-4f6f-b308-
fe48766e9ace" } -->> { "uuid" : "3c227fab-
f792-4800-999b-5de0c3630b7f" } on : shard0001 { "t" : 3000, "i" : 15 }
{ "uuid" : "3c227fab-
f792-4800-999b-5de0c3630b7f" } -->> { "uuid" :
"5c5d6d3b-7bb1-450d-86e0-604a06df5a0a" } on : shard0001 { "t" : 3000,
"i" : 10 }
{ "uuid" :
"5c5d6d3b-7bb1-450d-86e0-604a06df5a0a" } -->> { "uuid" : "7cbe5f73-
b804-4d76-834b-26ea8dfc2060" } on : shard0001 { "t" : 3000, "i" : 11 }
{ "uuid" : "7cbe5f73-
b804-4d76-834b-26ea8dfc2060" } -->> { "uuid" : "9cd3aaa4-5ee4-4a41-
b0df-d06f3333269c" } on : shard0001 { "t" : 3000, "i" : 8 }
{ "uuid" : "9cd3aaa4-5ee4-4a41-b0df-
d06f3333269c" } -->> { "uuid" :
"bdaef5f0-8766-41b5-992c-05155942a798" } on : shard0001 { "t" : 3000,
"i" : 9 }
{ "uuid" :
"bdaef5f0-8766-41b5-992c-05155942a798" } -->> { "uuid" :
"de557951-997a-4f30-a970-3581697bc4bf" } on : shard0001 { "t" : 3000,
"i" : 12 }
{ "uuid" : "de557951-997a-4f30-
a970-3581697bc4bf" } -->> { "uuid" : "fffff375-ed72-4482-a20e-
dd4edbf7ef97" } on : shard0001 { "t" : 3000, "i" : 13 }
{ "uuid" : "fffff375-ed72-4482-a20e-
dd4edbf7ef97" } -->> { "uuid" : { $maxKey : 1 } } on : shard0000
{ "t" : 3000, "i" : 0 }
//Babak