We had a look at currentOp() and there was nothing really obvious
there (see extract below).
We experienced issues during MongoDump and tried a repair, which gives
something meaningful :
[...]
Tue Jan 17 00:09:46 [initandlisten] warning: ClientCursor::yield can't
unlock b/c of recursive lock ns: Main.cachedItems top: { opid: 221, a
ctive: true, waitingForLock: false, secs_running: 0, op: "getmore",
ns: "Main.cachedItems", query: {}, client: "
0.0.0.0:0", desc:
"initandli
sten", numYields: 0 }
Tue Jan 17 00:09:46 [initandlisten] warning: ClientCursor::yield can't
unlock b/c of recursive lock ns: Main.cachedItems top: { opid: 221, a
ctive: true, waitingForLock: false, secs_running: 0, op: "getmore",
ns: "Main.cachedItems", query: {}, client: "
0.0.0.0:0", desc:
"initandli
sten", numYields: 0 }
Tue Jan 17 00:09:46 [initandlisten] warning: ClientCursor::yield can't
unlock b/c of recursive lock ns: Main.cachedItems top: { opid: 221, a
ctive: true, waitingForLock: false, secs_running: 0, op: "getmore",
ns: "Main.cachedItems", query: {}, client: "
0.0.0.0:0", desc:
"initandli
sten", numYields: 0 }
Tue Jan 17 00:09:46 [initandlisten] build index Main.cachedItems
{ _id: 1 }
1000000/1051544 95%
Tue Jan 17 00:10:17 [initandlisten] external sort used : 2 files
in 31 secs
Tue Jan 17 00:10:27 [initandlisten] done building bottom layer,
going to commit
Tue Jan 17 00:10:27 [initandlisten] build index done 1051544 records
41.668 secs
Tue Jan 17 00:10:27 [initandlisten] build index Main.cachedItems { k:
1.0 }
Tue Jan 17 00:10:36 [initandlisten] external sort used : 2 files
in 8 secs
Tue Jan 17 00:10:38 [initandlisten] build index done 1051544 records
10.78 secs
Tue Jan 17 00:10:38 [initandlisten] build index Main.cachedItems { e:
1.0 }
Tue Jan 17 00:10:41 [initandlisten] build index done 1051544 records
3.031 secs
Tue Jan 17 00:10:45 [initandlisten] removeJournalFiles
Tue Jan 17 00:10:46 [initandlisten] finished checking dbs
Tue Jan 17 00:10:46 dbexit:
I don't know about recursive locks but that smells like something that
could hang around and be reissued, explaining why the writers don't go
down past a certain level. Do you know how we can get more
information ? We have mongo logs, mongostat dumps, and applicative
logs as well.
Extract of currentOp() :
{
"inprog" : [
{
"opid" : 11198852,
"active" : true,
"lockType" : "write",
"waitingForLock" : false,
"secs_running" : 0,
"op" : "update",
"ns" : "Main.cachedItems",
"query" : {
"k" : "tile0_3_3-2431917"
},
"client" : "
79.125.41.97:54646",
"desc" : "conn",
"connectionId" : 5602,
"numYields" : 1
},
{
"opid" : 11198392,
"active" : false,
"lockType" : "write",
"waitingForLock" : true,
"op" : "remove",
"ns" : "",
"query" : {
"k" : "lck_visits29570744"
},
"client" : "
79.125.41.97:54643",
"desc" : "conn",
"connectionId" : 5600,
"numYields" : 0
},
{
"opid" : 11198484,
"active" : false,
"lockType" : "write",
"waitingForLock" : true,
"op" : "remove",
"ns" : "",
"query" : {
"k" : "lck_visits16822501"
},
"client" : "
79.125.83.28:58052",
"desc" : "conn",
"connectionId" : 5597,
"numYields" : 0
},
{
"opid" : 11198507,
"active" : false,
"lockType" : "write",
"waitingForLock" : true,
"op" : "update",
"ns" : "",
"query" : {
"k" : "data16377358"
},
"client" : "
79.125.41.97:54716",
"desc" : "conn",
"connectionId" : 5623,
"numYields" : 0
},
{
"opid" : 11198825,
"active" : false,
"lockType" : "write",
"waitingForLock" : true,
"op" : "update",
"ns" : "",
"query" : {
"k" : "tile0_4_4-12012251"
},
"client" : "
79.125.41.97:54599",
"desc" : "conn",
"connectionId" : 5578,
"numYields" : 0
},
{
"opid" : 11198801,
"active" : false,
"lockType" : "write",
"waitingForLock" : true,
"op" : "update",
"ns" : "",
"query" : {
"k" : "data33690616"
},
"client" : "
176.34.221.185:61670",
"desc" : "conn",
"connectionId" : 5594,
"numYields" : 0
},
Pierre
> ...
>
> plus de détails »