initial oplog application is *ultra* slow (slower than the write rate...)

473 views
Skip to first unread message

Philippe DAVID

unread,
Feb 2, 2012, 11:19:57 AM2/2/12
to mongod...@googlegroups.com
Hi,
I've just added a new member to a replica set. The new member is stuck
at the "initial oplog application " step:

Thu Feb 2 15:39:59 [rsSync] replSet initial oplog application from
xxxx:27017 starting at Feb 2 13:42:56:2 to Feb 2 15:39:31:1

RECOVERING> db.getReplicationInfo()
{
"logSizeMB" : 10000,
"usedMB" : 0.38,
"timeDiff" : 26,
"timeDiffHours" : 0.01,
"tFirst" : "Thu Feb 02 2012 13:42:56 GMT+0100 (CET)",
"tLast" : "Thu Feb 02 2012 13:43:22 GMT+0100 (CET)",
"now" : "Thu Feb 02 2012 17:09:06 GMT+0100 (CET)"
}

RECOVERING> db.currentOp()
{
"inprog" : [
{
"opid" : 17701,
"active" : true,
"lockType" : "write",
"waitingForLock" : false,
"secs_running" : 202,
"op" : "none",
"ns" : "thumbnails.thumb",
"query" : {

},
"client" : "",
"desc" : "rsSync",
"threadId" : "0x7fedb3ecc700",
"numYields" : 346
}
]
}


In 1.5 hours, it only replayed 30s of log....
When I look at the mongod process in atop, it says that mongod is
spending its time reading on disk....

My setup: mongod 2.0.2 with replica set + keyfile.
It may be relevant to mention that this member was added in the
replica set earlier but since I had trouble during the indexing step
(see my previous mail), I stopped the process, remove the host from
the replica set using rs.remove, trashed the data dir, and started
again, and added the host with hidden: true.

--
Philippe
Allgoob SA

Philippe DAVID

unread,
Feb 2, 2012, 11:44:28 AM2/2/12
to mongod...@googlegroups.com
It may be relevant to note that the collection I see in db.currentOp()
is always thumbnails.thumb, which is capped.

--
Philippe
Allgoob SA

Scott Hernandez

unread,
Feb 2, 2012, 2:01:46 PM2/2/12
to mongod...@googlegroups.com
What does mongostat show during this time on the new replica?

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>

Philippe DAVID

unread,
Feb 3, 2012, 3:12:01 AM2/3/12
to mongod...@googlegroups.com
I have a small mongostat snapshot, if it can help:
insert query update delete getmore command flushes mapped vsize
res faults locked % idx miss % qr|qw ar|aw netIn netOut conn
set repl time
*0 *0 *0 *0 0 8|0 0 84.4g 170g
4.85g 431 105 0 0|0 0|1 809b 2k 6
setag01 UNK 15:56:48
*0 *0 *0 *0 0 1|0 0 84.4g 170g
4.86g 545 61.8 0 0|0 0|1 62b 1k 6
setag01 UNK 15:56:49
*0 *0 *0 *0 0 4|0 0 84.4g 170g
4.87g 754 126 0 0|0 0|1 467b 2k 6
setag01 UNK 15:56:50

It finally succeeded to sync, but it took 5.5 hours after the indexing finished:
13:42:39 [rsStart] trying to contact A
13:42:39 [rsStart] trying to contact B
13:42:39 [rsStart] trying to contact C
13:42:39 [rsStart] replSet got config version 20 from a
remote, saving locally
13:42:39 [rsStart] replSet info saving a newer config
version to local.system.replset
13:42:40 [rsStart] replSet saveConfigLocally done
13:42:40 [rsStart] replSet STARTUP2
13:42:40 [rsMgr] replSet total number of votes is even -
add arbiter or give one member an extra vote
13:42:40 [rsSync] replSet initial sync pending
13:42:40 [rsSync] replSet initial sync need a member to be
primary or secondary to do our initial sync
13:42:41 [rsHealthPoll] replSet member A is up
13:42:41 [rsHealthPoll] replSet member A is now in state SECONDARY
13:42:41 [rsHealthPoll] replSet member B is up
13:42:41 [rsHealthPoll] replSet member B is now in state PRIMARY
13:42:41 [rsHealthPoll] replSet member C is up
13:42:41 [rsHealthPoll] replSet member C is now in state SECONDARY
13:42:56 [rsSync] replSet initial sync pending
13:42:56 [rsSync] replSet syncing to: B
13:42:56 [rsSync] replSet initial sync drop all databases
13:42:56 [rsSync] replSet initial sync clone all databases
13:42:56 [rsSync] replSet initial sync cloning db: admin
13:42:57 [rsSync] replSet initial sync cloning db: wkes_prod
13:44:03 [rsSync] replSet initial sync cloning db: thumbnails
13:58:25 [rsSync] replSet initial sync cloning db: wkuk_prod
13:58:26 [rsSync] replSet initial sync cloning db: wkfr_prod
15:39:30 [rsSync] replSet initial sync cloning db: test
15:39:31 [rsSync] replSet initial sync query minValid
15:39:59 [rsSync] replSet initial oplog application from B


starting at Feb 2 13:42:56:2 to Feb 2 15:39:31:1

17:07:08 [rsHealthPoll] replSet info C is down (or slow to
respond): DBClientBase::findN: transport error: C query: {
replSetHeartbeat: "setag01", v: 20, pv: 1, checkEmpty: false, from:
"ME
17:07:08 [rsHealthPoll] replSet member C is now in state DOWN
17:07:30 [rsHealthPoll] replSet info B is down (or slow to
respond): DBClientBase::findN: transport error: B query: {
replSetHeartbeat: "setag01", v: 20, pv: 1, checkEmpty: false, from:
"ME
17:07:30 [rsHealthPoll] replSet member B is now in state DOWN
17:07:32 [rsHealthPoll] replSet info A is down (or slow to
respond): DBClientBase::findN: transport error: A query: {
replSetHeartbeat: "setag01", v: 20, pv: 1, checkEmpty: false, from:
"ME
17:07:32 [rsHealthPoll] replSet member A is now in state DOWN
17:07:34 [rsHealthPoll] replSet member A is up
17:07:34 [rsHealthPoll] replSet member A is now in state SECONDARY
17:07:35 [rsHealthPoll] replSet member B is up
17:07:35 [rsHealthPoll] replSet member B is now in state PRIMARY
17:07:36 [rsHealthPoll] replSet member C is up
17:07:36 [rsHealthPoll] replSet member C is now in state SECONDARY
17:30:21 [rsSync] replSet initial sync finishing up
17:30:22 [rsSync] replSet set minValid=4f2aa023:1
17:30:22 [rsSync] replSet initial sync done
17:30:23 [rsSync] replSet syncing to: B
19:43:35 [rsSync] replSet still syncing, not yet to
minValid optime4f2aa023:1
19:43:36 ..
21:03:50 ..
21:04:58 .
21:05:03 [rsSync] replSet SECONDARY

In the log I found that it started to create an index for the capped
collection at 21:00:08, which did not take much time since this
collection is 10GB. So I wonder what happened between 15:39:59 and
21:00:08.
I can send you a full log if you like. (there are plenty of connection
accepted, authenticate, end connection that I could remove though...)

--
Philippe
Allgoob SA

Igor Serebryany

unread,
Mar 10, 2012, 6:46:25 AM3/10/12
to mongodb-user
I'm experiencing this same issue right now. Started replaying the log
40 minutes ago:
Sat Mar 10 05:04:42 [rsSync] replSet initial oplog application from
mongo3.singlehop.net:27017 starting at Mar 10 02:40:13:57 to Mar 10
05:04:20:3a

and it hasn't progressed at all:

root@mongo4:/var/lib# mongo
MongoDB shell version: 2.0.3
connecting to: test
RECOVERING> db.getReplicationInfo()
{
"logSizeMB" : 33258.15,
"usedMB" : 0.02,
"timeDiff" : 1,
"timeDiffHours" : 0,
"tFirst" : "Sat Mar 10 2012 02:40:13 GMT-0600 (CST)",
"tLast" : "Sat Mar 10 2012 02:40:14 GMT-0600 (CST)",
"now" : "Sat Mar 10 2012 05:33:36 GMT-0600 (CST)"
}

like the OP, mongostat shows some 3-7 local commands per second, a
single writer client in the queue and a lot of page faults and time
spending write-locked.
*0 *0 *0 *0 0 3|0 0 113g 227g
11.3g 435 125 0 0|0 0|1 186b 2k 37
singlehop UNK 05:45:15
*0 *0 *0 *0 0 5|0 0 113g 227g
11.3g 474 63.3 0 0|0 0|1 468b 2k 37
singlehop UNK 05:45:16
*0 *0 *0 *0 0 7|0 0 113g 227g
11.3g 399 125 0 0|0 0|1 434b 3k 37
singlehop UNK 05:45:17
*0 *0 *0 *0 0 6|0 0 113g 227g
11.3g 367 82.1 0 0|0 0|1 530b 2k 37
singlehop UNK 05:45:18

like the OP, my current Op shows that mongo is working on a capped
collection.

RECOVERING> db.currentOp()
{
"inprog" : [
{
"opid" : 36824,
"active" : true,
"lockType" : "write",
"waitingForLock" : false,
"secs_running" : 704,
"op" : "none",
"ns" : "manage.api_requests",
"query" : {

},
"client" : "",
"desc" : "rsSync",
"threadId" : "0x7f2d2fffc700",
"numYields" : 1178
}
]
}


On Feb 2, 8:19 am, Philippe DAVID <philippe.da...@allgoob.com> wrote:
> Hi,
> I've just added a new member to a replica set. The new member is stuck
> at the "initialoplogapplication" step:

Igor Serebryany

unread,
Mar 11, 2012, 6:21:16 AM3/11/12
to mongod...@googlegroups.com

here we are a day later, and nothing has changed. replication has played back less than a minute of oplog:

RECOVERING> db.printReplicationInfo()
configured oplog size:   33258.15MB
log length start to end: 52secs (0.01hrs)
oplog first event time:  Sat Mar 10 2012 02:40:13 GMT-0600 (CST)
oplog last event time:   Sat Mar 10 2012 02:41:05 GMT-0600 (CST)
now:                     Sun Mar 11 2012 05:17:08 GMT-0500 (CDT)

still lots of page faults and the db is spending close to 100% of the time write-locked. i'm worried that i'm going to run out of oplog and replication will halt. i tried restarting the new, recovering slave, but it did not help/nothing changed.

i checked the capped collection which seems to be the one being worked on based on currentOp() and it's missing it's _id index, which exists on the primary. maybe that's part of the problem? the OP seemed to indicate that his index on the capped collection was being rebuilt, but that he had a small collection. in my case the collection is 40GB...

i've tried the operation of adding a secondary to a cluster maybe five times now, and it has yet to go smoothly. what am i doing wrong here? should i start over?

--igor

Eliot Horowitz

unread,
Mar 12, 2012, 1:22:18 AM3/12/12
to mongod...@googlegroups.com
What's in the log currently?
Can you increase verbosity a lot? (can you use setParameter)

> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.

> To view this discussion on the web visit
> https://groups.google.com/d/msg/mongodb-user/-/zVcR-mOu5ZYJ.

Igor Serebryany

unread,
Mar 12, 2012, 3:32:27 AM3/12/12
to mongod...@googlegroups.com


On Sunday, March 11, 2012 9:22:18 PM UTC-8, Eliot Horowitz wrote:
What's in the log currently?
Can you increase verbosity a lot?  (can you use setParameter)

great suggestion! i tried that, and i didn't find a lot going on, strangely. there are a lot of these:
Mon Mar 12 02:13:51 [conn6413] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 reslen:295 0ms
Mon Mar 12 02:13:51 [conn6714] run command admin.$cmd { ismaster: 1 }
Mon Mar 12 02:13:51 [conn6714] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 reslen:295 0ms
Mon Mar 12 02:13:52 [conn6564] run command admin.$cmd { ismaster: 1 }
Mon Mar 12 02:13:52 [conn6564] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 reslen:295 0ms
Mon Mar 12 02:13:52 [conn6504] run command admin.$cmd { ismaster: 1 }

there are a lot of these:
Mon Mar 12 02:13:52 [conn6918] run command admin.$cmd { replSetHeartbeat: "xxx", v: 21, pv: 1, checkEmpty: false, from: "mongo3.xxx.net:27017" }
Mon Mar 12 02:13:52 [conn6918] command admin.$cmd command: { replSetHeartbeat: "xxx", v: 21, pv: 1, checkEmpty: false, from: "mongo3.xxx.net:27017" } ntoreturn:1 reslen:125 0ms
Mon Mar 12 02:13:52 [conn6920] run command admin.$cmd { replSetHeartbeat: "xxx", v: 21, pv: 1, checkEmpty: false, from: "mongo2.xxx.net:27017" }
Mon Mar 12 02:13:52 [conn6920] command admin.$cmd command: { replSetHeartbeat: "xxx", v: 21, pv: 1, checkEmpty: false, from: "mongo2.xxx.net:27017" } ntoreturn:1 reslen:125 0ms

and ther are occasional ones like this:j
Mon Mar 12 02:13:41 [PeriodicTask::Runner] task: WriteBackManager::cleaner took: 0ms
Mon Mar 12 02:13:41 [PeriodicTask::Runner] task: DBConnectionPool-cleaner took: 0ms
Mon Mar 12 02:13:41 [PeriodicTask::Runner] task: DBConnectionPool-cleaner took: 0ms

and these:
Mon Mar 12 02:13:39 [DataFileSync] flushing mmap took 85ms  for 50 files

and these
Mon Mar 12 02:13:31 BackgroundJob starting: ConnectBG

and these:
Mon Mar 12 02:12:56 [conn6913] Socket recv() conn closed? xxxip:59944
Mon Mar 12 02:12:56 [conn6913] SocketException: remote: xxxip:59944 error: 9001 socket exception [0] server [xxxip:59944]
Mon Mar 12 02:12:56 [conn6913] end connection xxxip:59944

i don't see anything in the log that's actually taking any time, except occasionally i see something crazy like this:
Sun Mar 11 21:05:37 [conn5303] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 reslen:295 1271310319ms

the server is pretty busy, too; servicing all those page faults, i guess. progress remains elusive:


RECOVERING> db.printReplicationInfo()
configured oplog size:   33258.15MB
log length start to end: 103secs (0.03hrs)

oplog first event time:  Sat Mar 10 2012 02:40:13 GMT-0600 (CST)
oplog last event time:   Sat Mar 10 2012 02:41:56 GMT-0600 (CST)
now:                     Mon Mar 12 2012 02:30:48 GMT-0500 (CDT)

--igor

Eliot Horowitz

unread,
Mar 12, 2012, 8:19:27 AM3/12/12
to mongod...@googlegroups.com
it looks like it can't pull data at all.

are there other secondaries?

can you send rs.status()

> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To view this discussion on the web visit

> https://groups.google.com/d/msg/mongodb-user/-/VDPqmx5-DekJ.

Igor Serebryany

unread,
Mar 12, 2012, 9:04:49 AM3/12/12
to mongod...@googlegroups.com
yeah, there's a primary and a secondary which are both working fine. also... it's pulling SOME data. replication has advanced a few minutes. i'm really curious why the recovering node is so busy; where are all those page faults coming from?

PRIMARY> rs.status()
{
    "set" : "xxx",
    "date" : ISODate("2012-03-12T12:59:35Z"),
    "myState" : 1,
    "syncingTo" : "mongo3.xxx.net:27017",
    "members" : [
        {
            "_id" : 4,
            "name" : "mongo2.xxx.net:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "optime" : {
                "t" : 1331557175000,
                "i" : 5
            },
            "optimeDate" : ISODate("2012-03-12T12:59:35Z"),
            "self" : true
        },
        {
            "_id" : 5,
            "name" : "mongo3.xxx.net:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 215321,
            "optime" : {
                "t" : 1331557175000,
                "i" : 5
            },
            "optimeDate" : ISODate("2012-03-12T12:59:35Z"),
            "lastHeartbeat" : ISODate("2012-03-12T12:59:35Z"),
            "pingMs" : 0
        },
        {
            "_id" : 6,
            "name" : "mongo4.xxx.net:27017",
            "health" : 1,
            "state" : 3,
            "stateStr" : "RECOVERING",
            "uptime" : 97913,
            "optime" : {
                "t" : 1331368929000,
                "i" : 9
            },
            "optimeDate" : ISODate("2012-03-10T08:42:09Z"),
            "lastHeartbeat" : ISODate("2012-03-12T12:59:33Z"),
            "pingMs" : 10
        }
    ],
    "ok" : 1

}

RECOVERING> db.printReplicationInfo()
configured oplog size:   33258.15MB
log length start to end: 107secs (0.03hrs)

oplog first event time:  Sat Mar 10 2012 02:40:13 GMT-0600 (CST)
oplog last event time:   Sat Mar 10 2012 02:42:00 GMT-0600 (CST)
now:                     Mon Mar 12 2012 04:33:44 GMT-0500 (CDT)

RECOVERING> db.printReplicationInfo()
configured oplog size:   33258.15MB
log length start to end: 116secs (0.03hrs)

oplog first event time:  Sat Mar 10 2012 02:40:13 GMT-0600 (CST)
oplog last event time:   Sat Mar 10 2012 02:42:09 GMT-0600 (CST)
now:                     Mon Mar 12 2012 08:03:24 GMT-0500 (CDT)

Igor Serebryany

unread,
Mar 12, 2012, 8:14:12 PM3/12/12
to mongod...@googlegroups.com

so, after another day of the box not getting anywhere, i decided to take matter into my own hands. i was thinking that the replication slowdown came from the fact that my capped collection didn't have an index on _id on the new secondary -- this index existed on both of the other machines in the cluster. so, i stopped the new secondary and took it out of the replica set by removing it's replSet option. then, i ran an ensureIndex on the capped collection.

after starting things back up again, looks like it's flyin'. so the bug appears to be that this index wasn't built when the initial replication was done. is this a general problem with capped collections? the OP reported that his index was built EVENTUALLY...


RECOVERING> db.printReplicationInfo()
configured oplog size:   33258.15MB
log length start to end: 22452secs (6.24hrs)

oplog first event time:  Sat Mar 10 2012 02:40:13 GMT-0600 (CST)
oplog last event time:   Sat Mar 10 2012 08:54:25 GMT-0600 (CST)
now:                     Mon Mar 12 2012 19:09:51 GMT-0500 (CDT)

--igor
Reply all
Reply to author
Forward
0 new messages