Thu Feb 2 15:39:59 [rsSync] replSet initial oplog application from
xxxx:27017 starting at Feb 2 13:42:56:2 to Feb 2 15:39:31:1
RECOVERING> db.getReplicationInfo()
{
"logSizeMB" : 10000,
"usedMB" : 0.38,
"timeDiff" : 26,
"timeDiffHours" : 0.01,
"tFirst" : "Thu Feb 02 2012 13:42:56 GMT+0100 (CET)",
"tLast" : "Thu Feb 02 2012 13:43:22 GMT+0100 (CET)",
"now" : "Thu Feb 02 2012 17:09:06 GMT+0100 (CET)"
}
RECOVERING> db.currentOp()
{
"inprog" : [
{
"opid" : 17701,
"active" : true,
"lockType" : "write",
"waitingForLock" : false,
"secs_running" : 202,
"op" : "none",
"ns" : "thumbnails.thumb",
"query" : {
},
"client" : "",
"desc" : "rsSync",
"threadId" : "0x7fedb3ecc700",
"numYields" : 346
}
]
}
In 1.5 hours, it only replayed 30s of log....
When I look at the mongod process in atop, it says that mongod is
spending its time reading on disk....
My setup: mongod 2.0.2 with replica set + keyfile.
It may be relevant to mention that this member was added in the
replica set earlier but since I had trouble during the indexing step
(see my previous mail), I stopped the process, remove the host from
the replica set using rs.remove, trashed the data dir, and started
again, and added the host with hidden: true.
--
Philippe
Allgoob SA
--
Philippe
Allgoob SA
> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
It finally succeeded to sync, but it took 5.5 hours after the indexing finished:
13:42:39 [rsStart] trying to contact A
13:42:39 [rsStart] trying to contact B
13:42:39 [rsStart] trying to contact C
13:42:39 [rsStart] replSet got config version 20 from a
remote, saving locally
13:42:39 [rsStart] replSet info saving a newer config
version to local.system.replset
13:42:40 [rsStart] replSet saveConfigLocally done
13:42:40 [rsStart] replSet STARTUP2
13:42:40 [rsMgr] replSet total number of votes is even -
add arbiter or give one member an extra vote
13:42:40 [rsSync] replSet initial sync pending
13:42:40 [rsSync] replSet initial sync need a member to be
primary or secondary to do our initial sync
13:42:41 [rsHealthPoll] replSet member A is up
13:42:41 [rsHealthPoll] replSet member A is now in state SECONDARY
13:42:41 [rsHealthPoll] replSet member B is up
13:42:41 [rsHealthPoll] replSet member B is now in state PRIMARY
13:42:41 [rsHealthPoll] replSet member C is up
13:42:41 [rsHealthPoll] replSet member C is now in state SECONDARY
13:42:56 [rsSync] replSet initial sync pending
13:42:56 [rsSync] replSet syncing to: B
13:42:56 [rsSync] replSet initial sync drop all databases
13:42:56 [rsSync] replSet initial sync clone all databases
13:42:56 [rsSync] replSet initial sync cloning db: admin
13:42:57 [rsSync] replSet initial sync cloning db: wkes_prod
13:44:03 [rsSync] replSet initial sync cloning db: thumbnails
13:58:25 [rsSync] replSet initial sync cloning db: wkuk_prod
13:58:26 [rsSync] replSet initial sync cloning db: wkfr_prod
15:39:30 [rsSync] replSet initial sync cloning db: test
15:39:31 [rsSync] replSet initial sync query minValid
15:39:59 [rsSync] replSet initial oplog application from B
starting at Feb 2 13:42:56:2 to Feb 2 15:39:31:1
17:07:08 [rsHealthPoll] replSet info C is down (or slow to
respond): DBClientBase::findN: transport error: C query: {
replSetHeartbeat: "setag01", v: 20, pv: 1, checkEmpty: false, from:
"ME
17:07:08 [rsHealthPoll] replSet member C is now in state DOWN
17:07:30 [rsHealthPoll] replSet info B is down (or slow to
respond): DBClientBase::findN: transport error: B query: {
replSetHeartbeat: "setag01", v: 20, pv: 1, checkEmpty: false, from:
"ME
17:07:30 [rsHealthPoll] replSet member B is now in state DOWN
17:07:32 [rsHealthPoll] replSet info A is down (or slow to
respond): DBClientBase::findN: transport error: A query: {
replSetHeartbeat: "setag01", v: 20, pv: 1, checkEmpty: false, from:
"ME
17:07:32 [rsHealthPoll] replSet member A is now in state DOWN
17:07:34 [rsHealthPoll] replSet member A is up
17:07:34 [rsHealthPoll] replSet member A is now in state SECONDARY
17:07:35 [rsHealthPoll] replSet member B is up
17:07:35 [rsHealthPoll] replSet member B is now in state PRIMARY
17:07:36 [rsHealthPoll] replSet member C is up
17:07:36 [rsHealthPoll] replSet member C is now in state SECONDARY
17:30:21 [rsSync] replSet initial sync finishing up
17:30:22 [rsSync] replSet set minValid=4f2aa023:1
17:30:22 [rsSync] replSet initial sync done
17:30:23 [rsSync] replSet syncing to: B
19:43:35 [rsSync] replSet still syncing, not yet to
minValid optime4f2aa023:1
19:43:36 ..
21:03:50 ..
21:04:58 .
21:05:03 [rsSync] replSet SECONDARY
In the log I found that it started to create an index for the capped
collection at 21:00:08, which did not take much time since this
collection is 10GB. So I wonder what happened between 15:39:59 and
21:00:08.
I can send you a full log if you like. (there are plenty of connection
accepted, authenticate, end connection that I could remove though...)
--
Philippe
Allgoob SA
> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/mongodb-user/-/zVcR-mOu5ZYJ.
What's in the log currently?
Can you increase verbosity a lot? (can you use setParameter)
are there other secondaries?
can you send rs.status()
> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/mongodb-user/-/VDPqmx5-DekJ.