Server never able to finish the initial sync to join a quite big dataset of its replica

1,089 views

Skip to first unread message

Matthieu Rigal

unread,

Jul 9, 2015, 1:41:41 PM7/9/15

to mongod...@googlegroups.com

Hi MongoDB Team,

I would like to report a strange behaviour here, because I was able to reproduce it consistently, on MongoDB 3.0.4 with MMAPv1. One server never manages to join a replica set where it should initial sync almost 1TB of data. The oplog size is logically 50GB and covers more or less 40 hours of data. Here is what happens consistently, what I think to understand and what in my eyes is wrong:

i.e. H6 = start time + 6 hours

H0 : initial sync starts

H6 : data sync finished, starts building indexes

| In the meanwhile many socket errors....

H31 : build index done. oplog sync 1 of 3

| In the meanwhile many "replication missing obj"

H36 : oplog sync 2 of 3

H38 : initial sync building indexes (WTF ?! Without any other messages, oplog sync 3 of 3 never comes... But maybe at that point, the oplog is "over")

| This recreation of indexes (already successfully completed before) never finishes, it is much slower than the previous one, the disk is swapping a lot, mongo is piling connections (several thousands in the logfile after 1 day) and the queued reads (server Status)

Then I also tried to restart the mongod server in the middle, this is also quite funny:

H0 : restart. Build index not finished on one collection

H0 : restart build index

H13: finished build index on collection

H13: transition to STARTUP2

H13: initial sync pending

H13: initial sync drop all databases (LOL, if mongo planned to do that, it was very useful to build the previous indexes !!!)

In the first case, if there is a problem with the oplog that might be too old, it definitely needs some proper logs explaining what happens (even more since the rest is VERY verbose). But maybe there was some other issue here??? I'm looking forward to hear from you.

In the second case, I don't know if it had any chance to succeed, probably not. But even in that case, mongo should realize first the state in which it is and if it plans to drop everything, avoid to rebuild the indexes first.

As I said, after several iterations I realized the same thing happen again and again and decided to wait to raise oplog time by adding a new shard or to wait to migrate to WiredTiger. Else no chance for now, it is condemned to fail...

Here is some raw logs, you will see from the timestamp that I have selected logs from the following sequence: I had the top scenario, I have restarted mongo and had the bottom scenario, at the end of the finished index build start the logs, and then again the top scenario

[H13 bottom scenario]

2015-07-06T23:53:46.342+0200 I - [initandlisten] Index: (2/3) BTree Bottom Up Progress: 277723400/279451059 99%

2015-07-06T23:53:55.493+0200 I INDEX [initandlisten] done building bottom layer, going to commit

2015-07-06T23:53:55.493+0200 I INDEX [initandlisten] build index done. scanned 279451059 total records. 46909 secs

2015-07-06T23:54:02.063+0200 I NETWORK [initandlisten] waiting for connections on port 27017

2015-07-06T23:54:02.080+0200 I NETWORK [initandlisten] connection accepted from internal.ip.18:34915 #1 (1 connection now open)

2015-07-06T23:54:02.369+0200 I REPL [ReplicationExecutor] New replica set config in use: { _id: "rs1", version: 8, members: [ { _id: 0, host: "HOST1:27017", arbiterOnly: false, buildIndexes:

true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 1, host: "HOST2:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDela

y: 0, votes: 1 }, { _id: 3, host: "HOSTSEC:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 0.1, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true,

heartbeatTimeoutSecs: 10, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 } } }

2015-07-06T23:54:02.370+0200 I REPL [ReplicationExecutor] This node is HOSTSEC:27017 in the config

2015-07-06T23:54:02.370+0200 I REPL [ReplicationExecutor] transition to STARTUP2

2015-07-06T23:54:02.578+0200 I NETWORK [initandlisten] connection accepted from internal.ip.16:54724 #2 (2 connections now open)

2015-07-06T23:54:02.626+0200 I NETWORK [initandlisten] connection accepted from internal.ip.65:33608 #3 (3 connections now open)

2015-07-06T23:54:02.825+0200 I REPL [ReplicationExecutor] Starting replication applier threads

2015-07-06T23:54:02.826+0200 I COMMAND [conn1] command admin.$cmd command: isMaster { isMaster: 1 } ntoreturn:1 keyUpdates:0 writeConflicts:0 numYields:0 reslen:356 locks:{} 495ms

2015-07-06T23:54:02.826+0200 I COMMAND [conn2] command admin.$cmd command: isMaster { isMaster: 1 } ntoreturn:1 keyUpdates:0 writeConflicts:0 numYields:0 reslen:356 locks:{} 247ms

2015-07-06T23:54:02.828+0200 I COMMAND [conn3] command admin.$cmd command: isMaster { isMaster: 1 } ntoreturn:1 keyUpdates:0 writeConflicts:0 numYields:0 reslen:356 locks:{} 200ms

2015-07-06T23:54:02.829+0200 I REPL [ReplicationExecutor] Member HOST1:27017 is now in state SECONDARY

2015-07-06T23:54:02.829+0200 I REPL [ReplicationExecutor] Member HOST2:27017 is now in state PRIMARY

2015-07-06T23:54:02.919+0200 I NETWORK [initandlisten] connection accepted from internal.ip.24:35911 #4 (4 connections now open)

2015-07-06T23:54:03.079+0200 I REPL [rsSync] initial sync pending

2015-07-06T23:54:03.174+0200 I NETWORK [initandlisten] connection accepted from internal.ip.12:60560 #5 (5 connections now open)

2015-07-06T23:54:03.465+0200 I REPL [rsSync] no valid sync sources found in current replset to do an initial sync

2015-07-06T23:54:03.933+0200 I NETWORK [initandlisten] connection accepted from internal.ip.23:36835 #6 (6 connections now open)

2015-07-06T23:54:04.285+0200 I NETWORK [initandlisten] connection accepted from internal.ip.17:45091 #7 (7 connections now open)

2015-07-06T23:54:04.465+0200 I REPL [rsSync] initial sync pending

2015-07-06T23:54:04.466+0200 I REPL [rsSync] no valid sync sources found in current replset to do an initial sync

2015-07-06T23:54:05.233+0200 I NETWORK [initandlisten] connection accepted from 127.0.0.1:53087 #8 (8 connections now open)

2015-07-06T23:54:07.197+0200 I NETWORK [initandlisten] connection accepted from internal.ip.20:44611 #9 (9 connections now open)

2015-07-06T23:54:09.624+0200 I NETWORK [initandlisten] connection accepted from internal.ip.25:53462 #10 (10 connections now open)

2015-07-06T23:54:10.355+0200 I NETWORK [initandlisten] connection accepted from internal.ip.27:51038 #11 (11 connections now open)

2015-07-06T23:54:10.355+0200 I NETWORK [initandlisten] connection accepted from internal.ip.19:51669 #12 (12 connections now open)

2015-07-06T23:54:10.572+0200 I NETWORK [initandlisten] connection accepted from internal.ip.11:48601 #13 (13 connections now open)

2015-07-06T23:54:11.447+0200 I REPL [rsSync] initial sync pending

2015-07-06T23:54:11.447+0200 I COMMAND [conn8] command db.$cmd command: dbStats { dbStats: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:315 locks:{ Global: { acquireCount: { r: 2 } }, MM

APV1Journal: { acquireCount: { r: 1 } }, Database: { acquireCount: { R: 1 } }, Metadata: { acquireCount: { R: 1 } } } 6196ms

2015-07-06T23:54:11.448+0200 I REPL [ReplicationExecutor] syncing from: HOST2:27017

2015-07-06T23:54:11.457+0200 I REPL [rsSync] initial sync drop all databases

[LOL]

2015-07-06T23:54:11.458+0200 I STORAGE [rsSync] dropAllDatabasesExceptLocal 3

2015-07-06T23:54:11.578+0200 I NETWORK [initandlisten] connection accepted from 127.0.0.1:53089 #14 (14 connections now open)

2015-07-06T23:54:11.825+0200 I NETWORK [conn8] end connection 127.0.0.1:53087 (13 connections now open)

2015-07-06T23:54:12.102+0200 I JOURNAL [rsSync] journalCleanup...

2015-07-06T23:54:12.102+0200 I JOURNAL [rsSync] removeJournalFiles

2015-07-06T23:54:13.318+0200 I JOURNAL [rsSync] journalCleanup...

2015-07-06T23:54:13.318+0200 I JOURNAL [rsSync] removeJournalFiles

2015-07-06T23:54:23.641+0200 I NETWORK [initandlisten] connection accepted from 127.0.0.1:53090 #15 (14 connections now open)

[...] (restarts from scratch, top scenario)

2015-07-08T11:53:15.574+0200 I INDEX [rsSync] build index done. scanned 11070 total records. 0 secs

2015-07-08T11:53:15.595+0200 I REPL [rsSync] initial sync data copy, starting syncup

2015-07-08T11:53:15.963+0200 I REPL [rsSync] oplog sync 1 of 3

2015-07-08T11:53:16.613+0200 I NETWORK [initandlisten] connection accepted from 127.0.0.1:51367 #23853 (13 connections now open)

2015-07-08T11:53:16.617+0200 I NETWORK [conn23853] end connection 127.0.0.1:51367 (12 connections now open)

2015-07-08T11:53:16.756+0200 I NETWORK [initandlisten] connection accepted from 127.0.0.1:51368 #23854 (13 connections now open)

2015-07-08T11:53:16.786+0200 I NETWORK [conn23854] end connection 127.0.0.1:51368 (12 connections now open)

2015-07-08T11:53:16.796+0200 I REPL [ReplicationExecutor] syncing from: HOST2:27017

2015-07-08T11:53:16.799+0200 I REPL [SyncSourceFeedback] replset setting syncSourceFeedback to HOST2:27017

[...]

2015-07-08T11:53:22.565+0200 I REPL [repl writer worker 13] replication failed to apply update: {JSON}

2015-07-08T11:53:22.575+0200 I REPL [repl writer worker 13] replication info adding missing object

2015-07-08T11:53:22.582+0200 I REPL [repl writer worker 13] replication missing object not found on source. presumably deleted later in oplog

2015-07-08T11:53:22.583+0200 I REPL [repl writer worker 13] replication o2: { _id: ObjectId('UUID') }

2015-07-08T11:53:22.583+0200 I REPL [repl writer worker 13] replication o firstfield: $set

[...]

2015-07-08T16:13:30.136+0200 I COMMAND [repl writer worker 14] CMD: drop db.tmp.mrs.reports_1436348079_6814

2015-07-08T16:13:30.261+0200 I COMMAND [repl writer worker 14] CMD: drop db.tmp.mrs.reports_1436348079_6815

[...]

2015-07-08T18:01:54.124+0200 I COMMAND [conn26678] command db.$cmd command: collStats { collstats: "coll1" } keyUpdates:0 writeConflicts:0 numYields:0 reslen:420 locks:{ Global: { acquireCount: { r: 2 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 10266409 } }, MMAPV1Journal: { acquireCount: { r: 1 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { R: 1 } } } 2858ms

2015-07-08T18:01:54.127+0200 I NETWORK [conn26678] end connection 127.0.0.1:58499 (12 connections now open)

[WTF!?]

2015-07-08T18:01:54.287+0200 I REPL [rsSync] initial sync building indexes

2015-07-08T18:01:54.297+0200 I REPL [rsSync] initial sync cloning indexes for : admin

2015-07-08T18:01:54.377+0200 I NETWORK [initandlisten] connection accepted from 127.0.0.1:58500 #26680 (13 connections now open)

2015-07-08T18:01:54.379+0200 I NETWORK [conn26680] end connection 127.0.0.1:58500 (12 connections now open)

2015-07-08T18:01:54.446+0200 I REPL [rsSync] initial sync cloning indexes for : db

2015-07-08T18:01:54.462+0200 I STORAGE [rsSync] copying indexes for: { name: "coll1", options: {} }

2015-07-08T18:01:54.616+0200 I NETWORK [initandlisten] connection accepted from 127.0.0.1:58503 #26681 (13 connections now open)

2015-07-08T18:01:54.672+0200 I INDEX [rsSync] build index on: db.coll1 properties: { v: 1, key: { KEY }, name: "#_1_uuid_1", ns: "db.coll1" }

2015-07-08T18:01:54.672+0200 I INDEX [rsSync] building index using bulk method

Thanks for some feedback,

Cheers,

Matthieu Rigal (MongoEngine co-owner)

Matthieu Rigal

unread,

Jul 13, 2015, 12:34:59 PM7/13/15

to mongod...@googlegroups.com

I would have expected some answer here, some explanations on the process that I tried to document properly. At least the first seems to be a real bug in my eyes, should I open a ticket directly? I wanted to discuss it first here...

Matthieu

Matthieu Rigal

unread,

Sep 7, 2015, 11:26:57 AM9/7/15

to mongodb-user

Hi,

Having in the meanwhile added one replica set and having a smaller collection, I expected to have more success with my latest tries. Unfortunately it went wrong one more time. I don't know why these problems of initial sync aren't taken more seriously.

This time one of the problem seem to be that the freshly build index is dropped, because it can found a duplicate (but how is it possible, the other members also do not accept duplicates, it is Mongo DB 3.0.6 on MMAPv1 ?). And since this is a rather serious error, I would expect some more decent error logging, with details about which duplicates, instead of the current useless message.

After this drop, the build process seems to shot himself, since I got a lot of socket exception.

Then it tries to sync the oplog and dies again in the middle (or before the end) of the step 2 of 3 from oplog sync.

Then the process retries to build all index, starting an endless loop...

It is normally more than enough, since it covers 60 to 80 hours while the sync process needs 2 days, but would a bigger oplog help? Is it at all possible to make it larger for one server than the usual 50GB?

Here some more logs in case it helps

Best,

Matthieu

2015-08-19T09:50:24.285+0200 I INDEX [rsSync] done building bottom layer, going to commit

2015-08-19T09:50:24.295+0200 I INDEX [rsSync] build index done. scanned 2237230 total records. 8387 secs

2015-08-19T09:50:24.314+0200 I STORAGE [rsSync] index build dropped: 1 dups

2015-08-19T09:50:24.643+0200 I COMMAND [conn26299] command admin.$cmd command: listDatabases { listDatabases: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:182 locks:{ Global: { acquireCount: { r: 4 } }, MMAPV1Journal: { acquireCount: { r: 3 } }

, Database: { acquireCount: { r: 2 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 6745926131628 } } } 2597284ms

[...]

2015-08-19T09:50:24.673+0200 I COMMAND [conn25846] command admin.$cmd command: listDatabases { listDatabases: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:182 locks:{ Global: { acquireCount: { r: 4 } }, MMAPV1Journal: { acquireCount: { r: 3 } }, Database: { acquireCount: { r: 2 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 38665023674208 } } } 6219668ms

2015-08-19T09:50:24.677+0200 I NETWORK [conn26611] end connection 127.0.0.1:54452 (343 connections now open)

2015-08-19T09:50:24.837+0200 I NETWORK [conn26478] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [127.0.0.1:54298]

2015-08-19T09:50:24.837+0200 I NETWORK [conn25894] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [127.0.0.1:53623]

2015-08-19T09:50:24.837+0200 I NETWORK [conn26338] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [127.0.0.1:54136]

[...]

2015-08-19T09:50:24.842+0200 I NETWORK [conn26206] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [127.0.0.1:53982]

2015-08-19T09:50:24.842+0200 I NETWORK [conn26158] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [127.0.0.1:53928]

2015-08-19T09:50:24.843+0200 I COMMAND [conn26484] command db.$cmd command: collStats { collstats: "coll2" } keyUpdates:0 writeConflicts:0 numYields:0 reslen:419 locks:{ Global: { acquireCount: { r: 2 } }, MMAPV1Journal: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 1254578946053 } }, Collection: { acquireCount: { R: 1 } } } 1119871ms

[...]

2015-08-19T09:50:26.302+0200 I NETWORK [conn25892] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [127.0.0.1:53621]

2015-08-19T09:50:26.348+0200 I INDEX [rsSync] build index on: db.sync properties: { v: 1, key: { _id: 1 }, name: "_id_", ns: "db.sync" }

2015-08-19T09:50:26.348+0200 I INDEX [rsSync] building index using bulk method

2015-08-19T09:50:26.374+0200 I INDEX [rsSync] build index done. scanned 3981 total records. 0 secs

2015-08-19T09:50:26.379+0200 I REPL [rsSync] initial sync data copy, starting syncup

2015-08-19T09:50:26.664+0200 I REPL [rsSync] oplog sync 1 of 3

2015-08-19T09:50:27.569+0200 I REPL [ReplicationExecutor] syncing from: PRIMARY_SRV_NAME:27017

2015-08-19T09:50:27.572+0200 I REPL [SyncSourceFeedback] replset setting syncSourceFeedback to PRIMARY_SRV_NAME:27017

2015-08-19T09:50:27.661+0200 I NETWORK [initandlisten] connection accepted from 127.0.0.1:54472 #26628 (17 connections now open)

[..]

2015-08-19T09:50:31.343+0200 I REPL [repl writer worker 12] replication failed to apply update: { "json_obj" }

2015-08-19T09:50:31.344+0200 I REPL [repl writer worker 12] adding missing object

2015-08-19T09:50:31.346+0200 I REPL [repl writer worker 12] missing object not found on source. presumably deleted later in oplog

2015-08-19T09:50:31.346+0200 I REPL [repl writer worker 12] o2: { _id: ObjectId('55d1b3bbd3d60e563f245ea6') }

2015-08-19T09:50:31.346+0200 I REPL [repl writer worker 12] o firstfield: $set

2015-08-19T09:50:31.346+0200 I REPL [repl writer worker 12] replication failed to apply update: { "json_obj" }

2015-08-19T13:04:42.480+0200 I REPL [rsSync] oplog sync 2 of 3

[..]

2015-08-19T13:32:41.373+0200 I NETWORK [conn28297] end connection IP.IP.IP.24:53272 (15 connections now open)

2015-08-19T13:32:41.373+0200 I NETWORK [initandlisten] connection accepted from IP.IP.IP.24:53275 #28299 (16 connections now open)

2015-08-19T13:32:42.635+0200 I REPL [rsSync] initial sync building indexes

2015-08-19T13:32:42.635+0200 I REPL [rsSync] initial sync cloning indexes for : admin

2015-08-19T13:32:42.764+0200 I REPL [rsSync] initial sync cloning indexes for : db

2015-08-19T13:32:42.801+0200 I STORAGE [rsSync] copying indexes for: { name: "coll1", options: {} }

Rhys Campbell

unread,

Sep 11, 2015, 3:41:18 AM9/11/15

to mongodb-user

Hi Matthieu,

With that amount of data I think MongoDB would recommend you perform the initial setup by copying data files. Are you able to remove one secondary from the cluster to copy the data files over?