MongoDB Initial Sync failing repeatedly (Mongo 3.2, Ubuntu 16.04)

192 views
Skip to first unread message

Abhishek Vaid

unread,
May 26, 2016, 6:58:41 AM5/26/16
to mongodb-dev
  • I have a one member mongod instance (Server1) with a very large DB with 2.5 million documents (Each document is very big) and 4 indices.
  • Then I added another machine (Server2)to this replica set. Mongod on Server2 takes about 5 hours to fetch all the documents in this big database.
  • After all the documents are fetched by Server2, It starts making secondary indices. It takes around 3 hours for indices to get finished.

    • Immediately after completing building the secondary indices, It tries to connect to primary and finds that socket is expired and timed out. 
    • At having received a timeout error it (Server2) simply drops all databases and starts the initial sync again.

  • The snippet from logs is below:


    2016-05-25T11:50:36.053+0000 I -        [rsSync]   Index Build: 2211700/2215091 99%
    2016-05-25T11:50:39.221+0000 I -        [rsSync]   Index Build: 2212000/2215091 99%
    2016-05-25T11:50:43.300+0000 I -        [rsSync]   Index Build: 2212300/2215091 99%
    2016-05-25T11:50:46.103+0000 I -        [rsSync]   Index Build: 2212500/2215091 99%
    2016-05-25T11:50:49.068+0000 I -        [rsSync]   Index Build: 2212800/2215091 99%
    2016-05-25T11:50:52.218+0000 I -        [rsSync]   Index Build: 2213600/2215091 99%
    2016-05-25T11:50:55.439+0000 I -        [rsSync]   Index Build: 2214500/2215091 99%
    2016-05-25T11:50:58.738+0000 I -        [rsSync]   Index Build: 2214700/2215091 99%
    2016-05-25T11:51:13.223+0000 I -        [rsSync]   Index: (2/3) BTree Bottom Up Progress: 536600/2215091 24%
    2016-05-25T11:51:23.285+0000 I -        [rsSync]   Index: (2/3) BTree Bottom Up Progress: 1984500/2215091 89%
    2016-05-25T11:51:24.317+0000 I INDEX    [rsSync]   done building bottom layer, going to commit
    2016-05-25T11:51:24.508+0000 I INDEX    [rsSync] build index done.  scanned 2215091 total records. 10491 secs
    2016-05-25T11:51:25.082+0000 I NETWORK  [rsSync] Socket say send() errno:110 Connection timed out xx.xx.xx.xx:27017
    2016-05-25T11:51:25.106+0000 E REPL     [rsSync] 9001 socket exception [SEND_ERROR] server [xx.xx.xx.xx:27017] 
    2016-05-25T11:51:25.106+0000 E REPL     [rsSync] initial sync attempt failed, 9 attempts remaining
    2016-05-25T11:51:30.106+0000 I REPL     [rsSync] initial sync pending
    2016-05-25T11:51:30.433+0000 I REPL     [ReplicationExecutor] syncing from: xx.xx.xx.xx:27017
    2016-05-25T11:51:30.563+0000 I REPL     [rsSync] initial sync drop all databases
    2016-05-25T11:51:30.564+0000 I STORAGE  [rsSync] dropAllDatabasesExceptLocal 42
    2016-05-25T11:51:31.925+0000 I JOURNAL  [rsSync] journalCleanup...
    2016-05-25T11:51:31.925+0000 I JOURNAL  [rsSync] removeJournalFiles
    2016-05-25T11:51:32.331+0000 I JOURNAL  [rsSync] journalCleanup...
    2016-05-25T11:51:32.332+0000 I JOURNAL  [rsSync] removeJournalFiles
    2016-05-25T11:51:32.489+0000 I JOURNAL  [rsSync] journalCleanup...
    2016-05-25T11:51:32.489+0000 I JOURNAL  [rsSync] removeJournalFiles


  • It has been very very frustrating trying to sync this replica set. It keeps doing initial sync over and over again. Any help is highly appreciated.

Asya Kamsky

unread,
May 30, 2016, 11:27:40 PM5/30/16
to mongo...@googlegroups.com
This list is for developers *of* MongoDB - please ask your question on
MongoDB-user Google Group.

Asya
> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mongodb-dev...@googlegroups.com.
> To post to this group, send email to mongo...@googlegroups.com.
> Visit this group at https://groups.google.com/group/mongodb-dev.
> For more options, visit https://groups.google.com/d/optout.



--
Asya Kamsky
Lead Product Manager
MongoDB
Download MongoDB - mongodb.org/downloads
Free MongoDB Monitoring - cloud.mongodb.com
Free Online Education - university.mongodb.com
Get Involved - mongodb.org/community
We're Hiring! - https://www.mongodb.com/careers

Simon Foster

unread,
Aug 3, 2016, 9:21:52 AM8/3/16
to mongodb-dev
I realise this is in the wrong group, but:

1) I have an identical problem.  A new rsset mongod on the same server syncs fine, on a remote server it fails like this.
2) This is Google's #1 search result for this error message
3) I searched for all posts by this user on mongodb-user and cannot find this posted there
4) Others are likely to have the problem and arrive at the same place, so it would be good to have an answer here other than "wrong way, go back"

Abhishek, did you resolve this problem and if so, how?  Anyone else?

To try and get on topic.  Why does Mongo loop indefinitely on rssync failure?  If the sync have never succeeded, a backing off process would make much more sense.  We've blown through a month of bandwidth allocation because we thought it was just taking a long time to sync (the Cloud Manager set up does say this could "take days")

Thanks,

- Simon
Reply all
Reply to author
Forward
0 new messages