replSet initial sync drop all databases

487 views
Skip to first unread message

郭冬冬

unread,
Jan 22, 2014, 2:47:17 AM1/22/14
to mongod...@googlegroups.com
why does mongodb secondary drop all database when start up ?

Greg Donald

unread,
Jan 30, 2014, 9:51:15 PM1/30/14
to mongod...@googlegroups.com
On Wednesday, January 22, 2014 1:47:17 AM UTC-6, 郭冬冬 wrote:
why does mongodb secondary drop all database when start up ?


I just tested this and I'm not seeing the behavior you describe.

I setup a small replica set using

mongod --fork --logpath a.log --smallfiles --oplogSize 50 --port 27001 --dbpath data/z1 --replSet z
mongod --fork --logpath b.log --smallfiles --oplogSize 50 --port 27002 --dbpath data/z2 --replSet z
mongod --fork --logpath c.log --smallfiles --oplogSize 50 --port 27003 --dbpath data/z3 --replSet z

> rs.initiate(                                                                                    
    { _id:'z',                                                                                              
      members:[                                                                                             
        { _id:1, host:'localhost:27001' },                                                                   
        { _id:2, host:'localhost:27002' },                                                                   
        { _id:3, host:'localhost:27003' }                                                                    
      ]                                                                                             
    }
);

Then I killed all three processes:

kill -9 25542 25496 25483

Next I brought one of them back up

mongod --fork --logpath c.log --smallfiles --oplogSize 50 --port 27003 --dbpath data/z3

and inserted a doc

> db.foo.insert({a:1})

Then I killed that process

kill -9 25885

Then I brought the replica set back online using

mongod --fork --logpath a.log --smallfiles --oplogSize 50 --port 27001 --dbpath data/z1 --replSet z
mongod --fork --logpath b.log --smallfiles --oplogSize 50 --port 27002 --dbpath data/z2 --replSet z
mongod --fork --logpath c.log --smallfiles --oplogSize 50 --port 27003 --dbpath data/z3 --replSet z

and my data is still there:

> mongo --port 27003
MongoDB shell version: 2.4.6
connecting to: 127.0.0.1:27003/test
z:SECONDARY> db.setSlaveOk()
z:SECONDARY> db.foo.find()
{ "_id" : ObjectId("52eb09f9df67edd1c6d33e71"), "a" : 1 }
z:SECONDARY>


In addition, if I insert data into the primary it seems to be perfectly happy to live right there alongside my existing data on the secondary:

> mongo --port 27001
MongoDB shell version: 2.4.6
connecting to: 127.0.0.1:27001/test
z:PRIMARY> db.foo.insert({b:2})
z:PRIMARY> db.foo.find()
{ "_id" : ObjectId("52eb0cefdf3e83d897ae68ec"), "b" : 2 }
z:PRIMARY>

> mongo --port 27003
MongoDB shell version: 2.4.6
connecting to: 127.0.0.1:27003/test
z:SECONDARY> db.setSlaveOk()
z:SECONDARY> db.foo.find()
{ "_id" : ObjectId("52eb09f9df67edd1c6d33e71"), "a" : 1 }
{ "_id" : ObjectId("52eb0cefdf3e83d897ae68ec"), "b" : 2 }



Asya Kamsky

unread,
Jan 31, 2014, 4:07:39 AM1/31/14
to mongodb-user
By starting up, I think you mean "when it first joins the replica set"
(not when you restart it after it's been a member).

The reason is that secondary must have identical copy of the data from
the primary. The way it gets there is by dropping its own DBs and
"copying" the DBs from the primary (or another node in the replica
set).

Asya


On Tue, Jan 21, 2014 at 11:47 PM, 郭冬冬 <guodd.si...@gmail.com> wrote:
> why does mongodb secondary drop all database when start up ?
>
> --
> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb
>
> ---
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mongodb-user...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

Shi Shei

unread,
Mar 20, 2014, 1:19:59 PM3/20/14
to mongod...@googlegroups.com
Asya, the initial sync never terminates because all databases are dropped after a certain time, sometimes at 70% sometimes at 80%.
Here are the logs where you can see that it cloned a database and then dropped them all to begin anew its initial sync:

Thu Mar 20 12:41:51.775 [rsSync] clone offerStore_shopverwaltung_dev.offer 5990143
Thu Mar 20 12:41:51.807 [conn82] auth: couldn't find user mms@admin, admin.system.users
Thu Mar 20 12:41:51.807 [conn83] auth: couldn't find user mms@admin, admin.system.users
Thu Mar 20 12:41:51.818 [conn80] end connection 172.16.64.36:59671 (18 connections now open)
Thu Mar 20 12:41:51.818 [conn79] end connection 172.16.64.36:59670 (17 connections now open)
Thu Mar 20 12:41:53.630 [conn86] end connection 172.17.64.59:56518 (16 connections now open)
Thu Mar 20 12:41:53.630 [initandlisten] connection accepted from 172.17.64.59:56520 #87 (18 connections now open)
Thu Mar 20 12:41:53.752 [rsHealthPoll] replset info b32:27018 thinks that we are down
Thu Mar 20 12:41:54.347 [conn87]  authenticate db: local { authenticate: 1, nonce: "85ce9c285d8821c0", user: "__system", key: "596f5de6b350d845da505267ddb67f5c" }
Thu Mar 20 12:41:54.481 [rsSync] Socket flush send() errno:9 Bad file descriptor 172.17.64.59:27018
Thu Mar 20 12:41:54.481 [rsSync]   caught exception (socket exception [SEND_ERROR] for 172.17.64.59:27018) in destructor (~PiggyBackData)
Thu Mar 20 12:41:54.481 [rsSync] replSet initial sync exception: 16465 recv failed while exhausting cursor 9 attempts remaining
Thu Mar 20 12:42:00.351 [conn87] replSet RECOVERING
Thu Mar 20 12:42:00.351 [conn87] replSet info voting yea for b32:27018 (0)
Thu Mar 20 12:42:01.759 [conn77] end connection 127.0.0.1:42507 (16 connections now open)
Thu Mar 20 12:42:01.759 [initandlisten] connection accepted from 127.0.0.1:42515 #88 (17 connections now open)
Thu Mar 20 12:42:02.349 [rsHealthPoll] replSet member b32:27018 is now in state PRIMARY
Thu Mar 20 12:42:18.858 [initandlisten] connection accepted from 172.16.64.36:33932 #89 (18 connections now open)
Thu Mar 20 12:42:18.858 [initandlisten] connection accepted from 172.16.64.36:33933 #90 (19 connections now open)
Thu Mar 20 12:42:18.880 [conn89]  authenticate db: admin { authenticate: 1, nonce: "7cd22915b5deb491", user: "mms", key: "35759fc233397c775562f730ac13d424" }
Thu Mar 20 12:42:18.880 [conn89] auth: couldn't find user mms@admin, admin.system.users
Thu Mar 20 12:42:18.880 [conn90]  authenticate db: admin { authenticate: 1, nonce: "3173617992cad84a", user: "mms", key: "3c57caff985161dff987ef5d78e25a0f" }
Thu Mar 20 12:42:18.881 [conn90] auth: couldn't find user mms@admin, admin.system.users
Thu Mar 20 12:42:18.891 [conn82] end connection 172.16.64.36:60738 (18 connections now open)
Thu Mar 20 12:42:18.892 [conn83] end connection 172.16.64.36:60739 (17 connections now open)
Thu Mar 20 12:42:20.355 [conn87] end connection 172.17.64.59:56520 (16 connections now open)
Thu Mar 20 12:42:20.356 [initandlisten] connection accepted from 172.17.64.59:56529 #91 (17 connections now open)
Thu Mar 20 12:42:20.356 [conn91]  authenticate db: local { authenticate: 1, nonce: "3bb64e864d2ffcc8", user: "__system", key: "ee20eb67dab81bfe1f9ca9a7a7b68e1e" }
Thu Mar 20 12:42:24.481 [rsSync] replSet initial sync pending
Thu Mar 20 12:42:24.481 [rsSync] replSet syncing to: b32:27018
Thu Mar 20 12:42:24.484 [rsSync] replSet initial sync drop all databases
Thu Mar 20 12:42:24.484 [rsSync] dropAllDatabasesExceptLocal 7
Thu Mar 20 12:42:24.521 [rsSync] removeJournalFiles
Thu Mar 20 12:42:24.565 [rsSync] removeJournalFiles
Thu Mar 20 12:42:27.108 [rsSync] removeJournalFiles
Thu Mar 20 12:42:27.573 [rsSync] removeJournalFiles
Thu Mar 20 12:42:27.793 [rsSync] removeJournalFiles
Thu Mar 20 12:42:27.858 [rsSync] removeJournalFiles
Thu Mar 20 12:42:27.986 [rsSync] replSet initial sync clone all databases
Thu Mar 20 12:42:27.990 [rsSync] replSet initial sync cloning db: offerStore_offermanager_dev1

Asya Kamsky

unread,
Mar 22, 2014, 9:37:53 PM3/22/14
to mongodb-user
MongoDB drops all databases at the beginning of initial sync (since it'll sync all data, it needs to remove all "previous" data).

In the logs you show the initial sync failed the previous time - it looks like it lost connection to the server it was syncing from or otherwise took long enough that the cursor expired (either timed out or if this was a capped collection possibly rolled over).

Syncing by seeding the secondary with a copy of the DB directory will usually avoid this issue.

Asya



For more options, visit https://groups.google.com/d/optout.

Shi Shei

unread,
Mar 24, 2014, 6:55:11 AM3/24/14
to mongod...@googlegroups.com
Thank you Asya, your help is much appreciated.
We restarted mongod with vvvvvv to have more verbose logs. This time, the initial sync succeeded. Not because of the verbosity of the logs but because the oplog changed in the meantime. We don't use capped collections. A network failure is implausible. We rather think that mongo got stuck somewhere in the oplog before.
Syncing by seeding is a good hint indeed. Thanks!
Reply all
Reply to author
Forward
0 new messages