Replication Set Resync Issue - Trying to fix a "RECOVERING" member

pnair

unread,

Sep 24, 2018, 8:09:48 PM9/24/18

to mongodb-user

We are using Mongo 3.6 and have a sharded replica set cluster.

Two of the replica sets in out cluster has gone into a Recovery State. There is one member that is acting as the primary.

We tried to recover one of the replicas in the "Recovering" state by removing the data from /data/db directory and then restarting mongod using the approach here

https://docs.mongodb.com/manual/tutorial/resync-replica-set-member/

The member comes up and then completes the initialization and goes into SECONDARY state. There are no errors. rs.status shows that the secondary is syncing from the primary member.

However, We are noticing that the data is really not getting resynced. For example Db5 in Primary is 16 GB and in secondary shows it as .128 GB..

Has anyone seen this behavior with fixing a replica set in "RECOVERING STATE". Any suggestions?

Here is some data that we are seeing..

mongo-cluster1-shard1:PRIMARY> show dbs

Db1 0.460GB

Db2 0.077GB

Db2 0.032GB

Db4 3.837GB

Db5 16.745GB

Db6 0.001GB

admin 0.000GB

local 1.374GB

mongo-cluster1-shard1:SECONDARY> show dbs

Db1 0.077GB

Db2 0.058GB

Db3 0.006GB

Db4 3.835GB

Db5 0.128GB

DB6 0.001GB

admin 0.000GB

local 0.052GB

Kevin Adistambha

unread,

Sep 26, 2018, 9:26:16 PM9/26/18

to mongodb-user

Hi,

The database size numbers next to the database names are the database sizes on disk, and do not reflect the actual content of the database itself. Within nodes in a replica set, it is expected to have these numbers vary, especially in a newly synced member. This is because MongoDB’s replication protocol is not a binary replication, thus it is possible that the contents are stored differently within each node.

A better comparison metric would be the output of db.stats(). Of note are the metrics dataSize, which reflects the actual data size instead of storage size. See dbStats for more details on the output of db.stats().

It’s important to note that once a node reached the status of SECONDARY after an initial sync, it has finished applying all the primary’s oplog, and is ready to take over as primary in the event of a failover. Other states (e.g. RECOVERING, STARTUP, etc.) do not have this guarantee. Hence once it is a SECONDARY, you can be reasonably certain that it has all of the data in the current primary.

Best regards,
Kevn

pnair

unread,

Oct 1, 2018, 1:12:46 PM10/1/18

to mongodb-user

Hi Kevin,

Thanks for your reply. I did a db.stats() on one of the databases primary and secondary nodes.

Here is the stats from it. As you said, the datasizes match and the storage size vary. Can you explain a little more why there is no much difference in the storage size.

I also see the index size is different in both primary and secondary.

Thanks again..

mongo-cluster1-shard1:PRIMARY> db.stats()

{

"db" : "Db5",

"collections" : 46,

"views" : 0,

"objects" : 78063,

"avgObjSize" : 6761.67507013566,

"dataSize" : 527836641,

"storageSize" : 17776951296,

"numExtents" : 0,

"indexes" : 162,

"indexSize" : 202719232,

"ok" : 1

}

mongo-cluster1-shard1:SECONDARY> db.stats()

{

"db" : "Db5",

"collections" : 46,

"views" : 0,

"objects" : 78066,

"avgObjSize" : 6761.701816411754,

"dataSize" : 527859014,

"storageSize" : 131010560,

"numExtents" : 0,

"indexes" : 162,

"indexSize" : 5918720,

"ok" : 1

Kevin Adistambha

unread,

Oct 1, 2018, 10:38:58 PM10/1/18

to mongodb-user

Hi,

Can you explain a little more why there is no much difference in the storage size.

The page dbStats output I linked previously should have a good explanation of the output metrics. To paraphrase:

dbStats.dataSize

The total size of the uncompressed data held in this database. The dataSize decreases when you remove documents.

also:

dbStats.storageSize

The total amount of space allocated to collections in this database for document storage. The storageSize does not decrease as you remove or shrink documents. This value may be smaller than dataSize for databases using the WiredTiger storage engine with compression enabled.

In combination, what the two stats are showing is that dataSize is the actual size of your data (uncompressed), and the storageSize is the actual size on disk. If your workload involves a lot of updates and deletes, eventually there could be fragmentation in how the files are stored on disk.

This is deliberate, since WiredTiger will eventually reuse the “blank” spots within the existing file when you insert a new document. This is done so that WiredTiger doesn’t have to constantly resize the disk files, which is an expensive operation. Imagine if WiredTiger is forced to resize and rearrange the disk files each time you delete a single document, especially if the document was physically stored in the middle of the data file. Performance won’t be very good. This is also the case with indexes, and the reason why the newly synced member has a smaller index size.

If you really need to recover space, then you can run compact. However please note that this operation will require downtime on the node involved, and is a highly disruptive operation to run. There is also no guarantee that space will be recovered. The best way to ensure that you recover space is to perform an initial sync on the node (which you did in your case).

Best regards
Kevin

Reply all

Reply to author

Forward