MongoDB WT replication network usage

MarcF

unread,

Oct 18, 2016, 5:07:00 AM10/18/16

to mongodb-user

I have a mongo replica set where all reads/writes must go via the primary. The majority of the writes to the primary are small UPDATES to relatively large documents (~500kB). Each document may have between 5 and 10 updates one after the other.

Recently I noticed something odd, in that the network IN to the secondaries is MUCH higher than the network IN to the primary. Given that the secondaries should only be handling replication and nothing else I'm at a bit of a loss to explain this effect.

Here is the network IN over the last hour to the PRIMARY:

And here is the network IN over the last hour for the two SECONDARIES:

The network IN on the secondaries appears to average about 200MB (per minute), which is 3 times higher than the 60MB (per minute) of the primary.

What could explain this?

MarcF

unread,

Oct 21, 2016, 6:21:41 AM10/21/16

to mongodb-user

Bump

Kevin Adistambha

unread,

Oct 24, 2016, 3:01:22 AM10/24/16

to mongodb-user

Hi,

I have a mongo replica set where all reads/writes must go via the primary. The majority of the writes to the primary are small UPDATES to relatively large documents (~500kB). Each document may have between 5 and 10 updates one after the other.

Recently I noticed something odd, in that the network IN to the secondaries is MUCH higher than the network IN to the primary. Given that the secondaries should only be handling replication and nothing else I’m at a bit of a loss to explain this effect.

MongoDB performs replication by using the oplog to record any changes to the database, and it relies on the oplog being idempotent (i.e. applying an operation once or multiple times will result in the same outcome). Because of this, the Primary and the Secondaries are not necessarily writing using the same method. This may manifest in the difference in network utilization between the Primary and Secondaries that you observed.

As an example, if you have an increment operation in your primary (e.g. db.test.update({_id: 1}, {$inc: {x: 1}})), this operation would be recorded in the oplog as db.test.update({_id: 1}, {$set: {x: <some value>}}). Another example is the $addToSet operation, which will be recorded in the oplog as setting the whole array into a field.

Since you have relatively large documents, depending on the update operation, it may be the case that MongoDB is forced to send the whole document to the Secondaries to enforce the idempotency property of the oplog.

Please note that the upcoming MongoDB 3.4 has wire protocol compression feature (https://jira.mongodb.org/browse/SERVER-3018) that may minimize the effect of this behaviour.

Best regards,
Kevin

MarcF

unread,

Oct 24, 2016, 6:56:28 AM10/24/16

to mongodb-user

Many kind thanks for the logical explanation for the network differences. I'll try out the compression features once available.