Hello.
Running a 3 node replica set on version 2.6.6 (agent 3.3.0.183). When the solution was deployed I enabled an MMS alert to notify on Replication Oplog Windows dropping below X value. The alert fired and I started a process to find out why.
First thing i checked was the ProfileLevel by opening the Shell on each db and executing:
db.getProfilingStatus()
All databases reported ZERO (Disabled).
{
"was" : 0,
"slowms" : 100
}
Looking at MMS I can see the growth in the Oplog Gb/Hour is increasing at a rate fast.
This is not natural growth in terms of increase clients.
Meeting with the development team they are not aware of any desired software changed to increase storage transaction atlough we have not rulled out a bug somewhere.
April Date
log per hour Increase
1
20mb
2
30mb +10
3
30mb
4
30mb
5
30mb
6
30mb
7
30mb
8
30mb
9
30mb
10
30mb
11
30mb
12
40mb +10
13
40mb
14
50mb +10
15
50mb
16
50mb
17
50mb
18
50mb
19
70mb +20
20
120mb +50
21
180mb +50
22
230mb +50
23
310mb +80
24 650mb +340
The result of this is an OpLog Windows down from 2000+ hours to 70 hours from 22nd Feb with the largest drop over the last 5 days as you can see from the above.
When i query or mongoexport the oplog.rs collection to there are currently no clear patterns in results.
I am looking for suggestions to help debug this issue please.
Thanks for any help
Scott