First problem: "connection refused because too many open connections: 20000"
At the moment, mongod has a hard-wired limit on the number of open connections.
file handles are used for two purposes: the database files, which are
memory mapped files (mentioned in the above), and for connections between
client drivers (your mongos instances in thise case) and the servers. Some
drivers aren't as good as others at connection handling at this time, and may
use a connection for every concurrent client connection, even from the same
process. So, if you're running Apache in a multi-threaded configuration, a
single child process many be holding open many connections to the server.
Check the web page above and check on your database size relative to the
required number of open file handles, as well as the number of connections
generated by your web traffic. Also check to make sure you are using the
appropriate options for your driver to use connection pooling, as this should
minimize the number of connections it uses.
Second problem: "once a secondary gets into the recovering state, it never recovers"
Your oplog may be too small. The absolute size (you state it is 4GB) isn't
what matters; the important thing is how much activity it can hold for the
duration required for that activity to be replicated. The oplog is a capped
Capped collections are fixed in size, and used in a circular fashion. Once
filled, writing wraps around to the beginning again.
Replication works by opening a tailable cursor against a the oplog (see
interrupted for some reason, then the replica begins to fall behind. That is
ok as long as when the replica does re-connect to the primary, it is able to
pick up replication where it left off before. However, if the oplog has
wrapped, then the replica can't continue replication, and goes into the
recovery state.
If that should happen, the replica will attempt to dump its current data,
and copy a fresh up-to-date set from the primary. When it begins this
process, it takes not of the primary's current oplog position. When the
copy is complete, the secondary again tries to being replication, using the
oplog position noted before copying began. If there is a lot of mutation
activity on that primary, it is possible that the oplog will have wrapped
during the copy process, and we're back to the same problem again: the
secondary cannot replicate from the primary. This can happen when the
mutation rate on a primary is very high, causing the oplog to wrap often.
There's no set formula for oplog size that I can give you to size it. You
need to measure the mutation rate, and see how fast you use oplog space
for your application, and then size it appropriately, taking into account
the window of "repair time" that you want to allow yourself. In other words,
how long are you willing to take to repair your system and get replication
working again? The oplog must be big enough to accomodate that for your
rate of oplog use. It sounds like yours may be too small, and you've gotten
stuck in the recovery loop described above.
Next, you say you rsync'ed your primary's files to your secondary, and that
you then started getting data errors in your mongod logs. That's not too
surprising, because it is not safe to copy the database files unless they
have been flushed and locked as described for backup procedures here:
of your copying them without locking and flushing (syncing) them. A repair
might work, but it might not. The safest strategy, since they are secondaries,
would be to stop activity on the primary, and use proper backups of it to create
new secondaries.
In order to resize your oplog, you need to drop it and recreate it. You can
use the commands from the page on Capped Collections above, and also these
For future reference, in order not to lose mongod log file contents, use
the --logappend option for your log files, as described here:
to avoid overwriting (and losing) log file contents in the course of database
restarts.
Here are some additional questions for you:
(*) Why are you running --repair on your primaries, when the data errors are
only being reported on your secondaries?
(*) Where are your mongos processes located?
Chris