Too many connections after add new Shard

Davizu

unread,

Mar 14, 2012, 1:16:12 PM3/14/12

to mongodb-user, heliof...@gmail.com, Bi...@10gen.com

Hello guys,

We are in a big situation here.. :-)

We have our production environment on Amazon EC2:

3 Shards:
2 replicas (m1.large (7GB RAM), aws linux, 64bits, mongo 2.0.3,
raid ebs)
1 arbiter (t1.micro, arbiter only)

3 Mongo configs (micro)

~33 (up to 55) floating web instances (m1.small, apache 2, php 5
(driver 1.2.9) and mongos)

It use to be 2 Shards (same config), but last thursday we added the
3rd shard, it was alright until sunday, when we started to have many
connection refused errors on shards 1 and 2 primary replicas:

"[initandlisten] connection refused because too many open connections:
20000"

Once one replica get on to this error, all the other shards primary
replicas get in too and the web instances can't connect anymore.

Here the ulimit settings we are using (nproc use to be 250000, I
double that in a desperate act):

# ulimit -Sa
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 59652
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 65536
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 1024
cpu time (seconds, -t) unlimited
max user processes (-u) 500000
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

# ulimit -Ha
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 59652
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 65536
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 1024
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

Another weird detail is once a secondary replica of shard1 or
shard2 get on recovering state, it never recovers itself. It get
stucked until I delete its files and rsync from its primary. It also
do not sync the files by itself don't matter how long (days) I wait.
Our oplog size is 4GB.

As I had to rsync the files from shard1/primary to shard1/secondary
and from shard2/primary to shard2/secondary a couple of times, we also
start to get some data errors on our mongod logs (unfortunately mongo
overwrites the log file by default and I lost these messages), so I
found an article saying the better practice is run a repair on those
boxes.

That's what I did (or tried). We are with our web service down
since monday 23:30 and since than I synced the files a couple of
times, tried to increase some ulimits values, tried to run
db.repairDatabase() (it took me 16 hours and crashed), and right now
I'm running "mongod --dbpath /my/path --repair" on all of my boxes
(wed 14h15).

Does anybody have a idea about that?

Thanks so much.
-Davizu

Some other details (unfortunately the boxes are running repair and I
can't get much more for now):

rs.config()
{
"_id" : "shard01",
"version" : 1034535,
"members" : [
{
"_id" : 1,
"host" : "mongoa-a01:27017",
"arbiterOnly" : true
},
{
"_id" : 2,
"host" : "mongod-a01:27017"
},
{
"_id" : 3,
"host" : "mongod-a02:27017"
}
]
}

rs.config()
{
"_id" : "shard02",
"version" : 97778,
"members" : [
{
"_id" : 1,
"host" : "mongoa-b01:27017",
"arbiterOnly" : true
},
{
"_id" : 2,
"host" : "mongod-b01:27017"
},
{
"_id" : 3,
"host" : "mongod-b02:27017"
}
]
}

rs.config()
{
"_id" : "shard03",
"version" : 292199,
"members" : [
{
"_id" : 1,
"host" : "mongoa-c01:27017",
"arbiterOnly" : true
},
{
"_id" : 2,
"host" : "mongod-c01:27017"
},
{
"_id" : 3,
"host" : "mongod-c02:27017"
}
]
}

Httpd.conf:
…
<IfModule prefork.c>
StartServers 8
MinSpareServers 8
MaxSpareServers 20
ServerLimit 128
MaxClients 128
MaxRequestsPerChild 512
</IfModule>
<IfModule worker.c>
StartServers 4
MaxClients 128
MinSpareThreads 4
MaxSpareThreads 16
ThreadsPerChild 25
MaxRequestsPerChild 1000
</IfModule>
...

Chris Westin

unread,

Mar 14, 2012, 7:57:26 PM3/14/12

to mongod...@googlegroups.com, heliof...@gmail.com, Bi...@10gen.com

First problem: "connection refused because too many open connections: 20000"

At the moment, mongod has a hard-wired limit on the number of open connections.

See http://www.mongodb.org/display/DOCS/Too+Many+Open+Files . Note that

file handles are used for two purposes: the database files, which are

memory mapped files (mentioned in the above), and for connections between

client drivers (your mongos instances in thise case) and the servers. Some

drivers aren't as good as others at connection handling at this time, and may

use a connection for every concurrent client connection, even from the same

process. So, if you're running Apache in a multi-threaded configuration, a

single child process many be holding open many connections to the server.

Check the web page above and check on your database size relative to the

required number of open file handles, as well as the number of connections

generated by your web traffic. Also check to make sure you are using the

appropriate options for your driver to use connection pooling, as this should

minimize the number of connections it uses.

Second problem: "once a secondary gets into the recovering state, it never recovers"

Your oplog may be too small. The absolute size (you state it is 4GB) isn't

what matters; the important thing is how much activity it can hold for the

duration required for that activity to be replicated. The oplog is a capped

collection (see http://www.mongodb.org/display/DOCS/Capped+Collections).

Capped collections are fixed in size, and used in a circular fashion. Once

filled, writing wraps around to the beginning again.

Replication works by opening a tailable cursor against a the oplog (see

http://www.mongodb.org/display/DOCS/Replica+Sets+-+Oplog). If that is

interrupted for some reason, then the replica begins to fall behind. That is

ok as long as when the replica does re-connect to the primary, it is able to

pick up replication where it left off before. However, if the oplog has

wrapped, then the replica can't continue replication, and goes into the

recovery state.

If that should happen, the replica will attempt to dump its current data,

and copy a fresh up-to-date set from the primary. When it begins this

process, it takes not of the primary's current oplog position. When the

copy is complete, the secondary again tries to being replication, using the

oplog position noted before copying began. If there is a lot of mutation

activity on that primary, it is possible that the oplog will have wrapped

during the copy process, and we're back to the same problem again: the

secondary cannot replicate from the primary. This can happen when the

mutation rate on a primary is very high, causing the oplog to wrap often.

There's no set formula for oplog size that I can give you to size it. You

need to measure the mutation rate, and see how fast you use oplog space

for your application, and then size it appropriately, taking into account

the window of "repair time" that you want to allow yourself. In other words,

how long are you willing to take to repair your system and get replication

working again? The oplog must be big enough to accomodate that for your

rate of oplog use. It sounds like yours may be too small, and you've gotten

stuck in the recovery loop described above.

Next, you say you rsync'ed your primary's files to your secondary, and that

you then started getting data errors in your mongod logs. That's not too

surprising, because it is not safe to copy the database files unless they

have been flushed and locked as described for backup procedures here:

http://www.mongodb.org/display/DOCS/Backups#Backups-WriteLock%2CFsync%2CandBackup . Your data files on your secondaries are likely to be corrupt as a result

of your copying them without locking and flushing (syncing) them. A repair

might work, but it might not. The safest strategy, since they are secondaries,

would be to stop activity on the primary, and use proper backups of it to create

new secondaries.

In order to resize your oplog, you need to drop it and recreate it. You can

use the commands from the page on Capped Collections above, and also these

instructions http://www.snailinaturtleneck.com/blog/2011/02/22/resizing-your-oplog/ .

For future reference, in order not to lose mongod log file contents, use

the --logappend option for your log files, as described here:

http://www.mongodb.org/display/DOCS/Logging . You should always use this option

to avoid overwriting (and losing) log file contents in the course of database

restarts.

Here are some additional questions for you:

(*) Why are you running --repair on your primaries, when the data errors are

only being reported on your secondaries?

(*) Where are your mongos processes located?

Chris

Davizu

unread,

Mar 14, 2012, 8:13:42 PM3/14/12

to mongodb-user

Hello Chris,

Thanks by the explain, I reading the material you posted and I'll
resize the oplog size as soon as it get back of the repair.

Here are the answers:

1-) The primaries got stucked on "too many connections" and I had
to restart mongod a couple of times (after change ulimits also) and
they exchange state with the secondaries. So I wasn't sure about
repair or not, so I did. Is that a problem or just a wast of time?

2-) The mongos runs on the same box as apache, every apache box has
its own mongos.

One more question: is there any possibility of my corrupted files
are causing/contributing with the "too many connections" issue?

Thanks!
-Daniel

On Mar 14, 8:57 pm, Chris Westin <cwes...@yahoo.com> wrote:
> First problem: "connection refused because too many open connections:
> 20000"
> At the moment, mongod has a hard-wired limit on the number of open
> connections.

> Seehttp://www.mongodb.org/display/DOCS/Too+Many+Open+Files. Note that

> file handles are used for two purposes: the database files, which are
> memory mapped files (mentioned in the above), and for connections between
> client drivers (your mongos instances in thise case) and the servers. Some
> drivers aren't as good as others at connection handling at this time, and
> may
> use a connection for every concurrent client connection, even from the same
> process. So, if you're running Apache in a multi-threaded configuration, a
> single child process many be holding open many connections to the server.
> Check the web page above and check on your database size relative to the
> required number of open file handles, as well as the number of connections
> generated by your web traffic. Also check to make sure you are using the
> appropriate options for your driver to use connection pooling, as this
> should
> minimize the number of connections it uses.
>
> Second problem: "once a secondary gets into the recovering state, it never
> recovers"
> Your oplog may be too small. The absolute size (you state it is 4GB) isn't
> what matters; the important thing is how much activity it can hold for the
> duration required for that activity to be replicated. The oplog is a capped

> collection (seehttp://www.mongodb.org/display/DOCS/Capped+Collections).

> Capped collections are fixed in size, and used in a circular fashion. Once
> filled, writing wraps around to the beginning again.
>

> Replication works by opening a tailable cursor against a the oplog (seehttp://www.mongodb.org/display/DOCS/Replica+Sets+-+Oplog). If that is

> have been flushed and locked as described for backup procedures here:http://www.mongodb.org/display/DOCS/Backups#Backups-WriteLock%2CFsync...

> . Your data files on your secondaries are likely to be corrupt as a result
> of your copying them without locking and flushing (syncing) them. A repair
> might work, but it might not. The safest strategy, since they are
> secondaries,
> would be to stop activity on the primary, and use proper backups of it to
> create
> new secondaries.
>
> In order to resize your oplog, you need to drop it and recreate it. You can
> use the commands from the page on Capped Collections above, and also these
> instructionshttp://www.snailinaturtleneck.com/blog/2011/02/22/resizing-your-oplog/.
>
> For future reference, in order not to lose mongod log file contents, use

> the --logappend option for your log files, as described here:http://www.mongodb.org/display/DOCS/Logging. You should always use this

> ...
>
> read more »

Chris Westin

unread,

Mar 14, 2012, 8:22:02 PM3/14/12

to mongod...@googlegroups.com

(1) There's no problem with repairing the primaries, but it is probably a waste of time, because nothing you've said gives me any reason to believe there is a problem with the primaries. You said only the secondaries' logs are indicating data errors, right?

Because the repairs on the secondaries may not be successful (there's no guarantee with repairs), it may be easier to simply create new secondaries from a proper backup of the primary.

(2) Good; that's the best configuration IMO. Next thing to do is to check on connection pooling for your driver, and the number of open file handles required for your database files and web servers, as I described earlier.

I doubt the corrupted files are contributing to the "too many open files issue."

Chris

Davizu

unread,

Mar 14, 2012, 8:40:08 PM3/14/12

to mongodb-user

(1) As far as I saw, yes. Only on secondaries. But now they are
running the repair for 8 hours, I think I'll let them finish in peace.
My db dir has about 120GB, so I hope they are close to finish.

(2) We are using the php driver 1.2.9. I read the 1.2.7 has a pooling
problem, but it is fixed since 1.2.8 (last month). By default, it uses
pooling and does not allow the app to close its connections even if
the close() method is called. There is two ini options related with
this, but we are not using. They are:

mongo.allow_persistent => 1
mongo.auto_reconnect => 1

Both are commented, so it should use the default 1 value. Our app
calls the close connection method every time needed (end of queries
cycle). Do you thinks is a better approach uncomment these settings
and keep closing the connections instead?

Here is our full php mongo settings:

MongoDB Support => enabled
Version => 1.2.9

Directive => Local Value => Master Value
mongo.allow_empty_keys => 0 => 0
mongo.allow_persistent => 1 => 1
mongo.auto_reconnect => 1 => 1
mongo.chunk_size => 262144 => 262144
mongo.cmd => $ => $
mongo.default_host => localhost => localhost
mongo.default_port => 27017 => 27017
mongo.long_as_object => 0 => 0
mongo.native_long => 0 => 0
mongo.no_id => 0 => 0
mongo.utf8 => 1 => 1

Thanks!
-Daniel

> ...
>
> read more »

Derick Rethans

unread,

Mar 15, 2012, 3:38:06 AM3/15/12

to mongodb-user

On Wed, 14 Mar 2012, Davizu wrote:

> (2) We are using the php driver 1.2.9. I read the 1.2.7 has a pooling
> problem, but it is fixed since 1.2.8 (last month). By default, it uses
> pooling and does not allow the app to close its connections even if
> the close() method is called. There is two ini options related with
> this, but we are not using. They are:
>
> mongo.allow_persistent => 1
> mongo.auto_reconnect => 1
>
> Both are commented, so it should use the default 1 value. Our app
> calls the close connection method every time needed (end of queries
> cycle). Do you thinks is a better approach uncomment these settings
> and keep closing the connections instead?

Do you have lots of timed-out connections?

cheers,
Derick

--
http://mongodb.org | http://derickrethans.nl
twitter: @derickr and @mongodb

Bill Hayward

unread,

Mar 15, 2012, 1:44:20 PM3/15/12

to mongod...@googlegroups.com

Hi Daniel,

How are you coming along with this issue?

Thanks in advance,

Bill

Davizu

unread,

Mar 15, 2012, 2:57:03 PM3/15/12

to mongodb-user

Hello Derick,

No, I didn't get timed-out connections from apache side. All I get
is socket errors or "could not authenticate" when the shards are on
"too many connections" state.

-Daniel

> --http://mongodb.org|http://derickrethans.nl
> twitter: @derickr and @mongodb

Davizu

unread,

Mar 15, 2012, 3:02:27 PM3/15/12

to mongodb-user

Hello Bill,

Unfortunately I think I messed up. After stop the repair process
(I waited 28hours until kill), now I have all my database files, but I
cant get the shard1 and shard2 working. They get stucked on "STARTUP2"
state and I don't know how to put their boxes back on
primary,secondary and arbiter state.

Now I'm setting up a brand new environment from scratch. I'll put
my service online first, than I'l try to reimport all my data from the
database files (mongoexport).

Regards and thanks,
-Daniel

Reply all

Reply to author

Forward