mongodb 1.8.1 didn't release the connections

179 views
Skip to first unread message

Fontana

unread,
Aug 1, 2011, 10:30:14 AM8/1/11
to mongodb-user
Hi,
I have a mongoDB with 2 capped collections.
The load is about 30 writes and 30 reads per second.
We recently upgraded our system to use MongoDB 1.8.2 and 10gen C#
client 1.1
Now, after the system is running for some time (can be12 hours or 5
minutes) we get errors from the MongoDB server:

"A connection attempt failed because the connected party did not
properly respond after a period of time, or established connection
failed because connected host has failed to respond"

We ran the serverStatus() command and see that the the number of
connection available is getting lower, until it reaches 0 (20000
current and 0 available)
We also looked at the MongoDB log and saw that at some point the query
times became longer and the connections that opened didn't close.
However, when I look at the machine connections (using the netstat
command) I see only 5-7 connections.
The same system worked with the same load and configuration for 10
months without any problem with version 1.6.

BTW, I noticed there's also an issue about something similar but I'm
not sure it's the same problem:
https://jira.mongodb.org/browse/SERVER-3146

Any Ideas what could it be?

Thanks,
Idan.

Alvin Richards

unread,
Aug 1, 2011, 10:40:04 AM8/1/11
to mongodb-user
Can you paste the output of

db.adminCommand("connPoolStats")


-Alvin

Fontana

unread,
Aug 1, 2011, 10:59:13 AM8/1/11
to mongodb-user
We had to solve this problem ASAP so we downgraded the server back to
1.6 two hours ago.... (the client is still 10gen 1.1)
For now it works fine, but I can't run the command you asked....

Robert Stam

unread,
Aug 1, 2011, 11:01:01 AM8/1/11
to mongodb-user
The C# driver connection pool has a default maximum size of 100
connections.

How many client processes are involved? Can you show the code that is
connecting to MongoDB? Are replica sets or sharding in the picture?

Is this related to your other question about using Save on a capped
collection when you have already assigned a value to the _id (so Save
calls Update instead of Insert, and Update on a capped collection is
slow because there is on index on the _id)? If so, does the problem go
away if you call Insert instead of Save?

Fontana

unread,
Aug 1, 2011, 11:44:25 AM8/1/11
to mongodb-user
I'll try to describe the system.

1)There is one process (WinService) which runs with threads and
inserts about 30 items per second to the first collection.
We created manual pool (which is a simple array of mongo client
objects initialized in advance) and each thread gets a different
object (we did it that way because the previous client we used -
'samus' - was not multithreaded, which caused a bottleneck when all
threads used the same object. We didn't know if the 10gen client
supports multithreading so it still works this way). the maximum size
of this array is 50, so no more than 50 objects will be created. We
don't know the exact number of thread being used because we use .NET 4
parallel extension for this.

2)There are 1250 processes which reads from the first collection, do
some work, and insert to the second collection. the average load is 10
reads/write per second. Each of these processes creates a mongo client
object when it start to work and dispose it (disconnect) when the work
is done.

3)There are 270 processes which reads from the second collection, do
some work and report to relational DB at the end. the average load is
10 reads/write per second. Each of these processes creates a mongo
client object when it start to work and dispose it (disconnect) when
the work is done.

Here is a snippet from our wrapper which creates the client object:

MongoConnectionStringBuilder mcsb = new
MongoConnectionStringBuilder("server=Machine1:28010");
mcsb.SafeMode = SafeMode.True;
mcsb.SlaveOk = true;
MongoServer server = MongoServer.Create(mcsb);
MongoDatabase db = server["MyDB"];

We don't call Connect command explicitly because the driver knows to
call implicitly.
so we just cal the insert command (in this example we insert the
object 'value'):

byte[] objectBytes = SerializeObject(value);
BsonDocument doc = new BsonDocument();
doc["_id"] = key; //a long number converted to string
doc["v"] = new BsonBinaryData(objectBytes);
doc["dt"] = DateTime.Now;
MongoDatabase[CollectionName].Insert(doc);

We used to have replication, but after the problem started we removed
it (we though it might caused the problem, but we were wrong)
We don't use sharding.

This question is related to the previous question regarding Save and
Insert.
We had very slow times after the upgrade and we thought the slowness
caused the error (because update is much more heavy than insert)
After changing to insert we got much better times, but the errors
still remain.

Wow, that was a long one :)
thanks for all your help,
--Idan.

Robert Stam

unread,
Aug 1, 2011, 12:01:20 PM8/1/11
to mongodb-user
1) What is the datatype of the mongo client objects? How are they
created? Since the default connection pool size is 100, if you really
created 50 different connection pools you could potentially be opening
5000 connections.

2) Are the 1250 processes multithreaded? It is not recommended that
you call Disconnect because that interferes with connection pooling
(all connections are closed and must be reopened again later).

3) Are these 270 processes multithreaded? Again, it's better to not
call Disconnect.

MongoServer.Create always returns the same instance of MongoServer
when passed the same settings, so even if you are calling
MongoServer.Create 50 times it is probably returning the same instance
of MongoServer every time.

Disconnect closes sockets on a background thread, so calling
Disconnect frequently may cause connections to build up rapidly if new
ones are being opened before the old ones are closed.

Fontana

unread,
Aug 1, 2011, 2:32:07 PM8/1/11
to mongodb-user
1) I'm creating them using the snippet I wrote. According to what you
wrote I'm actually using the same object with all threads.
2) The 1250 processes are not multithreaded. They actually being
started and then killed by a watchdog (which then start a new one
instead of the one who just died)
However, each of them calls Disconnect before it dies.
3) The 270 processes are not multithreaded. Each of them create a
client (again, with the snippet I wrote). They don't get killed, but
they also call Disconnect on each work they do.

So what do you suggest? Should we not call Disconnect?
But won't that still leave the connection open? what should we do?

Thanks for the help,
--Idan.

Robert Stam

unread,
Aug 1, 2011, 2:37:51 PM8/1/11
to mongodb-user
My concern with calling Disconnect was whether it was being called
repeatedly within the same process. Sounds like you are not doing that
in the first process, but that you are doing that in the second
process. Verify that the first process is not calling Disconnect and
change the second process so that it doesn't.

Calling Disconnect just before a process exits is OK but not strictly
necessary since the operating system will close all open sockets
automatically when the process exits.

You almost never need to call Disconnect.

Fontana

unread,
Aug 2, 2011, 2:41:51 AM8/2/11
to mongodb-user
Well, it happened again at night.
We got the connection error messages again and all the system just
stopped working.
We restarted the mongo server and all the processes but it will
probably happen again in a few hours.
Seems like The MongoDB server wasn't the issue.
Now all that's left is the taking care of the client and not call
Disconnect like you advised us.
I hope it will work otherwise we'll have to rollback the client as
well (and hope that will solve it...)

Thanks again,
--Idan.

Andrew Kempe

unread,
Aug 2, 2011, 6:46:01 AM8/2/11
to mongod...@googlegroups.com
This sounds like the same boost mutex problem I encountered.

Robert, talk to Greg Studer/Dwight as they have more info on this and a
possible solution.

Andrew

Thanks again,
--Idan.

--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to
mongodb-user...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.


Greg Studer

unread,
Aug 2, 2011, 6:05:39 PM8/2/11
to mongod...@googlegroups.com
Definitely let us know if not calling disconnect helps with your
problem. The issue Andrew is referring to might be the cause, but it
would only apply if you were running mongod on a win64 server - is
that the case?

Fontana

unread,
Aug 3, 2011, 2:10:22 AM8/3/11
to mongodb-user
Yes, that is the case.
For now we rolled-back to the previous client until we'll change our
wrapper not to call Disconnect.
I'll let you know if it helped.
We're still experiencing problem with establishing connection and slow
read/write times.
The load is about 30-40 reads/write per second. the items are 15K
average. Is it too much load? We only have 2 capped collection in our
database (200MB and 800MB)
Can the problem we're experiencing has to do with the fact that we
don't run with replication anymore? (we used to, but than we thought
it might cause the problem and we removed it)

Thanks,
--Idan.

Greg Studer

unread,
Aug 3, 2011, 10:51:52 AM8/3/11
to mongod...@googlegroups.com
> Is it too much load?
Doesn't sound like it, but it's hard to know without specifics of your
setup. A place to start is running mongostat and iostat, to figure
out if operations are queueing or if there's a lot of disk io going
on. Also I'd check the mongod log for long-running queries that take
hundreds of ms.

> Can the problem we're experiencing has to do with the fact that we
> don't run with replication anymore? (we used to, but than we thought
> it might cause the problem and we removed it)

Don't think so, replication has some small effects in terms of writing
to the oplog, but not like the issues you're seeing.

Fontana

unread,
Aug 4, 2011, 3:03:51 AM8/4/11
to mongodb-user
OK, it looks like finally things are back to normal.
Save and Load times are good, and we almost don't see any locks.
The problem was lack of Index.
We know that capped collections are recommended not to have index on
them, but that seems to work for our scenario.
We actually thought we had index because when we created the capped
collection we specified "autoIndexId:true", but later we learned it
had no effect on the collection Index (Maybe because this field
doesn't work if the collection is capped).
So we added it using the "ensureIndex" command
( db.MyCollectionk.ensureIndex({_id:1}) ), but we called
"repairDatabase()" right after it which seems to eliminate it (Anybody
knows why?)
Finally, after figuring out we are not really running with index, we
called "ensureIndex" without calling "repairDatabase()", and things
started working fine.
However, we still use the old client (samus) and the previous sever
version (1.6), because we rolled-back the system, so there's still
work to be done there.

Thanks for all the help
--Idan.
> ...
>
> read more »

Greg Studer

unread,
Aug 4, 2011, 12:55:21 PM8/4/11
to mongod...@googlegroups.com
> So we added it using the "ensureIndex" command
> ( db.MyCollectionk.ensureIndex({_id:1}) ), but we called
> "repairDatabase()" right after it which seems to eliminate it

Looks like that may be a bug - opened a JIRA for it. Behavior will
probably be the same in 1.8, just fyi.

Reply all
Reply to author
Forward
0 new messages