I'll try to describe the system.
1)There is one process (WinService) which runs with threads and
inserts about 30 items per second to the first collection.
We created manual pool (which is a simple array of mongo client
objects initialized in advance) and each thread gets a different
object (we did it that way because the previous client we used -
'samus' - was not multithreaded, which caused a bottleneck when all
threads used the same object. We didn't know if the 10gen client
supports multithreading so it still works this way). the maximum size
of this array is 50, so no more than 50 objects will be created. We
don't know the exact number of thread being used because we use .NET 4
parallel extension for this.
2)There are 1250 processes which reads from the first collection, do
some work, and insert to the second collection. the average load is 10
reads/write per second. Each of these processes creates a mongo client
object when it start to work and dispose it (disconnect) when the work
is done.
3)There are 270 processes which reads from the second collection, do
some work and report to relational DB at the end. the average load is
10 reads/write per second. Each of these processes creates a mongo
client object when it start to work and dispose it (disconnect) when
the work is done.
Here is a snippet from our wrapper which creates the client object:
MongoConnectionStringBuilder mcsb = new
MongoConnectionStringBuilder("server=Machine1:28010");
mcsb.SafeMode = SafeMode.True;
mcsb.SlaveOk = true;
MongoServer server = MongoServer.Create(mcsb);
MongoDatabase db = server["MyDB"];
We don't call Connect command explicitly because the driver knows to
call implicitly.
so we just cal the insert command (in this example we insert the
object 'value'):
byte[] objectBytes = SerializeObject(value);
BsonDocument doc = new BsonDocument();
doc["_id"] = key; //a long number converted to string
doc["v"] = new BsonBinaryData(objectBytes);
doc["dt"] = DateTime.Now;
MongoDatabase[CollectionName].Insert(doc);
We used to have replication, but after the problem started we removed
it (we though it might caused the problem, but we were wrong)
We don't use sharding.
This question is related to the previous question regarding Save and
Insert.
We had very slow times after the upgrade and we thought the slowness
caused the error (because update is much more heavy than insert)
After changing to insert we got much better times, but the errors
still remain.
Wow, that was a long one :)
thanks for all your help,
--Idan.