hello everyone,
while using a master/slave cluster to serving my web application, I get the error "can't create thread. closing connection". I dug a lot but didn't find any solution. anyone who ran into the same issue could you please shed me some light?
here's the detail of my issue:
I'm using C# driver v1.1.0.4184 to connect to a slave instance which is running mongodb 2.0.7 (the master instance runs 2.0.2), with option maxPoolSize=300;slaveOk=true.
when the web server is online, I can see from the log that the connection qty increases very rapidly. and when it reaches max user process limit (ulimit -u = 1024), I begin to see the error:
Fri Sep 28 06:37:21 [initandlisten] connection accepted from xx.xx.xx.xx:64034 #1073 (1014 connections now open)
Fri Sep 28 06:37:21 [initandlisten] pthread_create failed: errno:11 Resource temporarily unavailable
Fri Sep 28 06:37:21 [initandlisten] can't create new thread, closing connection
after some internet searching, I decided to increase "ulimit -u" to 20480. I do noticed another message from the log. I can't find the exact log now, but it says something like mongodb is releasing unused connections. when I see this message, the connection qty decreases very quickly. however, new connections are created even more quickly. so after a longer period, I end up got the same error again.
then I tried to stop master/slave replication, and make both servers master. this time everything works fine, connection qty is always kept less than 250.
here's what I think:
assuming from the log, I think it's the C# driver who didn't know it has created enough connections in the connection pool, instead, it kept creating new ones. thus the old ones are never used again, after a period, mongodb thinks the old connections are not used anymore and released them. that's why I see the 2nd message.
what I didn't figure out is why does this issue only happen to slave instance? does C# driver use monogdb to store it's current connection pool size? because this way it can never write the qty to slave instance, and I guess that's why it kept creating new connections because it can't get how many connections are already created.
thank you for reading my long post. really hope someone can't help me figure out a solution.