Excellent! Thanks for the detailed report and the patch. I'll see if I can
verify it here and get it applied asap. Might move the mutex to the
threads file, but the idea is sound.
Since you're using the clock event to re-enable connections, this means
that hitting maxconns will lock up memcached for a worst case minimum of 1
full second?
Perhaps a conditional could be used to wake up the main thread faster?
On a similar note (but wouldn't block this patch at all), I've been
wanting to change the default behavior in 1.6 to never close the accept()
routine, but instead instant-drop excessive connections. It makes more
sense for what memcached is, and I'd like to see an option in 1.4.x for
enabling such behavior to test with.
That avoids the issue in another way entirely, but I think we'll need both
fixes :P
-Dormando
Regardless of using pthread's conditional variable or not, the idea is consistent.
It's truly awesome that a test script is provided. Great work Mala! and Twittervese for investigating the problem.
-Toru
> But if the main thread waits for the conditional variable, who drives
> the
> main event loop?
>
> Preparing the pipe fd for notifying max conn, and watch it by the
> event loop
> should be the way.
I should've been more purposefully vague; "lets find the easy way to make
it wake up faster". Using the pipe fd sounds like the right idea there.
> +1. Never saw the application which call listen(fd, 0), to drop the
> new
> connection. I thought if OS kernel reaches the max backlog, then it
> starts
> dropping the new packets. So no need to handle such case in
> application,
> don't you?
(repeating from IRC for posterity): I like shoving an "ERROR: Too many
connections" down the pipe before closing it. Gives client authors the
chance to throw useful errors for the developer. Otherwise it's likely to
manifest as failed gets/sets with no errors.
-Dormando