memcached race condition problem

瀏覽次數:370 次
跳到第一則未讀訊息

morimoto

未讀,
2010年8月25日 凌晨1:26:382010/8/25
收件者:memcached
To memcached community,

I have found memcached race condition bug, and here is the detail and
patch that will fix this problem. We would love to hear what this
community think of this patch.

Under normal circumstances, each worker thread accesses its own
event_base
and same goes with main thread. We have found that under specific
situation where worker thread access main thread's event_base
(main_base)
resulting in unexepected result.


[Example Scenario]
1. throw alot clients (well over connection limit) to connect to
memcached.
2. memcached's file descriptors reaches maximum setting
3. main thread calls accept_new_conns(false) to stop polling sfd
4. main thread's event_base_loop stops accepting incoming request
5. main thread stops to acceess main_base at this point
6. a client disconnects
7. worker thread calls accept_new_conns(true) to start polling sfd
8. accept_new_conns uses mutex to protect main_base's race condition
9. worker thread starts loop with listen_conn
10. worker thread calls update_event with first conn
11. after first update_event(), main thread start polling sfd and
starts to access main_base <- PROBLEM
12. Worker thread continues to call update_event() with second conn

At this point, worker thread and main thread both acccess and modify
main_base.

With incorrect event_count, event_count is set to zero while there
is an actual event waiting. The result? memcached passes through
event_base_loop() quietly shutting down daemon.


[Quick Fix]
Set memcached to only listen to a single interface.

example memcached setting:
> memcached -U 0 -u nobody -p 11222 -t 4 -m 16000 -C -c 1000 -l 192.168.0.1 -v


[Reproducing]
Use attack script (thanks to mala): http://gist.github.com/522741

w/ -l interface restriction: we have seen over 70 hours of stability
- yes, you will see "Too many open connections." but that's not an
issue here

w/o -l interface restriction: memcached quits w/ attack script


Please give us some feedback on attach patch. This should fix the
race condition we have experienced.

At last, we would like to thank numerous contributors on twitter that
helped us nail this problem. http://togetter.com/li/41702

Shigeki

morimoto

未讀,
2010年8月25日 凌晨1:42:582010/8/25
收件者:memcached
I forgot to attach path.
here is the URL for patch.
http://kzk9.net/blog/memcached-fix-accept-new-conns-race.patch.txt

morimoto

未讀,
2010年8月25日 凌晨1:45:222010/8/25
收件者:memcached
This patch was made by Kazuki Ohta(@kaz_mover).

dormando

未讀,
2010年8月25日 凌晨2:57:092010/8/25
收件者:memcached
>
> [Quick Fix]
> Set memcached to only listen to a single interface.
>
> example memcached setting:
> > memcached -U 0 -u nobody -p 11222 -t 4 -m 16000 -C -c 1000 -l 192.168.0.1 -v
>
>
> [Reproducing]
> Use attack script (thanks to mala): http://gist.github.com/522741
>
> w/ -l interface restriction: we have seen over 70 hours of stability
> - yes, you will see "Too many open connections." but that's not an
> issue here
>
> w/o -l interface restriction: memcached quits w/ attack script
>
>
> Please give us some feedback on attach patch. This should fix the
> race condition we have experienced.
>
> At last, we would like to thank numerous contributors on twitter that
> helped us nail this problem. http://togetter.com/li/41702
>
> Shigeki

Excellent! Thanks for the detailed report and the patch. I'll see if I can
verify it here and get it applied asap. Might move the mutex to the
threads file, but the idea is sound.

Since you're using the clock event to re-enable connections, this means
that hitting maxconns will lock up memcached for a worst case minimum of 1
full second?

Perhaps a conditional could be used to wake up the main thread faster?

On a similar note (but wouldn't block this patch at all), I've been
wanting to change the default behavior in 1.6 to never close the accept()
routine, but instead instant-drop excessive connections. It makes more
sense for what memcached is, and I'd like to see an option in 1.4.x for
enabling such behavior to test with.

That avoids the issue in another way entirely, but I think we'll need both
fixes :P

-Dormando

Toru Maesaka

未讀,
2010年8月25日 凌晨3:03:322010/8/25
收件者:memc...@googlegroups.com
+1

Regardless of using pthread's conditional variable or not, the idea is consistent.

It's truly awesome that a test script is provided. Great work Mala! and Twittervese for investigating the problem.

-Toru

Kazuki Ohta

未讀,
2010年8月25日 上午11:34:132010/8/25
收件者:memcached
Hi, dormando.

I'm Kazuki, the original author of the above patch.

> Since you're using the clock event to re-enable connections, this means
> that hitting maxconns will lock up memcached for a worst case minimum of 1
> full second?

Yes, I also think this could be a problem in some situations, but is
able to
avoid the potential race condition described above.

> Perhaps a conditional could be used to wake up the main thread faster?

But if the main thread waits for the conditional variable, who drives
the
main event loop?

Preparing the pipe fd for notifying max conn, and watch it by the
event loop
should be the way.

> On a similar note (but wouldn't block this patch at all), I've been
> wanting to change the default behavior in 1.6 to never close the accept()
> routine, but instead instant-drop excessive connections. It makes more
> sense for what memcached is, and I'd like to see an option in 1.4.x for
> enabling such behavior to test with.

+1. Never saw the application which call listen(fd, 0), to drop the
new
connection. I thought if OS kernel reaches the max backlog, then it
starts
dropping the new packets. So no need to handle such case in
application,
don't you?

-Kazuki

dormando

未讀,
2010年8月25日 上午11:56:292010/8/25
收件者:memcached
Hey,

> But if the main thread waits for the conditional variable, who drives
> the
> main event loop?
>
> Preparing the pipe fd for notifying max conn, and watch it by the
> event loop
> should be the way.

I should've been more purposefully vague; "lets find the easy way to make
it wake up faster". Using the pipe fd sounds like the right idea there.

> +1. Never saw the application which call listen(fd, 0), to drop the
> new
> connection. I thought if OS kernel reaches the max backlog, then it
> starts
> dropping the new packets. So no need to handle such case in
> application,
> don't you?

(repeating from IRC for posterity): I like shoving an "ERROR: Too many
connections" down the pipe before closing it. Gives client authors the
chance to throw useful errors for the developer. Otherwise it's likely to
manifest as failed gets/sets with no errors.

-Dormando

回覆所有人
回覆作者
轉寄
0 則新訊息