epoll wait too many fd can make big latency ?

709 views
Skip to first unread message

James Hui

unread,
May 23, 2015, 3:13:58 PM5/23/15
to redi...@googlegroups.com
it's a funny scene; i'm using nginx+php+redis architecture; i used webbench to test the website performance; like this: webbench -t60 -c1000 http://10.3.128.3/index.php
but the page is very slow when i test. And i use redis-cli --latency, i saw the 60-100 avg latency, it's unbelievable

load average:0.36, php-fpm children:50, http-qps:1663, redis-qps:3300-3400, redis-latency: 8-25

i think the load average is very health, but the http-qps is not fast, so i change the php-fpm children to 200.

load average:0.18, php-fpm children:200, http-qps:1576, redis-qps: 3300-3400, redis-latency: 60-100
when i change the php-fpm children to 200, it have no effect, and the redis-latency is to big!
so i try to change the php-fpm children to small...

load average:8.01, php-fpm children:34, http-qps:4548, redis-qps: 10000-11000, redis-latency: 0-4
what happened! when i change the php-fpm children to 34, the redis-qps is fast, and the http-qps is fast too. 
i think the redis latency is the problem. so i try again.

load average:3.94, php-fpm children:35, http-qps:3262, redis-qps: 4000-4500, redis-latency: 0-13
it's unbelievable. just increase 1 php-fpm, have different result. i strace the redis, find the epoll_wait is always appear.

so, is it have some bug in redis use epoll?

Josiah Carlson

unread,
May 24, 2015, 2:50:13 AM5/24/15
to redi...@googlegroups.com
I don't know whether php-fpm is able to re-use connections (AFAIK it doesn't, but I don't use php, so someone else will need to correct me if I'm wrong), but your benchmark is broken. You are trying to run Redis on the same machine that nginx is running, which also runs your php-fpm processes. Every CPU cycle spent running your nginx or php processes will steal CPU from Redis. But, each php process will also let you hide the "network" round-trip latency (localhost and/or a unix domain socket are both faster than a remote network connection, but it is still greater than 0). When you start up 200 php-fpm processes, you are spending too much CPU on php. When you start up 34 php-fpm processes, you get better performance because php isn't stealing all of your CPU. You can try different process counts, or you can try setting the CPU affinity of all of your processes to try to give Redis its own core (so php doesn't fight with Redis for CPU).

 - Josiah


--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

James Hui

unread,
May 24, 2015, 4:47:36 AM5/24/15
to redi...@googlegroups.com
how to explain increase 34 php-fpm to 35 php-fpm?
and i used pconnect methoed, unix domain socket!
And in the benchmark, when use 34 php-fpm, cpu can use 100% (8 core cpus , lv 8), but increase to 35 php-fpm, lv is just 4, and increase to 50 or 200 php-fpm, lv is jst 0.1-0.5. so i think in the system connects dispatch or in redis connects dispatch may have some bug? 

在 2015年5月24日星期日 UTC+8下午2:50:13,Josiah Carlson写道:

Josiah Carlson

unread,
May 25, 2015, 1:37:01 AM5/25/15
to redi...@googlegroups.com
Looking at your benchmark numbers, php-fpm isn't cycling connections (the connections column in "redis-cli --stat" output). So the issue isn't Redis or the system cycling network connections.

In terms of handling requests over large numbers of connections, I am not seeing 2-3x changes in Redis command throughput or redis-benchsubstantial issues on my end when I benchmark on an  (X processes, each with its own connection to Redis, each making Y requests per connection):

      cl        req/client    ping/second
-------- -----------------    -----------
       8      131072 -> 239942
      12       87381 -> 266010
      16       65536 -> 280508
      24       43690 -> 276815
      32       32768 -> 255910
      48       21845 -> 250162
      64       16384 -> 235284
     128        8192 -> 223986

Looking at latency numbers from redis-benchmark, even at 256 connections, I'm not seeing any issues either:

josiah@josiah-linux ~/source:$ redis-benchmark -s /tmp/redis_6379.sock -n 1000000 -c 256 PING
====== PING ======
  1000000 requests completed in 2.96 seconds
  256 parallel clients
  3 bytes payload
  keep alive: 1

99.22% <= 1 milliseconds
99.98% <= 2 milliseconds
99.99% <= 3 milliseconds
100.00% <= 3 milliseconds
337837.84 requests per second

But when I start up a number of CPU-using processes to compete with Redis, my requests/second and latency get worse (as is expected):

josiah@josiah-linux ~/source:$ redis-benchmark -s /tmp/redis_6379.sock -n 1000000 -c 256 PING
====== PING ======
  1000000 requests completed in 4.69 seconds
  256 parallel clients
  3 bytes payload
  keep alive: 1

94.35% <= 1 milliseconds
96.11% <= 2 milliseconds
96.99% <= 3 milliseconds
98.18% <= 4 milliseconds
99.41% <= 5 milliseconds
99.88% <= 6 milliseconds
99.94% <= 7 milliseconds
100.00% <= 7 milliseconds
213356.09 requests per second

As I mentioned in my first reply, you are running into issues with CPU (or general resource) contention from your php processes. You hit a sweet spot around 30-40 php-fpm processes that maximize throughput because you are accidentally balancing your system resources between Redis, Nginx, and php. Too many php processes, and you starve Redis for CPU, which delays Redis from responding to requests, hence high latency. Too few php processes, and resources are left idle, but you have basically 0 latency. This is what someone would expect in theory, and is exactly what you are experiencing in practice.

As Greg Andrews mentioned in another thread, one of the major issues that people face with Redis performance issues is running Redis on the same machine as *other* processes that will compete with resources. That's what you are experiencing. Want to watch your problem basically disappear? Run the exact same setup on a machine with 2x as many CPU cores, but the same memory and everything else. You will be able to run more php processes for higher throughput and better latency.

 - Josiah

Reply all
Reply to author
Forward
0 new messages