问题困惑了我两个礼拜,一直找不到原因所以在此请教,长话短说。
upstream 服务器是一个 Tomcat (这不重要)
我先用 wrk 对 Tomcat 做性能测试,得到基准数值:
./wrk -c 500 -t 4 -d 1m --latency http://<tomcat>:8080/docs/config/filter.html
Running 1m test @ http://172.18.10.210:8080/docs/config/filter.html
4 threads and 500 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 64.64ms 58.20ms 914.81ms 84.37%
Req/Sec 2.25k 429.65 3.55k 71.32%
Latency Distribution
50% 49.41ms
75% 95.27ms
90% 136.19ms
99% 269.55ms
535553 requests in 1.00m, 46.54GB read
Requests/sec: 8910.95
Transfer/sec: 792.88MB
然后对 nginx (反向代理到 Tomcat)做性能测试:
./wrk -c 500 -t 4 -d 1m --latency http://<nginx>:8080/docs/config/filter.html
Running 1m test @ http://172.18.10.211:8080/docs/config/filter.html
4 threads and 500 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 123.56ms 134.94ms 1.85s 88.98%
Req/Sec 1.27k 240.43 1.92k 73.32%
Latency Distribution
50% 104.98ms
75% 167.39ms
90% 270.04ms
99% 625.88ms
302464 requests in 1.00m, 26.29GB read
Requests/sec: 5035.75
Transfer/sec: 448.14MB
单从吞吐量上来说,性能损失居然高达 (8910-5177)/8910=41.89%,而延迟也几乎是翻倍的。实在是百思不得其解。
附件是 nginx 配置文件,systemtap oncpu 和 offcpu 的火焰图。
不用怀疑 Tomcat 的垃圾收集,我观察过一切正常。
希望有大佬能解决小弟的疑惑,先谢过了。