Go schedule latency issue

303 views
Skip to first unread message

sotte...@gmail.com

unread,
Jan 18, 2018, 11:23:43 AM1/18/18
to golang-nuts
I developed an RPC Service with high requirements for real-time performance. There was an unexpected delay at a small probability, and I do not know how this happened. I suspect this is go schedule related, of course, this may also be caused by a code bug. 

Anyone can solve this problem, it will be a great help to me!

Following is my test process. trace.out in the attachment.


GO-Version:OS X go1.9.2  go1.10beta2   compile package ::go1.10beta2.darwin-amd64.tar.gz
Net : localhost To localhost
QPS : 11K

RT statistics 

TestTime

Test Count

0-1ms

1-2ms

2-5ms

5-10ms

10-20ms

20+ms

2018-01-16

58409

99.97432%

0.022256844%

0.00342413%

0

0

0

51707

99.99226%

77358964%

0

0

0

0

54650

99.93413 %


0.027447393%

0.021957913%

0.0036596523%

0.0054894784%

0.0073193046%



By analyzing trace.out, in addition to the GC, I found two places that lead to very long blocks
  • Net IO operator 
  • Channel operator 





































ReadLoop Code :

for {
		head := c.dataBuf[0:WafBaseHeaderLen]

		_, err := io.ReadFull(c.TcpConn, head)
		if err != nil {
			log.Println("WafConn Err :", err.Error())
			return nil, err
		}

		dataLen, err := c.GetLength(head)
	        
//omit some exception handler
           
		_, err = io.ReadFull(c.TcpConn, c.dataBuf[WafBaseHeaderLen: WafBaseHeaderLen+dataLen])
		if err != nil {
			log.Println("WafConn Err :", err.Error())
			return nil, err
		}

		msg = &conncodec.WafMessage{
		}
		err = msg.Decode(c.dataBuf[:WafBaseHeaderLen + dataLen])
		return msg, err
	}


 channel init 
waf_resp := make(chan WafProxyResp, 1)
waiting channel read event code: 
	select {
	case result := <-waf_resp:
	
		//do something
	case <-timer.C:  //10 ms timeout
		this.request_cache.DisposeRequest(request_id)
		monitor.QPSStatAddWafRecvTimeout(1)
		
		return GenWafResp(WAF_SUCCESS, "receive time out")
	}

Send Channel Code :
	select {
		case resp_chan <- resp:
			//只使用一次,写入完成后关闭;
			close(resp_chan)
		default:
			log.Print("Default fail !!!!! request id : ", request_id)
		}

trace.schedule.latency.out

matthe...@gmail.com

unread,
Jan 18, 2018, 12:25:43 PM1/18/18
to golang-nuts
Is your test request source running on the same computer? Have you disabled OS power management features?

Matt

sotte...@gmail.com

unread,
Jan 19, 2018, 2:14:25 AM1/19/18
to golang-nuts
a. request client and server running on the same computer. 
b. Unfortunately, I have not find the method how to disable OS power management. I re-running this case on linux server,  result shows that there is almost no time out. I suspect the P's block is affected by the external operating environment, and the M was not scheduled in time.

在 2018年1月19日星期五 UTC+8上午1:25:43,matthe...@gmail.com写道:

matthe...@gmail.com

unread,
Jan 19, 2018, 9:24:31 AM1/19/18
to golang-nuts
It looks like MacOS had CPU PM options removed from pmset in the last few years, so I’m not sure it’s possible to disable in recent versions. On Ubuntu I was able to get cpufrequtils to work a few months ago: https://askubuntu.com/questions/523640/how-i-can-disable-cpu-frequency-scaling-and-set-the-system-to-performance


Matt

sotte...@gmail.com

unread,
Jan 21, 2018, 9:58:34 PM1/21/18
to golang-nuts
I tested two scenarios for the priority value(19 and -19). The priority setting does affect the waf-client rt : 


➜  waf-client git:(master) ✗ nice -n -19 ./waf-client
nice: setpriority: Permission denied
2018/01/22 10:37:47 Add Client :  0.0.0.0:8000
2018/01/22 10:37:47  ----------------------- trace begin!!!
2018/01/22 10:37:47 Total :  2956
2018/01/22 10:37:47 0-1ms  :  4.634641 %
2018/01/22 10:37:47 1-2ms  :  0 %
2018/01/22 10:37:47 2-5ms  :  0 %
2018/01/22 10:37:47 5-10ms :  0 %
2018/01/22 10:37:47 10-20ms:  0 %
2018/01/22 10:37:47 20+ms  :  0 %
2018/01/22 10:37:52 Total :  51903
2018/01/22 10:37:52 0-1ms  :  99.98266 %
2018/01/22 10:37:52 1-2ms  :  0.015413367 %
2018/01/22 10:37:52 2-5ms  :  0.0019266709 %
2018/01/22 10:37:52 5-10ms :  0 %
2018/01/22 10:37:52 10-20ms:  0 %
2018/01/22 10:37:52 20+ms  :  0 %
2018/01/22 10:37:55  ----------------------- trace stop!!!
2018/01/22 10:37:57 Total :  57744
2018/01/22 10:37:57 0-1ms  :  99.99827 %
2018/01/22 10:37:57 1-2ms  :  0.0017317816 %
2018/01/22 10:37:57 2-5ms  :  0 %
2018/01/22 10:37:57 5-10ms :  0 %
2018/01/22 10:37:57 10-20ms:  0 %
2018/01/22 10:37:57 20+ms  :  0 %
2018/01/22 10:38:02 Total :  63337
2018/01/22 10:38:02 0-1ms  :  99.99526 %
2018/01/22 10:38:02 1-2ms  :  0.004736568 %
2018/01/22 10:38:02 2-5ms  :  0 %
2018/01/22 10:38:02 5-10ms :  0 %
2018/01/22 10:38:02 10-20ms:  0 %
2018/01/22 10:38:02 20+ms  :  0 %
2018/01/22 10:38:07 Total :  63352
2018/01/22 10:38:07 0-1ms  :  99.99526 %
2018/01/22 10:38:07 1-2ms  :  0.0047354465 %
2018/01/22 10:38:07 2-5ms  :  0 %
2018/01/22 10:38:07 5-10ms :  0 %
2018/01/22 10:38:07 10-20ms:  0 %
2018/01/22 10:38:07 20+ms  :  0 %
2018/01/22 10:38:12 Total :  62601
2018/01/22 10:38:12 0-1ms  :  99.984024 %
2018/01/22 10:38:12 1-2ms  :  0.004792256 %
2018/01/22 10:38:12 2-5ms  :  0.007987093 %
2018/01/22 10:38:12 5-10ms :  0 %
2018/01/22 10:38:12 10-20ms:  0.0031948371 %
2018/01/22 10:38:12 20+ms  :  0 %
^C
➜  waf-client git:(master) ✗ nice -n 19 ./waf-client
2018/01/22 10:38:20 Add Client :  0.0.0.0:8000
2018/01/22 10:38:20  ----------------------- trace begin!!!
2018/01/22 10:38:20 Total :  36113
2018/01/22 10:38:20 0-1ms  :  5.0812726 %
2018/01/22 10:38:20 1-2ms  :  0.0027690858 %
2018/01/22 10:38:20 2-5ms  :  0 %
2018/01/22 10:38:20 5-10ms :  0 %
2018/01/22 10:38:20 10-20ms:  0 %
2018/01/22 10:38:20 20+ms  :  0 %
2018/01/22 10:38:25 Total :  54688
2018/01/22 10:38:25 0-1ms  :  99.97074 %
2018/01/22 10:38:25 1-2ms  :  0.025599767 %
2018/01/22 10:38:25 2-5ms  :  0 %
2018/01/22 10:38:25 5-10ms :  0.0018285547 %
2018/01/22 10:38:25 10-20ms:  0.0018285547 %
2018/01/22 10:38:25 20+ms  :  0 %
2018/01/22 10:38:28  ----------------------- trace stop!!!
2018/01/22 10:38:28 CheckRequest time :  2502
2018/01/22 10:38:30 Total :  58819
2018/01/22 10:38:30 0-1ms  :  99.983 %
2018/01/22 10:38:30 1-2ms  :  0.015301178 %
2018/01/22 10:38:30 2-5ms  :  0.0017001309 %
2018/01/22 10:38:30 5-10ms :  0 %
2018/01/22 10:38:30 10-20ms:  0 %
2018/01/22 10:38:30 20+ms  :  0 %
2018/01/22 10:38:32 CheckRequest time :  2608
2018/01/22 10:38:32 CheckRequest time :  2372
2018/01/22 10:38:35 Total :  65278
2018/01/22 10:38:35 0-1ms  :  99.9663 %
2018/01/22 10:38:35 1-2ms  :  0.030638194 %
2018/01/22 10:38:35 2-5ms  :  0.0030638194 %
2018/01/22 10:38:35 5-10ms :  0 %
2018/01/22 10:38:35 10-20ms:  0 %
2018/01/22 10:38:35 20+ms  :  0 %
2018/01/22 10:38:40 Total :  61372
2018/01/22 10:38:40 0-1ms  :  99.96578 %
2018/01/22 10:38:40 1-2ms  :  0.016294075 %
2018/01/22 10:38:40 2-5ms  :  0.008147038 %
2018/01/22 10:38:40 5-10ms :  0.0048882226 %
2018/01/22 10:38:40 10-20ms:  0.0048882226 %
2018/01/22 10:38:40 20+ms  :  0 %


在 2018年1月19日星期五 UTC+8下午10:24:31,matthe...@gmail.com写道:

matthe...@gmail.com

unread,
Jan 22, 2018, 10:42:49 AM1/22/18
to golang-nuts
The higher priority -19 we're interested in has the Permission denied error message, the command needs to be run with elevated OS privileges. Is waf-client the program that has the real-time requirement?

If you separate the calling process out to a separate computer do you still see the delay?

Matt

Jesper Louis Andersen

unread,
Jan 22, 2018, 12:30:33 PM1/22/18
to matthe...@gmail.com, golang-nuts
Beware of running the test generator on the same machine as the SUT (system under test).

There are several things here:

* On localhost, the network stack bypasses several kernel parts. This may lead to wrong test data.
* The test generator might steal resources from the SUT, skewing the numbers
* Desktop machines are sensitive to external noise. A music player can be enough to mess with the data set.
* If you are not handling CPU powersaving or other such features, then you test may be wrong.
* If your target system is Linux, I'd recommend you don't use a MacOS benchmark as the final verdict.


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

sotte...@gmail.com

unread,
Jan 23, 2018, 2:10:54 AM1/23/18
to golang-nuts
thanks. 

I've test in a separate linux server(Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz * 48)


2018/01/23 14:41:14 CheckRequest time :  3393
2018/01/23 14:41:14  ----------------------- trace begin!!!
2018/01/23 14:41:14 CheckRequest time :  3021
2018/01/23 14:41:19 Total :  107214
2018/01/23 14:41:19 0-1ms  :  99.99813 %
2018/01/23 14:41:19 1-2ms  :  0 %
2018/01/23 14:41:19 2-5ms  :  0.001865428 %
2018/01/23 14:41:19 5-10ms :  0 %
2018/01/23 14:41:19 10-20ms:  0 %
2018/01/23 14:41:19 20+ms  :  0 %
2018/01/23 14:41:19  ----------------------- trace stop!!!
2018/01/23 14:41:24 Total :  138002
2018/01/23 14:41:24 0-1ms  :  100 %
2018/01/23 14:41:24 1-2ms  :  0 %
2018/01/23 14:41:24 2-5ms  :  0 %
2018/01/23 14:41:24 5-10ms :  0 %
2018/01/23 14:41:24 10-20ms:  0 %
2018/01/23 14:41:24 20+ms  :  0 %
2018/01/23 14:41:29 Total :  132976
2018/01/23 14:41:29 0-1ms  :  100 %
2018/01/23 14:41:29 1-2ms  :  0 %
2018/01/23 14:41:29 2-5ms  :  0 %
2018/01/23 14:41:29 5-10ms :  0 %
2018/01/23 14:41:29 10-20ms:  0 %
2018/01/23 14:41:29 20+ms  :  0 %
2018/01/23 14:41:34 Total :  140598
2018/01/23 14:41:34 0-1ms  :  100 %
2018/01/23 14:41:34 1-2ms  :  0 %
2018/01/23 14:41:34 2-5ms  :  0 %
2018/01/23 14:41:34 5-10ms :  0 %
2018/01/23 14:41:34 10-20ms:  0 %
2018/01/23 14:41:34 20+ms  :  0 %


在 2018年1月23日星期二 UTC+8上午1:30:33,Jesper Louis Andersen写道:

sotte...@gmail.com

unread,
Jan 23, 2018, 2:16:29 AM1/23/18
to golang-nuts
Is waf-client the program that has the real-time requirement?
-> yes. I've done a lot of optimization in terms of performance and GC. This is the last remain latency issue I fount, and I've test in a separate linux env.

在 2018年1月22日星期一 UTC+8下午11:42:49,matthe...@gmail.com写道:
Reply all
Reply to author
Forward
0 new messages