Timers and implementation of a timeout system on requests

Nathanael Curin

unread,

May 9, 2019, 11:02:56 AM5/9/19

to golang-nuts

Hi everyone,

Searching Go's documentation and this group didn't really help me find what I'm looking for so, here goes.

I'd like to implement a timeout system on every request to my HTTP server that would work like this :

Receive an http.Request, perform initial checks on validity
Start a timeout time.Timer of N milliseconds
Send the Request + a context.Context to a goroutine, answering through a response channel when its job is done
Wait in a Select for either channel (timer.C or responseChannel)

If the Select goes in the responseChannel branch, I can close my timer, and write my HTTP Response. Otherwise, my timer expired, I have to answer to my HTTP client, close my Context, and simply discard whatever is sent to the responseChannel afterwards, in the event that this actually happens.

A few questions about this implementation :

(Technical stuff) How exactly are Timers and Tickers implemented in the runtime / in the OS? Is there a hard limit? Soft limit? Is it CPU-bound? Core-bound?...
If I received, let's say, 5000 queries per second, and every query has 100ms of timeout (so, 500 potential simultaneous timers - in practice, probably a bit more), would every timer really be perfectly stable? How can you make sure of this, debug, and monitor timer expirations?
Last but not least, admitting that Go's scheduler actually answers perfectly fine at the timer's expiration, how can I make sure that the end of the code after the Select/Case runs without stopping? Can the routine get "unscheduled" after the timer's expiration, but before writing the HTTP Response for some reason?

Thanks for the insight. Don't hesitate to ask for precisions if necessary.

Burak Serdar

unread,

May 9, 2019, 11:23:27 AM5/9/19

to Nathanael Curin, golang-nuts

On Thu, May 9, 2019 at 9:03 AM Nathanael Curin <n.c...@capitaldata.fr> wrote:
>
> Hi everyone,
>
> Searching Go's documentation and this group didn't really help me find what I'm looking for so, here goes.
>
> I'd like to implement a timeout system on every request to my HTTP server that would work like this :
>
> Receive an http.Request, perform initial checks on validity
> Start a timeout time.Timer of N milliseconds
> Send the Request + a context.Context to a goroutine, answering through a response channel when its job is done
> Wait in a Select for either channel (timer.C or responseChannel)

You can use context.WithTimeout() for this. You can do:

request=request.WithContext(context.WithTimeout(request.Context(),
100*time.Millisecond))

and send the request to your goroutine. The context will be canceled
after the timeout. During processing, you should check if the context
is still alive and return if it timed out.

Each timer will run it its own goroutine, so it'll take 2K of memory
for each. I don't know how accurate those timers would be, though. You
could record and log the difference between the time you start
processing and a timeout happens and see how well it scales.

When the context times out, the select waiting on the cancel channel
will wake up, and then you can execute any cleanups necessary. A
timeout will not "unschedule" a goroutine, it'll simply close a
channel.

>
> If the Select goes in the responseChannel branch, I can close my timer, and write my HTTP Response. Otherwise, my timer expired, I have to answer to my HTTP client, close my Context, and simply discard whatever is sent to the responseChannel afterwards, in the event that this actually happens.
>
> A few questions about this implementation :
>
> (Technical stuff) How exactly are Timers and Tickers implemented in the runtime / in the OS? Is there a hard limit? Soft limit? Is it CPU-bound? Core-bound?...
> If I received, let's say, 5000 queries per second, and every query has 100ms of timeout (so, 500 potential simultaneous timers - in practice, probably a bit more), would every timer really be perfectly stable? How can you make sure of this, debug, and monitor timer expirations?
> Last but not least, admitting that Go's scheduler actually answers perfectly fine at the timer's expiration, how can I make sure that the end of the code after the Select/Case runs without stopping? Can the routine get "unscheduled" after the timer's expiration, but before writing the HTTP Response for some reason?
>
> Thanks for the insight. Don't hesitate to ask for precisions if necessary.
>

> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/cdf6b347-bfb9-44ce-b4d6-9d06602b1738%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Nathanael Curin

unread,

May 10, 2019, 8:05:37 AM5/10/19

to golang-nuts

Good point on the implementing side of things, it's cleaner. I'm still really curious of the limits and implementation details - There has to be some kind of limit where things start to become erratic. If anyone wants to chime in :)

> To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.

Robert Engels

unread,

May 10, 2019, 8:59:17 AM5/10/19

to Nathanael Curin, golang-nuts

I don’t think your requirements are completely specified. For example, you say the timeout is 100ms - nothing is ever exact - what is the tolerance in the delay before it is cancelled ? Are the calls in the handler even cancelable? What type of hardware (64+ cores?)

I think this is why you are hearing crickets.

To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/cccb64eb-2fc5-403e-a663-7f12996f1b38%40googlegroups.com.

Robert Engels

unread,

May 10, 2019, 9:04:26 AM5/10/19

to Nathanael Curin, golang-nuts

That being said, 5000 requests per second is pretty low on any reasonable hardware. You can review github/robaho/go-trader - it does 30k requests per sec on desktop machines.

To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/F55588DC-4BC9-4D7E-9E33-AEC2929AA85A%40ix.netcom.com.

Robert Engels

unread,

May 10, 2019, 9:13:45 AM5/10/19

to Nathanael Curin, golang-nuts

One other point, at 5k req/s sustained each request must on average complete in 200us (if using a single core) otherwise you will not make your deadlines and you will run out of memory. So 100 ms is way outside the threshold needed. Clearly by upping the parallelism you can increase the time interval but you get the general idea.

To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/76191CAD-4945-4E81-833D-2D9F99C4D6CD%40ix.netcom.com.

Nathanael Curin

unread,

May 10, 2019, 9:14:31 AM5/10/19

to golang-nuts

The server has 12 cores (2 cpu*6 cores). The tolerance would be about +-10ms, I'd probably check how much time I have on the fly on every request, and remove 10-15ms just "for safety" if anything hangs for a bit too long.

I just ran a few tests with a collegue using N simultaneous context.WithTimeout at 100ms. We found that, on an 8 core idle machine, and 100k simultaneous routines, the measured time was :

MIN 100010µs - MAX 138099µs - AVG 103387.58519µs

I could graph results but I feel like this is enough for a conclusion - timers are precise enough for my needs. 100k timers is just completely overkill for our situation, and testing with 10k instead gives us 100ms for all 3 metrics (min/max/avg).

There might be an issue though, seeing the "MAX" value written above, with the Scheduler. We found that there might be a random delay between the context's timeout and its actual check in a routine, even though there's just no code in between. Illustrated :

now := time.Now()
timeoutCtx, cancel := context.WithTimeout(c, time.Millisecond*100)
go func(ctx context.Context, cancel context.CancelFunc, now time.Time) {
   select {
   case <-ctx.Done():
      timeChan <- time.Now().Sub(now).Nanoseconds() / 1000
   }
}(timeoutCtx, cancel, now)

This code (wrapped in a N iterations for loop) that can produce 130+ms responses through the channel - Calling runtime.Gosched() right after the goroutine, in the for loop, seems to stabilize the results a bit more, but it's just too janky for me.

Nathanael Curin

unread,

May 10, 2019, 9:18:01 AM5/10/19

to golang-nuts

Oh we are definitely not using a single core, haha. Every request is its own set of routines, being careful to not overspawn them though.

To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/cccb64eb-2fc5-403e-a663-7f12996f1b38%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.

To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/F55588DC-4BC9-4D7E-9E33-AEC2929AA85A%40ix.netcom.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.

To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.

Reply all

Reply to author

Forward