Help debugging atomic.AddInt64 usage

179 views
Skip to first unread message

James Hartig

unread,
Aug 2, 2018, 11:12:28 AM8/2/18
to golang-nuts
We are having a weird case where an atomic number was increasing and not decreasing, despite calls to decrease it. Hopefully, someone on the list can point out something I'm doing wrong with the atomic package.

The function with the issue is:

func UnhealthyOverConcurrent(prefix string, limit int64) Middleware {
var concur int64
lhealth.Register(lflag.Prefixed("lhttp-concurrent", prefix), func() error {
c := atomic.LoadInt64(&concur)
if c <= limit {
return nil
}
return fmt.Errorf("requests at %d and limit is %d", c, limit)
})
return func(h http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
atomic.AddInt64(&concur, 1)
h.ServeHTTP(w, r)
atomic.AddInt64(&concur, -1)
})
}
}

We started to observe that the error was being returned from the register callback because c was over 500 (limit value in this case) but there was definitely not 500 concurrent connections. There actually was only around 50. The lhealth.Register callback is called roughly every second and until the service was restarted it had stayed constant over 500. This happened to multiple instances of the service across multiple servers. The weird thing is that we don't use atomic in another spot that also happens to track concurrent requests and it didn't have the issue.

That spot roughly does:
func Metrics() Middleware {
var l sync.Mutex
var val float64

incr := func(by float64) {
l.Lock()
defer l.Unlock()
val += by
}

return func(h http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
incr(1)
h.ServeHTTP(w, r)
incr(-1)
})
}
}

This second instance reported the correct number of concurrent requests while the other function was off by a factor of 10.

Is there anything I'm doing wrong with the atomic package? Is it better to just switch to a locking version like the second case? Since an http handler can panic and cause the counts to be off, we are switching the decrements to be deferred but we don't see any panics in the logs during this time period.
Message has been deleted
Message has been deleted
Message has been deleted

keith....@gmail.com

unread,
Aug 2, 2018, 8:04:48 PM8/2/18
to golang-nuts
This is definitely strange.  You're using the atomics correctly.
Are you on a 32-bit platform by any chance?  If so, try replacing 64-bit atomics with 32-bit atomics.  The 64-bit atomics should work on 32-bit platforms, but it wouldn't hurt to test that theory.

If you can extract a standalone test case, please file an issue at https://github.com/golang/go/issues

nanmu42

unread,
Aug 2, 2018, 8:59:42 PM8/2/18
to golang-nuts
Give lower conccurency a try, like 10 requests per second.

If the counter is still keeping up, never down, there may be a real issue.


Handling http request takes time, it is possible that due to slower action of Mutex, the system has less concurrent load.

Bas van Beek

unread,
Aug 3, 2018, 8:45:01 AM8/3/18
to golang-nuts
That should work and probably is working. I expect h.ServeHTTP not returning as quickly as you expect it to.

What you could try to do to test if there is an issue with decreasing the atomic value is to split up concur into two counters, one for entry and one for exit. The difference between the two would give you an approximation of concur. 

er...@pauley.me

unread,
Aug 3, 2018, 12:57:29 PM8/3/18
to golang-nuts
Is your handler panicking? If so there are also other issues to address, but you can test/quickfix this by deferring your decrement so it runs regardless:


atomic.AddInt64(&concur, 1)
defer atomic.AddInt64(&concur, -1)
h.ServeHTTP(w, r)
Reply all
Reply to author
Forward
0 new messages