Strange Go runtime behavior related to xml.Unmarshal

164 views
Skip to first unread message

Nikos Anastopoulos

unread,
Apr 27, 2017, 8:50:26 AM4/27/17
to golang-nuts
Hi list, 

I have a Go app that, among others, continuously parses XML files to extract dynamic info. This seems to cause a strange behavior in the Go runtime, according to which the underlying Go threads keep increasing over time. 

I verified this by commenting out several parts of my app code, coming up to a minimal working version which essentially calls xml.Unmarshal repetitively. You can have a look at it here: https://gist.github.com/anastop/e533628e5624c81b8822fda512aefaab

When executing this minimal version on my platform (Ubuntu 16.04.2, Xeon E5-2699 / 88 CPUs, go1.8), I observe that it starts with 4 Linux threads in total and within 4 minutes it reaches e.g. 28 (and goes up), which is unusual w.r.t. what I observe with other Go apps. The problem becomes even worse when this code becomes part of my larger app, where I have observed Go thread counts reaching up to several hundreds (e.g. 300-400).

Just to mention that I do not see any other strange behavior like increase in memory usage. 

As I want this app to introduce a minimal level of noise to other co-running apps, I am wondering whether this is a known/expected behavior, or I should resort to other workarounds to meet my requirements. 

Thanks in advance, 
Nikos

Ian Lance Taylor

unread,
Apr 28, 2017, 5:39:13 PM4/28/17
to Nikos Anastopoulos, golang-nuts
I tried your test program, but I can't recreate this. How are you
measuring the number of threads used by the program?

Ian

Nikos Anastopoulos

unread,
Apr 29, 2017, 12:33:00 AM4/29/17
to golang-nuts
Either through the /proc/PID/tasks entries, or using GODEBUG=schedtrace=1000.I get consistent results with both ways

Nikos Anastopoulos

unread,
Apr 29, 2017, 6:09:08 AM4/29/17
to golang-nuts
Τη Σάββατο, 29 Απριλίου 2017 - 7:33:00 π.μ. UTC+3, ο χρήστης Nikos Anastopoulos έγραψε:
Either through the /proc/PID/tasks entries, or using GODEBUG=schedtrace=1000.I get consistent results with both ways

 With some experimentation more I did, I tend to believe this behavior has somehow to do with the number of CPUs of the platform; or to be more precise, to the number of CPUs made available to the Go app through the CPU affinity mask. When reducing the available CPUs from 88 to 44, 22, 11, and so on, the total number of threads after a 4-min run decreases as well, as you can see from the detailed results below (look at the "threads" field). So it is quite possible that if you try to reproduce the test on a low-count CPU environment you won't see anything weird. 

Here are my results on my 88-CPU platform: 

export GODEBUG=schedtrace=1000

1. [default, 88-cpus]: ./bench 
//after 4 min
SCHED 240224ms: gomaxprocs=88 idleprocs=88 threads=29 spinningthreads=0 idlethreads=27 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

2. [44-cpus]: $ taskset -ca 0-43 ./bench
//after 4 min
SCHED 240143ms: gomaxprocs=44 idleprocs=44 threads=26 spinningthreads=0 idlethreads=24 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

3. [22-cpus]: $ taskset -ca 0-21 ./bench
//after 4 min
SCHED 240248ms: gomaxprocs=22 idleprocs=22 threads=23 spinningthreads=0 idlethreads=21 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

4. [11-cpus]: $ taskset -ca 0-10 ./bench
//after 4 min
SCHED 240189ms: gomaxprocs=11 idleprocs=11 threads=12 spinningthreads=0 idlethreads=10 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0]
239 81393

5. [8-cpus]: $ taskset -ca 0-7 ./bench 
//after 4 min
SCHED 240201ms: gomaxprocs=8 idleprocs=8 threads=9 spinningthreads=0 idlethreads=7 runqueue=0 [0 0 0 0 0 0 0 0]

6. [4-cpus]: $ taskset -ca 0-3 ./bench
//after 4 min
SCHED 240184ms: gomaxprocs=4 idleprocs=4 threads=5 spinningthreads=0 idlethreads=3 runqueue=0 [0 0 0 0]

7. [2-cpus]: $ taskset -ca 0,1 ./bench 
//after 4 min
SCHED 240171ms: gomaxprocs=2 idleprocs=2 threads=4 spinningthreads=0 idlethreads=2 runqueue=0 [0 0]

8. [1-cpu]: $ taskset -ca 0 ./bench 
//after 4 min
SCHED 239437ms: gomaxprocs=1 idleprocs=1 threads=4 spinningthreads=0 idlethreads=2 runqueue=0 [0]


I also checked if GOMAXPROCS is directly related, but I don't think so. For example, on a 2-CPU VM, I can perfectly set GOMAXPROCS=88 and run the app, but after the 4-min period the threads have reached a total count of 6: 
$ GOMAXPROCS=88 GODEBUG=schedtrace=1000 ./bench 
SCHED 240298ms: gomaxprocs=88 idleprocs=88 threads=6 spinningthreads=0 idlethreads=4 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
239 81393

Ian Lance Taylor

unread,
Apr 29, 2017, 7:36:15 PM4/29/17
to Nikos Anastopoulos, golang-nuts
On Sat, Apr 29, 2017 at 3:09 AM, Nikos Anastopoulos <n.an...@gmail.com> wrote:
> Τη Σάββατο, 29 Απριλίου 2017 - 7:33:00 π.μ. UTC+3, ο χρήστης Nikos
> Anastopoulos έγραψε:
>>
>> Either through the /proc/PID/tasks entries, or using
>> GODEBUG=schedtrace=1000.I get consistent results with both ways
>
>
> With some experimentation more I did, I tend to believe this behavior has
> somehow to do with the number of CPUs of the platform; or to be more
> precise, to the number of CPUs made available to the Go app through the CPU
> affinity mask. When reducing the available CPUs from 88 to 44, 22, 11, and
> so on, the total number of threads after a 4-min run decreases as well, as
> you can see from the detailed results below (look at the "threads" field).
> So it is quite possible that if you try to reproduce the test on a low-count
> CPU environment you won't see anything weird.

All else being equal, which it never is, the number of threads used by
Go will tend to level out at about GOMAXPROCS plus a bit. If you have
88 available CPUs, GOMAXPROCS will default to 88, and over time you
should expect to see your program using around 92 threads.

If you see the number of threads growing significantly beyond
GOMAXPROCS, without an obvious explanation like blocking C threads,
then there may be a problem.

Ian

Nikos Anastopoulos

unread,
May 2, 2017, 5:43:20 AM5/2/17
to golang-nuts, n.an...@gmail.com
Thanks for your feedback.  As a last question, what is the application aspect (or other factors) that determines how quickly new threads are being employed over time? For example, in my scenario above, if I choose to do the parsing using plain regexp rather than xml.Unmarshal, the threads will reach a count of ~8 within the same period (4 mins), vs 28 when using xml.Unmarshal. In some other Go apps, threads will remain constant (i.e., 4) forever. Where is this difference attributed?

Ian Lance Taylor

unread,
May 2, 2017, 12:31:14 PM5/2/17
to Nikos Anastopoulos, golang-nuts
On Tue, May 2, 2017 at 2:43 AM, Nikos Anastopoulos <n.an...@gmail.com> wrote:
>
> Thanks for your feedback. As a last question, what is the application
> aspect (or other factors) that determines how quickly new threads are being
> employed over time? For example, in my scenario above, if I choose to do the
> parsing using plain regexp rather than xml.Unmarshal, the threads will reach
> a count of ~8 within the same period (4 mins), vs 28 when using
> xml.Unmarshal. In some other Go apps, threads will remain constant (i.e., 4)
> forever. Where is this difference attributed?

I don't know that there is any one thing. I would expect to be based
primarily on how many different goroutines are started and how often
those goroutines block in a system call. In general the Go runtime
does not try to limit threads to less than GOMAXPROCS.

Ian
Reply all
Reply to author
Forward
0 new messages