Threads are blocked in the Computer.threadPoolForRemoting

40 views
Skip to first unread message

Victor Martinez

unread,
Feb 10, 2020, 8:16:42 AM2/10/20
to Jenkins Developers
I'm testing the google-compute-engine-plugin and it seems there are a bunch of blocked threads related to the below hudson.remoting methods:

  • hudson.remoting.Request.call(Request.java:154)
  • hudson.remoting.Channel.send(Channel.java:716)
See some of the stacktraces:

Computer.threadPoolForRemoting [#414]
priority:5 - threadId:0x00007f848800c000 - nativeId:0x1c73 - nativeId (decimal):7283 - state:BLOCKED
stackTrace:
java.lang.Thread.State: BLOCKED (on object monitor)
at hudson.remoting.Request.call(Request.java:154)
- waiting to lock <0x00000007b6b39490> (a hudson.remoting.Channel)
at hudson.remoting.Channel.call(Channel.java:954)
at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:624)
at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:418)
at com.google.jenkins.plugins.computeengine.ComputeEngineComputerLauncher.launch(ComputeEngineComputerLauncher.java:318)
at com.google.jenkins.plugins.computeengine.ComputeEngineComputerLauncher.launch(ComputeEngineComputerLauncher.java:224)
at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:290)
at hudson.slaves.SlaveComputer$$Lambda$378/348978491.call(Unknown Source)
at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


Computer.threadPoolForRemoting [#347] for obs11-linux-r6rpli id=25
priority:5 - threadId:0x00007f849427b000 - nativeId:0x1a93 - nativeId (decimal):6803 - state:BLOCKED
stackTrace:
java.lang.Thread.State: BLOCKED (on object monitor)
at hudson.remoting.Channel.send(Channel.java:716)
- waiting to lock <0x00000006d82e4570> (a hudson.remoting.Channel)
at hudson.remoting.Request$2.run(Request.java:382)
at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
at org.jenkinsci.remoting.CallableDecorator.call(CallableDecorator.java:18)
at hudson.remoting.CallableDecoratorList$1.call(CallableDecoratorList.java:21)
at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


Environment:

I created a few dumps and uploaded them to https://fastthread.io/:

Questions:
  • Have you seen that particular behaviour?
  • Any clues how I can debug it further?
  • Is that an expected behaviour in the remoting itself? Or a kind of bug?

Thanks


Oleg Nenashev

unread,
Feb 10, 2020, 11:24:20 AM2/10/20
to Jenkins Developers
From what I see many channels are blocked by classloading, e.g. this one:
Classloading might be a long operation, especially when massive classloading saturates your master's network.
Just check whether you send big JARs over the network, e.g. AWS SDK (~100Mb) or similar XXXL ones

Computer.threadPoolForRemoting [#484] for obs11-linux-pzz7g1 id=23
Stack Trace is:
java.lang.Thread.State: RUNNABLE
at java
.net.SocketOutputStream.socketWrite0(Native Method)
at java
.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
at java
.net.SocketOutputStream.write(SocketOutputStream.java:155)
at com
.trilead.ssh2.crypto.cipher.CipherOutputStream.internal_write(CipherOutputStream.java:52)
at com
.trilead.ssh2.crypto.cipher.CipherOutputStream.writeBlock(CipherOutputStream.java:101)
at com
.trilead.ssh2.crypto.cipher.CipherOutputStream.write(CipherOutputStream.java:118)
at com
.trilead.ssh2.transport.TransportConnection.sendMessage(TransportConnection.java:179)
at com
.trilead.ssh2.transport.TransportConnection.sendMessage(TransportConnection.java:107)
at com
.trilead.ssh2.transport.TransportManager.sendMessage(TransportManager.java:690)
- locked <0x00000006d864a0e8> (a java.lang.Object)
at com
.trilead.ssh2.channel.ChannelManager.sendData(ChannelManager.java:429)
- locked <0x00000006d9603858> (a java.lang.Object)
at com
.trilead.ssh2.channel.ChannelOutputStream.write(ChannelOutputStream.java:63)
at hudson
.remoting.ChunkedOutputStream.sendFrame(ChunkedOutputStream.java:90)
at hudson
.remoting.ChunkedOutputStream.drain(ChunkedOutputStream.java:85)
at hudson
.remoting.ChunkedOutputStream.write(ChunkedOutputStream.java:54)
at java
.io.OutputStream.write(OutputStream.java:75)
at hudson
.remoting.ChunkedCommandTransport.writeBlock(ChunkedCommandTransport.java:45)
at hudson
.remoting.AbstractSynchronousByteArrayCommandTransport.write(AbstractSynchronousByteArrayCommandTransport.java:46)
at hudson
.remoting.Channel.send(Channel.java:721)
- locked <0x00000006d9602dd8> (a hudson.remoting.Channel)
at hudson
.remoting.ProxyOutputStream.write(ProxyOutputStream.java:143)
- locked <0x00000007b7efa6d0> (a hudson.remoting.ProxyOutputStream)
at hudson
.remoting.RemoteOutputStream.write(RemoteOutputStream.java:108)
at hudson
.remoting.Util.copy(Util.java:56)
at hudson
.remoting.JarLoaderImpl.writeJarTo(JarLoaderImpl.java:56)
at sun
.reflect.GeneratedMethodAccessor513.invoke(Unknown Source)
at sun
.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java
.lang.reflect.Method.invoke(Method.java:498)
at hudson
.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:929)
at hudson
.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:903)
at hudson
.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:855)
at hudson
.remoting.UserRequest.perform(UserRequest.java:211)
at hudson
.remoting.UserRequest.perform(UserRequest.java:54)
at hudson
.remoting.Request$2.run(Request.java:369)

Victor Martinez

unread,
Feb 10, 2020, 2:50:06 PM2/10/20
to Jenkins Developers
Thanks Oleg for the analysis, not sure what's causing that particular classloading, I'm actually running in my local jenkins instance, so no other things are happening in the background but the pipeline that triggers 100 parallel stages.

It seems the aws-sdk-plugin was installed in my local jenkins instance, so , the list of jar files which are bigger than 5mb in the docker container are the below ones:

root@5075844d2393:/var/jenkins_home# find . -name *.jar | xargs ls -ltrh | cut -d' ' -f5- | grep "[0-9]M" | sort -n

5.3M Jun 27  2019 ./plugins/dtkit-api/WEB-INF/lib/Saxon-HE-9.9.1-2.jar

5.4M Jan  6 10:05 ./plugins/aws-java-sdk/WEB-INF/lib/aws-java-sdk-ec2-1.11.700.jar

6.8M Feb 10 09:05 ./war/WEB-INF/lib/groovy-all-2.4.12.jar

6.8M May 30  2018 ./plugins/ghprb/WEB-INF/lib/groovy-all-2.4.11.jar

6.8M Oct 22  2018 ./plugins/ws-cleanup/WEB-INF/lib/groovy-all-2.4.12.jar

 11M Feb 10 09:05 ./war/WEB-INF/lib/jenkins-core-2.209.jar

 13M Jun 27  2019 ./plugins/dtkit-api/WEB-INF/lib/icu4j-63.1.jar

 35M Jan  6 10:06 ./plugins/aws-java-sdk/WEB-INF/lib/aws-java-sdk-models-1.11.700.jar


The only file I know that explicitly is copied when a cloud worker is created is the remoting.jar, I don't see any other explicit references in the google compute engine plugin. So, any ideas how I can go deeper a bit more?

Are there any specific configuration (feature flags or something similar) that could help me to tweak the remoting to reduce the classloading saturation?

Thanks so much
Reply all
Reply to author
Forward
0 new messages