Issue with master-agent communication over ssh

618 views
Skip to first unread message

jiga...@gmail.com

unread,
Dec 9, 2020, 11:53:19 AM12/9/20
to Jenkins Users
Hello Jenkins community, 

I have setup Jenkins agents over SSH and this agent frequently goes offline with Resource temporarily unavailable error. I had to configure my Jenkins agents to communicate over JNLP. Any advise on how to fix this issue?

Jenkins v2.249.1

[12/04/20 15:23:00] [SSH] Checking java version of java
[12/04/20 15:23:01] [SSH] java -version returned 1.8.0_202.
[12/04/20 15:23:01] [SSH] Starting sftp client.
[12/04/20 15:23:03] [SSH] Remote file system root $JENKINS_SSH_DATA does not exist. Will try to create it...
[12/04/20 15:23:03] [SSH] Copying latest remoting.jar...
[12/04/20 15:23:03] [SSH] Copied 1,521,553 bytes.
Expanded the channel window size to 4MB
[12/04/20 15:23:03] [SSH] Starting agent process: cd "$JENKINS_SSH_DATA" && java  -jar remoting.jar -workDir $JENKINS_SSH_DATA -jar-cache $JENKINS_SSH_DATA/remoting/jarCache
Dec 04, 2020 3:23:29 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using $JENKINS_SSH_DATA/remoting as a remoting work directory
Dec 04, 2020 3:23:29 PM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
INFO: Both error and output logs will be printed to $JENKINS_SSH_DATA/remoting
<===[JENKINS REMOTING CAPACITY]===>channel started
Remoting version: 4.5
This is a Unix agent
Evacuated stdout
Agent successfully connected and online
The Agent is connected, disconnect it before to try to connect it again.
Dec 04, 2020 3:33:58 PM org.eclipse.jgit.util.FS discoverGitSystemConfig
WARNING: Exception caught during execution of command '[git, config, --system, --edit]' in '$GIT_PATH/bin', return code '128', error message 'fatal: Invalid path '$GIT_PATH/etc': No such file or directory
'
Dec 04, 2020 3:33:58 PM org.eclipse.jgit.util.FS$FileStoreAttributes saveToConfig
WARNING: locking FileBasedConfig[$JENKINS_PATH/.config/jgit/config] failed after 5 retries
Dec 04, 2020 3:33:59 PM org.jenkinsci.remoting.util.AnonymousClassWarnings warn
WARNING: Attempt to (de-)serialize anonymous class com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.GerritTriggerBuildChooser$1; see: https://jenkins.io/redirect/serialization-of-anonymous-classes/
Dec 04, 2020 3:34:24 PM hudson.remoting.Request$2 run
WARNING: Failed to send back a reply to the request hudson.remoting.Request$2@493c5a4e
java.io.IOException: Resource temporarily unavailable
    at java.io.FileOutputStream.writeBytes(Native Method)
    at java.io.FileOutputStream.write(FileOutputStream.java:313)
    at hudson.remoting.StandardOutputStream.write(StandardOutputStream.java:83)
    at hudson.remoting.ChunkedOutputStream.sendFrame(ChunkedOutputStream.java:89)
    at hudson.remoting.ChunkedOutputStream.sendBreak(ChunkedOutputStream.java:62)
    at hudson.remoting.ChunkedCommandTransport.writeBlock(ChunkedCommandTransport.java:46)
    at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.write(AbstractSynchronousByteArrayCommandTransport.java:46)
    at hudson.remoting.Channel.send(Channel.java:766)
    at hudson.remoting.Request$2.run(Request.java:388)
    at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:73)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
 
Dec 04, 2020 5:25:30 PM hudson.remoting.Request$2 run
WARNING: Failed to send back a reply to the request hudson.remoting.Request$2@2511e2d4
java.io.IOException: Resource temporarily unavailable
    at java.io.FileOutputStream.writeBytes(Native Method)
    at java.io.FileOutputStream.write(FileOutputStream.java:326)
    at hudson.remoting.StandardOutputStream.write(StandardOutputStream.java:88)
    at hudson.remoting.ChunkedOutputStream.sendFrame(ChunkedOutputStream.java:90)
    at hudson.remoting.ChunkedOutputStream.drain(ChunkedOutputStream.java:85)
    at hudson.remoting.ChunkedOutputStream.write(ChunkedOutputStream.java:54)
    at java.io.OutputStream.write(OutputStream.java:75)
    at hudson.remoting.ChunkedCommandTransport.writeBlock(ChunkedCommandTransport.java:45)
    at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.write(AbstractSynchronousByteArrayCommandTransport.java:46)
    at hudson.remoting.Channel.send(Channel.java:766)
    at hudson.remoting.Request$2.run(Request.java:388)
    at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:73)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
 
ERROR: Connection terminated
java.io.StreamCorruptedException: invalid stream header: 00025B42
    at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:866)
    at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358)
    at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49)
    at hudson.remoting.Command.readFrom(Command.java:142)
    at hudson.remoting.Command.readFrom(Command.java:128)
    at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
    at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
Dec 04, 2020 5:32:30 PM hudson.slaves.ChannelPinger$1 onDead
INFO: Ping failed. Terminating the channel channel.
java.util.concurrent.TimeoutException: Ping started at 1607120910589 hasn't completed by 1607121150590
    at hudson.remoting.PingThread.ping(PingThread.java:134)
    at hudson.remoting.PingThread.run(PingThread.java:90)

Thanks,
Jigar R

Jeremy Mordkoff

unread,
Dec 10, 2020, 11:09:05 AM12/10/20
to Jenkins Users
This has been discussed before. Did you search this group's archives? You might be better off re-starting a conversation in one of those threads. 

My own experience is that I too saw this issue for many months. It cleared up on its own. In that interim I did upgrade jenkins more than once and I did switch from dualstack ipv4/ipv6 to ipv4 only for all jenkins work as I was getting errors deploying containers when ipv6 was enabled. I have no idea if either of these actually mattered, and I'm sure other things changed that I am assuming are completely unrelated but may not be. 

Ivan Fernandez Calvo

unread,
Dec 11, 2020, 12:11:39 PM12/11/20
to Jenkins Users
The most common issue is related to disconnections because there is no traffic between the Jenkins instance and the agent, for that, you have to tune the TCP stack of your OS (see https://support.cloudbees.com/hc/en-us/articles/115001369667-dedicated-SSH-agents-formerly-slaves-get-disconnected), or enable the keepalive option in the SSH protocol this can be configured by setting ClientAliveInterval or TCPKeepAlive on the SSH server (/etc/ssh/sshd_config), also by setting ServerAliveInterval or TCPKeepAlive options for the user connection (/etc/ssh/ssh_config or ~/.ssh/ssh_config)

https://www.freebsd.org/cgi/man.cgi?sshd_config(5)
https://www.freebsd.org/cgi/man.cgi?ssh_config(5)

Also, check you follow the best practices to configure your SSH agents and enable SSH verbose log output in your service (see https://github.com/jenkinsci/ssh-slaves-plugin/blob/master/doc/TROUBLESHOOTING.md)


jiga...@gmail.com

unread,
Dec 30, 2020, 5:40:38 PM12/30/20
to Jenkins Users
Interestingly, whenever Jenkins agent is executing PMD/Findbugs task, it runs into " Resource temporarily unavailable". 
I have
Please let me know if there is anything else I should consider.

Ivan Fernandez Calvo

unread,
Jan 1, 2021, 8:23:17 AM1/1/21
to Jenkins Users
If the agent is running something there is traffic, so it si not related to keep alive settings, I wonder if after the issue happens there is a remoting process running on the agent.
Did you check for hs_err_pid error files in the root fs of the agent? http://www.oracle.com/technetwork/java/javase/felog-138657.html#gbwcy
How much memory you set for your remoting process?
How much memory your maven build needs?

Jeff Thompson

unread,
Jan 4, 2021, 12:37:52 PM1/4/21
to jenkins...@googlegroups.com

When I saw something like this in the past, it was because the process was running out of resources. Specifically when running SpotBugs, I got out-of-memory errors. I had to modify the pom to allocate more memory.

I recommend ensuring your build runs normally on the agent without the additional complexities of the Jenkins environment. This might show where additional resources are needed. If that all passes, then continue on to the additional troubleshooting steps involving the Jenkins controller and agent.

Jeff Thompson

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/030aced0-ba6e-4012-a60a-58208d350544n%40googlegroups.com.

Jigar R

unread,
Feb 1, 2021, 5:52:19 PM2/1/21
to Jenkins Users
On Monday, January 4, 2021 at 12:37:52 PM UTC-5 jtho...@cloudbees.com wrote:

When I saw something like this in the past, it was because the process was running out of resources. Specifically when running SpotBugs, I got out-of-memory errors. I had to modify the pom to allocate more memory.

I have 2 different kind of jenkins agents.
1. java web start
2. ssh
If the memory was issue then wouldn't it fail on both cases?

I do see "Agent went offline during build Connection was broker: java.io.StreamCorruptedException: invalid stream header:".

Ivan Fernandez Calvo

unread,
Feb 2, 2021, 12:20:19 PM2/2/21
to Jenkins Users
>I have 2 different kind of jenkins agents.
>1. java web start
>2. ssh
>If the memory was issue then wouldn't it fail on both cases?

Not necessarily, starting by the point that are different ways to establish the connection, the JNLP agents could not update the remoting jar file (depends on your configurations), so you can be running different versions of remoting. I agree with Jeff looks like an OOM issue, review my comments at https://groups.google.com/g/jenkinsci-users/c/nD3s06hSUXE/m/BQKk5GSYBwAJ my recommendation is to fix the mem for the remoting process to 1024M (-Xmx1024m -Xms1024m) see if the issue disappear or change, if disappear, you would have to adjust the remoting process memory to the right one between 256M-1024M, to use 512M usually is safe a not too much (but depends on your agents' memory we do not know how much they have)

Jigar R

unread,
Feb 2, 2021, 4:52:08 PM2/2/21
to Jenkins Users
On Tuesday, February 2, 2021 at 12:20:19 PM UTC-5 kuisat...@gmail.com wrote:
>I have 2 different kind of jenkins agents.
>1. java web start
>2. ssh
>If the memory was issue then wouldn't it fail on both cases?

Not necessarily, starting by the point that are different ways to establish the connection, the JNLP agents could not update the remoting jar file (depends on your configurations), so you can be running different versions of remoting. I agree with Jeff looks like an OOM issue, review my comments at https://groups.google.com/g/jenkinsci-users/c/nD3s06hSUXE/m/BQKk5GSYBwAJ my recommendation is to fix the mem for the remoting process to 1024M (-Xmx1024m -Xms1024m) see if the issue disappear or change, if disappear, you would have to adjust the remoting process memory to the right one between 256M-1024M, to use 512M usually is safe a not too much (but depends on your agents' memory we do not know how much they have)

I created SSH agent with -Xmx1024m -Xms1024m.
I do see bunch of Warnings about "Ping failed. Terminating the channel channel".
I got following error:
"
  WARNING: Failed to send back a reply to te rquest hudson.remoting.Request$....
  java.io.IOException: Resource temporarily unavailable
          at java.io.FileOutputStream.writeBytes(Native method)
          .....

kuisathaverat

unread,
Feb 2, 2021, 5:27:07 PM2/2/21
to jenkins...@googlegroups.com
Weird, Could you share a screencapture of what you configurate? Also the whole exception those lines alone mean nothing. Know the version of Jenkins and the version of the SSH build agents plugins you use can help, the memory you have in your agents, if they are bare metal or cloud. In overall if you want help please provide more context. 

You received this message because you are subscribed to a topic in the Google Groups "Jenkins Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/jenkinsci-users/nD3s06hSUXE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/615a4333-587c-4ea2-a11f-fea45fe38a56n%40googlegroups.com.


--

Jigar R

unread,
Feb 4, 2021, 8:58:12 AM2/4/21
to Jenkins Users
Hello Ivan,

Attached the logs in the email 
  • jenkins.log - jenkins build output
  • Jenkins-agent.log - output from jenkins ssh agent 
Jenkins SSH agent was created with following:
  • launch method : launch agents via ssh
  • JavaPath: $JAVA_HOME
  • JVM options: -Xmx2048m -Xms2048m
  • Use TCP_NODELAY flag on the ssh connection - enabled
Environment information

  • Jenkins v2.249.1
  • RH6
  • SSH agent plugin 1.20
  • SSH slaves plugin 1.30.4

jenkins-agent.log
jenkins.log

kuisathaverat

unread,
Feb 4, 2021, 9:29:17 AM2/4/21
to jenkins...@googlegroups.com
I see some serialization fails and this breaks the channel, the plugin that causes the exception seems https://github.com/jenkinsci/tasks-plugin, and the `[Deprecated] Scan workspace for open tasks` I think matters, this plugins has been integrated into https://github.com/jenkinsci/warnings-ng-plugin and https://github.com/jenkinsci/analysis-model

ERROR: Step ‘[Deprecated] Scan workspace for open tasks’ aborted due to exception:
java.io.StreamCorruptedException: invalid type code: 6D
...
    at hudson.plugins.tasks.TasksPublisher.perform(TasksPublisher.java:182)
    at hudson.plugins.analysis.core.HealthAwarePublisher.perform(HealthAwarePublisher.java:69)
    at hudson.plugins.analysis.core.HealthAwareRecorder.perform(HealthAwareRecorder.java:298)
    at jenkins.tasks.SimpleBuildStep.perform(SimpleBuildStep.java:112)
    at hudson.tasks.BuildStepCompatibilityLayer.perform(BuildStepCompatibilityLayer.java:78)
    at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
    at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:741)
    at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:690)
    at hudson.model.Build$BuildExecution.post2(Build.java:186)
    at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:635)
    at hudson.model.Run.execute(Run.java:1919)
    at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
    at hudson.model.ResourceController.execute(ResourceController.java:97)
    at hudson.model.Executor.run(Executor.java:428)

Jeff Thompson

unread,
Feb 4, 2021, 11:25:17 AM2/4/21
to jenkins...@googlegroups.com

I agree with Ivan. There are differences, sometimes subtle, between how different agents behave for various reasons. The basic operations of the plugins and the protocol should be the same. There are can be differences in resource usage, platforms, etc. Sometimes plugins will behave differently on different one.

Try out some of Ivan's suggests or other troubleshooting like that and figure out how to isolate the problem.

Jeff Thompson

Jigar R

unread,
Feb 4, 2021, 12:20:21 PM2/4/21
to jenkins...@googlegroups.com
On Thu, Feb 4, 2021 at 9:29 AM kuisathaverat <kuisat...@gmail.com> wrote:
I see some serialization fails and this breaks the channel, the plugin that causes the exception seems https://github.com/jenkinsci/tasks-plugin, and the `[Deprecated] Scan workspace for open tasks` I think matters, this plugins has been integrated into https://github.com/jenkinsci/warnings-ng-plugin and https://github.com/jenkinsci/analysis-model

ERROR: Step ‘[Deprecated] Scan workspace for open tasks’ aborted due to exception:
java.io.StreamCorruptedException: invalid type code: 6D
...
    at hudson.plugins.tasks.TasksPublisher.perform(TasksPublisher.java:182)
    at hudson.plugins.analysis.core.HealthAwarePublisher.perform(HealthAwarePublisher.java:69)
    at hudson.plugins.analysis.core.HealthAwareRecorder.perform(HealthAwareRecorder.java:298)
    at jenkins.tasks.SimpleBuildStep.perform(SimpleBuildStep.java:112)
    at hudson.tasks.BuildStepCompatibilityLayer.perform(BuildStepCompatibilityLayer.java:78)
    at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
    at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:741)
    at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:690)
    at hudson.model.Build$BuildExecution.post2(Build.java:186)
    at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:635)
    at hudson.model.Run.execute(Run.java:1919)
    at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
    at hudson.model.ResourceController.execute(ResourceController.java:97)
    at hudson.model.Executor.run(Executor.java:428)

Thanks for this information. I will move to use warning-ng plugin & see if it breaks this or not.

Jigar R

unread,
Feb 4, 2021, 5:26:13 PM2/4/21
to Jenkins Users
On Thursday, February 4, 2021 at 12:20:21 PM UTC-5 Jigar R wrote:
On Thu, Feb 4, 2021 at 9:29 AM kuisathaverat <kuisat...@gmail.com> wrote:
I see some serialization fails and this breaks the channel, the plugin that causes the exception seems https://github.com/jenkinsci/tasks-plugin, and the `[Deprecated] Scan workspace for open tasks` I think matters, this plugins has been integrated into https://github.com/jenkinsci/warnings-ng-plugin and https://github.com/jenkinsci/analysis-model

ERROR: Step ‘[Deprecated] Scan workspace for open tasks’ aborted due to exception:
java.io.StreamCorruptedException: invalid type code: 6D
...
    at hudson.plugins.tasks.TasksPublisher.perform(TasksPublisher.java:182)
    at hudson.plugins.analysis.core.HealthAwarePublisher.perform(HealthAwarePublisher.java:69)
    at hudson.plugins.analysis.core.HealthAwareRecorder.perform(HealthAwareRecorder.java:298)
    at jenkins.tasks.SimpleBuildStep.perform(SimpleBuildStep.java:112)
    at hudson.tasks.BuildStepCompatibilityLayer.perform(BuildStepCompatibilityLayer.java:78)
    at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
    at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:741)
    at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:690)
    at hudson.model.Build$BuildExecution.post2(Build.java:186)
    at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:635)
    at hudson.model.Run.execute(Run.java:1919)
    at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
    at hudson.model.ResourceController.execute(ResourceController.java:97)
    at hudson.model.Executor.run(Executor.java:428)

Thanks for this information. I will move to use warning-ng plugin & see if it breaks this or not.

I updated jenkins job to use warnings NG instead of deprecated plugins. After lots of trial and error, I found that jenkins SSH agent would throw EOFException while running JaCoCo plugin v3.0.7 (https://plugins.jenkins.io/jacoco/). Attached logs.
jenkins2.log

Jigar R

unread,
Feb 9, 2021, 3:23:35 PM2/9/21
to Jenkins Users
Any recommendations on how should I go about this new error?

kuisathaverat

unread,
Feb 10, 2021, 9:30:28 AM2/10/21
to jenkins...@googlegroups.com
you attached the Jenkins build log and the Agent log, should be also an exception in the Jenkins log, Is the same you posted before `invalid type code: 6D`?

Jigar R

unread,
Feb 12, 2021, 11:32:37 AM2/12/21
to Jenkins Users
On Wednesday, February 10, 2021 at 9:30:28 AM UTC-5 kuisat...@gmail.com wrote:
you attached the Jenkins build log and the Agent log, should be also an exception in the Jenkins log, Is the same you posted before `invalid type code: 6D`?

This is all I see in jenkins.err log

2021-02-11 21:44:00.121+0000 [id=67650] INFO    h.r.SynchronousCommandTransport$ReaderThread#run: I/O error in channel ssh_agent
java.io.EOFException
    at java.io.ObjectInputStream$BlockDataInputStream.readFully(ObjectInputStream.java:3106)
    at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1956)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1567)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431)
    at hudson.remoting.Command.readFromObjectStream(Command.java:155)
    at hudson.remoting.Command.readFrom(Command.java:142)
    at hudson.remoting.Command.readFrom(Command.java:128)
    at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
    at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
Caused: java.io.IOException: Unexpected termination of the channel
    at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
2021-02-11 21:44:00.122+0000 [id=67569] WARNING h.m.AbstractBuild$AbstractBuildExecution#reportError: Step ‘Record compiler warnings and static analysis results’ aborted due to exception:
java.io.IOException: No workspace found for JOB_NAME #116
    at io.jenkins.plugins.analysis.core.steps.IssuesRecorder.perform(IssuesRecorder.java:577)
    at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
    at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:741)
    at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:690)
    at hudson.model.Build$BuildExecution.post2(Build.java:186)
    at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:635)
    at hudson.model.Run.execute(Run.java:1919)
    at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
    at hudson.model.ResourceController.execute(ResourceController.java:97)
    at hudson.model.Executor.run(Executor.java:428)
2021-02-11 21:44:00.283+0000 [id=67569] WARNING hudson.ivy.IvyBuildTrigger#recomputeModuleDescriptor: Cannot read ivy file backup...removing ModuleDescriptor
2021-02-11 21:44:00.287+0000 [id=67569] WARNING hudson.ivy.IvyBuildTrigger#getModuleDescriptor: Node is offline for JOB_NAME, using project to get Module Descriptor

Iván Fernández Calvo

unread,
Feb 12, 2021, 1:09:25 PM2/12/21
to jenkins...@googlegroups.com
Looks like an abrupt disconnect, this point to the resources you give to the remote process and the resources you have in the agent. The resource management on a JVM is tricky, the JDK gives a 75% of the memory resources if you do not pass Xmx and Xms settings. So I wonder how much memory you have on those agents and if you set those limits. Based on my experience less than 4GB  for sunfire tasks use to be tight.

Un Saludo 
Ivan Fernandez Calvo

El 12 feb 2021, a las 17:32, Jigar R <jigarra...@gmail.com> escribió:



Jigar Rathod

unread,
Feb 12, 2021, 1:39:33 PM2/12/21
to jenkins...@googlegroups.com


Sent from my iPhone

On Feb 12, 2021, at 1:09 PM, Iván Fernández Calvo <kuisat...@gmail.com> wrote:

Looks like an abrupt disconnect, this point to the resources you give to the remote process and the resources you have in the agent. The resource management on a JVM is tricky, the JDK gives a 75% of the memory resources if you do not pass Xmx and Xms settings. So I wonder how much memory you have on those agents and if you set those limits. Based on my experience less than 4GB  for sunfire tasks use to be tight.


Un Saludo 
Ivan Fernandez Calvo
I am using -Xmx2048m -Xms2048m. I will increase it to 4096 and see what happens. 

Thanks, 
Jigar R

kuisathaverat

unread,
Feb 13, 2021, 8:35:07 AM2/13/21
to jenkins...@googlegroups.com
the memory for the remoting process is not the issue, it is the free memory you left for the builds and the systems. I dunno what memory have your agents, but let's said they have 4GB if you give 2GB to the remoting process you only have 2GB for your builds and the system tasks so probably if you run a maven build that generates reports, it will need more than 4GB but only have 2GB so the maven process will die and because it grabs all the memory probably tear down another system process so the agent will disconnect the SSH connection.
The remoting process usually not need more than 512MB (-Xms512m -Xms512m), I use to keep 1-2 GB for the system (ssh, Docker, and other services) so if you maven build need 4GB the agent will need at least 6-8GB. But all this is a guess because I dunno how much memory your build needs (maven or whatever) and I dunno which services you have running on your agents, and so on.

Jigar R

unread,
Mar 1, 2021, 4:03:24 PM3/1/21
to Jenkins Users
I configured my jenkins agent with over 16GB of memory now. I use Apache ivy & Apache Ant to build the project. I still get resource temporarily unavailable. 

On the other hand, It works like a charm with Java Web start agent that I have. I haven't configured any JVM options for it and it just works smoothly. This is nerve wrecking. 

Jigar R

unread,
Mar 1, 2021, 5:05:15 PM3/1/21
to Jenkins Users
On Monday, March 1, 2021 at 4:03:24 PM UTC-5 Jigar R wrote:
I configured my jenkins agent with over 16GB of memory now. I use Apache ivy & Apache Ant to build the project. I still get resource temporarily unavailable. 

On the other hand, It works like a charm with Java Web start agent that I have. I haven't configured any JVM options for it and it just works smoothly. This is nerve wrecking. 
On Saturday, February 13, 2021 at 8:35:07 AM UTC-5 kuisat...@gmail.com wrote:
the memory for the remoting process is not the issue, it is the free memory you left for the builds and the systems. I dunno what memory have your agents, but let's said they have 4GB if you give 2GB to the remoting process you only have 2GB for your builds and the system tasks so probably if you run a maven build that generates reports, it will need more than 4GB but only have 2GB so the maven process will die and because it grabs all the memory probably tear down another system process so the agent will disconnect the SSH connection.
The remoting process usually not need more than 512MB (-Xms512m -Xms512m), I use to keep 1-2 GB for the system (ssh, Docker, and other services) so if you maven build need 4GB the agent will need at least 6-8GB. But all this is a guess because I dunno how much memory your build needs (maven or whatever) and I dunno which services you have running on your agents, and so on.


Java web start agent uses agent.jar whereas SSH agent uses remoting.jar. Is there a difference between them?

kuisathaverat

unread,
Mar 1, 2021, 6:19:57 PM3/1/21
to jenkins...@googlegroups.com
It is the same file with different names, SSH agents update it automatically every time you change the Jenkins core version, JNLP agent depends if you have configured the auto update or you updated the jar manually.

Reply all
Reply to author
Forward
0 new messages