Job Always Times Out Out After 5 Minutes

474 views
Skip to first unread message

Russ Robinson

unread,
Feb 2, 2021, 10:52:29 AM2/2/21
to rundeck-discuss
Team,

  I'm using Rundeck Community 3.3.7 and our job (which performs a backup) is timing out after 5 minutes.  In addition to testing with '0' value; we have tried to adjust the following (instead of using '0'):
/etc/rundeck/framework.properties:
framework.ssh-connection-timeout = 86400000
framework.ssh-command-timeout = 86400000
Project settings:
project.ssh-command-timeout=86400000
project.ssh-connect-timeout=86400000
service.FileCopier.default.provider=jsch-scp
service.NodeExecutor.default.provider=jsch-ssh

The job definition also has Timeout value set to 2d.

Any suggestions on how to fix this issue?

Russ Robinson

unread,
Feb 2, 2021, 10:58:46 AM2/2/21
to rundeck-discuss
Also - as our Rundeck server is running on Linux, we did try out the same commands from Linux ssh command via the userid running the Linux server.  Similarly, it timed out and we had to add the following into the Linux rundeck userid's $HOME/.ssh/config file:

Host *
  ServerAliveCountMax 3
  ServerAliveInterval 10
  TCPKeepAlive yes
  StrictHostKeyChecking no
  ConnectTimeout 240

Is it possible to add these kind of settings to Rundeck jssh settings?

rac...@rundeck.com

unread,
Feb 2, 2021, 11:38:18 AM2/2/21
to rundeck-discuss
Hi Russ,

Take a look at this, it seems useful in your case.

Regards.

Xavier Humbert

unread,
Feb 2, 2021, 11:56:39 AM2/2/21
to rundeck...@googlegroups.com, Russ Robinson

Hi Russ,

I've been hit (well, my users, but this is my job to keep them happy) by the very same problem. Besides of setting ServerAlive in ssh options -which has no effect w/r to Rundeck-, I had to change the executors to OpenSSH :

service.FileCopier.default.provider=ssh-copier
service.NodeExecutor.default.provider=ssh-exec

HTH

Xavier

--
You received this message because you are subscribed to the Google Groups "rundeck-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rundeck-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rundeck-discuss/c1a03abe-79a7-4da5-bd1c-bedb6d77d239n%40googlegroups.com.
-- 
Xavier Humbert
CRT Supervision et Exploitation de Niveau 1
Rectorat de Nancy-Metz
03 83 86 27 39
OpenPGP_0x90B78A89BCC49C10.asc
OpenPGP_signature

Russ Robinson

unread,
Feb 2, 2021, 3:33:30 PM2/2/21
to rundeck-discuss
Thanks.  I'm trying to switch over to use OpenSSH.  Unfortunately; my job just returns with a bad return code of 5.  No other details.  Here is my setup:
  • Within my project; I have the following settings:
    • project.plugin.FileCopier.ssh-copier.authentication=password
    • project.plugin.FileCopier.ssh-copier.ssh_password_option=option.sshPassword
    • project.plugin.NodeExecutor.ssh-exec.authentication=password
    • project.plugin.NodeExecutor.ssh-exec.ssh_password_option=option.sshPassword
    • service.FileCopier.default.provider=ssh-copier
    • service.NodeExecutor.default.provider=ssh-exec
  • Within my node definition in resource json file, I have:
    •   {
          "nodename": "mytest.server.com",
          "type": "Node",
          "hostname": "mytest.server.com",
          "osFamily": "unix",
          "sudo-command-enabled": "true",
          "sudo-password-option": "option.sshPassword",
          "username": "${option.username}",
          "tags": [
            ""
          ]
        },

Within my test job; the "sshPassword" option field is still set to "Secure Remote Authentication".  In running in debug mode; I just get:

[workflow] beginExecuteNodeStep(myrundeck.server.com): NodeDispatch: ScriptFileItem{label='Date Sleep Test', script=[109 chars]}
[ssh-copier] executing: [/bin/bash, /var/lib/rundeck/libext/cache/openssh-node-execution-2.0.1/ssh-copy.sh, ${node.username}, ${node.hostname}]
[ssh-copier]: result code: 5
[workflow] finishExecuteNodeStep(myrundeck.server.com): NodeDispatch: NonZeroResultCode: [ssh-copier]: external script failed with exit code: 5

Any suggestions on what further to look at?

rac...@rundeck.com

unread,
Feb 2, 2021, 3:52:23 PM2/2/21
to rundeck-discuss
Hi Russ,

It seems a credentials issue. Anyway, if you have some time, give a chance to this to avoid the timeout issue.

Regards!

Russ Robinson

unread,
Feb 2, 2021, 4:06:06 PM2/2/21
to rundeck-discuss
How does script-exec or script-copy obtain the password  used for their ssh commands?  In our scenario; each job prompts the user for their userid and password.  We do not have generic ssh userids or keys; and nothing hard-coded (userid or password) in the node definitions.

rac...@rundeck.com

unread,
Feb 2, 2021, 4:39:00 PM2/2/21
to rundeck-discuss

Hi Russ,

Following this, you can set a job level authentication in this way:

On the node definition:

<?xml version="1.0" encoding="UTF-8"?>
<project>
  <node name="node00" description="Node 00" tags="mytag" hostname="192.168.33.20" osArch="amd64" osFamily="unix" osName="Linux" osVersion="3.10.0-1062.4.1.el7.x86_64" username="${option.myuser}" ssh-authentication="password" ssh-password-option="option.sshPassword1"/>
</project>

Which works with the following job definition (like your scenario):

<joblist>
  <job>
    <context>
      <options preserveOrder='true'>
        <option name='sshPassword1' secure='true' />
        <option name='myuser' value='vagrant' />
      </options>
    </context>
    <defaultTab>nodes</defaultTab>
    <description></description>
    <dispatch>
      <excludePrecedence>true</excludePrecedence>
      <keepgoing>false</keepgoing>
      <rankOrder>ascending</rankOrder>
      <successOnEmptyNodeFilter>false</successOnEmptyNodeFilter>
      <threadcount>1</threadcount>
    </dispatch>
    <executionEnabled>true</executionEnabled>
    <id>b188c66c-c057-4bb7-98bf-7c84632bc144</id>
    <loglevel>INFO</loglevel>
    <name>Whoami</name>
    <nodeFilterEditable>false</nodeFilterEditable>
    <nodefilters>
      <filter>name: node00</filter>
    </nodefilters>
    <nodesSelectedByDefault>true</nodesSelectedByDefault>
    <plugins />
    <scheduleEnabled>true</scheduleEnabled>
    <sequence keepgoing='false' strategy='node-first'>
      <command>
        <exec>whoami</exec>
      </command>
    </sequence>
    <uuid>b188c66c-c057-4bb7-98bf-7c84632bc144</uuid>
  </job>
</joblist>

Hope it helps!

Reply all
Reply to author
Forward
0 new messages