Rundeck 4.17.1 SSH execution issue

145 views
Skip to first unread message

Gorazd Žagar

unread,
Nov 6, 2023, 10:33:02 AM11/6/23
to rundeck-discuss
Hi, I've been using Rundeck for over 8 years now and have recently upgraded from 3.2.2 to 4.17.1. I started observing an interesting problem after the upgrade which occurs only on some nodes.

When the following is used:
  • an invocation string of sudo -E or with the combination of -S
  • Default Node Executor is set to SSH
  • Pass RD_* Variables is enabled
The script won't execute on some nodes although all nodes have the same OpenSSH version and same overall configuration (server and sshd_config).

The sudo is setup correctly in the Rundeck configuration, mind you, we have been using Rundeck for a long time without problems.

Projects were migrated using archive export/import.

What I observed is that most likely Rundeck does not send a sudo password as strace shows this on the process:

# strace -p 4030723
strace: Process 4030723 attached
read(0,

4030260 ?        Ss     0:00  \_ sshd: rundeck [priv]
4030360 ?        S      0:00      \_ sshd: rundeck@pts/1
4030723 pts/1    Ss+    0:00          \_ sudo -S -E /var/tmp/390-101884-CTNJ001-dispatch-script.tmp.sh


The environment variables are properly sent (verified through /proc/4030723/environ).

If Pass RD_* Variables is disabled, the .sh is executed and job succeeds.

Was wondering if anybody else encountered the same problem and if any of you have any ideas how to debug this?

Cheers, G.





The data contained in this e-mail and its attachments is confidential and intended solely for the person or organisation to whom it is addressed. If you are not the intended recipient, you must not copy or distribute it or take action in reliance on it. Please notify the sender that you received this e-mail in error and delete it immediately. Communication via e-mail over the internet is not secure and messages may be read, manipulated or otherwise compromised by third parties. We do not accept any responsibility in this eventuality. Emails sent to and from us may be stored and monitored.

Gorazd Žagar

unread,
Nov 6, 2023, 10:41:31 AM11/6/23
to rundeck-discuss
I can confirm the problem is with the Rundeck not providing (sending) the sudo password when prompted. This only happens on some servers and I'm trying to figure out why.

rac...@rundeck.com

unread,
Nov 6, 2023, 10:53:39 AM11/6/23
to rundeck-discuss
Hi,

Do you see any clue on the service.log after dispatching commands or scripts on the affected nodes?

rac...@rundeck.com

unread,
Nov 6, 2023, 10:58:06 AM11/6/23
to rundeck-discuss
Also, Is the sudo configuration the same on all nodes (model source)?

Gorazd Žagar

unread,
Nov 6, 2023, 11:32:55 AM11/6/23
to rundeck-discuss
No errors logged in the service.log.

No errors logged on the host either (auth.log, syslog).

My sudo configuration is in the framework.properties:

framework.sudo-command-enabled=true
framework.sudo-password-storage-path=keys/rundeck_sudo_passwd
framework.sudo-fail-on-prompt-timeout=false

But I will read the docs to determine if there's something more I can do around it.

All nodes have the same configuration.

I believe the latest version of Rundeck has an issue with this procedure. When you disable the sending of RD_* variables, it works well with sudo -E/-S, Rundeck sends the password where as when RD_* variables are sent for some reason the prompt is waiting for password which is never sent (observed with strace).

rac...@rundeck.com

unread,
Nov 6, 2023, 1:17:24 PM11/6/23
to rundeck-discuss
Hi!

I tried to replicate the issue on Centos 9 as remote SSH without success. What OS (and versions) are affected in your case?

Regards.

Gorazd Žagar

unread,
Nov 6, 2023, 1:31:37 PM11/6/23
to rundeck-discuss
Ubuntu 22.04.3 LTS

SSH version 8.9p1-3ubuntu0.4

Here's the strace from a server where the sudo works (Rundeck responds with providing the password, masked here with # at the last line, rundeck is the username):

[pid 1969607] write(2, "[sudo] password for rundeck: ", 29) = 29
[pid 1969606] <... ppoll resumed>)      = 1 ([{fd=10, revents=POLLIN}], left {tv_sec=59999, tv_nsec=686174456})
[pid 1969606] rt_sigprocmask(SIG_UNBLOCK, [CHLD],  <unfinished ...>
[pid 1969607] read(0,  <unfinished ...>
[pid 1969606] <... rt_sigprocmask resumed>[CHLD], 8) = 0
[pid 1969606] read(10, "[sudo] password for rundeck: ", 32768) = 29
[pid 1969606] getpid()                  = 1969606
[pid 1969606] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
[pid 1969606] ppoll([{fd=4, events=POLLIN}, {fd=4, events=POLLOUT}, {fd=10, events=POLLIN}], 3, {tv_sec=60000, tv_nsec=0}, [], 8) = 1 ([{fd=4, revents=POLLOUT}], left {tv_sec=59999, tv_nsec=999996560})
[pid 1969606] rt_sigprocmask(SIG_UNBLOCK, [CHLD], [CHLD], 8) = 0
[pid 1969606] write(4, "j6\206\314-\345AQ1O\33\204(\3\354\f\33{\322\234\362\353\230E\234\222\357\315&\236\273S\353\5\34S\274\v\17\324i\337\356:\316%\324\335'\324\271z>\270\372>{<\4\335\212\16\333\2329,f\314\331\261J\21\362\206i\347\314i\278", 80) = 80
[pid 1969606] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
[pid 1969606] ppoll([{fd=4, events=POLLIN}, {fd=4, events=0}, {fd=10, events=POLLIN}], 3, {tv_sec=60000, tv_nsec=0}, [], 8) = 1 ([{fd=4, revents=POLLIN}], left {tv_sec=59999, tv_nsec=662740088})
[pid 1969606] rt_sigprocmask(SIG_UNBLOCK, [CHLD], [CHLD], 8) = 0
[pid 1969606] read(4, "\244\326\22\344\203$fAx\236\354'\200:\341\0Q\232o\374t6\200\265\214\213\265}!#\375\315\241\rm\217\230\315\545\241\217\345q\366\304\335\34\233l*\230\tL{\37\334Z\25\245I\231\260\272\304\24\310S\327\260\340\3103-\251\23\217|\333\353;\252\366+/)A\255\254\304\315(\360+\221\35d", 262144) = 96
[pid 1969606] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
[pid 1969606] ppoll([{fd=4, events=POLLIN}, {fd=4, events=0}, {fd=10, events=POLLIN}, {fd=7, events=POLLOUT}], 4, {tv_sec=60000, tv_nsec=0}, [], 8) = 1 ([{fd=7, revents=POLLOUT}], left {tv_sec=59999, tv_nsec=999995370})
[pid 1969606] rt_sigprocmask(SIG_UNBLOCK, [CHLD], [CHLD], 8) = 0
[pid 1969606] write(7, "#################\n", 23) = 23


Here's the problematic server, where the prompt keeps open, password is never supplied and thus script never executes:

[pid 4159385] write(2, "[sudo] password for rundeck: ", 29) = 29
[pid 4159042] <... ppoll resumed>)      = 1 ([{fd=10, revents=POLLIN}], left {tv_sec=59999, tv_nsec=710411812})
[pid 4159385] read(0,  <unfinished ...>
[pid 4159042] rt_sigprocmask(SIG_UNBLOCK, [CHLD], [CHLD], 8) = 0
[pid 4159042] read(10, "[sudo] password for rundeck: ", 32768) = 29
[pid 4159042] getpid()                  = 4159042
[pid 4159042] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
[pid 4159042] ppoll([{fd=4, events=POLLIN}, {fd=4, events=POLLOUT}, {fd=10, events=POLLIN}], 3, {tv_sec=60000, tv_nsec=0}, [], 8) = 1 ([{fd=4, revents=POLLOUT}], left {tv_sec=59999, tv_nsec=999996084})
[pid 4159042] rt_sigprocmask(SIG_UNBLOCK, [CHLD], [CHLD], 8) = 0
[pid 4159042] write(4, "\303q\211I\272(\264\356\367\355\242\355\373\26\3k\2541\305\246\367\211d\243\211\n\2\7\256!\311\5A\373\300\321\226|:\224\256\311Y|=\263\327\327\316\457\5X\367\4\202\225\r\35\272\31\341\247\323x\365\356lL\216?\2B\276\360\24P\266L!\231", 80) = 80
[pid 4159042] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
[pid 4159042] ppoll([{fd=4, events=POLLIN}, {fd=4, events=0}, {fd=10, events=POLLIN}], 3, {tv_sec=60000, tv_nsec=0}, [], 8

(it stops here, no further processing of SSH)

I still have the old version 3.2.2 running and there's no issues there, same project, same script, same node configuration.

The problem I'm experiencing is a combination of server configuration and Rundeck but am wondering has the SSH plugin been modified in any way since 3.2.2 release (I haven't checked the release notes).

I will run a network dump now to see if Rundeck sends the password and it just not processed by SSH or does it never even send a password.

G.

Gorazd Žagar

unread,
Nov 7, 2023, 6:06:07 AM11/7/23
to rundeck-discuss
OK, I have managed to resolve this issue. The problem was with a few nodes which had a higher latency. Since the upgrade of Rundeck, the SSH plugin seems to be processing slower or Rundeck is processing the jobs slower in general. This resulted in the longer waiting time for the sudo prompt for these nodes to be taken into account and processed. It appears that the timeout set for sudo, which was never reached in the previous version, has now been reached on certain nodes. I solved this by increasing the framework.sudo-prompt-max-timeout. I do not recall running a job in the debug mode actually did actually print the warning that the timeout waiting for sudo has been reached. It would be extremely useful to have this kind of debugging available. 
Reply all
Reply to author
Forward
0 new messages