job hangs on long commands

701 views
Skip to first unread message

tu...@improve.ro

unread,
Apr 10, 2013, 12:05:22 PM4/10/13
to rundeck...@googlegroups.com
Hi guys,

I created a simple job that is running remotely on 3 ec2 nodes in amazon.

The job has only 2 steps:
Step 1/ sleeps for one hour
Step 2/ writes a message to the output

The problem is that, the job hangs on the first step and never ends.The issue is reproducible only if the rundeck runner stays to much on a single command, in my case the sleep command.If i change the value from the sleep to 20 minutes it works fine.

Bellow you can find the execution log and attached is the job definition.
I really need a workaround for this, I will appreciate any help or workaround for this scenario.

Thanks a lot
Tudor


^^^07:41:19|CONFIG|admin|||sj1slm435||[workflow] Begin execution: rundeck-workflow-step-first context: null^^^
^^^07:41:19|CONFIG|admin|||sj1slm435||NodeSet: NodeSet{excludes={name=i-c01866f2, dominant=true, }, includes={tags=mongodb-10gen, dominant=false, }}^^^
^^^07:41:19|CONFIG|admin|||sj1slm435||Workflow: WorkflowImpl{commands=[com.dtolabs.rundeck.execution.ExecutionItemFactory$1@2902a840, com.dtolabs.rundeck.execution.ExecutionItemFactory$1@6d7416c8], threadcount=1, keepgoing=false, strategy=step-first}^^^
^^^07:41:19|CONFIG|admin|||sj1slm435||data context: {job={id=e8250421-e2d8-429a-9039-a02717dccc87, project=oakPerformance, loglevel=VERBOSE, username=admin, name=testBlockingIssue, group=null, execid=10}, option={MongosNumber=3, MONGOS_MAIN_PLATFORM=ec2-50-112-80-21.us-west-2.compute.amazonaws.com, TestName=OakTest#testPyramidStructure, OakType=mongomk}}^^^
^^^07:41:19|CONFIG|||1-NodeDispatch-script|||[workflow] Begin step: 1,NodeDispatch^^^
^^^07:41:19|CONFIG|||1-NodeDispatch-script|||1: Workflow step executing: com.dtolabs.rundeck.execution.ExecutionItemFactory$1@2902a840^^^
^^^07:41:19|CONFIG|||1-NodeDispatch-script|||preparing for parallel execution...(keepgoing? false, threads: 20)^^^
^^^07:41:19|CONFIG|||1-NodeDispatch-script|||Create task for node: i-047c7d36^^^
^^^07:41:19|CONFIG|||1-NodeDispatch-script|||Create task for node: i-067c7d34^^^
^^^07:41:19|CONFIG|||1-NodeDispatch-script|||Create task for node: i-0a7c7d38^^^
^^^07:41:19|CONFIG|||1-NodeDispatch-script|||parallel dispatch to nodes: [i-047c7d36, i-0a7c7d38, i-067c7d34]^^^
^^^07:41:19|CONFIG|jslave||1-NodeDispatch-script|i-067c7d34||[workflow] beginExecuteNodeStep(i-067c7d34): NodeDispatch: com.dtolabs.rundeck.execution.ExecutionItemFactory$1@2902a840^^^
^^^07:41:19|CONFIG|jslave||1-NodeDispatch-script|i-047c7d36||[workflow] beginExecuteNodeStep(i-047c7d36): NodeDispatch: com.dtolabs.rundeck.execution.ExecutionItemFactory$1@2902a840^^^
^^^07:41:19|CONFIG|jslave||1-NodeDispatch-script|i-0a7c7d38||[workflow] beginExecuteNodeStep(i-0a7c7d38): NodeDispatch: com.dtolabs.rundeck.execution.ExecutionItemFactory$1@2902a840^^^
^^^07:41:20|CONFIG|jslave||1-NodeDispatch-script|i-047c7d36||Using ssh keyfile: /home/jslave/.ssh/id_rsa^^^
^^^07:41:20|CONFIG|jslave||1-NodeDispatch-script|i-047c7d36||Starting SSH Connection: jsl...@ec2-54-245-51-45.us-west-2.compute.amazonaws.com (i-047c7d36)^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Set timeout to 0^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Connecting to ec2-54-245-51-45.us-west-2.compute.amazonaws.com:22^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Connecting to ec2-54-245-51-45.us-west-2.compute.amazonaws.com port 22^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Disconnecting from ec2-54-244-80-87.us-west-2.compute.amazonaws.com port 22^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Caught an exception, leaving main loop due to Socket closed^^^
^^^07:41:20|CONFIG|jslave||1-NodeDispatch-script|i-0a7c7d38||Using ssh keyfile: /home/jslave/.ssh/id_rsa^^^
^^^07:41:20|CONFIG|jslave||1-NodeDispatch-script|i-0a7c7d38||Starting SSH Connection: jsl...@ec2-54-244-80-87.us-west-2.compute.amazonaws.com (i-0a7c7d38)^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Set timeout to 0^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Connecting to ec2-54-244-80-87.us-west-2.compute.amazonaws.com:22^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Connecting to ec2-54-244-80-87.us-west-2.compute.amazonaws.com port 22^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Disconnecting from ec2-50-112-80-21.us-west-2.compute.amazonaws.com port 22^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Connection established^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Caught an exception, leaving main loop due to Socket closed^^^
^^^07:41:20|CONFIG|jslave||1-NodeDispatch-script|i-067c7d34||Using ssh keyfile: /home/jslave/.ssh/id_rsa^^^
^^^07:41:20|CONFIG|jslave||1-NodeDispatch-script|i-067c7d34||Starting SSH Connection: jsl...@ec2-50-112-80-21.us-west-2.compute.amazonaws.com (i-067c7d34)^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Set timeout to 0^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Connecting to ec2-50-112-80-21.us-west-2.compute.amazonaws.com:22^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Connecting to ec2-50-112-80-21.us-west-2.compute.amazonaws.com port 22^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Connection established^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Connection established^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Remote version string: SSH-2.0-OpenSSH_5.3p1 Debian-3ubuntu7^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Local version string: SSH-2.0-JSCH-0.1.45^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||CheckCiphers: aes256-ctr,aes192-ctr,aes128-ctr,aes256-cbc,aes192-cbc,aes128-cbc,3des-ctr,arcfour,arcfour128,arcfour256^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||aes256-ctr is not available.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||aes192-ctr is not available.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||aes256-cbc is not available.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||aes192-cbc is not available.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||arcfour256 is not available.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||CheckKexes: diffie-hellman-group14-sha1^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||diffie-hellman-group14-sha1 is not available.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||SSH_MSG_KEXINIT sent^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Remote version string: SSH-2.0-OpenSSH_5.3p1 Debian-3ubuntu7^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Local version string: SSH-2.0-JSCH-0.1.45^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||CheckCiphers: aes256-ctr,aes192-ctr,aes128-ctr,aes256-cbc,aes192-cbc,aes128-cbc,3des-ctr,arcfour,arcfour128,arcfour256^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||SSH_MSG_KEXINIT received^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||kex: server->client aes128-ctr hmac-md5 none^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||kex: client->server aes128-ctr hmac-md5 none^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||aes256-ctr is not available.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||aes192-ctr is not available.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||aes256-cbc is not available.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||aes192-cbc is not available.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||arcfour256 is not available.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||CheckKexes: diffie-hellman-group14-sha1^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||diffie-hellman-group14-sha1 is not available.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||SSH_MSG_KEXINIT sent^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Remote version string: SSH-2.0-OpenSSH_5.3p1 Debian-3ubuntu7^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Local version string: SSH-2.0-JSCH-0.1.45^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||CheckCiphers: aes256-ctr,aes192-ctr,aes128-ctr,aes256-cbc,aes192-cbc,aes128-cbc,3des-ctr,arcfour,arcfour128,arcfour256^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||SSH_MSG_KEXDH_INIT sent^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||expecting SSH_MSG_KEXDH_REPLY^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||aes256-ctr is not available.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||aes192-ctr is not available.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||aes256-cbc is not available.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||aes192-cbc is not available.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||arcfour256 is not available.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||CheckKexes: diffie-hellman-group14-sha1^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||diffie-hellman-group14-sha1 is not available.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||SSH_MSG_KEXINIT sent^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||SSH_MSG_KEXINIT received^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||kex: server->client aes128-ctr hmac-md5 none^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||kex: client->server aes128-ctr hmac-md5 none^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||SSH_MSG_KEXDH_INIT sent^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||expecting SSH_MSG_KEXDH_REPLY^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||SSH_MSG_KEXINIT received^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||kex: server->client aes128-ctr hmac-md5 none^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||kex: client->server aes128-ctr hmac-md5 none^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||SSH_MSG_KEXDH_INIT sent^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||expecting SSH_MSG_KEXDH_REPLY^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||ssh_rsa_verify: signature true^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Permanently added 'ec2-54-244-80-87.us-west-2.compute.amazonaws.com' (RSA) to the list of known hosts.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||SSH_MSG_NEWKEYS sent^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||SSH_MSG_NEWKEYS received^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||SSH_MSG_SERVICE_REQUEST sent^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||ssh_rsa_verify: signature true^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Permanently added 'ec2-50-112-80-21.us-west-2.compute.amazonaws.com' (RSA) to the list of known hosts.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||SSH_MSG_NEWKEYS sent^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||SSH_MSG_NEWKEYS received^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||SSH_MSG_SERVICE_REQUEST sent^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||SSH_MSG_SERVICE_ACCEPT received^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||SSH_MSG_SERVICE_ACCEPT received^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Authentications that can continue: publickey,keyboard-interactive,password^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Next authentication method: publickey^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Authentications that can continue: publickey,keyboard-interactive,password^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Next authentication method: publickey^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Authentication succeeded (publickey).^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Authentication succeeded (publickey).^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||ssh_rsa_verify: signature true^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Permanently added 'ec2-54-245-51-45.us-west-2.compute.amazonaws.com' (RSA) to the list of known hosts.^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||SSH_MSG_NEWKEYS sent^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||SSH_MSG_NEWKEYS received^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||SSH_MSG_SERVICE_REQUEST sent^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||SSH_MSG_SERVICE_ACCEPT received^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Authentications that can continue: publickey,keyboard-interactive,password^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Next authentication method: publickey^^^
^^^07:41:20|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Authentication succeeded (publickey).^^^
^^^07:41:21|CONFIG|jslave||1-NodeDispatch-script|i-0a7c7d38||Adding reference: ant.PropertyHelper^^^
^^^07:41:21|CONFIG|jslave||1-NodeDispatch-script|i-0a7c7d38||Setting project property: sshexec.output -> ^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Disconnecting from ec2-54-244-80-87.us-west-2.compute.amazonaws.com port 22^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Caught an exception, leaving main loop due to Socket closed^^^
^^^07:41:21|CONFIG|jslave||1-NodeDispatch-script|i-0a7c7d38||Using ssh keyfile: /home/jslave/.ssh/id_rsa^^^
^^^07:41:21|CONFIG|jslave||1-NodeDispatch-script|i-0a7c7d38||Starting SSH Connection: jsl...@ec2-54-244-80-87.us-west-2.compute.amazonaws.com (i-0a7c7d38)^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Set timeout to 0^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Connecting to ec2-54-244-80-87.us-west-2.compute.amazonaws.com:22^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Connecting to ec2-54-244-80-87.us-west-2.compute.amazonaws.com port 22^^^
^^^07:41:21|CONFIG|jslave||1-NodeDispatch-script|i-067c7d34||Adding reference: ant.PropertyHelper^^^
^^^07:41:21|CONFIG|jslave||1-NodeDispatch-script|i-067c7d34||Setting project property: sshexec.output -> ^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Disconnecting from ec2-50-112-80-21.us-west-2.compute.amazonaws.com port 22^^^
^^^07:41:21|CONFIG|jslave||1-NodeDispatch-script|i-067c7d34||Using ssh keyfile: /home/jslave/.ssh/id_rsa^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Caught an exception, leaving main loop due to Socket closed^^^
^^^07:41:21|CONFIG|jslave||1-NodeDispatch-script|i-067c7d34||Starting SSH Connection: jsl...@ec2-50-112-80-21.us-west-2.compute.amazonaws.com (i-067c7d34)^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Set timeout to 0^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Connecting to ec2-50-112-80-21.us-west-2.compute.amazonaws.com:22^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Connecting to ec2-50-112-80-21.us-west-2.compute.amazonaws.com port 22^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Connection established^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Connection established^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Remote version string: SSH-2.0-OpenSSH_5.3p1 Debian-3ubuntu7^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Local version string: SSH-2.0-JSCH-0.1.45^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||CheckCiphers: aes256-ctr,aes192-ctr,aes128-ctr,aes256-cbc,aes192-cbc,aes128-cbc,3des-ctr,arcfour,arcfour128,arcfour256^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||aes256-ctr is not available.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||aes192-ctr is not available.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||aes256-cbc is not available.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||aes192-cbc is not available.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||arcfour256 is not available.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||CheckKexes: diffie-hellman-group14-sha1^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||diffie-hellman-group14-sha1 is not available.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||SSH_MSG_KEXINIT sent^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Remote version string: SSH-2.0-OpenSSH_5.3p1 Debian-3ubuntu7^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Local version string: SSH-2.0-JSCH-0.1.45^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||CheckCiphers: aes256-ctr,aes192-ctr,aes128-ctr,aes256-cbc,aes192-cbc,aes128-cbc,3des-ctr,arcfour,arcfour128,arcfour256^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||aes256-ctr is not available.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||aes192-ctr is not available.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||aes256-cbc is not available.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||aes192-cbc is not available.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||arcfour256 is not available.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||CheckKexes: diffie-hellman-group14-sha1^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||diffie-hellman-group14-sha1 is not available.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||SSH_MSG_KEXINIT sent^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||SSH_MSG_KEXINIT received^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||kex: server->client aes128-ctr hmac-md5 none^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||kex: client->server aes128-ctr hmac-md5 none^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||SSH_MSG_KEXDH_INIT sent^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||expecting SSH_MSG_KEXDH_REPLY^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||SSH_MSG_KEXINIT received^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||kex: server->client aes128-ctr hmac-md5 none^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||kex: client->server aes128-ctr hmac-md5 none^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||SSH_MSG_KEXDH_INIT sent^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||expecting SSH_MSG_KEXDH_REPLY^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||ssh_rsa_verify: signature true^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Permanently added 'ec2-54-244-80-87.us-west-2.compute.amazonaws.com' (RSA) to the list of known hosts.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||SSH_MSG_NEWKEYS sent^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||SSH_MSG_NEWKEYS received^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||SSH_MSG_SERVICE_REQUEST sent^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||ssh_rsa_verify: signature true^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Permanently added 'ec2-50-112-80-21.us-west-2.compute.amazonaws.com' (RSA) to the list of known hosts.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||SSH_MSG_NEWKEYS sent^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||SSH_MSG_NEWKEYS received^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||SSH_MSG_SERVICE_REQUEST sent^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||SSH_MSG_SERVICE_ACCEPT received^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||SSH_MSG_SERVICE_ACCEPT received^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Authentications that can continue: publickey,keyboard-interactive,password^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Next authentication method: publickey^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Authentications that can continue: publickey,keyboard-interactive,password^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Next authentication method: publickey^^^
^^^07:41:21|CONFIG|jslave||1-NodeDispatch-script|i-047c7d36||Adding reference: ant.PropertyHelper^^^
^^^07:41:21|CONFIG|jslave||1-NodeDispatch-script|i-047c7d36||Setting project property: sshexec.output -> ^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Disconnecting from ec2-54-245-51-45.us-west-2.compute.amazonaws.com port 22^^^
^^^07:41:21|CONFIG|jslave||1-NodeDispatch-script|i-047c7d36||Using ssh keyfile: /home/jslave/.ssh/id_rsa^^^
^^^07:41:21|CONFIG|jslave||1-NodeDispatch-script|i-047c7d36||Starting SSH Connection: jsl...@ec2-54-245-51-45.us-west-2.compute.amazonaws.com (i-047c7d36)^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Set timeout to 0^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Connecting to ec2-54-245-51-45.us-west-2.compute.amazonaws.com:22^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Connecting to ec2-54-245-51-45.us-west-2.compute.amazonaws.com port 22^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Caught an exception, leaving main loop due to Socket closed^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Connection established^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-0a7c7d38||Authentication succeeded (publickey).^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-067c7d34||Authentication succeeded (publickey).^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Remote version string: SSH-2.0-OpenSSH_5.3p1 Debian-3ubuntu7^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Local version string: SSH-2.0-JSCH-0.1.45^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||CheckCiphers: aes256-ctr,aes192-ctr,aes128-ctr,aes256-cbc,aes192-cbc,aes128-cbc,3des-ctr,arcfour,arcfour128,arcfour256^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||aes256-ctr is not available.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||aes192-ctr is not available.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||aes256-cbc is not available.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||aes192-cbc is not available.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||arcfour256 is not available.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||CheckKexes: diffie-hellman-group14-sha1^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||diffie-hellman-group14-sha1 is not available.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||SSH_MSG_KEXINIT sent^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||SSH_MSG_KEXINIT received^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||kex: server->client aes128-ctr hmac-md5 none^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||kex: client->server aes128-ctr hmac-md5 none^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||SSH_MSG_KEXDH_INIT sent^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||expecting SSH_MSG_KEXDH_REPLY^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||ssh_rsa_verify: signature true^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Permanently added 'ec2-54-245-51-45.us-west-2.compute.amazonaws.com' (RSA) to the list of known hosts.^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||SSH_MSG_NEWKEYS sent^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||SSH_MSG_NEWKEYS received^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||SSH_MSG_SERVICE_REQUEST sent^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||SSH_MSG_SERVICE_ACCEPT received^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Authentications that can continue: publickey,keyboard-interactive,password^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Next authentication method: publickey^^^
^^^07:41:21|WARNING|jslave||1-NodeDispatch-script|i-047c7d36||Authentication succeeded (publickey).^^^
^^^07:41:22|INFO|jslave||1-NodeDispatch-script|i-0a7c7d38||***Start first step***^^^
^^^07:41:22|INFO|jslave||1-NodeDispatch-script|i-067c7d34||***Start first step***^^^
^^^07:41:22|INFO|jslave||1-NodeDispatch-script|i-047c7d36||***Start first step***^^^


testBlockingIssue.xml

Charles Scott

unread,
Apr 10, 2013, 12:33:31 PM4/10/13
to rundeck...@googlegroups.com

Interesting that you have narrowed this down to amount of time.

I have sporadic problems with dispatching to nodes where commands hang even though the job should have completed successfully (job gets to the end and rundeck seems to not detect the successful remoteexit zero call).  I have never reliably reproduced the problem, it happens enough to be annoying but rare enough to not be critical. I've submitted thread dumps in the past, however, we haven't come up with anything tangible.  Not sure if this is  related to your issue or not but will make a note of this.

   I'm on an older 1.4.1 instance and am hoping for a new 1.5.x release as I have had better results testing development 1.5.1 SNAPSHOT builds which have had issues with remote/sudo dispatching.
From: <tu...@improve.ro>
Reply-To: <rundeck...@googlegroups.com>
Date: Wednesday, April 10, 2013 9:05 AM
To: <rundeck...@googlegroups.com>
Subject: [rundeck] job hangs on long commands
--
You received this message because you are subscribed to the Google Groups "rundeck-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rundeck-discu...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Greg Schueler

unread,
Apr 10, 2013, 12:38:49 PM4/10/13
to rundeck...@googlegroups.com
Hi Tudor,

Perhaps there's a reason you need a persistent hour-long ssh session before doing something, in which case  I don't know of a workaround, and this sounds like some kind of bug or timeout problem.  What version of Rundeck are you using? Another possibility is an issue in recording the execution result to the database, which would make it appear to "hang". There have been issues with this in earlier versions.

However, if you don't need the ssh session to be connected for so long and you are using Rundeck 1.5, you could try to add another step that uses the Localexec plugin, (or implement a simple script-based workflow step plugin: http://rundeck.org/docs/developer/workflow-step-plugin-development.html#script-based-step-plugins that can perform a sleep).  Then this "sleep" would happen on the Rundeck server before moving to the next step, which could be another command that happens over ssh. 

If using a local sleep still has the same problem, that would be a good indication that there is a bug finalizing the execution state to the DB.



tu...@improve.ro

unread,
Apr 10, 2013, 1:29:23 PM4/10/13
to rundeck...@googlegroups.com

Hi Greg,

Thanks for looking on this.

On Wednesday, April 10, 2013 7:38:49 PM UTC+3, Greg Schueler wrote:
Hi Tudor,

Perhaps there's a reason you need a persistent hour-long ssh session before doing something, in which case  I don't know of a workaround, and this sounds like some kind of bug or timeout problem.  
Unfortunately:), I do need huge ssh sessions, I need them to launch on the remote nodes some long maven builds, the sleep job was just a test to see if i can reproduce it with other commands.
 
What version of Rundeck are you using?
 
I use 1.5, but I will run a test tomorrow with the last snapshot from Jenkins, to see if i'm lucky. 

Another possibility is an issue in recording the execution result to the database, which would make it appear to "hang". There have been issues with this in earlier versions.

However, if you don't need the ssh session to be connected for so long and you are using Rundeck 1.5, you could try to add another step that uses the Localexec plugin, (or implement a simple script-based workflow step plugin: http://rundeck.org/docs/developer/workflow-step-plugin-development.html#script-based-step-plugins that can perform a sleep).  Then this "sleep" would happen on the Rundeck server before moving to the next step, which could be another command that happens over ssh. 
 
In my case, it's mandatory to have the maven build running on the rundeck node (in amazon), so the local execution wouldn't be helpful.

Do you think, that if i change the SSH configuration on the server (rundeck node) by adding a keep connection alive option would help? 

Another workaround that i was thinking would be to launch the long process in the background (by ending the command in the script with &), in order for the rundeck runner to advance to the next step, and in the specific step to check if the long process still runs (a busy waiting mechanism or something like that).I know that is ugly, but do you think that will work?

Thanks, 
Tudor

tu...@improve.ro

unread,
Apr 11, 2013, 6:21:04 AM4/11/13
to rundeck...@googlegroups.com
Hi guys,

I came out with a workaround for running long commands remotely and seems that is working.
Instead of running:

longCommand

you can do it like this

nohup longCommand 0<&- &>/dev/null &
PID=`echo $!`
echo $PID
for (( ; ; ))
do
   sleep 30
   if ps -p $PID > /dev/null
     then
        echo "$PID is still running"
     else
        echo Command was executed
        break
   fi
done


Cheers,
Tudor


Reply all
Reply to author
Forward
0 new messages