Async start a process and check until its listen port condition is successful

Mohtashim S

unread,

Jun 20, 2022, 12:03:34 PM6/20/22

to Ansible Project

I trigger multiple tomcat startup scripts and then need to check if all process listens on their specific port across multiple hosts in the quickest time possible.

For the test case, I m writing 3 scripts instead of tomcat scripts that runs on a single host and listen on ports 4443,4445,4447 respectively as below.

cat /tmp/startapp1.sh

while test 1 # infinite loop
sleep 10
do
nc -l localhost 4443 > /tmp/app1.log
done

cat /tmp/startapp2.sh

while test 1 # infinite loop
sleep 30
do
nc -l localhost 4445 > /tmp/app2.log
done

cat /tmp/startapp3.sh

while test 1 # infinite loop
sleep 20
do
nc -l localhost 4447 > /tmp/app3.log
done
Below is my code to trigger the script and check if the telnet is successful:

cat main.yml

- include_tasks: "internal.yml"
loop:
- /tmp/startapp1.sh 4443
- /tmp/startapp2.sh 4445
- /tmp/startapp3.sh 4447
cat internal.yml

- shell: "{{ item.split()[0] }}"
async: 600
poll: 0

- name: DEBUG CHECK TELNET
shell: "telnet {{ item.split()[1] }}"
delegate_to: localhost
register: telnetcheck
until: telnetcheck.rc == 0
async: 600
poll: 0
delay: 6
retries: 10

- name: Result of TELNET
async_status:
jid: "{{ item.ansible_job_id }}"
register: _jobs
until: _jobs.finished
delay: 6
retries: 10
with_items: "{{ telnetcheck.results }}"

Expectation: The above three scripts should start along with telnet check in about 30 seconds.

Thus, the basic check that needs to be done here is telnet until: telnetcheck.rc == 0 but due to async the telnet shell module does not have entries for rc and hence I get the below error:

"msg": "The conditional check 'telnetcheck.rc == 0' failed. The error was: error while evaluating conditional (telnetcheck.rc == 0): 'dict object' has no attribute 'rc'"

In the above code where and how can I check if telnet had succeeded i.e telnetcheck.rc == 0 and make sure the Expectation is met?

Dick Visser

unread,

Jun 20, 2022, 1:42:07 PM6/20/22

to ansible...@googlegroups.com

What is the reason to use shell + telnet why the wait_for module has
such functionality?
This does what (I think!) you want:

1. run ncat (not nc) on 3 ports on the target system:

dnmvisser@villa:~$ for i in 4443 4445 4447; do ncat -l -k -p $i & done
[1] 253485
[2] 253486
[3] 253487

Consider this playbook "waitfor.yml":

---
- hosts: all
vars:
ports:
- 4443
- 4445
- 4447
tasks:
- name: Starting multiple wait_for tasks
wait_for:
port: "{{ item }}"
loop: "{{ ports }}"
register: foo
async: 600
poll: 0
changed_when: false
- name: Collecting status of wait_for tasks
async_status:
jid: "{{ item.ansible_job_id }}"
register: jobs
until: jobs.finished
delay: 1
retries: 600
loop: "{{ foo.results }}"

Run that against your target system:

ansible-playbook -i villa, waitfor.yml

The playbook should start and then get into a loop where it waits for
all 3 ports to listen.

So, while this runs, open the required sockets. I use ncat instead of nc:

dnmvisser@villa:~$ for i in 4443 4445 4447; do ncat -l -k -p $i & done
[1] 253485
[2] 253486
[3] 253487

Within 1 second the playbook will finish as all ports are now
listening on the target.

Dick

> --
> You received this message because you are subscribed to the Google Groups "Ansible Project" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/ansible-project/19bde69c-b54d-41d7-a9dc-7a8515c51c44n%40googlegroups.com.

Mohtashim S

unread,

Jun 20, 2022, 3:05:54 PM6/20/22

to Ansible Project

@Dick under ` - name: Starting multiple wait_for tasks` -> i do not have a loop ` loop: "{{ ports }}"` as my loop is in `main.yml` outer yml

Thus, i get the following error:

TASK [Collecting status of wait_for tasks] *********************************************************************************************************************************************************************
task path: /root/newinternal.yml:17
fatal: [localhost]: FAILED! => {
"msg": "'dict object' has no attribute 'results'"
}

Can you please suggest?

Dick Visser

unread,

Jun 20, 2022, 3:12:29 PM6/20/22

to ansible...@googlegroups.com

On Mon, 20 Jun 2022 at 21:06, Mohtashim S <mohta...@gmail.com> wrote:
>
> @Dick under ` - name: Starting multiple wait_for tasks` -> i do not have a loop ` loop: "{{ ports }}"` as my loop is in `main.yml` outer yml
>
> Thus, i get the following error:
>
> TASK [Collecting status of wait_for tasks] *********************************************************************************************************************************************************************
> task path: /root/newinternal.yml:17
> fatal: [localhost]: FAILED! => {
> "msg": "'dict object' has no attribute 'results'"
> }
>
> Can you please suggest?

I can't do everything for you.
Look at the tasks/logic of the (fully working) playbook that I
provided and adapt it to your situation.

Mohtashim S

unread,

Jun 21, 2022, 3:56:12 AM6/21/22

to Ansible Project

@Dick your solution takes 64 seconds for starting and checking for successful telnet if each of the three scripts has 30 seconds sleep time.

With the below approach, it takes only half the time i.e 35 seconds to complete everything.

cat main.yml

---

- name: Starting services
gather_facts: false
hosts: localhost
tasks:

- include_tasks: "newinternal.yml"
loop:
- ~/startapp1.sh 4443
- ~/startapp2.sh 4445
- ~/startapp3.sh 4447

- name: Pause for 32 seconds to check telnet
pause:
seconds: 32

- include_tasks: "waitnewinternal.yml"
loop:
- ~/startapp1.sh 4443
- ~/startapp2.sh 4445
- ~/startapp3.sh 4447

cat newinternal.yml

- shell: "{{ item.split()[0] }}"
async: 600
poll: 0

cat waitnewinternal.yml

- name: Starting multiple wait_for tasks

wait_for:
host: localhost
port: "{{ item.split()[1] }}"
timeout: 43

register: foo
async: 600
poll: 0
changed_when: false

- name: Collecting status of wait_for tasks
async_status:
jid: "{{ foo.ansible_job_id }}"

register: jobs
until: jobs.finished
delay: 1
retries: 600

Considering the solution you provided is async without bottlenecks I have the below 2 queries:

1. Can you please explain why the difference of half the time?

2. Can your solution be optimized so it also takes less than 40 seconds which makes sense. My solution is not ideal as the 32 seconds of sleep is not definitive and will vary from environment to environment.

Reply all

Reply to author

Forward