Unable to reach rabitmq from remote airflow worker

142 views
Skip to first unread message

Abhi Basu

unread,
Sep 16, 2018, 11:15:53 PM9/16/18
to rabbitmq-users
Seeing messages like this in the log on remote worker:
[2018-09-16 17:36:08,350: ERROR/MainProcess] consumer: Cannot connect to amqp://myuser:**@<IP_ADDR>:5672/myvhost: timed out.
Trying again in 32.00 seconds...

However, the following works on the airflow server (where rabbitmq is running):
broker_url = amqp://myuser:********@localhost:5672/myvhost

Made sure both nodes allow each other to connect (no firewall issues), and telnet to the rabitmq port also works fine. 

Rabbitmq installed using these instructions and seem to be functioning fine - https://www.vultr.com/docs/how-to-install-rabbitmq-on-centos-7

Thanks.

Michael Klishin

unread,
Sep 17, 2018, 3:57:55 AM9/17/18
to rabbitm...@googlegroups.com
See server logs [1] and the connectivity troubleshooting guide [2].

That's as much as can be suggested with the amount of information provided.
The "timed out" part of the message clearly suggests it must be a TCP connectivity problem.


--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
MK

Staff Software Engineer, Pivotal/RabbitMQ

Abhi Basu

unread,
Sep 18, 2018, 11:09:23 AM9/18/18
to rabbitm...@googlegroups.com
Can someone please help me with this? Since I am using AWS EC2 VMs there is an external and internal IP to adjust for.

Worker /etc/hosts:
127.0.0.1       localhost localhost.localdomain
INT_IP_WORKER   ip-INT-HOSTNAME.ec2.internal

Worker airflow.cfg:
broker_url = amqp://myuser:mypassword@ec2-EXT-HOSTNAME-SERVER.compute-1.amazonaws.com :5672/myvhost

Worker log has this in the end:
  warnings.warn(W_PIDBOX_IN_USE.format(node=self))
[2018-09-18 14:54:43,507: WARNING/MainProcess] celery@ ip-INT-HOSTNAME.ec2.internal ready.

No errors in scheduler logs
MySQL and Rabbitmq are running on server.
AF scheduler and webserver running on server
AF worker running on worker.


Server /etc/hosts
127.0.0.1       localhost localhost.localdomain
INT_IP_SERVER    ip-INT-HOSTNAME-SERVER.ec2.internal




To post to this group, send email to rabbitm...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
--
MK

Staff Software Engineer, Pivotal/RabbitMQ

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Abhi Basu

Michael Klishin

unread,
Sep 18, 2018, 11:20:33 AM9/18/18
to rabbitm...@googlegroups.com
Please see and post server logs and follow the guide recommended above [1]. Guessing
is the least efficient way of solving problems. Let's collect some data to make more informed decisions.


Abhi Basu

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Abhi Basu

unread,
Sep 18, 2018, 11:35:57 AM9/18/18
to rabbitm...@googlegroups.com
1. rabbitmqctl status:
Status of node 'rabbit@ip-172-31-25-86' ...
[{pid,1849},
.....
{listeners,[{clustering,25672,"::"},{amqp,5672,"::"}]},

2. Server logs show something like this:
=INFO REPORT==== 18-Sep-2018::14:45:46 ===
Server startup complete; 6 plugins started.
 * rabbitmq_management
 * rabbitmq_web_dispatch
 * webmachine
 * mochiweb
 * rabbitmq_management_agent
 * amqp_client

=INFO REPORT==== 18-Sep-2018::14:47:48 ===
Plugins changed; enabled [], disabled []

=INFO REPORT==== 18-Sep-2018::14:48:08 ===
accepting AMQP connection <0.1516.0> (AF-SERVER-IP:53536 -> AF-SERVER-IP:5672)

=INFO REPORT==== 18-Sep-2018::14:48:09 ===
closing AMQP connection <0.1516.0> (AF-SERVER-IP:53536 -> AF-SERVER-IP:5672)

=INFO REPORT==== 18-Sep-2018::14:52:21 ===
accepting AMQP connection <0.3577.0> (18.235.34.169:40936 -> AF-SERVER-IP:5672)

=INFO REPORT==== 18-Sep-2018::14:52:21 ===
accepting AMQP connection <0.3582.0> (18.235.34.169:40938 -> AF-SERVER-IP:5672)

=INFO REPORT==== 18-Sep-2018::14:54:42 ===
accepting AMQP connection <0.4757.0> (18.235.34.169:40940 -> AF-SERVER-IP:5672)

=INFO REPORT==== 18-Sep-2018::14:54:42 ===
accepting AMQP connection <0.4762.0> (18.235.34.169:40942 -> AF-SERVER-IP:5672)

=INFO REPORT==== 18-Sep-2018::14:55:29 ===
accepting AMQP connection <0.5187.0> (AF-SERVER-IP:53538 -> AF-SERVER-IP:5672)

=INFO REPORT==== 18-Sep-2018::14:55:30 ===
closing AMQP connection <0.5187.0> (AF-SERVER-IP:53538 -> AF-SERVER-IP:5672)

3. Telnet from worker to server on port 5672 works


Abhi Basu

Michael Klishin

unread,
Sep 18, 2018, 4:08:10 PM9/18/18
to rabbitm...@googlegroups.com
According to the log there were successful connections to this node [1].
Given that the node listens on the standard port on all interfacese and telnet is also able to connect
I'd check the client end for misconfiguration.

Please ask on an Airflow mailing list/forum/Slack channel/etc, I am running out of ideas what else
can be checked on the RabbitMQ end.

Abhi Basu

unread,
Sep 20, 2018, 11:59:34 AM9/20/18
to rabbitm...@googlegroups.com
Is there any documentation available on how a remote worker needs to be configured.

My remote worker cannot seem to get to the CeleryExecutor on master, even though the config file points to the public ip of the master and /etc/hosts file has a mapping from master public IP to master private hostname.

Thanks,

Abhi
Abhi Basu
Reply all
Reply to author
Forward
0 new messages