RabbitMQ frozen at startup

530 views
Skip to first unread message

César

unread,
May 16, 2016, 9:50:54 AM5/16/16
to rabbitmq-users
Hi all,

we are seeing an intermittent behaviour in RabbitMQ (so far only seen twice) and would like to see if anyone can think of an explanation for it.

RabbitMQ is started inside a Puppet catalog run, but it hangs forever. A manual restart of RabbitMQ seems to fix the problem, but we would like to know if there is any way to avoid this problem.
The only unusual message that we could see appeared in rabbitmq-sasl.log:

=SUPERVISOR REPORT==== 12-May-2016::05:05:42 ===
     Supervisor: {local,ssl_sup}
     Context:    shutdown_error
     Reason:     killed
     Offender:   [{pid,<0.283.0>},
                  {name,tls_connection},
                  {mfargs,{tls_connection_sup,start_link,[]}},
                  {restart_type,permanent},
                  {shutdown,4000},
                  {child_type,supervisor}]

Other than that, RabbitMQ was working normally. The process tree looked like this:

root      4534  0.0  0.0 150532 64008 ?        Ss   May12   0:00 /usr/bin/ruby /usr/bin/puppet agent --server=ms1dot90
root      3953  0.0  0.1 371680 266312 ?       S    May12   0:20  \_ puppet agent: applying configuration
root      6575  0.0  0.0  11564  1676 ?        Ss   May12   0:00      \_ /bin/sh /sbin/service rabbitmq-server start
root      6581  0.0  0.0 108432  1804 ?        S    May12   0:00          \_ /bin/sh /etc/init.d/rabbitmq-server start
root      6744  0.0  0.0 108432  1184 ?        S    May12   0:00              \_ /bin/sh /etc/init.d/rabbitmq-server start
root      6748  0.0  0.0 108164  1384 ?        S    May12   0:00              |   \_ /bin/bash -c ulimit -S -c unlimited >/dev/null 2>&1 ; /usr/sbin/rabbitmq-server
root      6752  0.0  0.0 108164  1460 ?        S    May12   0:00              |       \_ /bin/sh /usr/sbin/rabbitmq-server
root      6770  0.0  0.0 140884  1552 ?        S    May12   0:00              |           \_ su rabbitmq -s /bin/sh -c /usr/lib64/rabbitmq/bin/rabbitmq-server
rabbitmq  6772  7.6  0.0 6977928 180528 ?      Ssl  May12 156:08              |               \_ /usr/lib64/erlang/erts-6.4/bin/beam.smp -W w -A 64 -P 1048576 -K true -- -root /usr/lib64/erlang -progname erl -- -home /var/lib/rabbitmq -- -pa /usr/lib64/rabbitmq/lib/rabbitmq_server-3.5.4/sbin/../ebin -noshell -noinput -s rabbit boot -sname rabbit@ms1dot90
rabbitmq  7146  0.0  0.0  10792   536 ?        Ss   May12   0:00              |                   \_ inet_gethost 4
rabbitmq  7147  0.0  0.0  12896   640 ?        S    May12   0:00              |                       \_ inet_gethost 4
root      6745  0.0  0.0 108164  1460 ?        S    May12   0:00              \_ /bin/sh /usr/sbin/rabbitmqctl wait /var/run/rabbitmq/pid
root      6769  0.0  0.0 140884  1556 ?        S    May12   0:00                  \_ su rabbitmq -s /bin/sh -c /usr/lib64/rabbitmq/bin/rabbitmqctl  'wait' '/var/run/rabbitmq/pid'
rabbitmq  6771  0.1  0.0 2900308 130568 ?      Ssl  May12   3:19                      \_ /usr/lib64/erlang/erts-6.4/bin/beam.smp -- -root /usr/lib64/erlang -progname erl -- -home /var/lib/rabbitmq -- -pa /usr/lib64/rabbitmq/lib/rabbitmq_server-3.5.4/sbin/../ebin -noshell -noinput -hidden -boot start_clean -sasl errlog_type error -mnesia dir "/var/lib/rab

After restarting the service we got the following:

[root@ms1dot90 ~]# service rabbitmq-server restart
Restarting rabbitmq-server: FAILED - check /var/log/rabbitmq/shutdown_log, _err
SUCCESS
rabbitmq-server.

[root@ms1dot90 ~]# cat /var/log/rabbitmq/shutdown_err
Error: {could_not_read_pid,{error,enoent}}

Any ideas?

Thanks!
Cesar.

Michael Klishin

unread,
May 16, 2016, 9:53:53 AM5/16/16
to rabbitm...@googlegroups.com, César
The SASL message says there was a timeout when stopping a TCP connection process.
The message in shutdown_err means RabbitMQ couldn't read a PID file.

Nothing specific  comes to my mind.

What version do you use?
> --
> You received this message because you are subscribed to the Google Groups "rabbitmq-users"
> group.
> To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
> To post to this group, send an email to rabbitm...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

--
MK

Staff Software Engineer, Pivotal/RabbitMQ


César

unread,
May 16, 2016, 9:59:15 AM5/16/16
to rabbitmq-users, xcu...@gmail.com
Hi Michael,

we use version 3.5.4.
Sorry if I didn't manage to provide more details, I was not sure of anything else that might be of use!

Thanks,
Cesar.

Michael Klishin

unread,
May 16, 2016, 10:01:49 AM5/16/16
to rabbitm...@googlegroups.com, César
I recall a few fixes that could prevent nodes from shutting down cleanly.

Take a look at http://www.rabbitmq.com/changelog.html and consider moving
to at least 3.5.7. 

César

unread,
May 16, 2016, 10:05:29 AM5/16/16
to rabbitmq-users, xcu...@gmail.com
Ah, that's a piece of very good news then :)

Thanks for your assistance!

Samuel Thorpe

unread,
Jun 19, 2017, 3:59:15 PM6/19/17
to rabbitmq-users, xcu...@gmail.com
Hello,

I realize this thread is now over a year old, but I am experiencing a similar problem on version 3.6.9. The error is very sporadic, as thus far we can't replicate it consistently. Here is a link to a different post I made in this group in case it helps anyone else browsing for solutions to this problem. So far the issue is still unresolved. 


Cheers,
-Sam Thorpe

Luke Bakken

unread,
Dec 8, 2017, 7:24:35 PM12/8/17
to rabbitmq-users
Hi Sam -

Have you tried upgrading Erlang to 20.1 to resolve this? Upgrading RabbitMQ may also help.
Reply all
Reply to author
Forward
0 new messages