Web Stomp plugin crashing on bad_escape

88 views
Skip to first unread message

Jeremy Steinberg

unread,
Jun 11, 2020, 11:01:17 AM6/11/20
to rabbitmq-users
Currently running rabbitmq 3.7.24.

Attempting to subscribe to STOMP via the websocket with bad input seems to crash the websocket process.

2020-06-10 18:13:46.271 [error] <0.7959.31> Supervisor {<0.7959.31>,amqp_channel_sup_sup} had child channel_sup started with amqp_channel_sup:start_link(direct, <0.7958.31>, <<\"10.176.0.45:39024 -> 10.176.148.79:15674\">>) at undefined exit with reason shutdown in context shutdown_error
2020-06-10 18:14:29.414 [error] <0.11248.19> Supervisor {<0.11248.19>,amqp_channel_sup_sup} had child channel_sup started with amqp_channel_sup:start_link(direct, <0.11247.19>, <<\"10.176.0.45:20384 -> 10.176.148.79:15674\">>) at undefined exit with reason shutdown in context shutdown_error
2020-06-10 18:18:23.859 [error] <0.13816.34> CRASH REPORT Process <0.13816.34> with 0 neighbours crashed with reason: no case clause matching {error,{bad_escape,\"\\\\\\n\"}} in rabbit_web_stomp_handler:handle_data/2 line 236
2020-06-10 18:18:23.860 [error] <0.13824.34> Supervisor {<0.13824.34>,rabbit_web_stomp_connection_sup} had child cowboy_clear started with cowboy_clear:start_link(web_stomp, #Port<0.6424>, ranch_tcp, #{env => #{dispatch => [{'_',[],[{[<<\"ws\">>],[],rabbit_web_stomp_handler,[{type,text},{ws_opts,...}]}]}],...},...}) at <0.13816.34> exit with reason no case clause matching {error,{bad_escape,\"\\\\\\n\"}} in rabbit_web_stomp_handler:handle_data/2 line 236 in context child_terminated
2020-06-10 18:18:23.860 [error] <0.13824.34> Supervisor {<0.13824.34>,rabbit_web_stomp_connection_sup} had child cowboy_clear started with cowboy_clear:start_link(web_stomp, #Port<0.6424>, ranch_tcp, #{env => #{dispatch => [{'_',[],[{[<<\"ws\">>],[],rabbit_web_stomp_handler,[{type,text},{ws_opts,...}]}]}],...},...}) at <0.13816.34> exit with reason reached_max_restart_intensity in context shutdown
2020-06-10 18:19:35.140 [error] <0.4125.35> CRASH REPORT Process <0.4125.35> with 0 neighbours crashed with reason: no case clause matching {error,{bad_escape,\"\\\\\\n\"}} in rabbit_web_stomp_handler:handle_data/2 line 236
2020-06-10 18:19:35.140 [error] <0.4123.35> Supervisor {<0.4123.35>,rabbit_web_stomp_connection_sup} had child cowboy_clear started with cowboy_clear:start_link(web_stomp, #Port<0.6523>, ranch_tcp, #{env => #{dispatch => [{'_',[],[{[<<\"ws\">>],[],rabbit_web_stomp_handler,[{type,text},{ws_opts,...}]}]}],...},...}) at <0.4125.35> exit with reason no case clause matching {error,{bad_escape,\"\\\\\\n\"}} in rabbit_web_stomp_handler:handle_data/2 line 236 in context child_terminated
2020-06-10 18:19:35.141 [error] <0.4123.35> Supervisor {<0.4123.35>,rabbit_web_stomp_connection_sup} had child cowboy_clear started with cowboy_clear:start_link(web_stomp, #Port<0.6523>, ranch_tcp, #{env => #{dispatch => [{'_',[],[{[<<\"ws\">>],[],rabbit_web_stomp_handler,[{type,text},{ws_opts,...}]}]}],...},...}) at <0.4125.35> exit with reason reached_max_restart_intensity in context shutdown


In our situation we do not have complete control over what topics are being subscribed to.  It is a public facing web socket server.  I do not think it is unreasonable for the server to more gracefully handle bad input.


Further, it appears when the web stomp process is restarted, it either leaks the old stomp subscribers OR there is a race condition when creating them.  When it starts back up and new connections are added, the logs are littered with errors like the following:

2020-06-10 18:23:27.655 [error] <0.13153.37> STOMP error frame sent:
Message: \"Duplicated subscription identifier\"
Detail: \"A subscription identified by 'T_sub-5' already exists.\"

So, there is lots of churn on the clients until they find an available subscription.  Occasionally, our nodes become unresponsive and we need to restart them.


Michael Klishin

unread,
Jun 11, 2020, 11:29:21 AM6/11/20
to rabbitmq-users
And what is the "faulty input" that we can use to reproduce?

I'm not sure what you mean by "process" here. Consumers do not and cannot survive a restart. However, multiple clients can race to subscribe with the same ID. On top of that, node failure
and subsequent consumer loss are not detected instantly in a distributed system [1].

Applications must be ready to deal with errors for duplicate subscriber IDs.

From: rabbitm...@googlegroups.com <rabbitm...@googlegroups.com> on behalf of Jeremy Steinberg <jeremyst...@gmail.com>
Sent: Thursday, June 11, 2020 18:01
To: rabbitmq-users <rabbitm...@googlegroups.com>
Subject: [rabbitmq-users] Web Stomp plugin crashing on bad_escape
 
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/a744e6ec-3e90-4ad9-ba00-779e51f9b2a6o%40googlegroups.com.

Jeremy Steinberg

unread,
Jun 11, 2020, 12:45:54 PM6/11/20
to rabbitmq-users
We originally had an issue when switching over to using rabbitmq/web-stomp from our old stomp infrastructure where certain characters in topics were considered invalid in rabbitmq-web-stomp.

Unescaped URL characters like `/`.

If you try to subscribe to something like:

SUBSCRIBE
id:sub-1
destination:/topic/metrics.DDD/D

rabbit stomp did not like it.  This was causing the similar crashes to what we are seeing now.  I am unable to replicate this currently.  But the behavior we saw mimicked what we are seeing now with the nodes eventually becoming unresponsive. 

> no case clause matching {error,{bad_escape,\"\\\\\\n\"}} in rabbit_web_stomp_handler:handle_data/2 line 236

I was able to replicate it just now by attempting to subscribe with:

/topic/metrics.DDD\\\n

SUBSCRIBE
id:sub-1
destination:/topic/metrics.DDD\\\n

CRASH REPORT Process <0.28991.246> with 0 neighbours crashed with reason: no case clause matching {error,{bad_escape,“\\\n”}} in rabbit_web_stomp_handler:handle_data/2 line 236


To unsubscribe from this group and stop receiving emails from it, send an email to rabbitm...@googlegroups.com.

Luke Bakken

unread,
Jun 11, 2020, 5:21:35 PM6/11/20
to rabbitmq-users
Hi Jeremy,

I have reproduced this issue and have a fix - https://github.com/rabbitmq/rabbitmq-web-stomp/issues/121

I can't say if this will be back-ported into the 3.7.x release series for RabbitMQ. I believe we are only doing critical security fixes at this point - https://www.rabbitmq.com/versions.html

Thanks,
Luke

Jeremy Steinberg

unread,
Jun 11, 2020, 5:39:02 PM6/11/20
to rabbitmq-users
Thank you so much.  Upgrading to 3.8.x is fine.

Jeremy
Reply all
Reply to author
Forward
0 new messages