Potential race condition with 'x-consistent-hash' exchanges

248 views
Skip to first unread message

Ildar Gafurov

unread,
Mar 23, 2021, 5:37:59 AM3/23/21
to rabbitmq-users
Hello. I've found error that looks like a race condition error. I don't know how to reproduce this bug, but still. First of all I use RabbitMQ 3.8.9 (Erlang 23.1.5) as cluster of five docker containers. I have several 'x-consistent-hash' exchanges and some time some of them begin to drop messages. In time of error following messages appear in logs:

2021-03-22 18:30:51.778 [warning] <0.25225.261> Bucket 7 not found

I've done some research and executed the command:
  rabbitmqctl eval 'rabbit_exchange_type_consistent_hash:ring_state(<<"/">>, <<"Exch1ConsistentHash">>).'

(Exch1ConsistentHash is a durable 'x-consistent-hash' exchange that has a single consumer with routing key 10)

it shows me this:

{ok,{chx_hash_ring,{resource,<<"/">>,exchange,<<"Exch1ConsistentHash">>},
                   #{10 =>
                         {resource,<<"/">>,queue,
                                   <<"aioamqp.gen-9183a64b-85fd-49e7-b648-f759e9ec2eed">>},
                     11 =>
                         {resource,<<"/">>,queue,
                                   <<"aioamqp.gen-9183a64b-85fd-49e7-b648-f759e9ec2eed">>},
                     12 =>
                         {resource,<<"/">>,queue,
                                   <<"aioamqp.gen-9183a64b-85fd-49e7-b648-f759e9ec2eed">>},
                     13 =>
                         {resource,<<"/">>,queue,
                                   <<"aioamqp.gen-9183a64b-85fd-49e7-b648-f759e9ec2eed">>},
                     14 =>
                         {resource,<<"/">>,queue,
                                   <<"aioamqp.gen-9183a64b-85fd-49e7-b648-f759e9ec2eed">>},
                     15 =>
                         {resource,<<"/">>,queue,
                                   <<"aioamqp.gen-9183a64b-85fd-49e7-b648-f759e9ec2eed">>},
                     16 =>
                         {resource,<<"/">>,queue,
                                   <<"aioamqp.gen-9183a64b-85fd-49e7-b648-f759e9ec2eed">>},
                     17 =>
                         {resource,<<"/">>,queue,
                                   <<"aioamqp.gen-9183a64b-85fd-49e7-b648-f759e9ec2eed">>},
                     18 =>
                         {resource,<<"/">>,queue,
                                   <<"aioamqp.gen-9183a64b-85fd-49e7-b648-f759e9ec2eed">>},
                     19 =>
                         {resource,<<"/">>,queue,
                                   <<"aioamqp.gen-9183a64b-85fd-49e7-b648-f759e9ec2eed">>}},
                   20}}

I don't know Erlang, but tell me please whether I'm wrong or not:
  the "SelectedBucket" value should always be between 0 and a size of a map (which in my case is 10)
  2. The last integer value for chx_hash_ring is a next number that a next queue will use (in my case is 20):
    ...
    <<"aioamqp.gen-9183a64b-85fd-49e7-b648-f759e9ec2eed">>}},
    20}}

If this true, then there is a race condition somewhere.

Thank you

M K

unread,
Apr 24, 2021, 11:09:41 AM4/24/21
to rabbitmq-users

The ring is updated when bindings change. If this happens on a separate connection or channel right before or in the middle of a routing operation on another, this can affect publishing. This is generally true for all exchange types but this one Is stateful.

There were no major changes to this plugin since rabbitmq/rabbitmq-consistent-hash-exchange#37 except for one thing related to changes in RabbitMQ core:

https://github.com/rabbitmq/rabbitmq-delayed-message-exchange/issues/1496, which 3.8.9 does not include.

Please upgrade to 3.8.14 or 3.8.15 which is expected to come out next week.

Reply all
Reply to author
Forward
0 new messages