Dear RabbitMQ community,
We have an update on Erlang/OTP 20 and RabbitMQ compatibility.
First, let's clarify when OTP 20 support will be available: RabbitMQ versions before 3.6.11 will not work correctly with OTP-20.
What's the risk of upgrading from 19.x to 20 on an unsupported version? When upgrading from OTP-19.x(or earlier) to OTP-20 all the persistent data will be permanently lost!
What about newly provisioned nodes or clusters? Although it's possible to run RabbitMQ with OTP-20 from scratch, there will be crashes in queue mirroring and management API.
What are the Breaking Changes?
Key breaking change for us is the new external (binary) terms format, which was introduced in R16 but now is used by default. This term format will encode atoms as unicode with a different type tag. Unfortunately there is no way to generate a value in the old format in OTP itself, so it's not easy to perform migrations.
The format change will affect PID decoding and encoding (as in
rabbit_misc:decompose_pid/1) , which will break mirroring, node renaming and parts of management API. The issue was addressed in https://github.com/rabbitmq/rabbitmq-common/commit/4aa1bf4c1450fa8ba60b381c8b295e03fbd17c44
We also use
term_to_binary to generate hash IDs for various entities in the system. While some entities are transient, like management bindings properties and federation links IDs, some of them are persistent and should not change during upgrade. An umbrella issue for that is https://github.com/rabbitmq/rabbitmq-server/issues/1243
It's been decided to not use
term_to_binary to generate hashes in future versions (3.7 and above). Management and federation hash strategies were updated in: https://github.com/rabbitmq/rabbitmq-management-agent/pull/47 ,https://github.com/rabbitmq/rabbitmq-management/pull/415 and https://github.com/rabbitmq/rabbitmq-federation/pull/58
Hashing that involves
term_to_binary/1 is used when calculating queue index directory names and queue names in the STOMP plugin. They should be addressed in 3.6.11. Since we don't have upgrade steps for patch releases and cannot rename directories and queue names, we decided to generate the old
term_to_binary format for specific types we use in our own code. The generation functions are in the
In case of STOMP queue generation, a tuple of strings or binaries was used, so the function is called
string_and_binary_tuple_2_to_binary and accepts a 2-element tuple of strings or binaries. The issue is https://github.com/rabbitmq/rabbitmq-stomp/issues/115 This is addressed in https://github.com/rabbitmq/rabbitmq-server/pull/1262 and https://github.com/rabbitmq/rabbitmq-stomp/pull/116
Queue index directories is the most serious change, since RabbtiMQ will delete all "unknown" queue index directories on boot. When queue index names change, all the directories become "unknown", so all the persistent data will be lost. GitHub issue for this specific problem is https://github.com/rabbitmq/rabbitmq-server/issues/1243.
In 3.6 we use compat function
queue_name_to_binary to generate the same format of the directory name as in pre-20.
This is addressed in https://github.com/rabbitmq/rabbitmq-server/pull/1246.
In 3.7 we decided to generate a queue index name with different algorithm, so we won't be affected with further changes in
term_to_binary in future versions. The directories should be renamed during migration step. This will require to generate the old name first using compat function.
This is addressed in https://github.com/rabbitmq/rabbitmq-server/pull/1250.
To summarise, we are trying to make 3.6.11 and 3.7.0 (by the time it reaches RC stage) support OTP 20 and introducing a CI pipeline that will test Erlang upgrades to 20. Any updates that are worth mentioning will be posted in this thread.Cheers.