Important: RabbitMQ compatibility with Erlang/OTP 20

4462 views
Skip to first unread message

Michael Klishin

unread,
Jun 1, 2017, 12:03:35 PM6/1/17
to rabbitm...@googlegroups.com
Dear RabbitMQ community,

As some of you know, Erlang/OTP 20 is going to be released quite soon.
Some Erlang community members are very excited about it as it introduces
certain internal improvements that are well overdue.

Unfortunately, those changes include breaking changes that affect the way
Erlang data structures are serialised to binary data. This is critically important
to RabbitMQ. Our team is looking into what kind of changes may be necessary
to support OTP 20 and so far it looks like upgrades of installations
that use Erlang 19 to 20 won't be possible in the short term or ever.

Therefore we have to warn you: Erlang/OTP 20 is not supported by RabbitMQ
at this point. We will update this thread as the story unfolds.

Note that certain Linux distribution to switch default Erlang version
to OTP 20 RCs in the last few weeks, making package managers
install a pre-release version over a GA one. This is a very unfortunate
and dangerous change. Make sure your Erlang package is pinned
to 19.3 (or an earlier supported version) so that you don't get
unexpected upgrades to OTP 20.
--
MK

Staff Software Engineer, Pivotal/RabbitMQ

Carl Hörberg

unread,
Jun 2, 2017, 5:43:05 AM6/2/17
to rabbitmq-users
Ok, but are new installations with Erlang 20 safe/possible?

Carl Hörberg

unread,
Jun 2, 2017, 5:51:19 AM6/2/17
to rabbitm...@googlegroups.com
Which internal improvements in Erlang 20 are you looking forward to? 

--
You received this message because you are subscribed to a topic in the Google Groups "rabbitmq-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rabbitmq-users/_imbAavBYjY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michael Klishin

unread,
Jun 2, 2017, 7:10:43 AM6/2/17
to rabbitm...@googlegroups.com
New installations will work, I believe I saw someone already running 3.6.9 on OTP 20.

I am personally looking forward to efficiency improvements and dirty schedulers.
Unicode is no longer an afterthought in 20, which is great (and what lead to breaking term_to_binary changes).

To unsubscribe from this group and all its topics, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Michael Klishin

unread,
Jun 2, 2017, 8:50:46 AM6/2/17
to rabbitm...@googlegroups.com
So far it looks like a combination of two changes in OTP 20 mean RabbitMQ 3.6.x will never
be able to support it.

Which in turn means a ton of installations will stick to 19 forever, long after it's out of any kind of support.

This is a real shame.


To post to this group, send email to rabbitm...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
MK

Staff Software Engineer, Pivotal/RabbitMQ

Peter Silva

unread,
Jun 2, 2017, 1:49:22 PM6/2/17
to rabbitmq-users
confused...

say you have a two node cluster (A and B)

1) break cluster (routing all traffic to A)
2) on B: 'download broker definitions' (is it json or erlang?)  --> b.conf
3) shutdown rabbitmq, purging wherever some state is stored.
4) upgrade erlang,
5) start up B (empty) and restore the broker definitions.
6) switch traffic from B to A (carefully.)
7) rinse, lather, repeat steps 2 through 5 with A.
8) reform cluster.

should that sort of method, in principle, work?
To unsubscribe from this group and all its topics, send an email to rabbitmq-user...@googlegroups.com.

To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.

To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
MK

Staff Software Engineer, Pivotal/RabbitMQ

Michael Klishin

unread,
Jun 2, 2017, 1:55:00 PM6/2/17
to rabbitm...@googlegroups.com
If you can spin up a new cluster on OTP 20 and switch all apps to it
gradually (first export/import definition, then publishers, then consumers), it is
going to work. The practice is known as blue/green deployments.

We were referring to "in place" upgrades, which are still more common.

Of course, if you are willing to wipe out or consume all data before upgrading
then most upgrade issues don't affect you ;)

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Peter Silva

unread,
Jun 2, 2017, 2:38:05 PM6/2/17
to rabbitmq-users
If you're worried about your data, then I would tend to be doing blue/green upgrades all the time.  I mean, what happens if it fails? there is no clean way to back out.  I would think the inplace upgrade people are more concerned about ease of maintenance than availability.

For them, you could do something like have a *stop&dump* command (like the broker export/import, but for the actual data) that would just dump all the data in a version independent format (json?) and reload it after the upgrade.  So you would get a service interruption, but no information loss.

fwiw, I implemented that sort of thing at the application level on a per queue basis, for when consumers have difficulty to avoid filling memory, but nothing system-wide.

qq8...@126.com

unread,
Jun 4, 2017, 9:28:32 AM6/4/17
to rabbitmq-users
Is OPT 19  compatibility for RabbitMQ 3.5.7 ?

在 2017年6月3日星期六 UTC+8上午1:55:00,Michael Klishin写道:

Michael Klishin

unread,
Jun 4, 2017, 4:42:19 PM6/4/17
to rabbitm...@googlegroups.com
Please keep this thread on topic: OTP 20.

First release to support 19.0 was 3.6.4:

dfed...@pivotal.io

unread,
Jun 15, 2017, 12:02:40 PM6/15/17
to rabbitmq-users

Dear RabbitMQ community,

We have an update on Erlang/OTP 20 and RabbitMQ compatibility.

First, let's clarify when OTP 20 support will be available: RabbitMQ versions before 3.6.11 will not work correctly with OTP-20.

What's the risk of upgrading from 19.x to 20 on an unsupported version? When upgrading from OTP-19.x(or earlier) to OTP-20 all the persistent data will be permanently lost!

What about newly provisioned nodes or clusters? Although it's possible to run RabbitMQ with OTP-20 from scratch, there will be crashes in queue mirroring and management API.

What are the Breaking Changes?

Key breaking change for us is the new external (binary) terms format, which was introduced in R16 but now is used by default. This term format will encode atoms as unicode with a different type tag. Unfortunately there is no way to generate a value in the old format in OTP itself, so it's not easy to perform migrations.

The format change will affect PID decoding and encoding (as in rabbit_misc:decompose_pid/1) , which will break mirroring, node renaming and parts of management API. The issue was addressed in https://github.com/rabbitmq/rabbitmq-common/commit/4aa1bf4c1450fa8ba60b381c8b295e03fbd17c44

We also use term_to_binary to generate hash IDs for various entities in the system. While some entities are transient, like management bindings properties and federation links IDs, some of them are persistent and should not change during upgrade. An umbrella issue for that is https://github.com/rabbitmq/rabbitmq-server/issues/1243

It's been decided to not use term_to_binary to generate hashes in future versions (3.7 and above). Management and federation hash strategies were updated in: https://github.com/rabbitmq/rabbitmq-management-agent/pull/47 ,https://github.com/rabbitmq/rabbitmq-management/pull/415 and https://github.com/rabbitmq/rabbitmq-federation/pull/58

Hashing that involves term_to_binary/1 is used when calculating queue index directory names and queue names in the STOMP plugin. They should be addressed in 3.6.11. Since we don't have upgrade steps for patch releases and cannot rename directories and queue names, we decided to generate the old term_to_binary format for specific types we use in our own code. The generation functions are in the term_to_binary_compat module.

In case of STOMP queue generation, a tuple of strings or binaries was used, so the function is called string_and_binary_tuple_2_to_binary and accepts a 2-element tuple of strings or binaries. The issue is https://github.com/rabbitmq/rabbitmq-stomp/issues/115 This is addressed in https://github.com/rabbitmq/rabbitmq-server/pull/1262 and https://github.com/rabbitmq/rabbitmq-stomp/pull/116

Queue index directories is the most serious change, since RabbtiMQ will delete all "unknown" queue index directories on boot. When queue index names change, all the directories become "unknown", so all the persistent data will be lost. GitHub issue for this specific problem is https://github.com/rabbitmq/rabbitmq-server/issues/1243.

In 3.6 we use compat function queue_name_to_binary to generate the same format of the directory name as in pre-20.

This is addressed in https://github.com/rabbitmq/rabbitmq-server/pull/1246.

In 3.7 we decided to generate a queue index name with different algorithm, so we won't be affected with further changes in term_to_binary in future versions. The directories should be renamed during migration step. This will require to generate the old name first using compat function.

This is addressed in https://github.com/rabbitmq/rabbitmq-server/pull/1250.

To summarise, we are trying to make 3.6.11 and 3.7.0 (by the time it reaches RC stage) support OTP 20 and introducing a CI pipeline that will test Erlang upgrades to 20. Any updates that are worth mentioning will be posted in this thread.

Cheers.   

Leo Liu

unread,
Jun 16, 2017, 5:06:11 AM6/16/17
to rabbitm...@googlegroups.com
FYI.

term_to_binary was made backwards-compatible on 13 June. See commit
https://github.com/erlang/otp/commit/f1c69ee583dcd1f525562cf6adc382b8464b1578:

Revert "erts: Do not generate atoms on old latin1 external format"

The incompatible version is now behind a flag. See
https://github.com/erlang/otp/commit/bdc0f3504fb13f777a7cc826caa5fd10dc6fc291:
Introduce minor vsn 2 in term_to_binary/2

Hope this useful.

Cheers,
Leo

dfed...@pivotal.io

unread,
Jun 16, 2017, 5:13:30 AM6/16/17
to rabbitmq-users, sdl...@gmail.com
Hi Leo.

Yes. Thanks for that information.

We are currently testing 3.6.10 with the OTP master branch to see if there are any incompatibilities left.

We plan to add version to `term_to_binary` in 3.6.11 and still have plans to remove unnecessary usage of it in 3.7.

Stay tuned for more info.
Message has been deleted

dfed...@pivotal.io

unread,
Jun 16, 2017, 12:03:58 PM6/16/17
to rabbitmq-users

Recently the OTP team had reverted the change to term_to_binary, which caused the incompatibility. You can see this commit for more info https://github.com/erlang/otp/commit/48e67f5dd1d20b9a1f78c5a97cc7ed4afb489ba5

The term_to_binary/2 function now has another minor_version option 2, which will generate the new format, while version 1, generating the old one, is the default version for term_to_binary/1.


So RabbitMQ 3.6.10 should now support OTP-20 (after release). We are still testing versions 3.6.4 to 3.6.9 for compatibility.

OTP-20.0-rc3 is not out yet, so to test it yourself, you have to build it from github master branch.


In 3.6.11 we will change term_to_binary/1 to term_to_binary/2 with minor_version = 1 so it will not be affected by changing the default value for minor_version in future OTP releases.

You can see the change in this PR https://github.com/rabbitmq/rabbitmq-server/pull/1268

There is no change in plans for 3.7, we will try to not use term_to_binary anywhere we can. See previous posts in this thread.


On Thursday, 1 June 2017 17:03:35 UTC+1, Michael Klishin wrote:

Michael Klishin

unread,
Jun 19, 2017, 5:16:07 PM6/19/17
to rabbitm...@googlegroups.com
There's another update on the state of OTP 20.

Erlang Solutions, whose package repository is quite popular, decided to
provide OTP 20-rc2 by default from their apt repository.

This is a very unfortunate decision for RabbitMQ. Unfortunately, ESL have no intention
to change this as OTP 20 will go GA (without any more RCs, it seems) very soon
and therefore OTP 20 GA will be provided, which is somewhat justified.

Since the earliest version of RabbitMQ that can support OTP 20 will be 3.6.11,
and we didn't plan on releasing in this week, leave alone had much time to extensively test
our fixes, we *highly* recommend RabbitMQ users who provision
Erlang via Erlang Solutions to use apt pinning [1][2] to avoid surprise upgrades to OTP 20.

We will be preparing 3.6.11 RC for a release on an expedited schedule now, with a new milestone
coming tomorrow.

We will also update our docs to mention apt pinning and emphasize that OTP 20 is to be avoided
by RabbitMQ users for the next few weeks.


Michael Klishin

unread,
Jun 21, 2017, 11:15:16 AM6/21/17
to rabbitmq-users
RabbitMQ 3.6.11 Milestone 2 introduces some initial changes for OTP 20 compatibility:
Message has been deleted
Message has been deleted

Michael Klishin

unread,
Jul 5, 2017, 12:10:40 PM7/5/17
to rabbitm...@googlegroups.com
Team RabbitMQ has an update on the state of OTP 20 support.

OTP 20 GA is mostly backwards compatible compared to 20-rc1: I could not reproduce upgrade data loss
with any RabbitMQ version from 3.6.4 (first version to support Erlang 19) through 3.6.10 when upgrading from 19.x
to 20.0 GA.

Our point of view on 20 is still this: if you have existing installations to upgrade, wait for RabbitMQ 3.6.11
and upgrade to OTP 20 after that, or even a few months later. New installations can use 20 from the start
if they choose to.

We'd like to thank the OTP team for listening to our rc1 feedback and implementing a (most) compatible
version of term_to_binary/1 for the final release.


On Wed, Jun 21, 2017 at 7:52 PM, Gabriele Santomaggio <g.sant...@gmail.com> wrote:
Hi,

We removed the 20-rc* packages in our Erlang-Solutions repositories.

The Erlang/OTP 20 packages is now available https://www.erlang-solutions.com/resources/download.html  


-
Gabriele

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Michael Klishin

unread,
Aug 16, 2017, 9:58:24 AM8/16/17
to rabbitmq-users
Team RabbitMQ plan to ship RabbitMQ 3.6.11 very soon and it supports Erlang/OTP 20 (including Erlang upgrades).

We've been using milestone releases on OTP 20 in our long running environments for a few weeks now and
haven't identified any remaining incompatibilities.

We also have a bunch of RabbitMQ and Erlang upgrade tests that covers a couple of dozens of permutations,
including upgrades to OTP 20.

vton...@paypal.com

unread,
Aug 25, 2017, 6:50:13 PM8/25/17
to rabbitmq-users
If 3.6.11 is supporting OTP 20, then below statement in 'https://www.rabbitmq.com/install-debian.html' void and no longer true?
20.xNOT SUPPORTED and will lead to DATA LOSS when upgraded to from an earlier Erlang/OTP release, avoid unless this document is updated to suggest otherwise.

Could someone please update this page with latest info for new users.

Michael Klishin

unread,
Aug 26, 2017, 1:42:21 AM8/26/17
to rabbitm...@googlegroups.com
Done.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages