scalable MQTT broker

494 views
Skip to first unread message

ajay aggarwal

unread,
May 16, 2017, 8:44:41 PM5/16/17
to Mainflux
Hi Drasko,

We exchanged some emails and you suggested to move this discussion here so others can benefit too. 

I am looking for a MQTT broker that can scale horizontally. I should also mention that we use micro-services based architecture and deploy our services in containers using mesos/marathon. Currently we are using rabbitMQ (with its MQTT plugin) for MQTT broker. But there are at least 2 big issues that I can see with it that will prevent us to scale it linearly. 
  1. First it seems that rabbitMQ replicates client state to all nodes, which will not scale well as the number of clients increase (to say 10's of millions)
  2. Second on the backend we need to use wildcard subscription which creates a firehose situation because all messages need to flow through the node where the backend service is connected to for its wildcard subscription.
I was wondering whats you take on above issues. Are there other brokers which solve these issues more elegantly?

If we were to implement the MQTT broker from scratch we would use a horizontally scalable database like Cassandra to solve problem #1 and use Kafka to solve #2. But before we decide to implement our own broker, I wanted to make sure I am not missing out on some existing solution that is already out there and solves above issues.

Would appreciate your input.

Thanks.

Ajay

Drasko DRASKOVIC

unread,
May 17, 2017, 6:50:18 PM5/17/17
to ajay aggarwal, André Graf, Mainflux
Hi Ajay,

On Wed, May 17, 2017 at 2:44 AM, ajay aggarwal <ajay...@gmail.com> wrote:
> Hi Drasko,
>
> We exchanged some emails and you suggested to move this discussion here so
> others can benefit too.
>
> I am looking for a MQTT broker that can scale horizontally. I should also
> mention that we use micro-services based architecture and deploy our
> services in containers using mesos/marathon. Currently we are using rabbitMQ
> (with its MQTT plugin) for MQTT broker. But there are at least 2 big issues
> that I can see with it that will prevent us to scale it linearly.
>
> First it seems that rabbitMQ replicates client state to all nodes, which
> will not scale well as the number of clients increase (to say 10's of
> millions)

I am not sure that this full replication is necessary, but all nodes
in the cluster must share consistent memory state. Erlang obtains this
naturally, this is built into the language, and probably via mechanism
of Mnesia (distributed hash table). For other languages you have to
"simulate" this behavior via Redis for example.

So, I would be very surprised that RabbitMQ does not have some options
to use distributed RAM between nodes in the cluster. But I do not know
RabbitMQ well...

> Second on the backend we need to use wildcard subscription which creates a
> firehose situation because all messages need to flow through the node where
> the backend service is connected to for its wildcard subscription.

This is a bigger problem :).

>
> I was wondering whats you take on above issues. Are there other brokers
> which solve these issues more elegantly?

Well - first to understand well the problem you referenced to my post
here: https://groups.google.com/forum/#!topic/rabbitmq-users/KVMNkAsW-ac.
But to dig deeper, investigate my posts here:
https://github.com/erlio/vernemq/issues/197 and especially here:
https://dev.eclipse.org/mhonarc/lists/mosquitto-dev/msg01273.html

As you see, what is happening is following: on device cloud interface
(let's call this south bridge) you can maybe do load-balancing (people
from 2lemetry video https://www.youtube.com/watch?v=VoTclkxSago are
using DNS load balancing here, but I think TCP balancing should work
just fine: https://www.nginx.com/resources/admin-guide/tcp-load-balancing/).
However, on nothbound interface of your broker (or a cluster) you are
doing SUB with the application. Even if there is 100 nodes (brokers)
in the cluster (i.e. they replicated state betwenn them selves and
communicate to keep the same state), your app can connect to only one
node. Which exactly - does not matter (load-balancer chooses one in
round-robin, or some more clever algorithm). But you are connection to
only one node of a cluster. If your app does SUB on `#`, then all
other nodes have to route messages to this node to which app is
connected.

If you have nodes A, B and C, and your app SUB on B with `#`, then
whatever client connected to A or C publish this must go through B,
because there is a subscriber on B that demands all messages that are
pushed on any MQTT topic. This as you see beats the purpose of the
scaling.

Let's say now that you decided to connect 3 apps (workers) that just
take data from your 3 nodes in a cluster and write them in a database.
In the ideal case, worker 1 would take all messages from node A,
worker 2 all the messages from node B and worker 3 all the messages
from node C. They would do this in parallel and write them into DB.
But as you see - this is not possible. because you can not SUB to all
the messages from A and ONLY A, if A is clustered with B and C.

The only thing that is left is to get your hands dirty and dive into
the code of a broker and change it slightly. For this purpose most of
the brokers provide hooks, and this is what Andre from VerneMQ team
was mentioning when replying to my question here:
https://github.com/erlio/vernemq/issues/197

Now, in EMQ you must program these hook handlers in Erlang (as a
plugin), while in VerneMQ you can do it in Erlang and in Lua, via
vmq-diversity: https://vernemq.com/blog/2016/04/29/vmq-diversity-the-vernemq-plugin-builder-toolkit.html.

Mainflux for now (and this will probably change) uses NodeJS based
broker for several reasons, but one of them was this fact that
Javascript is something that most people understand and can
change/adopt. It was not an easy choice, having in mind all the
constraints, but Aedes (https://github.com/mcollina/aedes) seems to be
efficient (http://www.nearform.com/nodecrunch/performance-reaching-ludicrous-speed/)
and scales (shared state) via Dynamo-like consistent hash ring
(https://github.com/mcollina/aedes/issues/76).

Mainflux MQTT broker is slighlty changed, i.e. hook "authorizePublih"
is used (although I think "on_publish" would be a better candidate
here) to publish out of that instance (and that instance only) of MQTT
broker data towards external application:
https://github.com/mainflux/mainflux-mqtt/blob/master/mainflux-mqtt.js#L111.
In our case that external application is NATS broker, but in your case
it will be worker that will take this data and write it in the DB.

>
> If we were to implement the MQTT broker from scratch we would use a
> horizontally scalable database like Cassandra to solve problem #1 and use
> Kafka to solve #2. But before we decide to implement our own broker, I
> wanted to make sure I am not missing out on some existing solution that is
> already out there and solves above issues.
>
> Would appreciate your input.

Yep, no magic wand here. You must implement a small plugin ;).

I hope this helps.

BR,
Drasko

ajay aggarwal

unread,
May 27, 2017, 11:50:11 AM5/27/17
to Mainflux
Thanks Drasko. Yes the hooks like on_publish in EMQ and VerneMQ make it real easy to accomplish what I am looking to do. Only caveat.. need to learn Erlang.

But thanks for sharing this very useful insight into thsee issues. And good luck with Mainflux.

Drasko DRASKOVIC

unread,
May 28, 2017, 9:40:56 PM5/28/17
to ajay aggarwal, Mainflux
On Sat, May 27, 2017 at 5:50 PM, ajay aggarwal <ajay...@gmail.com> wrote:
> Thanks Drasko. Yes the hooks like on_publish in EMQ and VerneMQ make it real easy to accomplish what I am looking to do. Only caveat.. need to learn Erlang.

Yes, but Erlang really shines for these use-cases (scalability, fault
tolerance and HA, ...).

On the other hand, maybe it would be easier for you also to start with
Aedes - we are currently using it because of easier
deployment/modifications and because of greater popularity of
Javascript in the community (lower barrier to contributions and
collaborative work). Aedes should also scale very well.

> But thanks for sharing this very useful insight into these issues. And good luck with Mainflux.

Np, thanks.

BR,
Drasko
Reply all
Reply to author
Forward
0 new messages