MQTT broker scalability

3,720 views
Skip to first unread message

soumik

unread,
Jan 30, 2012, 12:36:33 PM1/30/12
to MQ Telemetry Transport
Hi,
First of all, I would like say that I'm new to MQTT and this
community. I'm currently exploring MQTT for a commercial M2M platform
prototype. I have tried out some simple examples for MQTT pub-sub
communication.
Having done that, I'm now wondering what is the general strategy for
scalability of MQTT broker/server. I've seen some mention of MQTT
broker acting as a "bridge" to other MQTT brokers. But I didn't quite
understand it fully and from what I understood it didn't look too
elegant either. Actually, let me explain what I'm trying to achieve in
order to put my last statement in context.
So, I'm thinking of deploying a cluster of MQTT brokers in EC2 behind
a ELB(Elastic Load Balancer). The idea is for MQTT clients to be able
to publish to MQTT brokers through the ELB Elastic IP(lets say round
robin load-balanced). So the idea is with increased number of MQTT
publisher clients we are able to set the ELB to scale the MQTT brokers
accordingly.

With "bridging", seems to me that we have to set the topics "in/out"
in the MQTT config file for every MQTT broker added to the cluster.
The configuration (keeping in mind not to loop between brokers and
also to actually be IP-aware of the next/previous broker) apart,
"bridging" causes the messages in the topic to be duplicated and sent
to the next broker. This seems too "stateful" for having a clean
scalable approach.

So my question is, is there a better way to achieve a scalability for
the brokers?? Or am I wrong about the MQTT "bridge' approach?

Also, I was wondering how do the MQTT broker store the messages??
Seems to me that the current "statefulness" of MQTT broker might be
because of the way they store and access the message data.

Thanks,
Soumik

Nicholas O'Leary

unread,
Feb 3, 2012, 11:54:17 AM2/3/12
to mq...@googlegroups.com
Hi Soumik,

apologies for the delay in responding. I'm sure you can appreciate
this is quite a large topic to cover.

How you design your scalability depends on what messaging patterns you
want to be able to achieve.

I don't think what follows is specific to MQTT - this would be much
the same for any messaging system that wants to scale.

Broadly, I think there are two class of messaging pattern that need to
be considered.

First, there is a client-to-centre pattern where you have a large
number of clients that want to send messages to a back-end system and
vice versa. In this pattern, the clients don't typically need to send
messages to each other. For example, a smart metering system would
have the meters on the edge reporting back to the central
reporting/billing system, but two meters would never talk to each
other.
In this pattern you can imagine a hierarchy of brokers; the clients
connect to one of a set of front-line brokers much as you describe.
These brokers then bridge the message topics "up" to a centralised
broker (or brokers for redundancy...).
If the centre wishes to send a message out to the clients, then you
have to consider how the centre knows where to send the message;
essentially which topic should it publish to in order for the message
to be bridged to the correct front-line broker that the client is
connected (and subscribed) to. One means of doing this is for the
clients to publish a "birth" message when they connect that allows the
centre to record where the client has connected. You can also then use
the Will message to notify the centre the client is no longer
connected so cannot receive a message.

The second pattern is a client-to-client pattern, where you have a
large number of clients that want to send messages to each other.
Within this pattern, there is also the choice over whether you want
strict point-to-point style messaging where a client wants to send a
message to another specific client, or if you want a client to be able
to broadcast a message to multiple clients.
Fully bridging between all front-line brokers could get quite
inefficient - particularly if you are trying to achieve point-to-point
messaging. You don't want to send a message destined for a specific
client to every front-line broker just in case. So using the
hierarchical approach described above, you can keep track of where
messages need to be routed to in order to get them to their
destination.


You also have to consider how to handle disconnected clients. In MQTT,
if you connect with CleanSession=false, then the broker is required to
persist the client's subscriptions and in-flight message state. That
is fine if you can guarantee the client will reconnect to the same
broker next time, but if it fails over to another broker then you have
state on the first broker that will cause problems; the subscriptions
continuing to queue up messages and in-flight message is left in
limbo.
The recommendation is to use CleanSession=true if you want the clients
to be able to fail-over safely - but your application must handle the
consequences of this.

Of course, everything I've described here is just one way of
approaching it and I've skipped over a lot of the finer detail. But I
hope this helps.


> Also, I was wondering how do the MQTT broker store the messages??
> Seems to me that the current "statefulness" of MQTT broker might be
> because of the way they store and access the message data.

The MQTT spec doesn't say anything about how a broker implementation
should store the messages - that is down to the specific
implementations.

Regards,
Nick

> --
> To learn more about MQTT please visit http://mqtt.org
>
> To post to this group, send email to mq...@googlegroups.com
> To unsubscribe from this group, send email to
> mqtt+uns...@googlegroups.com
>
> For more options, visit this group at
> http://groups.google.com/group/mqtt

Reply all
Reply to author
Forward
0 new messages