MQTT vs XML over TCP for lossy cellular connections

Fred Basset

unread,

Apr 29, 2014, 5:53:12 PM4/29/14

to mq...@googlegroups.com

Hi All,

In my current project we have messaging between a central server and a fairly large number of remote nodes in the field. Each of the remote notes usually connects to the public internet via a cell modem; various technologies are used from GPRS to 3G. We use XMPP for the messaging running over HTTPS. Message payloads are in XML and can be quite large (10s of kB). Whilst this works fairly well, I do see situations where our outgoing message queue fills up and we lose data. My theory is that we are seeing too much packet loss over the cell modems and this is causing TCP to really slow down as it waits for ACKs, re-transmits etc.. The throughput on some of the modems is quite low, I've seen on average 2kB / second and quite variable; I've seen uploads burst sometimes a bit higher then be stalled at zero throughput for periods of time.

I had heard of MQTT and was wondering if this could be a better protocol for our application? If we sent binary data over MQTT the bandwidth requirements would be a lot lower and hopefully MQTT could deal with the high packet loss better. Perhaps MQTT over UDP with a customized reliability layer on top might work better?

Can anyone advise?

Thank you,

Fred

Stefano Costa

unread,

Apr 30, 2014, 3:15:22 AM4/30/14

to mq...@googlegroups.com

On 04/29/2014 11:53 PM, Fred Basset wrote:
> Hi All,
>
> In my current project we have messaging between a central server and a
> fairly large number of remote nodes in the field. Each of the remote
> notes usually connects to the public internet via a cell modem;
> various technologies are used from GPRS to 3G. We use XMPP for the
> messaging running over HTTPS. Message payloads are in XML and can be
> quite large (10s of kB). Whilst this works fairly well, I do see
> situations where our outgoing message queue fills up and we lose data.
> My theory is that we are seeing too much packet loss over the cell
> modems and this is causing TCP to really slow down as it waits for
> ACKs, re-transmits etc.. The throughput on some of the modems is
> quite low, I've seen on average 2kB / second and quite variable; I've
> seen uploads burst sometimes a bit higher then be stalled at zero
> throughput for periods of time.

Latency is usually a problem with cell modems, but depends a lot on
location and operator of course. We've seen great differences in the way
a pure TCP socket behaves among operators (latency, drops, signalling
from layers below...) but 2kB/seconds is very poor! Are you sure you're
taking into account the overhead? Are you calculating the throughput on
the payload only?

>
> I had heard of MQTT and was wondering if this could be a better
> protocol for our application? If we sent binary data over MQTT the
> bandwidth requirements would be a lot lower and hopefully MQTT could
> deal with the high packet loss better. Perhaps MQTT over UDP with a
> customized reliability layer on top might work better?

MQTT works on top of TCP (not sure if UDP is a feasible option) and for
sure if you manage to shrink the payload by sending "binary" (or anyway
less structured) data instead of a nice XML syntax you will gain in
terms of needed bandwidth, but will experience similar problems if drops
/ latency occur. The great benefit is dealing with a protocol and tools
that enables to trim the delivery strategy (QoS) and plenty of support /
examples. We do use MQTT on GPRS (Cinterion modules with Java) with success.

--
Stefano Costa, Managing Director R&D

M +39 335 6565749
Skype stefanocosta.bluewind
Twitter @stefanobluewind
http://www.bluewind.it

Dave Locke

unread,

Apr 30, 2014, 5:00:24 AM4/30/14

to mq...@googlegroups.com

even though MQTT runs over TCP it may help. The MQTT protocol is designed to run over constrained networks. For instance:

1) The protocol overhead is very small to deliver a message 2 bytes + a topic (subject) + the payload are required to deliver a message. So combining a low overhead protocol with a payload that is compressed will help on lines that are low bandwidth. A good practise on constrained networks is to ensure the payload is kept as small as possible.

2) MQTT will deliver a message to the requested quality of service. QOS includes fire and forget, at least once and exactly once. The message is delivered to the quality of service even if the connection is dropped in the middle of delivering the message. (when the connection is re-established the protocol ensures any inflight messages are delivered to the requested QOS).

All the best
Dave

mq...@googlegroups.com wrote on 30/04/2014 08:15:22: > From: Stefano Costa <stefan...@bluewind.it>

> > -- > To learn more about MQTT please visithttp://mqtt.org> --- > You received this message because you are subscribed to the Google > Groups "MQTT" group. > To unsubscribe from this group and stop receiving emails from it, > send an email to mqtt+uns...@googlegroups.com. > To post to this group, send email to mq...@googlegroups.com. > Visit this group athttp://groups.google.com/group/mqtt. > For more options, visithttps://groups.google.com/d/optout. >
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Stefano Costa

unread,

Apr 30, 2014, 5:43:01 AM4/30/14

to mq...@googlegroups.com

On 04/30/2014 11:00 AM, Dave Locke wrote:

1) The protocol overhead is very small to deliver a message 2 bytes + a topic (subject) + the payload are required to deliver a message. So combining a low overhead protocol with a payload that is compressed will help on lines that are low bandwidth. A good practise on constrained networks is to ensure the payload is kept as small as possible.

Fully agree on this, great benefit of course. Would be a non sense to try to send "structured" data with MQTT on most situations.

2) MQTT will deliver a message to the requested quality of service. QOS includes fire and forget, at least once and exactly once. The message is delivered to the quality of service even if the connection is dropped in the middle of delivering the message. (when the connection is re-established the protocol ensures any inflight messages are delivered to the requested QOS).

Yes this is also true but sometimes could lead to higher traffic costs, must be taken into account and monitored.

Fred: I'd be interested at learning more about the application, if any detail can be disclosed here. Number of nodes, cellular operator being used, traffic per month... interesting use mode for MQTT.

S

Karl Palsson

unread,

Apr 30, 2014, 7:26:12 AM4/30/14

to mq...@googlegroups.com

On Wed, Apr 30, 2014 at 11:43:01AM +0200, Stefano Costa wrote:
> Fully agree on this, great benefit of course. Would be a non sense
> to try to send "structured" data with MQTT on most situations.

Stop it! please! _IF_ you're on a constrained network yes, but please don't imply that it's
nonsense for _most_ situations to use "structured" data. I completely agree that on constrained
networks used plain text and xml and json are not optimal, but there are plenty of people using
mqtt on much less constrained networks, because the protocol is simple, the brokers are easy and
lightweight and the client tools are nice to use. Please don't perpetuate this idea that MQTT is
only for ugly networks and that everyone has the same pattern of pub/sub ratios and client/broker
ratios.

Sincerely,
Karl Palsson

Peter Hunsberger

unread,

Apr 30, 2014, 11:40:57 AM4/30/14

to mq...@googlegroups.com

Stefano, that's nonsense ;-) one cannot let technology determine the business requirements! If XML makes sense then MQTT is an excellent way to transport it for many use cases. If there are technology restrictions on bandwidth and / or data sizes solutions such as EXI (http://www.w3.org/TR/2014/REC-exi-20140211/) might be applicable.

Peter Hunsberger

--

Fred Basset

unread,

Apr 30, 2014, 2:38:15 PM4/30/14

to mq...@googlegroups.com, stefan...@bluewind.it

On Wednesday, April 30, 2014 2:43:01 AM UTC-7, Stefano Costa Bluewind wrote:

On 04/30/2014 11:00 AM, Dave Locke wrote:

1) The protocol overhead is very small to deliver a message 2 bytes + a topic (subject) + the payload are required to deliver a message. So combining a low overhead protocol with a payload that is compressed will help on lines that are low bandwidth. A good practise on constrained networks is to ensure the payload is kept as small as possible.

Fully agree on this, great benefit of course. Would be a non sense to try to send "structured" data with MQTT on most situations.

2) MQTT will deliver a message to the requested quality of service. QOS includes fire and forget, at least once and exactly once. The message is delivered to the quality of service even if the connection is dropped in the middle of delivering the message. (when the connection is re-established the protocol ensures any inflight messages are delivered to the requested QOS).

Yes this is also true but sometimes could lead to higher traffic costs, must be taken into account and monitored.

Fred: I'd be interested at learning more about the application, if any detail can be disclosed here. Number of nodes, cellular operator being used, traffic per month... interesting use mode for MQTT.

Stefano & others,

Thanks for the response. I have two systems I'm involved with. One system is a monitoring system that has about 1000 nodes deployed world-wide. Various modem technologies are used from GPRS to 3G. I don't think we have any 4G modems yet. In the US I believe we have accounts with all the major providers. On these systems an XMPP message containing XML fields representing the values read is sent up to the server every minute The message size is quite variable. On a small system we could be reading as few as 10 - 50 values, on larger systems it could be 1000s of values. On these systems we see occasional problems of the outgoing message queue filling up (i.e. modem not being able to transmit the values fast enough). No-one has done a detailed analysis of what's actually occurring on those systems yet.

On another project we are possibly deploying 10s of 1000s of smaller remote nodes that also do monitoring. Those systems do not use XMPP but currently use SOAP over HTTPS. On those systems we currently have a CDMA modem module and use Verizon as the provider. I tested the upload speeds by using SCP to copy a large binary file of random data to a publicaly available server. This method could in itself be completely inaccurate, but I had to run something to gauge the speed. I also measured download speed by doing a wget of a large binary random file and the speeds were similar. On the download I saw the speed vary greatly, 0 throughput for extended periods (e.g. 30 seconds), to bursting up to about 8KytesB/s. Average there was also about 2KBytes/second. This could just be the TCP windowing in effect so I'm not reading too much into it. Another use case of these systems is doing software updates. These will be quite large (a few MBytes), but be fairly infrequent (maybe twice a year). On these systems we will want to optimize communications to use as little bandwidth as possible, as we can then switch to a cheaper plan.

I'd also be interested in hearing what type of data format people are using in their payloads? Google protocol buffers looks pretty interesting but I've never used it.

Fred.

Stefano Costa

unread,

Apr 30, 2014, 2:49:48 PM4/30/14

to mq...@googlegroups.com

> Stop it! please! _IF_ you're on a constrained network yes, but please don't imply that it's
> nonsense for _most_ situations to use "structured" data. I completely agree that on constrained
> networks used plain text and xml and json are not optimal, but there are plenty of people using
> mqtt on much less constrained networks, because the protocol is simple, the brokers are easy and
> lightweight and the client tools are nice to use. Please don't perpetuate this idea that MQTT is
> only for ugly networks and that everyone has the same pattern of pub/sub ratios and client/broker
> ratios.
>

Karl, Peter,
where did I write that MQTT is "only" for ugly networks etc?

Citing Dave:

" A good practise on constrained networks is to ensure the payload is kept as small as possible."

I was making reference to this very common situation. It's not ugly, nor limiting any business, it's a world of less reliable and fairly expensive communications channels where MQTT is among the best choices: GPRS, satellite, proprietary radio etc. Why should we "stop" discussing about this?

Thanks for understanding.
S

Darren Clark

unread,

Apr 30, 2014, 6:40:22 PM4/30/14

to mq...@googlegroups.com

Fred,

FWIW we've had great success JSON. Protobuf would also be a good choice for a more constrained network (We're primarily 4G) if you're careful with schemas. As I'm sure most people on this list are familiar with, maintaining compatibility with 1000s of devices that may or may not have applied a recent update can be a challenge. Also I'm not sure the state of Protobuf and JavaScript at the moment, so to display easily in a browser you might need an intermediary.

-Darren

S

Fred Basset

unread,

Apr 30, 2014, 6:49:20 PM4/30/14

to mq...@googlegroups.com

Hi Darren,

Glad that JSON is working well for you. Are you using any type of compression? Also are you using TLS? 4G is what I want but it's a hard sell when the CDMA hardware is cheaper and we are doing large volumes.

Also have you ever run any tests to determine packet loss or other error rates?

Darren Clark

unread,

Apr 30, 2014, 8:05:36 PM4/30/14

to mq...@googlegroups.com

Fred,

Funny you ask about compression, that's actually what I'm working on today. ;) To optionally gzip the payload and then unzip at the server via a plug-in (we're using HiveMQ). So currently no compression, and yes, we're using TLS. The performance we get from the remote devices varies wildly, some are outside, some are in supermarkets, some in malls, etc. and anecdotally MQTT is considerably more reliable than HTTP polling, etc. I do not have actual measurements. Apparently we're running a test next weekend in a "nightmare" location for cellular, and CDMA at that, I may may able to get someone to put some instrumentation on that one.

-Darren

Fred Basset

unread,

Apr 30, 2014, 8:55:09 PM4/30/14

to mq...@googlegroups.com

Darren,

Thanks for the response. How are you packetizing the JSON data into the small MQTT packets? It would be great if you could tell me the speed you are getting at your CDMA site. All I get is 2kbyte/second on my system (upload using SCP).

Stefano Costa

unread,

May 1, 2014, 9:36:21 AM5/1/14

to Fred Basset, mq...@googlegroups.com

Fred, this is a real use case and I find it interesting to discuss about it.

I'm sure someone else has experience with MQTT and reliability of
communication in different situations and has something to tell. MQTT is
a wonderful piece of engineering, and knowing how it behaves when
deployed is useful knowledge.

This is not directly linked to cellular modems and poor connection:
mobile phones, Wifi links, and also fixed lines in the real world are
far from stable 24/7. So the software architect must take this into
account.

pls find my personal comments below.

Il 30/04/2014 20:38, Fred Basset ha scritto:
> Stefano & others,
>
> Thanks for the response. I have two systems I'm involved with. One
> system is a monitoring system that has about 1000 nodes deployed
> world-wide. Various modem technologies are used from GPRS to 3G. I
> don't think we have any 4G modems yet. In the US I believe we have
> accounts with all the major providers. On these systems an XMPP message
> containing XML fields representing the values read is sent up to the
> server every minute The message size is quite variable. On a small
> system we could be reading as few as 10 - 50 values, on larger systems
> it could be 1000s of values. On these systems we see occasional
> problems of the outgoing message queue filling up (i.e. modem not being
> able to transmit the values fast enough). No-one has done a detailed
> analysis of what's actually occurring on those systems yet.

My knowledge of cellular data plans for M2M prices (Europe): 0,15 to
0,50 Euro per Mb per Month (1-1000 SIM Cards on one contract, is this
reasonable? Depends on countries covered and operator. With this rate
and this amount of data to be exchanged I would keep payload as low as
possible, and try to pack values on a few topics, while using MQTT.
This goes against compatibility, simplicity while describing the
structures etc etc but who pays at the end of the month?

>
> On another project we are possibly deploying 10s of 1000s of smaller
> remote nodes that also do monitoring. Those systems do not use XMPP but
> currently use SOAP over HTTPS. On those systems we currently have a
> CDMA modem module and use Verizon as the provider. I tested the upload
> speeds by using SCP to copy a large binary file of random data to a
> publicaly available server. This method could in itself be completely
> inaccurate, but I had to run something to gauge the speed. I also
> measured download speed by doing a wget of a large binary random file
> and the speeds were similar. On the download I saw the speed vary
> greatly, 0 throughput for extended periods (e.g. 30 seconds), to
> bursting up to about 8KytesB/s. Average there was also about
> 2KBytes/second. This could just be the TCP windowing in effect so I'm

I wouldn't be sure that windowing is a reason for slowing down and
suspending for 30s. Depends a lot on operator, area etc but are you sure
you don't have any other internal bottleneck? Wget & SCP are not bad
techniques for measuring the overall speed. When applicable (linx etc)
IPerf is a great tool that can give much more details, we usually try to
build and use this: http://iperf.fr/

> not reading too much into it. Another use case of these systems is
> doing software updates. These will be quite large (a few MBytes), but
> be fairly infrequent (maybe twice a year). On these systems we will
> want to optimize communications to use as little bandwidth as possible,
> as we can then switch to a cheaper plan.

Software upgrades is something that MQTT can manage very well, due to
the structure of topics, reliability of communication and so on. No
problem in dealing with payloads that are actually the binary itself, up
to several Mb. The only problem is that if anything fails at the end of
a transfer the entire package must be repeated. I've never done this but
I suspect that splitting big binaries in small pieces could be a good idea.

>
> I'd also be interested in hearing what type of data format people are
> using in their payloads? Google protocol buffers looks pretty
> interesting but I've never used it.

My personal experience: plain text (one single value per topic), packed
compressed binary values (proprietary structures). Depends on the
bandwidth price. Google protocol buffers seem very similar to this.

S.

Frank Pagliughi

unread,

May 1, 2014, 10:25:36 AM5/1/14

to mq...@googlegroups.com

I've always though that the stated goal of MQTT to operate over
unreliable connections was somewhat at odds with its reliance solely on
using TCP as the underlying transport. It's actually a great starting
point, since it does account for a large number of use cases, but
hopefully MQTT will evolve to allow for different transports. The thing
it has going for it is that is has the concept of a virtual "connection"
that is separate from the underlying TCP/socket connection. It can lose
and re-establish the TCP connection between the client and server
without losing the virtual connection. This opens the door for a lot of
different transports.

I've done a lot of remote data loggers that use satellite modems. These
bill by connection time, not bandwidth, so call times need to be kept to
just a few seconds. But the latencies are horrific - half a second or
more. So you don't want to use an underlying protocol that waits for a
reply before performing the next step. You need something that bursts a
lot of packets the moment the connection is established, while the other
side is sending over all the ACK's and NAK's from the previous connection.

At the macroscopic level, this is exactly what MQTT does with messages,
but for some connection types, a different underlying transport would be
appropriate. This can be done with a protocol built on top of UDP.
Whether it would be made a part of MQTT or a separate standard and/or
library is another discussion. It would be valuable for MQTT, but useful
outside of it.

For the satellite loggers, we actually have a network service on both
sides that tricks the apps into thinking that they're talking TCP to the
other side, but it's actually bursting UDP to send the data.

Frank

Fred Basset

unread,

May 1, 2014, 11:31:32 PM5/1/14

to mq...@googlegroups.com

Good discussion, very informative. How is MQTT actually more efficient than TCP if you are using the QOS where sent packets are acknowledged and only received once? In my application we don't have any fire and forget type information.

Darren Clark

unread,

May 1, 2014, 11:49:25 PM5/1/14

to mq...@googlegroups.com

Regarding MQTT vs. raw TCP, if you want the same pub/sub semantics you would have to code that yourself. You would have to deal with fragmentation and whatnot. While I'm sure I could write all that, I'd rather not...

-D

Fred Basset

unread,

May 2, 2014, 2:53:48 PM5/2/14

to mq...@googlegroups.com

OK, makes sense. If I have for example 10KB of JSON data I want to publish, how do I determine the optimal size for the MQTT packets? E.g. should I just send it as one single 10KB MQTT packet, or break it up? My use case would be cellular modem connections with high packet loss.

Karl P

unread,

May 2, 2014, 3:25:14 PM5/2/14

to mq...@googlegroups.com

On 05/01/2014 12:55 AM, Fred Basset wrote:
> Darren,
>
> Thanks for the response. How are you packetizing the JSON data into the small
> MQTT packets? It would be great if you could tell me the speed you are getting
> at your CDMA site. All I get is 2kbyte/second on my system (upload using SCP).

What do you mean by "small mqtt packets" ? You write the entire json blob into
your mqtt message, tcp takes care of fragmenting if the link layer requires it.

I mean, you _can_ fragment yourself, but you'd have to have a pretty good reason
to do so.

Cheers,
Karl P

Stefano Costa

unread,

May 3, 2014, 3:07:17 AM5/3/14

to mq...@googlegroups.com

On 05/02/2014 08:53 PM, Fred Basset wrote:
> OK, makes sense. If I have for example 10KB of JSON data I want to
> publish, how do I determine the optimal size for the MQTT packets?
> E.g. should I just send it as one single 10KB MQTT packet, or break
> it up? My use case would be cellular modem connections with high
> packet loss.

You mentioned "size of MQTT packet" here and in a previous email
message. MQTT has an upper limit to the payload size that's very large
(268435455 bytes) given that the implementation you're using does not
have lower limits (being out of standard in this case, due to hardware /
platform restrictions).

Packet loss is a problem in any case and will be managed by TCP and MQTT
with retransmissions and so on depending on QoS being used.

Andy Stanford-Clark

unread,

May 8, 2014, 12:39:49 PM5/8/14

to mq...@googlegroups.com

Sorry to be late to the party, but I thought I'd just add a few thoughts about TCP, MQTT, QoS and packetisation ...

As long as the connection stays up, TCP/IP offers once and once only, in-order delivery of messages. This has often puzzled people who are looking at messaging middeware products and protocols, which appear to offer the same thing.
But the difference is that MQTT and similar application-level protocols that sit on top of TCP/IP, is that they offer application to application reliability of data delivery, whereas TCP/IP only offers its quality of service to the top of the TCP/IP stack - what the application then does with it is not the concern of TCP, and that's where the messages often get lost and hence where MQTT (and others) QoS comes into play.

The next thing is "as long as the connection stays up" at the start of that paragraph. If the connection drops, and often with cellular, satellite, wifi, etc networks it does, particularly "out in the wild", then TCP/IP doesn't help you with delivering your messages. Anything in-flight when the connection drops, is lost. MQTT with its QOS 1 and 2 ("at least once" and "exactly once") delivery gives you the application to application reliability of delivery, even if the connection breaks.

The retry timer is important here, though - if you're on a network with low bandwidth and/or high latency, having the retry timer pop when you're waiting for the acknowledgement from a QoS1or2 message that is working perfectly well, but just taking a long time, only makes things worse. As you've now added a second copy of the message into the network, which chews precious bandwidth, and if things are getting backed-up, can easily make things worse.
So backing off the retry timer is something to think about for low bandwidth and/or high latency networks.

The final thought is about packetisation - if your network is quite reliable, and you don't mind if some messages get lost, then using QoS0 ("at most once" delivery) is most likely the right approach.
However, if you need to ensure delivery, then QoS 1 or 2 is called-for, and in this case, if you have a large message payload, then if the connection drops, MQTT will re-connect later and re-send the message (in order to achieve the assured delivery). It is quite possible that a very large message might never get delivered in this scenario, as the connection might not ever stay up long enough to deliver the entire message (particularly large messages over slow networks).
MQTT doesn't have a "restart from byte x" capability (like, ISTR, some variant of FTP has (memory rusty here)), so a suggested approach in this situation is to break the large message up into a set of smaller messages, send each of those messages reliably to the other end (and include some kind of "message 1 of 5" type indicator (perhaps in the topic name, to avoid polluting the payloads?), and then have the client at the receiving end re-assemble the message pieces before passing the entire message on to the "application" when it's all been received successfully.

We've used this in the past for trickling firmware updates across a network over a period of time, and then later, the receiving device sent back a message saying "OK, I've got it all, and the checksums are good", and we say "OK - apply it then.. NOW", so we can control the time at which the upgrade takes place, even if it's taken minutes/hours/days to get the new image out there.

There is probably scope for a "packetising client" library that sits between the MQTT protocol library and the application, if someone feels like a little challenge :)

And finally on the structured vs binary data - to reiterate what's been said - the value to the business of easy-to-parse, somewhat structured data almost always far outweighs the additional network overheads and cost of sending it. So it's best to send something like JSON (or XML if you like all the angle brackets!) in preference to some proprietary binary format.
Compressing the payload (even in a cheap and cheerful way if the client device isn't up to full-blown LZW zipping) can make a huge difference to both cost of transmission and success of message delivery in the face of unreliable connections.

Andy S-C

Reply all

Reply to author

Forward