MQTT QoS vs TCP

pozz

unread,

May 19, 2023, 5:42:14 AM5/19/23

to

I know TCP is able to guarantee the delivery of messages from sender to
receiver, without corruption (thanks to checksums), in order and without
duplication (thanks to sequence numbers).

So I'm confused when I read something about MQTT QoS. For example, QoS 1
(at least once) uses ack (PUBACK) and retransmissions (with packet ids)
to guarantee that the message is delivered at least one time.

I'm wondering how this is possible over a TCP connection that uses the
same technique of ack and sequence numbers.

From what I know about TCP, if some data isn't well acknowledged by the
receiver, the sender automatically resends the packets not acked. This
is performed by TCP/IP stack or kernel, without the application knows
anything about this.

It seems to me it's impossible that a MQTT client needs to resend a MQTT
message because it wasn't received by the broker. If this happens, TCP
should signal the error to the application that should close and try to
reopen the connection.

Ed Prochak

unread,

May 19, 2023, 11:14:35 AM5/19/23

to

TCP is only one of the lower levels of the protocol stack.
Data can sometimes be lost in the higher levels.

Secondly, there is the issue of resend timeouts. If TCP fails to deliver
the message past the MQTT retry time limit, then MQTT will resend the
message.

HTH,
Ed

Ivan Shmakov

unread,

May 19, 2023, 12:52:20 PM5/19/23

to

>>>>> On 2023-05-19, pozz <pozz...@gmail.com> wrote:

> It seems to me it's impossible that a MQTT client needs to resend
> a MQTT message because it wasn't received by the broker. If this
> happens, TCP should signal the error to the application that
> should close and try to reopen the connection.

After which an MQTT client will need to retransmit its
message, no? The difference between QoS 0 and QoS 1 boils
down to whether the sender of the message is actually
bothered to do that.

(I /think/ QoS 1 also allows for reliable delivery along the
entire client-to-server-to-another-client path, but I'm not
sure about that.)

TCP will also signal an error when the message has
successfully reached its destination, but the respective
acknowledgement has not; as such, an application level
protocol running over TCP generally needs the means to weed
out the duplicates that are bound to happen in this case.
Which is what MQTT QoS 2 does.

--
FSF associate member #7257 http://am-1.org/~ivan/

pozz

unread,

May 22, 2023, 3:11:17 AM5/22/23

to

Il 19/05/2023 17:14, Ed Prochak ha scritto:
> On Friday, May 19, 2023 at 5:42:14 AM UTC-4, pozz wrote:
>> I know TCP is able to guarantee the delivery of messages from sender to
>> receiver, without corruption (thanks to checksums), in order and without
>> duplication (thanks to sequence numbers).
>>
>> So I'm confused when I read something about MQTT QoS. For example, QoS 1
>> (at least once) uses ack (PUBACK) and retransmissions (with packet ids)
>> to guarantee that the message is delivered at least one time.
>>
>> I'm wondering how this is possible over a TCP connection that uses the
>> same technique of ack and sequence numbers.
>>
>> From what I know about TCP, if some data isn't well acknowledged by the
>> receiver, the sender automatically resends the packets not acked. This
>> is performed by TCP/IP stack or kernel, without the application knows
>> anything about this.
>>
>> It seems to me it's impossible that a MQTT client needs to resend a MQTT
>> message because it wasn't received by the broker. If this happens, TCP
>> should signal the error to the application that should close and try to
>> reopen the connection.
>
> TCP is only one of the lower levels of the protocol stack.
> Data can sometimes be lost in the higher levels.

In this case, there's only one higher level, that is MQTT application.
How an application running on a machine could lost something? Network
links aren't reliable, but applicaions running on a processor are reliable.
Do you think about application crash or an entire machine crash that
needs a reboot? In this case, after the reboot, the MQTT application
usually doesn't know anything about the previous connection, timeout and
lost messages... except it saved something on a non volatile memory.

> Secondly, there is the issue of resend timeouts. If TCP fails to deliver
> the message past the MQTT retry time limit, then MQTT will resend the
> message.

What happens in this case? Suppose one TCP fragment with a single MQTT
message (just for simplicity) sent by a client to the server (the
broker) was lost. After a TCP timeout, the network stack autonomously
resend the fragment until an ACK is received. Even if the MQTT
application resend the MQTT message *before* TCP timeout, it will not be
sent by TCP layer until the previous fragment is acked.
Maybe, more exactly, on the receiver machine, the TCP layer will not
pass the resent message to the application (the MQTT broker) before the
lost TCP segment is received as well. When the lost TCP fragment is
received, the broker will receive two MQTT messages: the "original" and
the resent ones. I think it's impossible for the broker to receive the
second transmission without receiving the first.

So it seems to me the retransmission made at the MQTT level is
completely useless... but I think I didn't get the real point here.

David Brown

unread,

May 22, 2023, 4:09:14 AM5/22/23

to

I haven't used MQTT much, but generally if an application gets a timeout
and wants to retry, it will close the TCP/IP connection and open a new
one. (Or rather, open a new one while the old one is closing - closing
a failing TCP/IP connection can be slow.)

I would actually have thought that UDP was a more natural choice for
MQTT, rather than TCP - although older versions of MQTT did not have QoS
and were therefore reliant on TCP's acknowledges and retries.

(I always think its a shame that SCTP never caught on - among its many
benefits, you don't have this head-of-line blocking issue.)

pozz

unread,

May 22, 2023, 5:08:37 AM5/22/23

to

I'm quite sure that MQTT retransmission mechanism is *not* based on a
new TCP connection. In MQTT, the TCP connection is persistent. It can
stay open for days without exchanging any real data. In this case, the
keepalive facility is there to detect a broken link.

David Brown

unread,

May 22, 2023, 6:56:19 AM5/22/23

to

If the TCP/IP connection is working correctly, messages will be
transmitted correctly to the broker. If a QoS message fails to be
transmitted - the MQTT client or server does not receive an acknowledge
in time - then there are two possible issues. One is that the
server/broker application is in trouble. The other is that there is an
issue with the network. In most cases, I would suspect the network
first. TCP/IP already has acknowledges and timeouts, so if it is a
temporary problem then it is likely to be handled there. By the time it
reaches the attention of the application protocol's QoS handling, you
are definitely at the point where a new TCP/IP connection is the right
way to go - perhaps targeting a different IP address or via a different
route.

The MQTT application already has to handle dropping and making new
TCP/IP connections - even if the norm is for the connection to last for
weeks at a time or more. So creating a new TCP/IP link has a lot to
gain, and very little to lose, and it is the standard way to handle such
issues.

pozz

unread,

May 23, 2023, 2:53:50 AM5/23/23

to

Yes, this is the only solution for me too. Anyway, I don't know if this
behaviour (closing and reopening TCP connection) is described in the
MQTT specifications.

> The MQTT application already has to handle dropping and making new
> TCP/IP connections - even if the norm is for the connection to last for
> weeks at a time or more. So creating a new TCP/IP link has a lot to
> gain, and very little to lose, and it is the standard way to handle such
> issues.

Here[1] the MQTT client implementation of lwip, a popular TCP/IP stack
for embedded systems.
When the timeout for the ACK is expired, this client only calls an
application callback with ERR_TIMEOUT. Maybe the decision to close and
reopen a new TCP connection is passed to the application.
I don't know if other MQTT clients implement an embedded mechanism that
automatically tries to solve the issue of lost ACKs by reopening a TCP
connection.

>>> I would actually have thought that UDP was a more natural choice for
>>> MQTT, rather than TCP - although older versions of MQTT did not have
>>> QoS and were therefore reliant on TCP's acknowledges and retries.
>>>
>>> (I always think its a shame that SCTP never caught on - among its
>>> many benefits, you don't have this head-of-line blocking issue.)
>>
>

[1] https://github.com/lwip-tcpip/lwip/blob/master/src/apps/mqtt/mqtt.c

David Brown

unread,

May 23, 2023, 3:55:34 AM5/23/23

to

I haven't read the MQTT specifications - I don't even know what
documentation exists for the protocol. But implementation details like
this are not always covered in such documents, as it is really at a
level below the protocol itself. (The specifications for HTTP, for
example, don't say how many simultaneous connections a browser should
have to a web server, or when it should give up and retry.) So don't be
surprised if this is /not/ in the specs - that does not mean a client
cannot or should not make new TCP/IP connections.

>
>> The MQTT application already has to handle dropping and making new
>> TCP/IP connections - even if the norm is for the connection to last
>> for weeks at a time or more. So creating a new TCP/IP link has a lot
>> to gain, and very little to lose, and it is the standard way to handle
>> such issues.
>
> Here[1] the MQTT client implementation of lwip, a popular TCP/IP stack
> for embedded systems.

This is a bit muddled. I am familiar with LWIP, but I don't know
whether you are talking about an MQTT client that you wrote yourself, or
which comes as part of newer LWIP, or which someone else contributed as
a sample.

> When the timeout for the ACK is expired, this client only calls an
> application callback with ERR_TIMEOUT. Maybe the decision to close and
> reopen a new TCP connection is passed to the application.

Yes, that would be the normal behaviour.

> I don't know if other MQTT clients implement an embedded mechanism that
> automatically tries to solve the issue of lost ACKs by reopening a TCP
> connection.
>

I don't know either. I can only tell you that if you are failing to
communicate on a TCP/IP connection, then making a new one (possibly
after a delay) is the normal way to handle things if you want automatic
recovery.

pozz

unread,

May 23, 2023, 5:02:17 PM5/23/23

to

In the link, there's the official MQTT client implementation of lwip
project.