Why is the keep-alive needed?

6,232 views
Skip to first unread message

Fred Basset

unread,
Jun 17, 2014, 12:18:38 AM6/17/14
to mq...@googlegroups.com
I'm developing some sample apps using Python and Mosquitto.  With this software you need to specify a keep-alive time.  Why is this actually needed and what data is transferred in the keep-alive?

In my application using wireless modems we pay per byte, so we need to minimize the data transferred.

Thanks,
Fred

ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ (Nagesh S)

unread,
Jun 17, 2014, 12:46:12 AM6/17/14
to mq...@googlegroups.com
I think, it is needed so as not to always rely on the TCP timeout.

Excerpt -
TCP provides a reliable transport layer. One of the ways it provides reliability is for each end to acknowledge the data it receives from. the other end. But data segments and acknowledgments can get lost. TCP handles this by setting a timeout when it sends data, and if the data isn't acknowledged when the timeout expires, it retransmits the data. A critical element of any implementation is the timeout and retransmission strategy.

More - http://www.pcvr.nl/tcpip/tcp_time.htm

Only the fixed header (2 bytes) are exchanged PINGREQ and PINGRESP. http://public.dhe.ibm.com/software/dw/webservices/ws-mqtt/mqtt-v3r1.html#pingreq


--
To learn more about MQTT please visit http://mqtt.org
---
You received this message because you are subscribed to the Google Groups "MQTT" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mqtt+uns...@googlegroups.com.
To post to this group, send email to mq...@googlegroups.com.
Visit this group at http://groups.google.com/group/mqtt.
For more options, visit https://groups.google.com/d/optout.

Jian Zhen

unread,
Jun 17, 2014, 4:46:44 AM6/17/14
to mq...@googlegroups.com
Keep Alive timer is how long the connection will be kept alive until the server will disconnect the client.


Search for keep alive timer...

Setting the keep alive timer to 0 effectively tells the server not to disconnect the client. However, it's still up to the server to decide if it will disconnect due to inactivity.

Andy Stanford-Clark

unread,
Jun 17, 2014, 9:03:17 AM6/17/14
to mq...@googlegroups.com
Further to the letter of the specification, the purpose of keepalive is so that the *application* level (client app, and broker) can know that  the underlying connection is still open end-to-end.

Although TCP/IP in theory notifies you when a socket breaks, in practice, particularly on things like mobile and satellite links, which  often "fake" TCP over the air and put headers back on at each end, it's quite possible for a TCP session to "black hole", i.e. it appears to be open still, but in fact is just dumping anything you write to it onto the floor.

So the keepalive confirms that you really are still talking to a broker (and from the broker end, that a client really is still connected), particularly when you're sitting on a long-connected connection, either subscribed to an infrequently publishing topic, or publishing at qos0 (i.e. no acknowledgement) to a broker.

The ping-resp ("pong"!) coming back from the broker in response to a client-initiated "ping-req" should be used (by the MQTT library) to either tell the application if the connection has gone away, or to trigger a reconnect.

Andy

Marco Wachs

unread,
Feb 22, 2019, 7:03:23 AM2/22/19
to MQTT
I`m currently writing my bachelor thesis about MQTT and looking for some detailed references on how and why the case of a "blackholing TCP-Session" appears even with the keepalive timer.
In which case does the connection still appear to be open and dumping the keepalive packet. Haven`t found anything about it on google. Hope anyone could help me.
Thanks in advance :)!

Greets

Marco

Francis Brosnan Blázquez

unread,
Feb 22, 2019, 7:46:49 AM2/22/19
to mq...@googlegroups.com, Marco Wachs
Hello Marco.

Even though TCP is a session-enabled transport protocol, where MQTT
runs, it is just semantic: a mental construction translated into
software where two ends (hopefully) sustain a shared view about a
particular session.

As you know, you can have two equipments with a running ssh session
that can perfectly survive  even in the case you lose all connectivity
as long as both ends:

   1) Sustain the session the same state just before connection lost
       and after recovering.

   2) No data is sent during connection lost (in case of SSH, no terminal typing,
       for MQTT, no message at all) to avoid triggering connection lost.

However, even in the case both TCP ends comply everything, this will
not work if routers in the middle does not forward connection RST or
lose NAT translations, or your provider CGNAT went full or some of the
fancy things some networks do with TCP session (like mobile).

That's why you need a keepAlive mechanism even with TCP.

Not only to be sure the connection is still alive and working, but also
to ensure you quickly detect it is gone (by forcing point 2).

KeepAlive also helps to mitigate NAT problems especially with IPv4
networks.

So, in essence, even if you run TCP, if you want a truly connected
session, it must be tested (both directions) regularly with some short
of PING/PONG.

Best Regards.

toast-uz

unread,
Feb 22, 2019, 7:51:44 AM2/22/19
to MQTT
You can get some information with googling "TCP half open".

e.g.
https://en.wikipedia.org/wiki/TCP_half-open

Arlen Nipper

unread,
Feb 22, 2019, 8:20:43 AM2/22/19
to mq...@googlegroups.com, Marco Wachs
Hello Francis,

Perfect explanation!

-Arlen Nipper

--
To learn more about MQTT please visit http://mqtt.org
---
You received this message because you are subscribed to the Google Groups "MQTT" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mqtt+uns...@googlegroups.com.
To post to this group, send email to mq...@googlegroups.com.

Marco Wachs

unread,
Feb 22, 2019, 9:12:16 AM2/22/19
to MQTT
Hello @toast-uz,

yeah i know, but where is the explanation how a half open TCP connection could appear even if using the TCP keepAlive Timer. Why i need two keepAlive Timers one for the transport layer and one for the application layer. Did i miss something on that articel?
Message has been deleted

Marco Wachs

unread,
Feb 22, 2019, 9:49:41 AM2/22/19
to MQTT
Hello Francis,

thanks for the fast and detailed reply.
But i still don`t get the point why a TCP-Connection could be half open "forever" even if the keepAlive Timer is activated.
In case the connection between both parties ( let's call it A and B) is lost. The keepAlive Timer expires and a keepAlive Packet is send from A to B and from B to A.
After a couple of time without getting any reply on both sides they close their connection.
I`don`t really get the point why i need the keepAlive Timer for the transport Layer and the Application Layer (two times). Isn`t one enough to tell the upper or lower layer to disconnect and close the connection.
Maybe i don`t really get it from your explanation.

Best regards

Marco

Francis Brosnan Blázquez

unread,
Feb 22, 2019, 12:03:19 PM2/22/19
to mq...@googlegroups.com, Marco Wachs
Hello Mark.

Although TCP is a very simple protocol to use, it is very complex
underneath and it gets very tricky it you start changing default values
[1] [2], due to many implications you cannot control easily.

If you start changing TCP options moving it from defaults, you also
have to consider how this relates to other configurations in your
system AND, this is key, how all this will be handled by routers and
devices in the middle (which are more prepared for the usual case) not
only in your premises but also in hypothetical customer hardware you
don't know yet.

The problems relies on the fact that people using TCP are expecting it
to be really resistant to network failures so it does a lot of work for
you for free, recovering, retransmitting, translating, mitigating, etc,
without the application noticing it.

They don't care but people want it delivered, ordered, without no app
complications.

That's why network operators, middle routers or even the smallest DSL
nat router, do fancy things with TCP connections and its associated
state: because doing so they increase chances to recover the connection
without TCP peer noticing it, so they can keep on delivering traffic,
ordered and with as much as possible resource consumption (CGNAT aren't
free).

So if give a try to resolve keep alive problems by using just TCP's
support you will find it is hard: it will require from you a very deep
knowledge about TCP, how it relates to your hardware and scenario, but
also you will have to be prepared to face very strange and bizarre
problems (because there is a lot in the middle).

I would never recommend, but possibly you can afford this with a very
custom solution where all hardware and software is under control.

This is not the case for MQTT where heterogeneous scenario will be
usual.

So, your question "Why do we need another KeepAlive mechanism when we
have TCP's?" is assuming three false premises:

1) You have full control to change whatever you want at TCP level
without no consequences.

2) TCP works in a ideal case where you are point-to-point connected to
the other end so there is no theoretical possibility to have a traffic
loss, network connection problems, traffic congestion, attacks [3], etc.

3) That good amount of overbooked heterogeneous full-of-quirks hardware
working in the middle of TCP peers doing best effort does not exists.

So protocol designers have come to a simple conclusion:

1) It is far more easier to have a PINGREQ and PINGRESP at application
level (resolves the problem).

2) We can have full control of it at application level, with low
coupling with the-TCP-ecosystem (resolves implications).

3) Takes nothing to reason about it and how it works (low cost
solution).

4) It allows to check TCP, but also application level (more
comprehensive check).

5) AND allows to use the-TCP-ecosystem in a regular way (total bonus).

In essence, the point is not why do you need a KeepAlive mechanism at
MQTT level, but that you will end up building one after attempting to
fight the problem at TCP level (understanding TCP is not just a
protocol but a really big and old ecosystem).

Best Regards.

[1] http://codearcana.com/posts/2015/08/28/tcp-keepalive-is-a-lie.html
[2] https://stackoverflow.com/questions/24133668/tcp-keepalive-not-working
[3] Not mentioned, but this is something you will have to evaluate too.

Andy Stanford-Clark

unread,
Feb 23, 2019, 4:19:54 AM2/23/19
to 'Simon Walters' via MQTT
great answer!

--
To learn more about MQTT please visit http://mqtt.org
---
You received this message because you are subscribed to the Google Groups "MQTT" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mqtt+uns...@googlegroups.com.
To post to this group, send email to mq...@googlegroups.com.

Marco Wachs

unread,
Feb 23, 2019, 12:57:26 PM2/23/19
to MQTT
Hello Francis,

Haven't expected that there  so many hardware dependencies on the keep Alive Timer.

thanks a lot for the detailed answer you helped me a lot!
Have a nice evening

Best regards
Marco

puneet kumar

unread,
Feb 23, 2019, 2:42:22 PM2/23/19
to mq...@googlegroups.com
Sorry, other half of my message didn’t go through.

Keep alive is needed to keep the connection alive, however there are some other options where you don’t really need keepalive. QUIC provides you 0-RTT capability, which means it can resume the connection even if it was closed or dead. My paper in Elsevier journal demonstrates that. You might want to look into that option too. 

Paper title: Implementation and Analysis of QUIC for MQTT.

Sent from my iPhone

puneet kumar

unread,
Feb 23, 2019, 2:46:53 PM2/23/19
to mq...@googlegroups.com
One more thing, QUIC also resolves NAT rebinding  issue mentioned below.

Sent from my iPhone
Reply all
Reply to author
Forward
0 new messages