Sizing for online and offline messages

236 views
Skip to first unread message

Luke Dudney

unread,
Oct 25, 2020, 11:24:48 PM10/25/20
to vernem...@googlegroups.com
Hi group

Just wondering if there is some guidance available on sizing of broker disk and memory, and the corresponding max_online_messages and max_offline_messages settings.

In the past we have had max_online_messages = -1 (infinite), which leads to out-of-memory conditions on the server, which leads to the broker process being OOM reaped and all online messages being lost. Even setting to several million has led to the same issue on a 32GB system. Setting it lower can lead to lost messages for clients that cannot keep up or that have been offline for some time.

Some thoughts / assumptions

  1. offline queue is stored compressed in LevelDB in the msg_store, bounded by max_offline_messages, constrained by disk**
  2. online queue is stored uncompressed in RAM, constrained by memory size
  3. max_online_messages is a per-client setting, there is no capability to limit broker-wide online messages
Is there some guidance or equation to use to calculate the best values for these settings, using inputs such as:
  • average message size
  • message compressability
  • message rate
  • qty of clients
** I had assumed that the maximum effective size of the offline queue is constrained by disk size and max_offline_messages, however in the case that a client has been offline for some time, when they come back online, all of their offline messages are immediately loaded into memory and the broker will drop all messages that exceed max_online_messages. This is not expected behaviour as I had assumed the maximum size of the offline queue should be constrained only by the size of the disk.

Cheers
Luke


André Fatton

unread,
Oct 26, 2020, 6:30:41 AM10/26/20
to vernemq-users
Hi Luke,

Thanks for your questions and observations!
I'm jumping onto your last remark, on the loading of messages from disk. If a queue gets initialized from disk, and changes state from offline queue to online queue, the behaviour you describe is obviously highly contraintuitive. That said, `max_online_messages` is to protect RAM, `max_offline_messages` to protect disk (or disk and RAM, I should say, as there are always RAM based indexes). So if the first is configured to a lower value than the second, we'll immediately see message drops. The idea was to give you both those config options independent of each other.
I guess we should do some adaptive loading and delivery of messages loaded from disk. Adapted to the online consumer speed, that is. Possibly have an iterator on LevelDB that reads in batches that are always a little smaller than the `max_online_messages` window.
I was under the impression we already do some batching but I'll have to check in the source code.

-André

Luke Dudney

unread,
Oct 26, 2020, 7:09:24 PM10/26/20
to André Fatton, vernemq-users
Thanks Andre, I suppose these are two separate conversations.

Any general guidance on calculating the max_online and max_offline message parameters, based on broker size? (And also the reverse - sizing brokers based on required publishing sizes/rates etc) ?



--
You received this message because you are subscribed to the Google Groups "vernemq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vernemq-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/vernemq-users/60c36136-caa5-4883-bfb8-47a8026af2c5n%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Luke Dudney

unread,
Oct 29, 2020, 1:21:11 AM10/29/20
to André Fatton, vernemq-users
I created an issue in GitHub for the dropped messages issue: https://github.com/vernemq/vernemq/issues/1663 as this seems like a problem with the software rather than general discussion.

Still keen to discuss with the group any ideas on sizing and configuration of these options

cheers
Luke


Reply all
Reply to author
Forward
0 new messages