|Re: [mqtt] Scaling MQTT Clients/Subscribers||Dominik Obermaier||5/28/13 12:33 PM|
that is an interesting scenario where a single client has to deal with that high amount of messages per second. Most client libraries I am aware of are not able to handle that amount of messages properly.
The key questions here are: How are your topics structured? And does your client subscribe to wildcard?
We had similar scenarios where we had to persist a large amount of messages per second similar to yours to a NoSQL database or where on-the-fly analysis of the message payloads had to be done. Unfortunately we did not achieve this goal with any MQTT client library we tried. I wrote a blog post how it worked for us here (in our concrete use case we did not use a SQL database because it's hard for most SQL databases to scale to 50-70k msg/sec: http://www.hivemq.com/mqtt-sql-database/
When you have a well structured topic hierarchy it would be possible to let different subscriber clients subscribe to different subtopics to distribute the load. When all messages are addressed to the same topic I do not see a simple solution at the moment because I do not see how load balancing could be applied to MQTT clients.
I know this does not help you much at the moment, but we consider to open source a MQTT library soon which is able to handle that amount of messages. I heard from MQTT.js that it can handle many messages but I did not try it.
I'm also keen to see how other folks solved this kind of problem :)
Andrew Ralston wrote:
|Re: [mqtt] Scaling MQTT Clients/Subscribers||Andy Piper||5/28/13 1:48 PM|
Sounds exciting! which language? would it be something worth considering adding to Eclipse Paho...? :-)
|Re: [mqtt] Scaling MQTT Clients/Subscribers||Matteo||5/29/13 3:56 AM|
I can confirm MQTT.js can handle 18.000+ msg/sec using Mosquitto as a broker.
My own Mosca (based on MQTT.js) is ticking at 13.000 msg/sec on the same setup.
That kind of saturates my 2011 MBA by running both the "bomber", the broker and the subscriber.
I think that better numbers can be achieved on a more powerful machine.
2013/5/28 Dominik Obermaier <dominik....@googlemail.com>
|Re: [mqtt] Scaling MQTT Clients/Subscribers||Dave Locke||5/29/13 5:00 AM|
messages per second to a single MQTT client varies considerably based on client implementation , from memory the Paho C client can handle >10K msgs/sec. At some point a single client / subscriber will run out of steam. A few options to enable a system to scale include:
- Make the topic space more fine grained i.e. rather than use a single topic / subscription, enhance the topic space so there are multiple topics and the load can be spread across multiple clients / subscriptions.
- Some messaging servers allow messages that arrive on a topic to be directed to a queue. The benefit of the queue is multiple applications can concurrently process messages from the queue but each message is only processed once.
All the best
From: Andrew Ralston <aral...@gmail.com>
Date: 28/05/2013 20:04
Subject: [mqtt] Scaling MQTT Clients/Subscribers
Sent by: mq...@googlegroups.com
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
|Re: [mqtt] Scaling MQTT Clients/Subscribers||Dominik Obermaier||5/30/13 1:57 PM|
The language of the library is Java. The library is targeted to higher level clients which must handle massive message throughput. It is also pretty decent for stress testing brokers. It's probably worth having a chat about adding it to Paho when we're about to release it open source (somewhere in Q3 I think).
Andy Piper wrote:
|Re: [mqtt] Scaling MQTT Clients/Subscribers||Andrew Ralston||10/31/13 10:26 AM|
I think you can also contextualize this problem in terms of highly available clients. Since we operate exclusively at QoS 1, our ability to consume messages actually needs to exceed our publishers theoretical max publish rate (to allow us to consume messages that were published when we were offline).
To me the idea of partitioning topics into finer-grained units is a piratical but likely short term solution. Eventually the problem will reassert itself - either through hardware limits, publish growth-rates or through less fine-grained topic structures (think domains where that doesn't make sense or legacy application migration where you can't be as flexible). Pushing all messages to a queue for down stream applications also seems to cover up the problem somewhat since the broker already performs the queueing when using QoS 1 messages.
What would be nice to see is a change to the spec/brokers that allows N client instances to join a client-cluster where messages are delivered to one of the clients in the cluster. If you could establish common parameters like QoS level, Will messages etc for all clients, then each client could implement it's own keepalive/retry/timeout parameters (specific to its needs). This would allow you to create a distributed client network with nodes dispersed across data centers which would make your application much more resilient. This would also allow you to make your clients a part of your consuming application which simplifies the architecture significantly. The broker would be able to distribute messages to clients based on the client with the fewest in-flight messages at any one time.
Dave, I sent this in an email to you directly too - let me know what you guys think.