I think I have a good HA & load balancing strategy and like someone to look over my shoulder before I go down a rabbit hole...
Currently I have a number of IOT devices that connect to a Mosquitto broker via paho.mqtt.client. The devices publish 99% of the time but also subscribe so any unit can be commanded to do something by the 'engine'
The broker is running on a server that also houses:
An 'engine' that is a client that subscribes to all remote devices ( wildcard) and sends the received data to a local MongoDB database. This engine can also send a command out to a remote unit on its private topic.
A web server that interacts with the DB.
I already have the IOT device code setup to connect & reconnect to the next broker in a list which it gets from a DNS srv record .
This works well such that if a broker is no longer reachable the client will keep trying other brokers in the list until it can connect to one.
My server strategy is to have two more copies of the present server box ( three total) with the following modifications:
- The Mongo DB is to be configured as a replica set member where only one of the three copies is 'primary' ( master) and the other two are 'secondary'.
- The Broker will be configured to bridge with the other two but only forward messages from its local engine ( not remote IOT devices).
What I expect to work.
- IOT device can connect any of the three server boxes' brokers since all three will be up all the time.
- If IOT device is connected to a secondary server then the Mongo client on its engine will post the data to the primary ( with replication to the others)
- Web users can connect to any of the three boxes and Web will work because the local Web server will connect to the DB replica set.
- If a Web user sends a command back to an IOT device then it will go through the local engine and be forwarded via broker Bridging to the other brokers so the IOT device will get it no matter which Broker it is subscribed to.
- If the primary box dies then any connected IOT devices will connect to another server box and Mongo will elect another box to be the primary.
- If any box dies then web connections to the other two will still work fine.
I am not sure about what happens to web clients who are trying to connect to
mysite.com which has three A records but one becomes unreachable - will they fail over to the next one or will some just give me a 'site not found' because they picked the wrong A record? Is it browser dependent or OS or just chance?
Does this strategy make sense or am I missing something?
Thanks
Bill