Hey there,
* From what I read and from the humble experience I have, I assume that instead of zip + json + upload to consul kv store + custom watches etc, you will rather need consul-template (
https://github.com/hashicorp/consul-template) or consul-templaterb (
https://github.com/criteo/consul-templaterb/). Even I am using with quite a lot experience the consul-template, I am recommending consul-templaterb as per a lot of conversations with its author and with knowledge of the problems it solves which the first does not.
* By sourcing consul kv variables directly into your json template files contrary to download zip/extract/put in place/and so on you can achieve process which is much simpler and templates can be managed in git etc.
* As far as I can understand you need *atomic* app configuration update and you fear race conditions which will lead to bad effect in case this atomicity is missed. I don't know what is your app doing or what is the reason for this high accuracy requirement but I definitely would recommend to review this particular requirement. Such high accuracy is rarely really needed.
* On the consul-template and consul-templaterb solutions outlined above, you can render the desired configuration files state and execute post render hooks within more than reasonable time frame (100msec to 5 seconds) but I have not seen a case where they can support atomic synchronization between renders and post-render hooks
* Following the same logic, you should leave the config update part on the mentioned solutions and implement the atomic requirement in the post-render hooks. So the first get data out of the kv store and update config files, the second is in charge for checking that all machines are with the latest on-file configuration state and only if this is met, reoad/restart is triggered
* Basic idea process looks like that but it is far from atomic proof
- external tool update kv store with new config
- consul-template(rb) spot the change into the kv store and trigger render
- render is completed and consul-template(rb) post render hook is triggered
- post render hook notify that machine X is ready for example by consul kv write /consul-template/etc/file.conf/ready/${HOSTNAME}
- post render hook knows how much machines should be ready before triggering reload and start looping on consul kv get -recurse /consul-template/etc/file.conf/ready/ | wc -l
- if result is 10, reload is triggered
- however! above solution looks ugly and it is far from atomic proof due to multiple issues
* Even if you skip consul-template(rb) and stick with your own watch, your watch might take the same steps for post-rendering hook.
* My recommendations if you really really need this atomicity will be instead
- external tool update kv store with new config
- 10 new VM/container machines are started with the new configuration
- external tool monitor health status of all the new cluster
- once all those 10 machines are ready external tool update load balancer to point to the new instance group/cluster/whatever you call it.
- In this way you achieve instant update to switch to the new application with the latest configuration
- This is similar to the scenario where you have a linux symlink and 2 folders and you simply point the symlink to the correct folder ... atomically
* However, my advise would be to review this atomic requirement and if you think that it is not really needed, simply stick with the consul-template(rb) and let it render its files and trigger the reload. From my experience they will get the job done within more than a reasonable time frame
* By the way you might want to look into the consul-community tools available here
https://www.consul.io/downloads_tools.html - some of those might be things that you never heard of, but might be ready for you to solve problems that you have and trying to solve on your own.
* Also I am curious about thus atomicity requirement. Might I kindly ask you to reveal a bit more details. Why you need such a synchronized update process?
I hope this helps.