I don't like this for two reasons:
- I'm using other methods for configuration management, using "config save" is bad for me
So don't do the Config save, part.
- Watching for changes on Consul and running "slaveof" _and_ updating the config file feels fragile to me. It is another possible solution
Watching consul changes would be no less fragile than using consul to manage the DNS entries in the first place. Indeed it would be less fragile than setting up additional load balancers, local caches, or especially modifying redis code to alter lookups and/or Config reload. It is less fragile, by virtue of fewer moving parts, than having a Config management tool update it. It is the least amount of effort needed to get the solution, and is the correct solution for the scenario you describe.
The key difficulty here is managing redis via Config file changes with a tool not made for it. Given redis already has the ability to be managed by using the api and Config save, I always recommend against going the other route the moment you introduce master/slave replication and use anything to dynamically failover or alter slaves based on availability of their master.
Historically we in the *NIX world have used the file and reload method, and it shows in how we develop and what we use to manage systems. Redis offers us a more robust method, but being from a different paradigm it is a bit alien and can, as you say, feel fragile. Yet reloading a Config file can be quite fragile. We generally don't think about it that way because "it is the way we do things" - it is in our comfort zone.
If you put the same question to someone who comes from, for example, the world of Cisco routers which have their Config managed via api and then have the router itself save any changes what you are planning would sound fragile to them.
I was building SaltStack modules for managing redis when the relative complexity of file-first really hit me. I wound up having the module talk to Redis to tell it what to do and have Redis manage it's Config file instead. My Redis Config management life became much easier at that point. Not to mention the fact that speed of changes are orders of magnitude faster. That speed likely matters. If your slaves are under such usage that restarting the daemon is a problem, I'd suspect waiting around for Config management then reload, then SYNC, is likely to be problematic as well.
Again, I feel like a good solution for this problem must involve a code change. I wanted to start a discussion about options #1 and #2.
Having Redis support SIGHUP for config reload would probably help a whole lot of other use cases, such as changing "save", "maxmemory", and so on.
While I'm not fundamentally opposed to a reload, both of those can be changed at runtime. There are very few settings which cannot, and of those some of them should/can not be changed in just a reload. Where to store your PID file comes to mind. So does autorehashing but Salvatore said he would change that as it should be configurable via the api.
When configuring Redis using a configuration management tool, using "config save" is not a viable option and implementing both config file rewrite and "CONFIG SET" feels (a) duplication of work and (b) fragile.
Config set and Config write are rock solid on their own, and in use daily on thousands of instances. There is nothing fragile about it. Trying to manage the Config file from two different tools is asking for trouble and partially reimplementing what Redis already does. This is why I always recommend you don't manage redis "file first" but rather the same away you manage, for example, a router: Make changes in running-memory Config and when certain of them save them.
The discussion is actually of two different things. Redis being able to reload its Config is one item. Managing slaves during a master failover is a different discussion with different requirements.
Ultimately in the situation you described the proper place for updating a slave's master when the master changes IPs is in whatever tool you are using, homegrown or not. That tool is by definition the source of authority. When the master changes you want only that change to be made. By talking directly to the slave and changing only that setting you can have that. By running it through a "third party" such as Puppet, Saltstack, etc. you introduce the possibility for other changes to corrupt, or even break, the failover process.
All that said, if you really want a simple route to a solution which uses file-first-Config you could use Inotify to watch the Config file, pull out the settings, and do a Config set for each of them to get the same effect - and without waiting on Config reload support or worrying about a reload causing other issues. Then you can retain file-first semantics.