Bit of a broad question here... But i run 3 separate Jenkins environments across diff regions on AWS, because I prefer segregation. As my Jenkins servers are tightly locked down, I generally dont need to update continuously but like to do so every 1-2 months at least.
When it comes to Jenkins updates & Jenkins plugin updates, and making sure masters don't break, I have only relied on snapshots. Taking a snap before updating the OS Jenkins version and all plugins.... If anything breaks, I rollback.
In 2+ years I have not had to rollback until recently. My staging and production Jenkins environments worked fine with the latest updates, but QA failed. This environment has recently added a lot more features & complexity, with additional plugins/services for automated testing and I guess one or more must have broken with the updates.
Oddly, I would normally assume Jenkins itself wouldn't cause things to break, it'd more likely to be plugins which warn of non backwards-compatible changes and the like. However, in this situation, the java ssh process from the master to the slave ceased to work after the updates. I had to specify the actual os ssh process, but then builds had all kinds of errors.
....But I digress, after many attempts at downgrading plugins and changing the config, I eventually had to revert to the saved server state for my Jenkins QA master. It would seem the method I've been using to test updates, whilst it does cover screwups and allows me to rollback, is now insufficient for testing for problems beforehand caused by any updates. Obviously having jobs fail to deploy costs valuable time and money, so I need to improve
I'm wondering if anyone has any recommendations for a workflow for testing Jenkins plugin updates for master servers??
I have the slightly annoying issue that I'm using a plugin "SCM Sync Configuration Plugin", to version config changes to github. Any snapshots I take to spinup a new server for testing, immediately cause problems with this for instance -- minor nuisance but this is my point, trying to get a nice automated workflow for testing if any updates are going to break things.