Sorry for the delayed response...I do have a few thoughts. :-)
The term "rollback" depends on your definition and/or implementation and what availability your service needs to provide customers.
Rollback could mean any of, or a combination of, the following (there could be other options based on what you build and how you deploy):
- Create VM snapshots and "rollback" to previous state (revert snapshot) on failed deployment
- Use Kubernetes features (namespaces, replica sets, rollout/rollback, etc.) to manage docker image versions to do a rolling update and rollback on failure. (This is actually very slick and something we are starting to explore)
- Rerun last successful GoCD pipeline instance (only the deployment stages) to redeploy last successful production build
- Checkout and rebuild from source and retest/redeploy
- Using infrastructure as code:
- Dynamically spin up a new virtual environment with the new build (parallel to current deployment) using something like salt-cloud, terraform, etc.
- Validate new deployment on each node
- At load balancer, drain, remove and replace each old node in the pool with new nodes
- If successful, delete old VM's
- (Rollback) If failure, add old nodes back into and delete failed new nodes from pool
The end goal (from my perspective) is to make the deployment virtually transparent and minimize interruptions to those using the service. Doing a full rebuild/redeploy from source code is not conducive to rapidly handling a failed deployment.
Focusing on option #3 above, GoCD has the awesome feature of rerunning a pipeline or even a step in a pipeline from the past with all settings, artifacts, scripts, etc just as they were the last time you deployed, assuming you designed your pipeline and process to archive everything as a material that is needed to deploy. A "rollback" then is simply rerunning the old pipeline using existing artifacts. There is NO need to go back to the beginning and start from source.
While this isn't always the best or most efficient way, it is (in my not so humble opinion) much better than trying to start from source again that gets a tricky if you have dependencies that are released and/or versioned independently and/or have significant automated QA that must also run (and SHOULD be run if you do a new build).
I'm full of opinions on this :) and am happy to clarify or provide alternate views as needed.
-Jeff