I'm happy to announce that my master's thesis got published earlier this month.
In a nutshell, we apply chaos engineering to run multiple versions of a container based application in parallel. In the deployment manifest, we specify a version range for the container image to be deployed rather than a specific version. At runtime whenever a container starts it will pick a random version in this range. This allows us to test the hypothesis that multiple versions within this version range are compatible with each other.
More details in the abstract attached below, and the full text can be found here:
Feel free to reach out. I especially enjoyed writing the future work session, as quite a few exciting extensions of this work emerged, so I'm happy to talk about that.
Stay safe and all the best,
It is common practice to use versioning policies to create unique identifiers for applications running in production. Many times the schema used for the versioning identifier also includes information which allows one to infer the impact of the changes bundled in this new version. One well known and frequently used versioning policy is semantic versioning (SemVer), which, if properly adhered to, can be used to determine which different versions can be used interchangeably. However, many systems depending on the semantic versions of their applications do not explore the version compatibility in their applications. Therefore never obtaining confidence about whether the application adheres to the version compatibility promises made by adopting semantic versioning.
In this thesis, we present a novel approach to applying chaos engineering practices perturbing containerized application deployments. These perturbations create multi-version execution environments, where multiple different versions of the application are running concurrently. To achieve these multi-version deployments we created the container-registryproxy (CRP), which applies proxy practices to perturb the container distribution protocol. The employed perturbation model targets the versioning of SemVer container images. When a container image is distributed, the perturbation engine serves any image compatible according to SemVer, picked at random. Consequently, deployments with multiple containers are likely to end up running multiple versions.
The CRP is evaluated against distributed deployments of two popular databases: PostgreSQL and Redis. We find that the PostgreSQL image for redundant PostgreSQL clusters distributed by Bitnami bundles additional tooling. Incompatibilities between these bundled tools were not reflected in the version identifier according to SemVer, causing multi-version deployments to encounter crashes and take longer before being able to handle requests. None of our multi-version deployments of Redis caused containers to crash nor did it take longer to be operational than version wise homogeneous deployments.