Hi Daniel,
Sorry for the late response. Holiday and debugging ;)
We're using kube2iam and the pods can all contact the RDS and S3 buckets.
In our replication config, instead of having names per geographic location like in the examples, we have "logical' names like "default" and "peer", which we switch depending on if the deployment is running in the primary location or the secondary. Our current assumption is that this "switch" of the value of the locations the DISTRIBUTED_STORAGE_CONFIG is messing things up.
DISTRIBUTED_STORAGE_CONFIG:
default:
- S3Storage
- {s3_bucket: ${default_bucket}, storage_path: /datastorage/registry}
peer:
- S3Storage
- {s3_bucket: ${peer_bucket}, storage_path: /datastorage/registry}
DISTRIBUTED_STORAGE_DEFAULT_LOCATIONS: [default, peer]
DISTRIBUTED_STORAGE_PREFERENCE: [default, peer]
You can see the mako (templating) variables.
In the logs, it checks to see if the file exists at a path in "default", and then it indeed does not exist in S3. Since it can't find it, it stops the replication. This is visible when manually calling the backfillreplication script.
Questions:
- Does this variable switcheroo look like a plausible reason for why we don't see any replication?
- Are the names of the DISTRIBUTED_STORAGE_CONFIG (in our case "default" and "peer) stored in the database in some way, for example as lookups of some kind? We'd like to bring back clarify by being explicit about primary and secondary in the config, but we're afraid that if we do that, the database might not be able to find any existing layers etc.
Thanks,
Frank