We have been working on standing up Vault on approximately 700 MySQL Servers (prd/non-prod). These servers are a mix of Percona, Amazon RDS MySQL, and Amazon Aurora. With Percona and RDS, things seem to be going very well, but not so much with Aurora. One issue we were having has to deal with clusters. Aurora has a concept of a writer and many readers. Only one node can actually write to the database at a time "the writer" - and what makes this even more difficult is that Amazon can and does move the writer around the cluster. The problem this presents has to deal with the way connections are stored in vault. Traditionally, we where storing each node in Vault. When we would request a connection, Vault would connect to the node, create the user, and return it with the corresponding password. With Aurora, you cannot create a user on a reader. It would return:
{
"errors": [
"Error 1290: The MySQL server is running with the --read-only option so it cannot execute this statement"
]
}
So, to work around this, we store the cluster endpoint in Vault for each participating node in the cluster. Amazon always points this endpoint at the writer. This seems to OK for the most part. We are seeing other weirdness however that we have not been able to work around. The first issue is connections to the database. We are not sure what happened, but over time, Vault had created 230+ connections to an Aurora server. All connections were live and in a sleep state. This is problematic in that it has the potential to exhaust all available connections to the database. When we attempted to DELETE the mount from Vault, we received the following error:
So far, we have been unsuccessful in dropping the mount. The permissions that have been set up for the Vault user are:
GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, RELOAD, PROCESS, REFERENCES, INDEX, ALTER, SHOW DATABASES, CREATE TEMPORARY TABLES, LOCK TABLES, EXECUTE, REPLICATION SLAVE, REPLICATION CLIENT, CREATE VIEW, SHOW VIEW, CREATE ROUTINE, ALTER ROUTINE, CREATE USER, EVENT, TRIGGER ON *.* TO ...WITH GRANT OPTION
These are the same permissions we use on the other MySQL instances (RDS, Percona)...We even went as far as to drop the Vault user from the database, and when we attempt to DELETE the mount, we receive:
Why does Vault need to check the database? Is it trying to clean up users it created? In our case, there were no Vault-created users in the database.
Two requests we have coming from our findings so far:
1. Allow unconditional deleting of mounts, or at lease the ability to force drop the mount.
2. Provide an option as to whether or not Vault maintains persistent connections for a given mount.
One question we have:
1. How can we drop this mount from Vault?
We understand that Hashicorp did not have Aurora in mind when they designed Vault. We will continue to test the product with Aurora and provide our findings to help solidify on this front.
Regards,
Jerry