Vault/Consul corruption possibilities?

361 views
Skip to first unread message

Will Hayworth

unread,
Jun 1, 2016, 1:58:16 PM6/1/16
to Vault, Travis Hanna
Hi all!

We're using Vault with the Consul backend, both in high-availability mode, and we noticed after a recent restart of the nodes (to rectify 500s we were seeing) that Vault is no longer initialized (e.g. we can't unlock it etc.). The Consul data directories still have plenty of data, but when I query the nodes I only see keys like "vault/core/leader/0003adbd-e9db-fa8c-a08e-6081c2794f17" (there are a TON of those) and one that says "vault/core/lock". I've tried looking at the Vault source and this looks like it might be problematic, but I'm not sure. I can reinitialize things and set up our PKI infrastructure and app-ids etc. again, but doing so is (obviously) a pain in the butt. I'm not sure if this is a Vault or a Consul problem, but I figured this list would be a place to start.

Do you have any ideas with respect to what's going on here? If we end up having to reinitialize our setup, are there any practices with which we might prevent a reoccurrence?

Thanks,
Will

Jeff Mitchell

unread,
Jun 1, 2016, 2:58:22 PM6/1/16
to vault...@googlegroups.com
Hi Will,

It's really hard to say what's going on -- it'd be helpful to know
which version of Vault you're running, as well as logs from Vault and
Consul.

Thanks,
Jeff
> --
> This mailing list is governed under the HashiCorp Community Guidelines -
> https://www.hashicorp.com/community-guidelines.html. Behavior in violation
> of those guidelines may result in your removal from this mailing list.
>
> GitHub Issues: https://github.com/hashicorp/vault/issues
> IRC: #vault-tool on Freenode
> ---
> You received this message because you are subscribed to the Google Groups
> "Vault" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to vault-tool+...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/vault-tool/6fb63200-4b57-4786-8175-3d301cde6ea2%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Will Hayworth

unread,
Jun 1, 2016, 3:12:48 PM6/1/16
to vault...@googlegroups.com
Sure thing, Jeff. :) (And thank you so much for responding!) I realized my omission of version info after the first email but didn't want to spam.
We're running Vault 0.5.2 and Consul 0.6.3. I've sent you the logs separately because they contain sensitive information (though I'm happy to try to strip that out and post the full stuff here).

For general consumption, we've been seeing logs like:

May 31 19:29:41 hostname docker/c7a3376a1444[4629]: 2016/05/31 19:29:41 [ERR] core: key rotation periodic upgrade check failed: Unexpected response code: 500

May 31 19:29:42 hostname docker/cbed24595186[4629]: 2016/05/31 19:29:42 [ERR] agent: failed to sync remote state: No cluster leader

And then, post restart:

May 31 22:47:19 hostname docker/c7a3376a1444[4629]: 2016/05/31 22:47:19 [INFO] core: acquired lock, enabling active operation

May 31 22:47:19 hostname docker/c7a3376a1444[4629]: 2016/05/31 22:47:19 [INFO] core: post-unseal setup starting

May 31 22:47:19 hostname docker/c7a3376a1444[4629]: 2016/05/31 22:47:19 [INFO] core: pre-seal teardown starting

May 31 22:47:19 hostname docker/c7a3376a1444[4629]: 2016/05/31 22:47:19 [INFO] core: pre-seal teardown complete

May 31 22:47:19 hostname docker/c7a3376a1444[4629]: 2016/05/31 22:47:19 [ERR] core: post-unseal setup failed: keyring unexpectedly missing


___________________________________________________________
Will Hayworth
Developer, Engagement Engine
Atlassian




You received this message because you are subscribed to a topic in the Google Groups "Vault" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/vault-tool/Xh-0mErdFMg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to vault-tool+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/vault-tool/CAORe8GGDGtMpPhiNRpNPWb-kY_A7roWq5bezNAa8dB_QXwNTaA%40mail.gmail.com.

Jeff Mitchell

unread,
Jun 1, 2016, 3:36:27 PM6/1/16
to vault...@googlegroups.com
Hi Will,

I'll follow up privately so we can keep your data private. However,
for anyone wondering: the reason that you're seeing a lot of the
leader entries is because old ones are cleaned up via a periodic timer
by a successful active node. If active status is flapping, every time
a new node becomes active it'll write an entry, but not actually start
removing old ones before it loses the active status again. So
resolving the underlying issue should fix that.

--Jeff
> https://groups.google.com/d/msgid/vault-tool/CACrrYg2FtNkN28VTSOY6VZkp%3DbuxYGS4%2BsD1ruPRKbOHtvR9_w%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages