I'm going to answer your questions a little out of order...
> Also, why do you strongly discourage the dynamic key option? Is it just the
> audit reason that is discussed in the doc, or something else?
It's one of two major reasons.The thing to keep in mind about auditing
with the dynamic keys backend is that unless you can configure SSH to
not only log who logged in when but which key was used (which I don't
think you can do, but I'm happy to be proven wrong), you have no way
to correlate tokens with actions. It's an ordering problem: if three
users grab keys for a host, and then log into the machine in the
reverse order of which the keys were grabbed, you have no way of
knowing that this happened, as opposed to them logging in in-order, or
any other ordering.
The other major reason is that generating SSH keys is expensive, in an
entropy sense. Vault could very quickly be a bottleneck if the machine
can't generate entropy fast enough.
> This isn't out of the question, though I'd need to do some work to get my
> security guys on board. They're pretty well sold on the idea of a bastion
> host proxying SSH.
I used to work at a company that did the bastion model. It was okay,
except for the various SSH features that were unsupported or dog-slow.
> Beyond that though, I'm wondering about the logistics of managing that.
> Right now, the one bastion host -- with one public IP
You'll definitely want more than one, so that that machine going down
doesn't erase the availability of SSH to all of your machines. So then
you have to figure out how to make the fact that there are multiple
bastion hosts transparent. At the previous company I mentioned, there
was a command that you used in place of SSH that would figure out
available bastion hosts and connect to one.
> If I were to go with transparent proxying directly to the instances with
> haproxy or nginx, that automation would need to be replaced with something
> that updated the proxy config. That's doable, but I believe I'd either need
> to allocate an IP on the proxy box for every host I want to proxy to, or
> assign a port for each machine. In either case, I'd have some way clients
> can (again through automation) get that mapping of ports to the machine
> they're trying to get to. There are certainly ways I can make that work,
> but it seems like it's actually quite a bit more complicated than the system
> I have now.
I did this in the past using port numbers, and I exposed a consul
agent's DNS ports to my other subnet so that they could query for the
hosts's SSH service. However, there is another potential solution;
haproxy can inspect a connection to know whether it is an SSH
connection request, and it can also look at SNI. So I believe you
could accomplish this with haproxy by using the method here:
http://blog.chmd.fr/ssh-over-ssl-episode-4-a-haproxy-based-configuration.html
-- but by adding `-servername %h` to the openssl s_client command to
have it send an SNI header.
On haproxy, you look at the SNI header to determine which host to send
the connection to. You could also, if you wished, inspect the data to
ensure that it is an SSH connection, although if the only thing
haproxy would be doing is proxying to SSH then this would be
unnecessary.
Note that I haven't tried this specific scenario out, but I have both
(separately) routed based on SNI header within haproxy and proxied SSH
by haproxy (and it was quite performant).
> Are there reasons why you don't think the SSH backend couldn't be modified
> to support the bastion model? Or are you just offering some suggestions for
> alternatives? The suggestions are appreciated, but I think the bastion
> model I'm describing is common, and it'd be useful to support it.
I'm offering suggestions for alternatives a) because the usage of the
bastion hosts indicates that you are planning on using the dynamic key
model, which is very strongly discouraged; b) I think the bastion host
concept is fine, but multiple SSH jumps is quite slow,
performance-wise, and better solutions exist today; and c) because I
don't know when there will be time to add that support in...there's a
very full plate in the near term.
--Jeff