On Wed, Jul 9, 2014 at 11:06 AM, Darren Shepherd
<
darren.s...@gmail.com> wrote:
> I'm kicking this thread off because I've been struggling with
> /etc/environment lately (not just that it went MIA, it's apparently coming
> back). I want to give a bigger picture to what I'm doing so that maybe we
> can see if there's a different approach or something. And ideally it would
> be nice to have my concerns addressed before stable is declared (probably
> not likely). I know this is long, but please read.
To clarify, it is coming back on EC2/OpenStack. The reason why writing
IP addresses there stopped in 367.0.0 was that we moved support for
detecting them from a stand-alone script into cloudinit in order to
gracefully support reading from either an EC2-compatible metadata
service or an OpenStack config drive which provides the same data. The
previous stand-alone script only worked with the metadata service and
caused booting OpenStack images on systems with only config drive to
hang forever waiting on the non-existant metadata service.
>
> I've been working on a really fun side project for awhile that has allowed
> me to do some fun experiments with running both KVM and Docker on CoreOS.
> Basically you can easily turn a CoreOS cluster into a platform for running
> both VMs and Containers. So a lot like OpenStack, but just really, really
> easy to run and Containers are first class citizens. I've been trying to
> package it up and release it out to the community so other people can play
> with it, but that's where I've been having problems. It's hard to ensure
> that it will work across EC2, GCE, Vagrant, Local install, iPXE, etc.
Indeed it is hard, fixing that is the eventual goal but it is a work
in progress.
>
> This is my approach. I wanted to start with the assumption that you have a
> fleet cluster running. You then download my unit file and do "fleetctl
> start fancy.unit". And voilà! You have your own personal EC2. In order
> for this to work, I have to build upon what is available by default. So
> docker already gives me a consistent view of the world and all my stuff runs
> in containers. So that part is good. The big hangup is about the IP of the
> server. I need a consistent and reliable way to get the IP of server.
> Since my installation method is from fleet, I can't rely on some special
> setup from cloud-config, also I don't want whatever cloud-config the user
> has screwing up things and making my stuff not work.
>
> So in order to get the IP information about the server I've been using
> /etc/environment and COREOS_PUBLIC_IPV4, COREOS_PRIVATE_IPV4. I've felt
> like I've been using some fragile back door by using this file. Here's my
> issues
Well, your intuition was right there, it is brittle. Setting those
values never worked properly except for EC2, OpenStack, GCE, and
Vagrant. We did document the feature in cloudinit that relies on those
($public_ipv4 and $private_ipv4) but I purposefully didn't document
the /etc/environment half of the implementation because I was doubtful
that it was a workable long-term solution. The primary reason for
adding this in the first place is that the current etcd implementation
is tied closely to a one-node-one-ip setup so to get etcd working at
all we needed some way to accommodate that limitation. Eventually etcd
will be fixed to lift that limitation, so my hope is that the need for
this will be reduced.
>
> 1) It doesn't seem to be documented anywhere. Maybe it was never intended
> to be used, since 367 deleted /etc/environment.
> 2) It doesn't seem to exist or be populated in all environments. Bare metal
> and iPXE don't seem to have the file?
> 3) /etc/environment is a general purpose file. I don't know what may read
> it, but that files exists on other distros like Ubuntu, etc. So people for
> whatever reason may want to put more crap in that file, or overwrite it, in
> the end screwing with the consistency of the environment.
The scripts that write to this file try to update it rather than
overwrite it for this reason, granted they are all quick-n-dirty bash
scripts that are not immune to race conditions but it was sufficient
for an initial proof-of-concent implementation.
> 4) PUBLIC_IPV4 and PRIVATE_IPV4 do not seem well defined, are sometime empty
>
> It seems like knowing what the primary IP and the possible externally NAT'd
> IP is a basic need for any service discovery. I want a very well known and
> consistent way to determine this information. Here's what I propose, feel
> free to disagree, I don't assume to know everything.
This is the basic summary of the trouble here, it is simply impossible
to robustly guess at what the "primary IP" of a host is and even more
impossible to guess at what an external NAT address might be. It
should be possible to at least guess well enough for many environments
but that must be implemented carefully to avoid hanging for an
excessive amount of time during boot. There will always be plenty of
environments where the guess work fails and we need to be well behaved
and fast everywhere.
If a generic implementation for detecting these vales returns to
CoreOS I have a couple requirements to avoid this being a persistent
thorn in our side:
- It must be integrated with coreos-cloudinit. We need to support
configuring networking statically in addition to DHCP as well as
networks that don't have a route to the Internet. The primary way we
have for users to provide configuration of any kind is via a cloud
config, the primary consumer of the PUBLIC/PRIVATE_IPV4 values. This
is why I never "fixed" the old generic coreos-setup-environment script
to simply wait until it found a route to the public Internet. Since
that implementation required coreos-setup-environment to finish before
coreos-cloudinit could run waiting on networking could deadlock boot.
So cloudinit needs some fairly complex logic in order be able to
behave sanely.
- It must have a pretty clear scope of what we can and cannot support.
Networking is complicated and we need to be able to clearly document
when and where it is possible to rely on these values and when it
isn't. Ideally the line between the two doesn't hinge on some
arbitrary timeout during boot, inevitably there will be environments
where networking gets configured properly before the timeout but
sometimes will be a little slower, leaving the values undefined and
then breaking services that depend on those values in unexpected ways.
Right now I don't know if it is possible to meet those requirements.
The better solution is for services to not assume that networking is
simple and depend on being configured in this way. For example what I
would like to see in etcd is something like this: (disclaimer, this is
just my own pondering, not a fully fleshed out design, etcd's future
may look quite different)
- Support an arbitrary number of addresses per node, advertising all
possible addresses a node knows about to its peers.
- Peers would connect via the first working address, prioritizing
better looking routes to try first.
- By default nodes would dynamically watch local addresses, updating
the list it advertises as needed.
- In addition to local addresses it should be possible to give a node
any number of other addresses to advertise in order to support NAT or
complex firewall setups.
Most services don't need to be quite that smart since most services
aren't organized into a many-to-many cluster. For a simple service
like a HTTP server it should be configured to not care about the local
address at all, or even what its absolute URL may be when possible. In
the few cases where an absolute URL must be used (such as in a
redirect from HTTP to HTTPS) the Host: header in the request should be
used.
The bit that lies in between those two in complexity is service
discovery where services self-publish themselves into etcd or similar.
It is handy when it is possible to do so via a simple curl command
when the service starts and stops and providing these variables would
facilitate that but a more intelligent tool would be better. I think
it would be worth while to include a tool in coreos for this purpose.
It would include the logic you described above for determining a
reasonable default IP to advertise and optionally include all other
addresses as secondary choices. It would also be able to robustly
update and cleanup state in etcd and when used as a long-running
process could use a ttl and periodic updates to ensure an unclean
shutdown doesn't leave stale data behind. Best of all dealing with
this sort of logic at this level avoids the complexity of making sure
it works everywhere and doesn't risk deadlocking boot and cloudinit.