Staging and Production Environments - Single or Separate CoreOS Clusters?

848 views
Skip to first unread message

Jan Vincent Liwanag

unread,
Oct 25, 2014, 10:38:26 AM10/25/14
to coreo...@googlegroups.com
Hi,

If I have two environments -- Staging and Production, would they be better off as separate CoreOS clusters? Or might it make more sense to create a single cluster and maybe tag which machines should run applications from staging or production?

I know that the likely answer to this is "it depends". But I'd like to get what considerations I have to make.

Thanks!

hydrajump

unread,
Oct 27, 2014, 8:05:10 PM10/27/14
to coreo...@googlegroups.com
Hi Jan,

Great question! I've also been thinking about this. At the moment my team wants to separate AWS VPCs one for staging and one for production. I've been considering a single VPC with a single CoreOS cluster
and as you suggested do some form of tagging or something to isolate the containers.

I'd also like to hear opinions and advice on this.

Brandon Philips

unread,
Oct 28, 2014, 12:16:06 AM10/28/14
to hydrajump, coreos-user
Hey Jan and Hydrajump-

I think it is reasonable to run production and staging workloads
side-by-side on the same cluster as long as you can get the cgroup
limits and load balancing configuration 100% correct on your systems.
These two things can be rather tricky given the state of the art in
tooling and monitoring. But, over time I think it will be easier and
easier to suggest this as a reasonable setup.

One idea if you do run a separate cluster it to run CoreOS alpha. That
is a great way to ensure the latest in kernels, docker, etcd, etc
works well for your application. Particularly when paired with solid
monitoring you can help the entire community, including your future
self, out by ensuring the platform you are building on continues to
run your application well.

tl;dr with careful configuration and load balancing I think it is
reasonable to run a single cluster

Brandon
> --
> You received this message because you are subscribed to the Google Groups
> "CoreOS User" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to coreos-user...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Rimas Mocevicius

unread,
Oct 28, 2014, 6:37:45 AM10/28/14
to coreo...@googlegroups.com, wa...@hydrajump.com
Hey guys,

Let me put my 2 cents here too.

At my work I did setup most of our servers on CoreOS on GCE, just the two bare metal one servers in our office run Ubuntu+Docker.

As we have 5 different production projects, I did setup one 3 node dev/test/staging cluster (on CoreOS alpha) for all our production projects.

Also I have another two nodes cluster behind loadbalancer for the private docker registry, docker images builder (docker in docker) and in house made
Deployment and Release Manager to control our dev/test/staging/production releases, manage clusters' units e.g check cluster status, pull/restart/status of units, rebuild base docker images.
(Maybe when our Deployment and Release Manager is more mature I will try to convince my employer to open source it)

So far so good with such simple and nice CoreOS cluster setup.

All the best

Rimas
Message has been deleted

hydrajump

unread,
Oct 28, 2014, 9:33:14 AM10/28/14
to coreo...@googlegroups.com, wa...@hydrajump.com
Hi Rimas,

Thanks for sharing your setup.

If I understand what you've described you have a single 3 node cluster used for all environments **and** a 2 node cluster just for private docker registry and other operational services?

Why didn't you put the things on the 2 node cluster on to the 3 node cluster and have everything running there?

Also can you have just 2 nodes to form a cluster? Don't you need a minimum of 3 for etcd?

How do you separate the various environments from interfering with each other? For instance, a dev web app shouldn't communicate with anything that's a production service?

Can you share anything even by simply explaining how new code pushed to say GitHub is pulled down, a Docker image built, pushed to your private registry and than deployed to your 3 node cluster using fleet (I assume you're using fleet)?

I'm trying to put together something similar and any advice you can share would be awesome!

Thanks.

- Jonathan

Btw are you on #coreos?

Rimas Mocevicius

unread,
Oct 28, 2014, 10:45:40 AM10/28/14
to coreo...@googlegroups.com, wa...@hydrajump.com
Hi Jonathan,

Sure I can go in more details of my setup:

>If I understand what you've described you have a single 3 node cluster used for all environments **and** a 2 node cluster just for private docker registry and other operational services?
>Why didn't you put the things on the 2 node cluster on to the 3 node cluster and have everything running there?
My production clusters usually run stable version or sometimes beta at least, that counts for that 2 node cluster too.
/dev/test/staging cluster is usually on alpha as well my local coreos vagrant setup.
This why I have two separate clusters 
(I used to have it in one cluster with alpha in my early days of coreos use in production, but when coreos matured to stable channel things got changed)

>Also can you have just 2 nodes to form a cluster? Don't you need a minimum of 3 for etcd?
I works fine for me with 2 nodes too, but for web facing real production clusters I always use 3 node clusters.

>How do you separate the various environments from interfering with each other? For instance, a dev web app shouldn't communicate with anything that's a production service?
Config files per environment have different connection settings and in some of my cases I have different hosts files pointing to the different MySQL servers

>Can you share anything even by simply explaining how new code pushed to say GitHub is pulled down, a Docker image built, pushed to your private registry and than deployed to your 3 node cluster using fleet (I assume you're using fleet)?

I do not rebuild docker images for dev/test/staging as it is way too much of time consuming.
The way my dev/test/staging is done:
Each cluster node has ssh container to use for rsync code.
and web server container  with apache (+ all necessary php and etc dependencies) and ssh too, just to mimic the real linux server for my developers.
The code is stored on the host disk (extra one, which makes easier to clone it, or reattach to the rebuilt host ) and mapped as a volume to web and ssh containers.
So release scripts just pull the code from the git branch (with exclusions for the config files in some cases) depending on environment.
That makes code releases very quick, and developers can see changes in seconds.

For the production I use the proper docker release cycle:
The Deployment Manager on docker-in-docker container (with ssh access as well) pulls the code from staging server, adds necessary config files,  builds docker image, 
pushes to private registry (google cloud storage based, so it works with two servers behind LB), pulls the image on production servers and restarts fleet service units.

And of course fleet is used on our coreos nodes, even on my one node local coreos vagrant machine :), I'm looking at some stage to put kubernetes on top of fleet.

Also If you are interested I have a small GUI/wrapper app for coreros-vagrant for Mac https://github.com/rimusz/coreos-osx-gui

> Btw are you on #coreos?
Yes,  I'm on #coreos as rimusz , you are free to contact  me there :)

Any more question just ask over here or at #coreos

Regards

Rimas

Jan Vincent Liwanag

unread,
Nov 1, 2014, 12:26:13 AM11/1/14
to coreo...@googlegroups.com, wa...@hydrajump.com
Thanks guys for the advice!

One thing I haven't figured out is given no tagging, on a cluster of 3 -- how does fleet decide where to run a certain application? Is there a chance that all applications end up running on a single node?

Rimas Mocevicius

unread,
Nov 1, 2014, 8:14:55 AM11/1/14
to coreo...@googlegroups.com, wa...@hydrajump.com
If there is no tagging all applications can end up running on the single node.
I was asking CoreOS guys before about how to spread the load equally around nodes, but there is no such option in the fleet yet.

Rob Szumski

unread,
Nov 1, 2014, 6:51:05 PM11/1/14
to Rimas Mocevicius, coreo...@googlegroups.com, wa...@hydrajump.com
Since fleet 0.7, the cluster will attempt to level out the number of units running on each machine. This accomplishes really coarse load leveling, but it doesn’t take into account resource usage or anything like that.

Rimas Mocevicius

unread,
Nov 1, 2014, 7:52:13 PM11/1/14
to coreo...@googlegroups.com, rmo...@gmail.com, wa...@hydrajump.com
Hi Rob,

Yes, it is only levels out units if e.g. you deploy units to a new fresh cluster.
E.g I have 3 nodes cluster.
1) On the first deploy 15 units get deployed 5 per node.
2) But when one machine gets rebooted after OS update, it's units get moved to the two
remaining machines and afterwards units do not get rescheduled evenly around 3 machines.
3) So I always have one empty machine till the next OS update

So would be nice that fleet can reschedule units when machine comes back after reboot.
In high availability units setup it works just fine, but having e.g. 15 units for three machines, one machine
always stays empty which was rebooted the last.

Rimas Mocevicius

unread,
Nov 2, 2014, 10:04:59 AM11/2/14
to coreo...@googlegroups.com, rmo...@gmail.com, wa...@hydrajump.com
Rob,
So I did look more deeply how to reschedule units when some cluster machine comes back after reboot 
and how it can take some units load from other cluster machines.
(as per problem explained in my previous post)

The only thing can I came up is:
To have one shot systemctl unit which runs e.g such commands below

fleetctl unload unit1
fleetctl start unit1

fleetctl unload unit2
fleetctl start unit2

fleetctl unload unit3
fleetctl start unit3
...

So the scripts unloads and starts all required fleet units one by one and then cluster attempts to level out the number of units running on each machine.

Rob, anything better you can recommend there?

Thanks

Rimas

Rob Szumski

unread,
Nov 6, 2014, 8:55:24 PM11/6/14
to Rimas Mocevicius, coreo...@googlegroups.com, wa...@hydrajump.com
I don’t know of anything better currently. If you have any more ideas about what a rebalancing feature looks like, opening a Github issue with a description/proposal would be appreciated. We can continue the discussion on there :)

 - Rob

Rimas Mocevicius

unread,
Nov 7, 2014, 1:07:21 PM11/7/14
to coreo...@googlegroups.com, rmo...@gmail.com, wa...@hydrajump.com
Sure Rob.

I think will about it and open a github issue over weekend.
Reply all
Reply to author
Forward
0 new messages