A few questions about disaster recovery

karl.qu...@ticketfly.com

unread,

Mar 6, 2015, 9:04:00 PM3/6/15

to scalr-...@googlegroups.com

Hi, all.

Before we can roll scalr out internally, there's a few last min things that need to be taken care of. Primarily, disaster recovery

I have two questions:

They both assume the following:

1. The scalr server lives in AWS. A snapshot of the system is made every 12 hours. The scalr server is scalr.mycorp.com DNS entry pointing to an EIP.

2. There was a catastrophic failure; the box is 100% unavailable, all i have is a snapshot.

So my firs question:

1. Let's assume that I can bring the server up and things are working fine. What about all the nodes that were created / spun up since my last backup? E.G. I created the most recent snapshot at Noon. at 4 PM i launched 5 new instances. How can I get scalr to 'know' about those nodes? There will be no record of them on the snapshot; only in AWS. Will the scalr-agent phoning home be 'enough' to get things working agin?

2. Let's say that i have nodes that are configured to talk to 1.2.3.4 as the scalr server. What file(s) do i need to update on the clients so that they'll begin talking to the scalr server located a 6.7.8.9? Can this even work?

Thanks!

-K

Thomas Orozco

unread,

Mar 10, 2015, 9:32:55 AM3/10/15

to scalr-...@googlegroups.com, Daniele Testa

Hi Karl,

1. Those servers won't be able to re-register with Scalr. This is something we plan to change later this year, but we aren't there yet. In the meantime, we recommend setting up replication in addition to snapshots (this will be possibe using the installer in 5.3, which we expect to release shortly, though until then you can of course do it manually).

2. You'd need to change the user data for those instances. This is obviously an intrusive process, so we very strongly recommend you use a domain name instead of an IP for the Scalr endpoint.

Cheers,

--
You received this message because you are subscribed to the Google Groups "scalr-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scalr-discus...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

karl.qu...@ticketfly.com

unread,

Mar 10, 2015, 4:03:34 PM3/10/15

to scalr-...@googlegroups.com, dan...@scalr.com

Thomas, thanks for the answers!

What, specifically, needs to be replicated? I'm not intimately familiar with the Scalr internals, I only know of the the "big pieces". Can you point me to a "high availability best practices" document or list of things to do?

Is there any documentation on where that scalr-agent userdata lives? I ask mostly to satisfy my own curiosity as well see how invasive the changes are. I 100% understand that i'd be messing around with things i shouldn't and that you wont be able to help if i come back asking for help w/ those instances.

Additionally, I have one other question:

I know that scalr uses chef solo/zero to install it's self. Is there any official support or documentation for using a chef run that i control to deploy the server? Basically, we'd like our entire infrastructure - including the core parts like Scalr to come up. Is using Chef to manage the scalr config.yaml recommended or supported at all?

Thanks!

-K

Thomas Orozco

unread,

Mar 10, 2015, 6:39:35 PM3/10/15

to scalr-...@googlegroups.com, Daniele Testa

Inline,

On Tue, Mar 10, 2015 at 8:03 PM, <karl.qu...@ticketfly.com> wrote:

What, specifically, needs to be replicated? I'm not intimately familiar with the Scalr internals, I only know of the the "big pieces". Can you point me to a "high availability best practices" document or list of things to do?

The MySQL databases need to be replicated (there are two databases: "scalr" and "mysql").

We don't have HA documentation right now (though we cover this during trainings for customers), but we'll probably add some in the near future (it's not that we don't want to, more so that I haven't found the time to publish it just yet).

Is there any documentation on where that scalr-agent userdata lives? I ask mostly to satisfy my own curiosity as well see how invasive the changes are. I 100% understand that i'd be messing around with things i shouldn't and that you wont be able to help if i come back asking for help w/ those instances.

In your cloud's user data (e.g. EC2 user data, OpenStack metadata).

Additionally, I have one other question:

I know that scalr uses chef solo/zero to install it's self. Is there any official support or documentation for using a chef run that i control to deploy the server? Basically, we'd like our entire infrastructure - including the core parts like Scalr to come up. Is using Chef to manage the scalr config.yaml recommended or supported at all?

Scalr does use Chef Solo to configure itself (as well as a Chef binary that ships within the Scalr package), but I would suggest you view this as an implementation detail, for the two following reasons:

- It might change in the future (though that's a bit unlikely)

- You have to keep the installer cookbook in sync with the packages you deploy (which is why the cookbook itself is included in the packages).

If your nonetheless want to use the Scalr cookbook yourself (and not use the cookbook or Chef binaries that come with the installer), then you can find it here: https://github.com/Scalr/installer-ng/tree/master/files/scalr-server-cookbooks/scalr-server

Finally, please note that there are more config files than just the config.yml. If you want to execute Chef yourself, I very strongly recommend you nonetheless use our cookbook.

Cheers,

karl.qu...@ticketfly.com

unread,

Mar 10, 2015, 7:53:32 PM3/10/15

to scalr-...@googlegroups.com, dan...@scalr.com

Awesome. thanks!

i'll keep that in mind w/ the database replication.

Ah, yeah, editing the instance metadata will be difficult once launched :).

Thanks for the feedback about the installer; i'll keep this in mind / pass it along.