Vitesse persistent storage

Matteo Candido

unread,

May 25, 2016, 12:27:43 PM5/25/16

to vitess

Hi,

i've exported a variable like this before run the vttablet-up.sh script.

export VTDATAROOT_VOLUME=/mnt/vitesse/vtdataroot

where /mnt/vitess/vtdataroot is an nfs mounted dir.

Same dir is mounted on each cluster's node.

This it the correct way to have the mysql persistence after the vttablet's pods crash?

Anthony Yeh

unread,

May 25, 2016, 5:36:41 PM5/25/16

to vitess

I think this should work, but you'll be limited by the performance of MySQL over NFS.

The VTDATAROOT_VOLUME variable was added for the case where we mount a local SSD in each node, which is faster than the default location of a Kubernetes emptyDir volume.

Usually what we recommend is that you use NFS only for backups by setting -file_backup_storage_root to point to the NFS mount. Then vttablet will pull down the latest backup as needed to seed a new replica. In this setup, you achieve durability by having a large number of small replicas; that is, you reshard to the point where each shard is small enough to restore quickly. Then you rely on an intelligent scheduler like Kubernetes to distributed replica pods across hardware, so that it's very unlikely all replicas for a given shard would go down at the same time.

We understand that not everyone will want or be able to use this setup immediately, so we plan to work on another example setup for Kubernetes that would use Persistent Volumes to achieve durability with a smaller number of replicas. The idea there would be to use block stores like EBS or PD that would be more efficient than NFS and would attach to the nodes automatically. However, we haven't started on that yet.

--
You received this message because you are subscribed to the Google Groups "vitess" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vitess+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Matteo Candido

unread,

May 26, 2016, 3:13:28 AM5/26/16

to vitess

Yep it was just a test.

I've already configured an nfs path for backup storage, but my question is : what happen if a node goes down and no backup is taken?

From vitess guide and from what i've read in this forum , the backup procedure is not automatic. It's correct?

Anthony Yeh

unread,

Jun 8, 2016, 7:18:44 PM6/8/16

to vitess

Sorry for the delayed response. I've had to reduce my time allocated for emails in order to get RC1 out. :)

If a node goes down at a time when there's no backup, then you'll need to make a backup from another slave before you can recreate any tablets that went down. If all slaves go down at the same time and there is no backup, you lose that data (assuming you used the emptyDir approach). That's why having many replicas, distributed across AZs, is important when using emptyDir. Again, the alternative is to use some kind of persistent volume if you don't have enough replica deployment to make emptyDir safe, which we are working on.

You're right that currently you are responsible for triggering a backup on each shard at some interval that you decide.

Brandon Philips

unread,

Jun 8, 2016, 8:39:10 PM6/8/16

to vit...@googlegroups.com

On Wed, Jun 8, 2016 at 4:18 PM 'Anthony Yeh' via vitess <vit...@googlegroups.com> wrote:

Again, the alternative is to use some kind of persistent volume if you don't have enough replica deployment to make emptyDir safe, which we are working on.

What are you working on? Sorry I am having a hard time parsing this.

Anthony Yeh

unread,

Jun 8, 2016, 8:50:30 PM6/8/16

to vit...@googlegroups.com

Currently our examples show you how to do the emptyDir method. We are working on an alternate example config that shows how to use Kubernetes Persistent Volumes instead for your mysql data dir.

--

Shouichi Kamiya

unread,

Nov 18, 2016, 3:46:17 AM11/18/16

to vitess

Is it reasonable to use emptyDir approach in production, or how many replicas, read only slaves are needed? Should I use PersistentVolume instead?

Anthony Yeh

unread,

Nov 18, 2016, 1:31:52 PM11/18/16

to vitess

As a general rule, you should use PersistentVolume (PV) unless you really must have local disk. In addition to surviving Pod death (e.g. due to Node failure), PV currently has the additional advantage that flag rollouts and upgrades are faster. With emptyDir, as you do a rollout, each tablet has to restore from backup since it starts from scratch. This also means with PV you spend less time waiting for a Pod to come back in the event that it must be moved to a different Node (e.g. if its former Node is drained).

If you do need local disk, the emptyDir approach would be reasonable if you have "enough" replicas in "enough" different zones. The meaning of "enough" depends on your tolerance for risk - the risk being that if all replicas die at the same time, you lose everything since the last backup.

Our examples used emptyDir because at the time, provisioning PVs was a manual process that was different depending on where you run Kubernetes. Now, Kubernetes has auto-provisioners for things like AWS EBS and GCE PD, so I think it makes sense to switch our default recommendation from emptyDir to PV.

We're working on a second generation Vitess-on-Kubernetes config in the form of a Helm chart:

https://github.com/youtube/vitess/tree/master/helm/vitess

It supports PersistentVolume as well as the upcoming StatefulSet feature. However, currently our chart only supports these features on a Kubernetes 1.4 cluster with Alpha features enabled. As of Kubernetes 1.5 (planned for Dec release), support will advance to Beta level.

Shouichi Kamiya

unread,

Nov 19, 2016, 2:36:35 AM11/19/16

to vitess

Thank you for the detailed description! It's good to know that there's a ongoing work uses StatefulSet.

Reply all

Reply to author

Forward