VTTablets Replication Controller

rishab...@tothenew.com

unread,

Jul 6, 2015, 5:49:12 AM7/6/15

to vit...@googlegroups.com

Hey Alain/Anthony,

While setting up and running things, I observed a serious issue with vttablet script. Looks like vttablet pods are not getting registered with any RC. Therefore, when the tablet goes down, there is no way get a failsafe thing in place, and the whole data will be lost.

Can we get this script to be able to use K8 RC, so that even when vttablets go down, data loss can be tackled with in real time.

Thanks
Rishabh

Anthony Yeh

unread,

Jul 6, 2015, 6:34:56 AM7/6/15

to rishab...@tothenew.com, vit...@googlegroups.com

I agree that vttablets should be managed by replication controllers. It's another thing on my To Do list. :)

Ideally we could have one RC to produce all the replicas for a given shard. The missing piece when I last looked at it was that there's no concept of identity for replicas. Currently we use the vttablet-up.sh script to assign each tablet a unique ID, which is used to track the tablet in topology. I had to do that because replication controllers didn't have any way to assign unique IDs in a deterministic way.

We can't use random IDs because each tablet records its existence in topology by its ID. If that tablet disappears, we currently have no way to guarantee that it will be removed from topology. We get around that by having an external system that assigns consistent IDs - that is, the process that is brought up to replace the tablet that disappeared will be assigned the same ID as the one that disappeared. That way the new process knows which ID to clean up and take ownership of in topology.

Our options are to figure out a way to not need consistent IDs assigned from outside, or to somehow get replication controllers to assign them. A few months ago, I talked to some Kubernetes folks who said the latter is a use case that they've seen come up in various situations. So they are interested in having an answer for it, although they were not sure what form it would take. I haven't checked yet whether there has been any progress on that front.

One thing that was pointed out to me is that normally if a container or pod fails, even if there is no replication controller, the kubelet will take care of restarting it with the same configuration on the same node. Using the current naive vttablet-up.sh script, the only way you could end up with a tablet failing and not getting replaced is if the node itself fails.

Let us know if you have any suggestions or find anything that might help here.

Thanks,

Anthony

--
You received this message because you are subscribed to the Google Groups "vitess" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vitess+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

rishab...@tothenew.com

unread,

Jul 6, 2015, 8:08:01 AM7/6/15

to vit...@googlegroups.com, eni...@google.com, rishab...@tothenew.com

Jus want to know, can't we use Ping result to know if the tablet is working or not?

Also, giving just a thought to it, can't we use tagging to identify the pods in the group. That ways, we can provide random id's to the pods and just listen to all pods with a particular tag like "vttablet".

I used the basic scripts to get k8 cluster up on AWS, in case of failure of a pod, it is not getting recreated. I don't know if some additional flag is required to switch that functionality on.

One thing more, what I could gauge is, there is no health check mechanism in place.

Thanks,
Rishabh

Anthony Yeh

unread,

Jul 6, 2015, 2:51:43 PM7/6/15

to rishab...@tothenew.com, vit...@googlegroups.com

We can use a health check to know if a tablet is currently working, but that signal alone can't tell us when the tablet is completely gone and we should stop trying to check its health. We could require some kind of heartbeat to keep a tablet's record in topology, but that would be a major architectural change with its own trade-offs.

If we were running only inside Kubernetes, it might be possible to use labels to target replicas, and not care about storing tablets in topology at all. But we don't want to be totally dependent on Kubernetes, so we'd still need to do our own load-balancing when outside it.

That's weird that the pod isn't being restarted. What kind of failure do you mean? The process died sometime after the pod first entered Running state? Or do you mean an error in scheduling of the pod itself (like if it gets stuck in Pending and never goes to Running)?

Now that the kubelet provides the "livenessProbe" and "readinessProbe" fields, it should be straightforward to hook these up. For example, we have this handler, which should be suitable for readinessProbe:

https://github.com/youtube/vitess/blob/master/go/cmd/vttablet/healthz.go

For the livenessProbe we should add a handler that just responds instantly to prove the process isn't deadlocked. Problems other than deadlock are unlikely to be helped by restarting the process - for example, if the tablet has fallen behind on replication.

This is also on my To Do list. :)

Robert Navarro

unread,

Oct 26, 2016, 7:15:20 PM10/26/16

to vitess, rishab...@tothenew.com

Sorry to dig up an old thread, but I had the same question.

I understand the concerns you have Anthony but couldn't we just setup a replication controller for each pod (and only have one pod behind it)

This would allow for restarts and uniqueness in the cluster.

Anthony Yeh

unread,

Oct 26, 2016, 8:37:15 PM10/26/16

to vitess, rishab...@tothenew.com

Yes, it should work fine to create one RC of size 1 for each tablet. I didn't set up the example to do this just because I think it's sort of hacky. I wanted to do it the "right way" instead, which is with StatefulSet (formerly known as PetSet). However, I got pulled off of Kubernetes-related work due to other priorities. I'm switching to Kubernetes full-time now, so I plan to take a fresh look at all this once I catch up on the latest developments.

Reply all

Reply to author

Forward