Alternative to etcd, possible sharded/partitioned KV store

20 views
Skip to first unread message

Deepak Vij

unread,
Feb 7, 2020, 6:22:44 PM2/7/20
to kubernetes-sig-scale

Hi all, I would like to reach out to you folks regarding alternative/s to current underlying “etcd” data store. I recently started looking around on this topic. I also saw that this particular topic was discussed during the following three Scalability weekly meetings:

  • 12/19/2019
  • 12/05/2019
  • 11/07/2019

 

I remember that there was an ongoing discussion on possibly leveraging TiKV as the possible partitioned KV data store. Also, a while back I had an in-depth discussion with the Apache Ignite KV community on all this. They showed interest as well at the time. Apache Ignite is a partitioned in-Memory persistence backed KV data store. Unlike other KV data stores, Ignite may be possibly leveraged for the underlying caching as well, not sure on this but worth looking into.

 

Also, I remember in one of the meeting notes, it was mentioned that there is an ongoing discussion regarding adjusting the “Watch” semantics as well. I am wondering where we are on all this, is this a prerequisite task prior to looking at an alternative KV data store as possible replacement for “etcd”.

 

In any case, it would be good to sync. up with you folks and hopefully know more on all this. I will also try to sync. up with you folks during the regular weekly meetings on this.

 

Regards,

Deepak Vij

Daniel Smith

unread,
Feb 7, 2020, 6:49:22 PM2/7/20
to Deepak Vij, K8s API Machinery SIG, kubernetes-sig-scale
+K8s API Machinery SIG is actually the relevant sig for the persistence layer, not sig scale.

We've extensively discussed this in the past, e.g. here, here, here, here, here

--
You received this message because you are subscribed to the Google Groups "kubernetes-sig-scale" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-s...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-sig-scale/6ab1c5e6-e94e-4f2a-b5ea-73091aab3792%40googlegroups.com.

Wojciech Tyczynski

unread,
Feb 10, 2020, 3:16:52 AM2/10/20
to Daniel Smith, Deepak Vij, K8s API Machinery SIG, kubernetes-sig-scale
On Sat, Feb 8, 2020 at 12:49 AM 'Daniel Smith' via kubernetes-sig-scale <kubernetes...@googlegroups.com> wrote:
+K8s API Machinery SIG is actually the relevant sig for the persistence layer, not sig scale.

We've extensively discussed this in the past, e.g. here, here, here, here, here

Agree with Daniel - we've been discussing this a little bit in our meetings, but it was more like "we should have a good motivation for doing that".
Scalability may obviously be this motivating factor, that's why I said that having some POC proving that we can go X times higher that with etcd may be a good starting point.

But the ultimate decision maker for this effort would be SIG apimachinery (though please involve me in discussions if you start them). 

Daniel Smith

unread,
Feb 10, 2020, 11:29:36 AM2/10/20
to Wojciech Tyczynski, Deepak Vij, K8s API Machinery SIG, kubernetes-sig-scale
On Mon, Feb 10, 2020 at 12:16 AM Wojciech Tyczynski <woj...@google.com> wrote:


On Sat, Feb 8, 2020 at 12:49 AM 'Daniel Smith' via kubernetes-sig-scale <kubernetes...@googlegroups.com> wrote:
+K8s API Machinery SIG is actually the relevant sig for the persistence layer, not sig scale.

We've extensively discussed this in the past, e.g. here, here, here, here, here

Agree with Daniel - we've been discussing this a little bit in our meetings, but it was more like "we should have a good motivation for doing that".
Scalability may obviously be this motivating factor, that's why I said that having some POC proving that we can go X times higher that with etcd may be a good starting point.

But the ultimate decision maker for this effort would be SIG apimachinery (though please involve me in discussions if you start them). 

Yeah, don't worry, in the unlikely event we undertook it, this would be a large effort and we'd make sure everyone got a chance to be involved. :)

Mike Spreitzer

unread,
Feb 11, 2020, 2:46:00 PM2/11/20
to kubernetes-sig-scale
Can you say what problem you are trying to solve with a change to the apiserver's backing store?

I am doing some challenging work, stressing Kubernetes watch notification (to Kubernetes watch clients) rate and storage volume.  I am fairly confident that I can get the Kubernetes watch notification rate I need by using enough Kube apiservers.  I am more concerned about data volume.  For that, another possible solution approach that occurs to me is to introducing sharding among multiple backing stores (etcd clusters) for a given object type.  This would mean that for LIST and WATCH the ResourceVersion would be a version vector; I think that may be doable.

Thanks,
Mike

Deepak Vij

unread,
Feb 14, 2020, 8:01:10 PM2/14/20
to Mike Spreitzer, kubernetes-sig-scale

True, sharding of underlying “etcd” datastore may have ramifications as far as resource versioning spanning across all the shards etc. We still have not thought through these design issues yet.

 

I was exploring replacement of “etcd” datastore with something like in-memory Ignite KV data store or TiKV etc. Based on Daniel’s feedback, creating a clean “Storage layer” abstraction which is agnostic of the underlying data store would be a big effort as API server is currently tightly intertwined with etcd3. As Daniel suggested earlier, a shim layer for converting the etcd3 protocol to Ignite/TiKV protocol may be the starting point. Thanks.


--
You received this message because you are subscribed to a topic in the Google Groups "kubernetes-sig-scale" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kubernetes-sig-scale/G_qNI0DwIho/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kubernetes-sig-s...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-sig-scale/6d287adf-5b49-4536-9e76-51969c1be6ef%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages