What internal databases does kubernetes use?

452 views
Skip to first unread message

adnan ashraf

unread,
Oct 16, 2016, 11:37:46 PM10/16/16
to Kubernetes developer/contributor discussion
I am a newbie and would appreciate if someone could help me with following:

Does kubernetes only use etcd as its single source of master database?
if so, how does it go about providing following DB functionalities that are missing in etcd?

- sorting based on multiple attributes (like order by SQL query)
- create complex queries (using multiple attributes and collection/foreign object join combinations in search query criteria)
- pagination on result of above complex query
- enforce uniqueness on attributes in a record (for example, user.email has to be unique across the whole etcd db)
- enforce data integrity rules (like foreign key enforcement in relational DB, cannot delete a record because it is referenced by another record)
- date function queries, group by queries


I am tasked with building an application that would only use etcd as the master database. Thanks so much


Adnan

David Oppenheimer

unread,
Oct 17, 2016, 12:25:11 AM10/17/16
to adnan ashraf, Kubernetes developer/contributor discussion
etcd isn't a database. It's a key-value store. Think of it like a hash table.


--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-dev+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-dev@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/3b3c456d-bcfc-48de-b70d-1da37e3aea3a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

adnan ashraf

unread,
Oct 17, 2016, 12:31:31 AM10/17/16
to Kubernetes developer/contributor discussion, adna...@gmail.com
Thanks for the reply. 

I need to understand what is the underlying database technology that kubernetes uses to store its data? does it use etcd as a caching layer only while the master data is stored in some other data store (like msql etc)? I have spent quite a bit of time understanding etcd and used it as well...I am in process of building a data intensive application and getting a push internally from the team to use etcd as the main single source of database. I am struggling to justify how to use etcd as the single source of database (while its missing so much of db features). I get the argument that since kubernetes uses etcd to store and manage All of its data in it, we should be able to do the same as well. 

If I knew better how internally kubernetes stores and caches data (across which technologies), it would better equip me for discussion. thanks

David Oppenheimer

unread,
Oct 17, 2016, 1:00:58 AM10/17/16
to adnan ashraf, Kubernetes developer/contributor discussion
etcd is the only storage system Kubernetes uses. It is not used as a caching layer for some other storage system.

The reason etcd lacks many database features is because it is not a database.

I see that you also posted your question to the etcd-dev mailing list; that's probably the best place to find out how you could use it in complex applications.


--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-dev+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-dev@googlegroups.com.

David Aronchick

unread,
Oct 17, 2016, 7:49:37 AM10/17/16
to David Oppenheimer, adnan ashraf, Kubernetes developer/contributor discussion
As a follow on, we very much do NOT recommend using the etcd that kubernetes creates and uses for any other purpose. It's really an implementation detail, and using it for any other reason risks destabilizing your cluster.
On Sun, Oct 16, 2016 at 22:00 'David Oppenheimer' via Kubernetes developer/contributor discussion <kuberne...@googlegroups.com> wrote:
etcd is the only storage system Kubernetes uses. It is not used as a caching layer for some other storage system.

The reason etcd lacks many database features is because it is not a database.

I see that you also posted your question to the etcd-dev mailing list; that's probably the best place to find out how you could use it in complex applications.

On Sun, Oct 16, 2016 at 9:31 PM, adnan ashraf <adna...@gmail.com> wrote:
Thanks for the reply. 

I need to understand what is the underlying database technology that kubernetes uses to store its data? does it use etcd as a caching layer only while the master data is stored in some other data store (like msql etc)? I have spent quite a bit of time understanding etcd and used it as well...I am in process of building a data intensive application and getting a push internally from the team to use etcd as the main single source of database. I am struggling to justify how to use etcd as the single source of database (while its missing so much of db features). I get the argument that since kubernetes uses etcd to store and manage All of its data in it, we should be able to do the same as well. 

If I knew better how internally kubernetes stores and caches data (across which technologies), it would better equip me for discussion. thanks

On Sunday, October 16, 2016 at 11:25:11 PM UTC-5, David Oppenheimer wrote:
etcd isn't a database. It's a key-value store. Think of it like a hash table.


--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-de...@googlegroups.com.
To post to this group, send email to kuberne...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-de...@googlegroups.com.
To post to this group, send email to kuberne...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/CAOU1bzdfvGbX3y6jo26uCXLcuTW04gUYYF%2BnOXeQ%2BdJyXD%2BiRg%40mail.gmail.com.

Eric Tune

unread,
Oct 17, 2016, 12:06:33 PM10/17/16
to David Aronchick, David Oppenheimer, adnan ashraf, Kubernetes developer/contributor discussion
We chose a key-value store because we want to be able to scale to very large cluster sizes and this high request rates, which we think it not possible with a traditional database.  We chose not to use multi-object transactions (think of it as multi-row transactions) both because relying on these would hurt our ability to re-architect the system into separate components in the future as needed.

However, it is much harder to develop a correct system on top of k-v store instead of a relational database.  

I would recommend that you stick with a relational database unless you clearly understand why you need a k-v store.

You should not be justifying to your team why to use a relational database  Your team should be justifying to you why they need etcd.

Regarding specific items:

- sorting based on multiple attributes (like order by SQL query)

This hasn't come up.  Most clients want one thing or all the things.

- create complex queries (using multiple attributes and collection/foreign object join combinations in search query criteria)

We don't have multi-objet transations, so having a database do joins doesn't make sense because things can be in wierd states.
Joining usually has to be done manually with custom logic.  (Example, controller-ref).  

- pagination on result of above complex query

Would be nice to have pagination for UI clients.

- enforce uniqueness on attributes in a record (for example, user.email has to be unique across the whole etcd db)

The name of an object gets uniqueness because it is part of the key name and etcd enforces key uniqueness.
We don't get enforced uniqueness on any other fields.

- enforce data integrity rules (like foreign key, andy enforcement in relational DB, cannot delete a record because it is referenced by another record)

We don't get these features from k-v store.

- date function queries, group by queries

We don't get these.  Haven't missed them.


On Mon, Oct 17, 2016 at 4:49 AM, 'David Aronchick' via Kubernetes developer/contributor discussion <kuberne...@googlegroups.com> wrote:
As a follow on, we very much do NOT recommend using the etcd that kubernetes creates and uses for any other purpose. It's really an implementation detail, and using it for any other reason risks destabilizing your cluster.
On Sun, Oct 16, 2016 at 22:00 'David Oppenheimer' via Kubernetes developer/contributor discussion <kubernetes-dev@googlegroups.com> wrote:
etcd is the only storage system Kubernetes uses. It is not used as a caching layer for some other storage system.

The reason etcd lacks many database features is because it is not a database.

I see that you also posted your question to the etcd-dev mailing list; that's probably the best place to find out how you could use it in complex applications.

On Sun, Oct 16, 2016 at 9:31 PM, adnan ashraf <adna...@gmail.com> wrote:
Thanks for the reply. 

I need to understand what is the underlying database technology that kubernetes uses to store its data? does it use etcd as a caching layer only while the master data is stored in some other data store (like msql etc)? I have spent quite a bit of time understanding etcd and used it as well...I am in process of building a data intensive application and getting a push internally from the team to use etcd as the main single source of database. I am struggling to justify how to use etcd as the single source of database (while its missing so much of db features). I get the argument that since kubernetes uses etcd to store and manage All of its data in it, we should be able to do the same as well. 

If I knew better how internally kubernetes stores and caches data (across which technologies), it would better equip me for discussion. thanks

On Sunday, October 16, 2016 at 11:25:11 PM UTC-5, David Oppenheimer wrote:
etcd isn't a database. It's a key-value store. Think of it like a hash table.


--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-dev+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-dev@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-dev+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-dev@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-dev+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-dev@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/CADSfKX%3D%2BZy2YDuqv73oZ453-0pUsusu0zQfoYb7Qa6ajKY3Lbg%40mail.gmail.com.
Reply all
Reply to author
Forward
This conversation is locked
You cannot reply and perform actions on locked conversations.
0 new messages