|Post: Brandon Philips Explains etcd||Phil Whelan||3/18/14 5:02 PM|
I just posted an interview with Brandon Philips that I did last week. It’s focused on etcd, with some discussion on how it relates to Docker.
|Re: [docker] Post: Brandon Philips Explains etcd||Evan Krall||3/18/14 6:47 PM|
I've got a couple questions / requests for clarification:
"It's a data-store for really important information. It's tolerant of nodes going down. It gives you a way to store configuration data with consistent changes and distributed locks. The data is always available and always correct."
Did he just claim that etcd is consistent, available, and partition tolerant? This is generally considered to be impossible - two nodes that can't talk to each other (are partitioned) cannot possibly both accept writes and still contain the same data.
In the face of quorum loss, does etcd try to stay available for reads (possibly returning stale data), available for both reads and writes (possibly having diverged views of the data), or refusing reads and writes, guaranteeing that nobody receives data that could be incorrect?
ZooKeeper is not recommended for virtual environments. This is the key reason ActiveState chose Doozerd over ZooKeeper when we added clustered configuration into our Cloud Foundry solution, Stackato.
You also brought this up in the blog post from last month about Docker, but you haven't provided much detail about why you think ZooKeeper is inappropriate for a virtual environment. What issues does ZooKeeper run into in virtual environments, and how do Doozerd and etcd avoid the same issues?
|Re: [docker] Post: Brandon Philips Explains etcd||Brandon Philips||3/18/14 9:00 PM|
On Tue, Mar 18, 2014 at 6:47 PM, Evan Krall <kr...@yelp.com> wrote:You are right, etcd doesn't solve CAP. This is the problem with
discussing distributed systems in an informal chat. :)
The underlying consensus algorithm for etcd is Raft; it is consistent
and partition tolerant in the CAP terms. What I meant by "available"
is that the data is available for reads when quorum is lost.
In the face of quorum loss you can continue to read by default. If you
want to have consistent reads you can add the consistent=true flag.
|Re: [docker] Post: Brandon Philips Explains etcd||Evan Krall||3/18/14 9:03 PM|
Thanks for the clarification, Brandon; that's very helpful.
|Re: [docker] Post: Brandon Philips Explains etcd||Li Xiang||3/18/14 9:04 PM|
etcd is a CP system. As you states CAP system is impossible. I believe the practical assumption is that in most cases there is a majority of nodes are working properly.
Consistency means same data at the same time. And in zk, etcd or any similar system, we use logical clock to represent time. Doozer have version as a logical clock for each key.
|Re: [docker] Post: Brandon Philips Explains etcd||Phil Whelan||3/19/14 10:30 AM|
On Tue, Mar 18, 2014 at 6:47 PM, Evan Krall <kr...@yelp.com> wrote:I've got a couple questions / requests for clarification:"It's a data-store for really important information. It's tolerant of
Great point Evan.
Thanks Brandon. I’ll add an update to post.
Should also note, that this quote does not include certain assumptions that were mentioned later in the post, such as only using a small dataset. This is not your average key-value data-store. It designed to do a specific job well.
I don’t want to stray to far from the Docker path on this list, so will try to be brief…
This below reason was why we previously went with Doozerd over ZK. We’re creating a virtual appliance and we want it be able to run anywhere.
|Re: [docker] Post: Brandon Philips Explains etcd||Ranjib Dey||3/19/14 11:29 AM|
i want to add couple of other points on why etcd may be preferred over zookeeper:
1) operational efficiency:
a) as of now, it not possible to dynamically resize zookeeper cluster, .i.e adding more members without restarting the cluster,
b) monitoring: till now its not possible to get stats about entrire zk cluster in one api call. trivial things like who is master requires multiple queries across cluster. etcd out of the box provide stats api (leader, followers, state etc)
c) deployment: zookeeper deployment requires some more tooling (though its not specific to zk, but like most cluster based services) which can capture the context (who is leader, who all are existing members) during provisioning. etcd provides discovery/bootstrapping (this functionality is still being refined though) where one can bootstrap cluster by pointing it to a discovery endpoint (pre existing etcd cluster). etcd also provided dynamic configuration manipulation over api.
a ) zookeeper does not provide locking directly, it provides primitives to do so. this means, locks/barriers are implemented with help of some client side logic as well, which resulted in duplicated efforts across the client libs. etcd on the other hand provides a core set of modules (lock, leader election) out of the box. this does not restrict one to use the atomic key/value manipulation operations, but it does facilitate all bug fixes/efforts to be put in one place, which every client can use
b) simple http based REST like interface
zookeeper is awesome, and i think etcd address similar uses cases but with much better operational benefits (and with cloud based deployments in mind)