How to use etcd for task distribution?

1,330 views
Skip to first unread message

王希刚

unread,
Oct 26, 2016, 3:05:08 AM10/26/16
to etcd-dev
How to use etcd for task distribution?

Brandon Philips

unread,
Oct 26, 2016, 4:13:56 PM10/26/16
to 王希刚, etcd-dev

On Wed, Oct 26, 2016 at 12:05 AM 王希刚 <wangxig...@gmail.com> wrote:
How to use etcd for task distribution?

--
You received this message because you are subscribed to the Google Groups "etcd-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to etcd-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

王希刚

unread,
Oct 26, 2016, 9:59:14 PM10/26/16
to Brandon Philips, etcd-dev

I can not use k8s to do task distribution, but simply want to use simple to achieve a simple task distribution system. For example: N producers put messages into etcd, and then N consumers get messages from etcd. Similar to pub-sub.

2016-10-27 4:13 GMT+08:00 Brandon Philips <brandon...@coreos.com>:
On Wed, Oct 26, 2016 at 12:05 AM 王希刚 <wangxig...@gmail.com> wrote:
How to use etcd for task distribution?

--
You received this message because you are subscribed to the Google Groups "etcd-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to etcd-dev+unsubscribe@googlegroups.com.

Russell Haering

unread,
Oct 26, 2016, 10:24:42 PM10/26/16
to 王希刚, Brandon Philips, etcd-dev

The simplest model I think is to publish jobs under a well known prefix. Workers watch that prefix, then take out a lock on the job ID - there are a bunch of recipes around the web for etcd-backed locking.

If the worker successfully processes the job it removes it then releases the lock. You have a lot of choices around handling failures. Your design should try to prevent situations where a job that crashes workers, or simply fails to process will lead to a backlog of failing jobs that prevents processing of new ones. Often this means tracking failures (including ones where a worker fails to heartbeat it's lock) and eventually moving the job to an error queue for human intervention.

This model suffers from scaling limitations around lock contention, because every worker competes for a lock on every job. To scale it further you can introduce a hash ring to limit which workers compete for which jobs. For example instead of having a single prefix, you could hash job IDs across 1024 different prefixes, and use a service registry or secondary locking scheme to ensure that several workers are watching each prefix at all times.

At some point your throughput will be limited by the guarantees provided by etcd. You could shard the latter mechanism across clusters, but I'd start simple and see what meets your needs.

-Russell


On Wed, Oct 26, 2016, 18:59 王希刚 <wangxig...@gmail.com> wrote:

I can not use k8s to do task distribution, but simply want to use simple to achieve a simple task distribution system. For example: N producers put messages into etcd, and then N consumers get messages from etcd. Similar to pub-sub.
2016-10-27 4:13 GMT+08:00 Brandon Philips <brandon...@coreos.com>:
On Wed, Oct 26, 2016 at 12:05 AM 王希刚 <wangxig...@gmail.com> wrote:
How to use etcd for task distribution?

--
You received this message because you are subscribed to the Google Groups "etcd-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to etcd-dev+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "etcd-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to etcd-dev+u...@googlegroups.com.

Vick Khera

unread,
Nov 16, 2016, 9:13:59 AM11/16/16
to etcd-dev, brandon...@coreos.com


On Wednesday, October 26, 2016 at 9:59:14 PM UTC-4, Xigang Wang wrote:

I can not use k8s to do task distribution, but simply want to use simple to achieve a simple task distribution system. For example: N producers put messages into etcd, and then N consumers get messages from etcd. Similar to pub-sub.


I use NATS <http://nats.io/> for the pub/sub needs of my application. 
Reply all
Reply to author
Forward
0 new messages