Ordasity isn't unique in that regard. Not sure why anyone would use ZK in that regard. Storm uses ZeroMQ for broker-less communication.
Storm is clustered streaming computation. We use it a lot from API consumers, logging aggregation, and online learning.
It is jvm based but you can use shell bolts to use arbitrary scripts for processing.
Michael Rose
FullContact
Not sent from my iPhone
We've never had an issue with machines going away � if a machine dies its tasks are assigned elsewhere and the tuples within that part of the tree are never finished (so it replays them).
Higher level constructs allow exactly-once semantics given compatible data sources.
I last looked at Ordasity a few months ago� I noticed you guys do a much better job of graceful load balancing / seamless deployment. One of our dislikes of Storm is the lack of built in semantics for auto-rebalancing without downtime (or doing a blue/green style deployment).
Storm's framework seems to be at a higher level whereas it seems you could build almost any clustered service using Ordasity. I also like the lack of master.
A question I have about Ordasity � do you guys build any kind of real-time query workloads on it? We mainly use Storm for background processing and find its DRPC mechanism clunky, so I'm always looking for something more elegant with regards to real time queries.
On Monday, December 10, 2012 at 8:48 PM, Cliff Moon wrote:
Failure recovery.� A thing that storm does not have to my knowledge (we last evaluated it a year ago).
On 12/10/12 7:46 PM, Michael Rose wrote:
Ordasity isn't unique in that regard. Not sure why anyone would use ZK in that regard. Storm uses ZeroMQ for broker-less communication.
Storm is clustered streaming computation. We use it a lot from API consumers, logging aggregation, and online learning.
It is jvm based but you can use shell bolts to use arbitrary scripts for processing.Michael Rose
FullContactNot sent from my iPhone
On Dec 10, 2012 6:52 PM, "Kelly Sommers" <kell.s...@gmail.com> wrote:
Ordasity was pushed to GitHub the other day by Boundary. Ordasity is a library for building distributed stateful clustered services on the JVM. I'd love to see this become an agent service so that any language can benefit from its abilities because I think it has a ton of potential.�I've seen some other projects use ZooKeeper as a queue, but�Ordasity takes a different approach which I am liking a lot after Cliff described it to me a little bit. Ordasity uses ZooKeeper for all the coordination but the actual work occurs outside of Ordasity however you want.
Still trying to wrap my head around how it all works but I really like the ability to schedule and coordinate work in a masterless fashion.
Hopefully I got that all correct, I'm sure Cliff will correct me if I haven't :)
--
You received this message because you are subscribed to the Google Groups "Distributed Systems" group.
To post to this group, send email to distsys...@googlegroups.com.
To unsubscribe from this group, send email to distsys-discu...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
�
�
--
You received this message because you are subscribed to the Google Groups "Distributed Systems" group.
To post to this group, send email to distsys...@googlegroups.com.
To unsubscribe from this group, send email to distsys-discu...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
�
�
--
You received this message because you are subscribed to the Google Groups "Distributed Systems" group.
To post to this group, send email to distsys...@googlegroups.com.
To unsubscribe from this group, send email to distsys-discu...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
�
�
--
You received this message because you are subscribed to the Google Groups "Distributed Systems" group.
To post to this group, send email to distsys...@googlegroups.com.
To unsubscribe from this group, send email to distsys-discu...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
�
�
On Monday, December 10, 2012 at 9:03 PM, Cliff Moon wrote:
Ordasity processes Boundary's entire inbound data stream, so it's all real-time workload for us. And you're right about being able to build anything with it. Our philosophy towards building distributed systems tends towards unix-like: many small primitives that can be composed into what you want. At scale everything is custom; so why shoot yourself in the ass by building a framework?
On 12/10/12 7:58 PM, Michael Rose wrote:
We've never had an issue with machines going away — if a machine dies its tasks are assigned elsewhere and the tuples within that part of the tree are never finished (so it replays them).
Higher level constructs allow exactly-once semantics given compatible data sources.
I last looked at Ordasity a few months ago… I noticed you guys do a much better job of graceful load balancing / seamless deployment. One of our dislikes of Storm is the lack of built in semantics for auto-rebalancing without downtime (or doing a blue/green style deployment).
Storm's framework seems to be at a higher level whereas it seems you could build almost any clustered service using Ordasity. I also like the lack of master.
A question I have about Ordasity — do you guys build any kind of real-time query workloads on it? We mainly use Storm for background processing and find its DRPC mechanism clunky, so I'm always looking for something more elegant with regards to real time queries.
On Monday, December 10, 2012 at 8:48 PM, Cliff Moon wrote:
Failure recovery. A thing that storm does not have to my knowledge (we last evaluated it a year ago).
On 12/10/12 7:46 PM, Michael Rose wrote:
Ordasity isn't unique in that regard. Not sure why anyone would use ZK in that regard. Storm uses ZeroMQ for broker-less communication.
Storm is clustered streaming computation. We use it a lot from API consumers, logging aggregation, and online learning.
It is jvm based but you can use shell bolts to use arbitrary scripts for processing.Michael Rose
FullContactNot sent from my iPhone
On Dec 10, 2012 6:52 PM, "Kelly Sommers" <kell.s...@gmail.com> wrote:
Ordasity was pushed to GitHub the other day by Boundary. Ordasity is a library for building distributed stateful clustered services on the JVM. I'd love to see this become an agent service so that any language can benefit from its abilities because I think it has a ton of potential. I've seen some other projects use ZooKeeper as a queue, but Ordasity takes a different approach which I am liking a lot after Cliff described it to me a little bit. Ordasity uses ZooKeeper for all the coordination but the actual work occurs outside of Ordasity however you want.
I noticed you register work units in ZooKeeper. What would a work unit be? A hint to start consuming from a message queue or to listen on a certain port? It's not really using ZK as a MQ so much as a checkpointing and coordinator right?
Apologies for the questions, but I'd really love to understand Ordasity. :)
On Monday, December 10, 2012 at 9:03 PM, Cliff Moon wrote:
Ordasity processes Boundary's entire inbound data stream, so it's all real-time workload for us.� And you're right about being able to build anything with it.� Our philosophy towards building distributed systems tends towards unix-like: many small primitives that can be composed into what you want.� At scale everything is custom; so why shoot yourself in the ass by building a framework?
On 12/10/12 7:58 PM, Michael Rose wrote:
We've never had an issue with machines going away � if a machine dies its tasks are assigned elsewhere and the tuples within that part of the tree are never finished (so it replays them).
Higher level constructs allow exactly-once semantics given compatible data sources.
I last looked at Ordasity a few months ago� I noticed you guys do a much better job of graceful load balancing / seamless deployment. One of our dislikes of Storm is the lack of built in semantics for auto-rebalancing without downtime (or doing a blue/green style deployment).
Storm's framework seems to be at a higher level whereas it seems you could build almost any clustered service using Ordasity. I also like the lack of master.
A question I have about Ordasity � do you guys build any kind of real-time query workloads on it? We mainly use Storm for background processing and find its DRPC mechanism clunky, so I'm always looking for something more elegant with regards to real time queries.
On Monday, December 10, 2012 at 8:48 PM, Cliff Moon wrote:
Failure recovery.� A thing that storm does not have to my knowledge (we last evaluated it a year ago).
On 12/10/12 7:46 PM, Michael Rose wrote:
Ordasity isn't unique in that regard. Not sure why anyone would use ZK in that regard. Storm uses ZeroMQ for broker-less communication.
Storm is clustered streaming computation. We use it a lot from API consumers, logging aggregation, and online learning.
It is jvm based but you can use shell bolts to use arbitrary scripts for processing.Michael Rose
FullContactNot sent from my iPhone
On Dec 10, 2012 6:52 PM, "Kelly Sommers" <kell.s...@gmail.com> wrote:
Ordasity was pushed to GitHub the other day by Boundary. Ordasity is a library for building distributed stateful clustered services on the JVM. I'd love to see this become an agent service so that any language can benefit from its abilities because I think it has a ton of potential.�I've seen some other projects use ZooKeeper as a queue, but�Ordasity takes a different approach which I am liking a lot after Cliff described it to me a little bit. Ordasity uses ZooKeeper for all the coordination but the actual work occurs outside of Ordasity however you want.
Still trying to wrap my head around how it all works but I really like the ability to schedule and coordinate work in a masterless fashion.
Hopefully I got that all correct, I'm sure Cliff will correct me if I haven't :)
--
You received this message because you are subscribed to the Google Groups "Distributed Systems" group.
To post to this group, send email to distsys...@googlegroups.com.
To unsubscribe from this group, send email to distsys-discu...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
�
�
--
You received this message because you are subscribed to the Google Groups "Distributed Systems" group.
To post to this group, send email to distsys...@googlegroups.com.
To unsubscribe from this group, send email to distsys-discu...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
�
�
--
You received this message because you are subscribed to the Google Groups "Distributed Systems" group.
To post to this group, send email to distsys...@googlegroups.com.
To unsubscribe from this group, send email to distsys-discu...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
�
�
--
You received this message because you are subscribed to the Google Groups "Distributed Systems" group.
To post to this group, send email to distsys...@googlegroups.com.
To unsubscribe from this group, send email to distsys-discu...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
�
�
--
You received this message because you are subscribed to the Google Groups "Distributed Systems" group.
To post to this group, send email to distsys...@googlegroups.com.
To unsubscribe from this group, send email to distsys-discu...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
�
�
--
You received this message because you are subscribed to the Google Groups "Distributed Systems" group.
To post to this group, send email to distsys...@googlegroups.com.
To unsubscribe from this group, send email to distsys-discu...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
�
�
Correct, ordasity does not use zookeeper as a queue. That would be incredibly stupid. Currently for us a work unit is all of a particular customer's traffic. Customers can connect to any number of (mostly) stateless collectors which then expose a pubsub interface internally. Downstream components use ordasity to claim work units and subscribe to the appropriate event streams from the collectors. We're reaching some scaling limits here so we'll likely switch to making the work unit a vnode type of construct.
And it's important to note that ordasity does not prescribe a transport mechanism *at all*. So, if zeromq is your thing then by all means use that to send event streams banging around the cluster. Ordasity simply exists for coordinating work units, surviving failures, and having the ability to manually push load around a running cluster without taking downtime.
Also, in order to survive failure during something like stream processing you need to build in your own replay mechanisms. We use kafka, coordinated by ordasity, to replay state during hard failures in the cluster, ie JVMs going into GC death spirals.
On 12/10/12 8:26 PM, Michael Rose wrote:
I noticed you register work units in ZooKeeper. What would a work unit be? A hint to start consuming from a message queue or to listen on a certain port? It's not really using ZK as a MQ so much as a checkpointing and coordinator right?
Apologies for the questions, but I'd really love to understand Ordasity. :)
On Monday, December 10, 2012 at 9:03 PM, Cliff Moon wrote:
Ordasity processes Boundary's entire inbound data stream, so it's all real-time workload for us. And you're right about being able to build anything with it. Our philosophy towards building distributed systems tends towards unix-like: many small primitives that can be composed into what you want. At scale everything is custom; so why shoot yourself in the ass by building a framework?
On 12/10/12 7:58 PM, Michael Rose wrote:
We've never had an issue with machines going away — if a machine dies its tasks are assigned elsewhere and the tuples within that part of the tree are never finished (so it replays them).
Higher level constructs allow exactly-once semantics given compatible data sources.
I last looked at Ordasity a few months ago… I noticed you guys do a much better job of graceful load balancing / seamless deployment. One of our dislikes of Storm is the lack of built in semantics for auto-rebalancing without downtime (or doing a blue/green style deployment).
Storm's framework seems to be at a higher level whereas it seems you could build almost any clustered service using Ordasity. I also like the lack of master.
A question I have about Ordasity — do you guys build any kind of real-time query workloads on it? We mainly use Storm for background processing and find its DRPC mechanism clunky, so I'm always looking for something more elegant with regards to real time queries.
On Monday, December 10, 2012 at 8:48 PM, Cliff Moon wrote:
Failure recovery. A thing that storm does not have to my knowledge (we last evaluated it a year ago).
On 12/10/12 7:46 PM, Michael Rose wrote:
Ordasity isn't unique in that regard. Not sure why anyone would use ZK in that regard. Storm uses ZeroMQ for broker-less communication.
Storm is clustered streaming computation. We use it a lot from API consumers, logging aggregation, and online learning.
It is jvm based but you can use shell bolts to use arbitrary scripts for processing.Michael Rose
FullContactNot sent from my iPhone
On Dec 10, 2012 6:52 PM, "Kelly Sommers" <kell.s...@gmail.com> wrote:
Ordasity was pushed to GitHub the other day by Boundary. Ordasity is a library for building distributed stateful clustered services on the JVM. I'd love to see this become an agent service so that any language can benefit from its abilities because I think it has a ton of potential. I've seen some other projects use ZooKeeper as a queue, but Ordasity takes a different approach which I am liking a lot after Cliff described it to me a little bit. Ordasity uses ZooKeeper for all the coordination but the actual work occurs outside of Ordasity however you want.
Cliff,
If Ordasity was to enable any language to take part, my thoughts were
that Ordasity could run as an agent service on each node. What do you
think the best way to communicate with it would be? I'm guessing by
the looks of the descriptions it needs to be a bidirectional thing
because Ordasity may call your code at any moment?
On 2012-12-10, at 11:41 PM, Cliff Moon <cl...@boundary.com
<mailto:cl...@boundary.com>> wrote:
Michael Rose (@Xorlev <https://twitter.com/xorlev>)
Senior Platform Engineer, FullContact <http://fullcontact.com/>
mic...@fullcontact.com <mailto:mic...@fullcontact.com>
Michael Rose (@Xorlev <https://twitter.com/xorlev>)
Senior Platform Engineer, FullContact <http://fullcontact.com/>
mic...@fullcontact.com <mailto:mic...@fullcontact.com>
On Monday, December 10, 2012 at 8:48 PM, Cliff Moon wrote:
Failure recovery. A thing that storm does not have to my
knowledge (we last evaluated it a year ago).
On 12/10/12 7:46 PM, Michael Rose wrote:
Ordasity isn't unique in that regard. Not sure why anyone would
use ZK in that regard. Storm uses ZeroMQ for broker-less
communication.
Storm is clustered streaming computation. We use it a lot from
API consumers, logging aggregation, and online learning.
It is jvm based but you can use shell bolts to use arbitrary
scripts for processing.
Michael Rose
FullContact
Not sent from my iPhone
On Dec 10, 2012 6:52 PM, "Kelly Sommers" <kell.s...@gmail.com
<mailto:kell.s...@gmail.com>> wrote:
Ordasity was pushed to GitHub the other day by Boundary.
Ordasity is a library for building distributed stateful
clustered services on the JVM. I'd love to see this become an
agent service so that any language can benefit from its
abilities because I think it has a ton of potential. I've seen
some other projects use ZooKeeper as a queue, but Ordasity
takes a different approach which I am liking a lot after Cliff
described it to me a little bit. Ordasity uses ZooKeeper for
all the coordination but the actual work occurs outside of
Ordasity however you want.
Still trying to wrap my head around how it all works but I
really like the ability to schedule and coordinate work in a
masterless fashion.
Hopefully I got that all correct, I'm sure Cliff will correct
me if I haven't :)
https://github.com/boundary/ordasity
--
You received this message because you are subscribed to the
Google Groups "Distributed Systems" group.
To post to this group, send email to
distsys...@googlegroups.com
<mailto:distsys...@googlegroups.com>.
To unsubscribe from this group, send email to
distsys-discu...@googlegroups.com
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the
Google Groups "Distributed Systems" group.
To post to this group, send email to
distsys...@googlegroups.com
<mailto:distsys...@googlegroups.com>.
To unsubscribe from this group, send email to
distsys-discu...@googlegroups.com
<mailto:distsys-discu...@googlegroups.com>.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the
Google Groups "Distributed Systems" group.
To post to this group, send email to
distsys...@googlegroups.com
<mailto:distsys...@googlegroups.com>.
To unsubscribe from this group, send email to
distsys-discu...@googlegroups.com
<mailto:distsys-discu...@googlegroups.com>.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google
Groups "Distributed Systems" group.
To post to this group, send email to
distsys...@googlegroups.com
<mailto:distsys...@googlegroups.com>.
To unsubscribe from this group, send email to
distsys-discu...@googlegroups.com
<mailto:distsys-discu...@googlegroups.com>.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google
Groups "Distributed Systems" group.
To post to this group, send email to
distsys...@googlegroups.com
<mailto:distsys...@googlegroups.com>.
To unsubscribe from this group, send email to
distsys-discu...@googlegroups.com
<mailto:distsys-discu...@googlegroups.com>.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google
Groups "Distributed Systems" group.
To post to this group, send email to distsys...@googlegroups.com.
To unsubscribe from this group, send email to
distsys-discu...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google
Groups "Distributed Systems" group.
To post to this group, send email to distsys...@googlegroups.com
<mailto:distsys...@googlegroups.com>.
To unsubscribe from this group, send email to
distsys-discu...@googlegroups.com
<mailto:distsys-discu...@googlegroups.com>.
For more options, visit https://groups.google.com/groups/opt_out.
If you were to develop it in-house what format or protocol would you
think fits best?
Expressing some guidance might not be a bad idea :)
On 2012-12-10, at 11:59 PM, Cliff Moon <cl...@boundary.com
<mailto:cl...@boundary.com>> wrote:
Conceptually, it would not be terrible to support a model where each
worker would keep an open connection to its local agent. Worker
connects, announces a reasonable and unique name, asks for work and
then goes about its business. The open-ness of the connection can be
used as a proxy for the participation of the worker and everything
else would work as expected. The most difficult parts would probably
be bulletproofing the client protocol and figuring out how best to
time out the client connections for hung workers.
We don't really have the time to develop something like this
in-house, but patches are very welcome.
kell.s...@gmail.com <mailto:kell.s...@gmail.com> wrote:
<mailto:distsys...@googlegroups.com>.
To unsubscribe from this group, send email to
distsys-discu...@googlegroups.com
<mailto:distsys-discu...@googlegroups.com>
<mailto:distsys-discuss%2Bunsu...@googlegroups.com
<mailto:2Bunsu...@googlegroups.com>>.
On Monday, December 10, 2012 at 9:41 PM, Cliff Moon wrote:
Correct, ordasity does not use zookeeper as a queue. That would be incredibly stupid. Currently for us a work unit is all of a particular customer's traffic. Customers can connect to any number of (mostly) stateless collectors which then expose a pubsub interface internally. Downstream components use ordasity to claim work units and subscribe to the appropriate event streams from the collectors. We're reaching some scaling limits here so we'll likely switch to making the work unit a vnode type of construct.
And it's important to note that ordasity does not prescribe a transport mechanism *at all*. So, if zeromq is your thing then by all means use that to send event streams banging around the cluster. Ordasity simply exists for coordinating work units, surviving failures, and having the ability to manually push load around a running cluster without taking downtime.
Also, in order to survive failure during something like stream processing you need to build in your own replay mechanisms. We use kafka, coordinated by ordasity, to replay state during hard failures in the cluster, ie JVMs going into GC death spirals.
On 12/10/12 8:26 PM, Michael Rose wrote:
I noticed you register work units in ZooKeeper. What would a work unit be? A hint to start consuming from a message queue or to listen on a certain port? It's not really using ZK as a MQ so much as a checkpointing and coordinator right?
Apologies for the questions, but I'd really love to understand Ordasity. :)
On Monday, December 10, 2012 at 9:03 PM, Cliff Moon wrote:
Ordasity processes Boundary's entire inbound data stream, so it's all real-time workload for us. And you're right about being able to build anything with it. Our philosophy towards building distributed systems tends towards unix-like: many small primitives that can be composed into what you want. At scale everything is custom; so why shoot yourself in the ass by building a framework?
On 12/10/12 7:58 PM, Michael Rose wrote:
We've never had an issue with machines going away — if a machine dies its tasks are assigned elsewhere and the tuples within that part of the tree are never finished (so it replays them).
Higher level constructs allow exactly-once semantics given compatible data sources.
I last looked at Ordasity a few months ago… I noticed you guys do a much better job of graceful load balancing / seamless deployment. One of our dislikes of Storm is the lack of built in semantics for auto-rebalancing without downtime (or doing a blue/green style deployment).
Storm's framework seems to be at a higher level whereas it seems you could build almost any clustered service using Ordasity. I also like the lack of master.
A question I have about Ordasity — do you guys build any kind of real-time query workloads on it? We mainly use Storm for background processing and find its DRPC mechanism clunky, so I'm always looking for something more elegant with regards to real time queries.
On Monday, December 10, 2012 at 8:48 PM, Cliff Moon wrote:
Failure recovery. A thing that storm does not have to my knowledge (we last evaluated it a year ago).
On 12/10/12 7:46 PM, Michael Rose wrote:
Ordasity isn't unique in that regard. Not sure why anyone would use ZK in that regard. Storm uses ZeroMQ for broker-less communication.
Storm is clustered streaming computation. We use it a lot from API consumers, logging aggregation, and online learning.
It is jvm based but you can use shell bolts to use arbitrary scripts for processing.Michael Rose
FullContactNot sent from my iPhone
On Dec 10, 2012 6:52 PM, "Kelly Sommers" <kell.s...@gmail.com> wrote:
Ordasity was pushed to GitHub the other day by Boundary. Ordasity is a library for building distributed stateful clustered services on the JVM. I'd love to see this become an agent service so that any language can benefit from its abilities because I think it has a ton of potential. I've seen some other projects use ZooKeeper as a queue, but Ordasity takes a different approach which I am liking a lot after Cliff described it to me a little bit. Ordasity uses ZooKeeper for all the coordination but the actual work occurs outside of Ordasity however you want.