On 7 Feb 2013, at 13:26, Alberto G. Corona wrote:
> I have an application in which the ordering of messages is important. I know that among two nodes, the receiver will receive all the messages and will receive them ordered.
>
The ordering guarantee is between two *processes* rather than between nodes. If P1 sends to P2 then all messages will either be delivered in order, or not at all. If P1 and P2 are on different nodes and the network becomes disconnected, then P2 has to explicitly `reconnect' to receive further messages and in doing so, acknowledges the possibility of lost messages and/or ordering.
> But what happens when two or more nodes are sending messages to a central node where what happens depends on the order (and ever in the timing of them) ?. Think for example a reverse auction service where the first bid get the lot. Imagine that there are N bidders in N nodes. (in reverse auctions,many bids can be sent in less than a second)
>
> The question is: do the messages arrive ordered by a timestamp put in the source node?.
>
No. Ordering between more than two processes, P1 *and* P2 both sending to P3 - is undefined, even (or perhaps, especially!) if all three processes reside on the same node, let alone three different ones! This is exactly how Erlang does it, except a bit better because Erlang *can* sometimes break the ordering guarantee between two (peer) processes in the face of network outages and fast automatic reconnects.
> If not, could it be implemented in the future as a inprovement, a "transactional service" for example?
>
Not at the distributed-process library level, no. The overhead of synchronising a group of N distributed senders to guarantee total ordering in a single receiver is quite a bit higher than you might imagine.
Having said that....
> This is also important for the implementation of synchronization and failover. clustering, distributed databases And other kinds of programming paradigms, like event sourcing.
>
Indeed this is a useful thing to have, but as I said, this is not something that the distributed-process layer should be doing. Erlang doesn't do this, but distributed databases written in Erlang, such as Riak, add this layer on top using vector clocks. I'd be happy to integrate this capability into Cloud Haskell's distributed-process-platform as an optional feature. Pull requests are most welcome! ;)
> Scenario 1
> Imagine that I want one or more mirror nodes, to be used just on case of failure of the main node. To do so, I forward the messages to the mirror nodes, so the state of them stay synchronized with the main node.
>
> Scenario 2
> Instead of Active-inactive synchronization for failover, The mirror nodes can be active, so they receive request simultaneously from a load distribution service, each one having a mirrored state, for example a distributed database of books in Amazon. or, more critical, some bank accounts. Then the messages must be transmitted and re-retransmitted among the cluster respecting the original order in which the clients sent the original messages.
>
Both of these require a lot more than just augmenting the node receiver queue with ordering guarantees. See for example, https://github.com/rabbitmq/rabbitmq-server/blob/master/src/gm.erl and then the essay near the top of https://github.com/rabbitmq/rabbitmq-server/blob/master/src/rabbit_mirror_queue_coordinator.erl#L50.
Incidentally, do you know of any existing messaging infrastructure technologies that *do* offer ordering guarantees that hold over multiple distributed senders? I work on messaging technology for a living (at https://rabbitmq.com) and I've not come across this. You can guarantee the ordering between two endpoints in a messaging infrastructure, no more. If you want to order particular clients with respect to timestamps, then you've got to implement that yourself in the processing nodes outside the messaging backbone.
> I think that does make sense to provide this service to cover these scenario. at the framework level, rather than at the application level.
>
I agree and disagree at the same time! Ha ha ha. :D
This kind of think *does* belong in a framework, but not the base level Cloud Haskell library. To do this properly requires vector clocks, and the cost of doing that is actually very high. Not every application needs this, so putting into the base Cloud Haskell layer not only complicates the semantics significantly, but it has a terrible effect on performance for those that do not need such strong guarantees.
Doing this at a level above distributed-process is fine though - see Jeff's https://github.com/jepst/distributed-process-global for example, which offers cluster control and global locking. But if you go ahead and put global locks around all send operations throughout your cluster, do please let me know how it performs at runtime - my expectation is that you'll have terrible throughput.
> I´m going too fast. Maybe there are something already developed for this or a more simple standardized solution
>
I won't pretent to be a distributed systems expert - although Edsko probably does count as one of those - but in my somewhat limited experience in this area over the years, I've found that problems such as this are often underestimated and assumed to be easy to solve. A vclock *will* solve some of these problems, but introduces other problems, as it facilitates Availability and Partition Tolerance but not Consistency. Consistency requires global synchronisation, which implies transactions, which implies something a la Paxos and friends. We have an open issue to implement distributed transactions in distributed-process-platform already: https://cloud-haskell.atlassian.net/browse/DPP-40 - we do not have an open issue to figure out how to do global/distributed deadlock detection, but then Erlang's gen_leader hasn't solved that problem yet either. And as I said, making 'send' a global transaction is not likely to yield very nice throughput.
Cheers,
Tim
Alberto,It sounds like an interesting design, though I'm unsure what the formal semantics would look like. I'm not aware for example, of any leaderless transaction management protocols - can you point me at some literature here? I'd be interested to learn.Without synchronisation between peers, i.e., transactions, I cannot see how this would work without a broadcast protocol. As I say, I've not heard of any transaction management protocols that don't require a coordinator.With a directed graph arranged as a ring overlay, I imagine that you could achieve lock-less ordering guarantees amongst all peers at the cost of 2*n hops per message as we do in gm. The cost ends up being state transfer in either case.I'd be delighted to see something that adds new distributed algorithms on top of cloud Haskell of course. We have numerous open issues that require coordination amongst peers already. Process Groups, mirrored supervisors and group services (all filed against d-p-platform) all require this. I am planning to use distributed-process-global for this, but I'm unable to run the test suite on osx and haven't had time to figure out why.Of course if you do implement this, the ordering will have to be enforced by an insulating process paired with the actual receiver. It might be worth adding this as a layer on top of the ManagedProcess API. How that is evolving is that you have one or more behaviours which are layered together to form a ProcessDefinition that defines the handlers for different kinds of messages. The order of the handler declarations determines the processing order once messages arrive, as these are passed to a selective receive. If your layer provides the ordering checks transparently, then it could be simply layered in as a behaviour. The handlers could either deal with coordination directly or cooperate with a registered process. Or you could use and insulator process of course.That wont work until 0.5.0 is released however, or the outstanding pull request is merged into development (branch) of d-p as the behaviour API depends on Message being serializable and consumable by user code.CheersTim