Hi Carlus,
> I have been working with JMS Queues and Topics for a number of years now. In that environment, the producer is guaranteed that the message is durable and persisted once the message arrives on the queue. If the message doesn't arrive on the queue, an exception is thrown and the producer can react to that exception. In disruptor, this does not seem to be the case. The producer would never know if a message arrived successfully or not, since there is not an "ack" back to the producer.
>
> Is there any recourse? Is there any design pattern that can be used to guarantee that messages are "received" by disruptor? Off of the top of my head, the only thing I can think of is fronting it with a queue, but that kind of defeats the whole purpose of the architecture, :D.
There are multiple levels of guarantee that you should separate here:
1. Under normal operation
2. Under a software failure (eg an exception in one of the consumers)
3. Under hardware failure (eg pulling the plug unexpectedly)
Under normal operation we get the guarantee that every event is processed by every consumer (almost by definition).
Under a software failure, you need to start making policy decisions. If a consumer encounters an exception it will defer to the ExceptionHandler it has been set up with (disruptor.handleExceptionsWith is the method that sets this up I think). By default this uses a FatalException handler which will stop the consumer and no further events will be processed. You can however use any exception handling policy you want - so it could either continue with the next message, or use custom code to notify the producer etc.
Under hardware failure you can only preserve messages in one of two ways:
1. Consider a message received when it is journaled to disk (and flushed!). You may still lose messages that were in flight across the network or read from the network and not yet journalled.
2. Build a network protocol that supports replay. For example, at LMAX each event sent over the network includes the ring buffer sequence number - if the receiver gets message 20 but hasn't yet received message 19, it will NAK back to the sender requesting that 19 be sent. Using this system, when the hardware failure is resolved the service starts back up and automatically requests a replay of any messages it missed.
There's nothing really built into the disruptor that does all this stuff for you (because the approach you should take varies so much depending on your exact requirements) though the ring buffer sequence number is a very useful unique ID for messages - LMAX uses it all over the place for this kind of thing.