Fundamentally the Disruptor minizes overheads, there is no magic, so if task B needs the results of task A then, you are quite correct, it must wait until task A is complete.
What the Disruptor does is reduce to a minimum the cost of that wait while maximizing the opportunity to do other work in parallel. This is, from our measurements, very different to most other approaches where the cost of the overheads is significantly more than even reasonably efficient code, let alone good code.
So for example, let's assume that you are receiving messages, journalling them to disk, sending them to a cluster pair, receiving acknowledgements back from the cluster pair and translating them to a usable form all before processing them in your business logic. All of these tasks must be complete before the business logic can process the event, but each of these tasks can operate in parallel, independently of one another. So the tasks can be carried out efficiently in parallel and when they are all complete the business logic can process the message. For most common approaches the cost of the multi-thread re-join is so vast that it can outweigh the other costs. The rejoin with the disruptor incurs no locks, so is cheap and there are big efficiency gains there.
Further, as Mike said, the batching effect allows catching up between stages. Put simply, the Disruptor moves the costs to where they belong, your functional code rather than the plumbing.
The effects that we measured early in the life of the Disruptor showed that the problem is not really "time of A" + "time of B" but the disproportionate costs of the gaps between A and B when using common concurrency approaches like locks and CAS. To maximize the throughput you need to optimise the performance of A and B, but that is only reasonable, the huge win is that the Disruptor moves the limiting factor to where you are doing useful work rather than where you are preparing to do work.
Of course if you are doing slow things like writing to disk or communicating over a network then writing code to be as efficient as possible will minimize the straight-line cost of the task. That is a question of mechanical sympathy and good design, but is not strictly related to the Disruptor itself and is very much dependent on the nature of the slow service.
I'd offer a couple of pieces of general, high-level advice for the tasks themselves, but as I said this is really a separate issue from the Disruptor itself: 1) Disks and networks are block devices, treat them as such. 2) Asynchrony is your friend, make all external interactions, particularly with other business services asynchronous.
Hope this is not too vague and helps,
Dave