Batch processing and realtime processing have different properties and
tradeoffs. With Hadoop, you can run idempotent functions on all your
data at once, but with high latency. With Storm, you can run
incremental functions very quickly, but since you don't look at the
whole dataset at once you can't run the same range of functions.
It turns out that these two paradigms complement each other extremely
well. All of our applications have both a batch processing component
and a realtime processing component.
You can see the slides for a presentation I gave about this batch/
realtime approach here:
http://www.slideshare.net/nathanmarz/the-secrets-of-building-realtime-big-data-systems
-Nathan