Can someone explain Ack'ing?

530 views
Skip to first unread message

Kurtis

unread,
May 15, 2013, 3:00:35 PM5/15/13
to storm...@googlegroups.com
We've built a topology. At the moment, it consists of a single Spout, several Bolts, and at least two "leaf bolts" (Bolts without an emit that save our data).

In our typical bolts, we always 'ack' everything just as soon as it is emitted. In fact, most of them implement iBasicBolt which (to my understanding) calls ack() automatically.

So I am curious on how the ack'ing system works. Let's use my final bolt, which saves to the DB, as an example. At the moment, I'm calling my save() functionality and then manually ack()'ing the tuple.

What **exactly** happens if I don't ack() the tuple? What if I **don't** ack() the tuple immediately? What happens when I **do** ack() a tuple? Specifically, what affect does ack() have on the system and data?

Does this just push back one level (the previous Node(s)) or does it start all the way from the start? How does Storm know if a tuple has been ack()'d or not?

I apologies for the vagueness in my question. There's a lot of implementation details I don't quite understand, yet, but will be critical when we try to optimize certain aspects. Particularly, I need to know how Storm will react when I start queuing up objects in my Database (or other) bolts to do a large "bulk save" DB Call instead of the overhead needed to sustain one call for every tuple.

By the way, I know pre-optimization is the root of all evil :) I'm just playing with ideas at this point and want to learn as much as I possibly can (within reason) before I do go off and implement something that might clash with Storm's internals.

Thanks!
- Kurtis

Anthony Urso

unread,
May 15, 2013, 3:32:36 PM5/15/13
to storm...@googlegroups.com
With most spouts, not acking will cause the spout to fail and re-inject the unacked tuples after a timeout. 

You don't have to ack immediately, Storm is designed so that you can ack after you have completed all processing on the tuple for that bolt, as long as you get it all done within that configurable topology timeout.

There is a lot of good background on this topic here:


Cheers,
Anthony


- Kurtis

--
You received this message because you are subscribed to the Google Groups "storm-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to storm-user+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Kurtis Mullins

unread,
May 20, 2013, 3:53:57 PM5/20/13
to storm...@googlegroups.com
As mentioned in IRC -- this document explains a lot of this: https://github.com/nathanmarz/storm/wiki/Guaranteeing-message-processing


- Kurtis

--

Kurtis

unread,
May 21, 2013, 11:52:01 PM5/21/13
to storm...@googlegroups.com, anth...@cs.ucla.edu
Thanks Anthony. For some reason, Google wasn't giving me the latest messages and I didn't see yours earlier when I posted.
Reply all
Reply to author
Forward
0 new messages