Hey,
just wanted to spread the word that there's been a minor release for
Nanite today (0.4.1.17), which is mostly of importance if you're using
the Redis backend for state storage. It fixes deprecation warnings
with newer versions of the Redis library, and an inconsistency when
dealing with intermediate results.
More importantly though, I'd like to lay out what else happened, and
what I'm planning on working on in the coming weeks.
Most of what I have on my list is dealing with reliability, since
right now there's simply a risk of losing messages when agents or
mappers suddenly go down. Messages in Nanite are ack'd before they are
delivered, and therefore lost when an unrecovered error occurs or the
process dies.
I started working on improving that on the agent part, though it's not
yet fully done. In a branch I introduced a feature that will allow
shutting down agents gracefully. When they still have work to do, and
you send them SIGINT or SIGTERM, they'll just disconnect from the
broker, clean up their pidfile, but won't shut down until all their
work is done. Due to the asynchronous nature of EventMachine I'm
relying on the user to tell me when he's done working through a
method. Right now I'm not assuming that a the actor method returning
means the task is done. That is mainly due to how we use it, because
we have actor methods kicking off longer running tasks that poll with
periodic timers. So that's up for discussion. I'll build something
similar for mappers, but less sophisticated, basically just waiting
for a grace period to fully shut down the process, not killing message
processing right in between.
We're using the branch code in production, and it's stable, so I'll
probably merge it soon, and make it a release 0.4.2.
Other than that, I'd like to improve the reliability of message
delivery itself. I might rely more on Redis to achieve that, but I'm
still playing with ideas. Basically I want to remove the ack before
dispatch in actors, and forge it together with the reliability stuff
outlined above, relying on the user or a returned method to know that
a method was done, redelivering to another agent after a certain time
has elapsed. Both could be options, and timeouts for redelivery could
also be something to include. I'm peeking at Beetle [2] for
inspiration here, because that's something Nanite is missing
currently. When a message is gone, it's gone. That's sometimes
acceptable, but I'd like to improve on that, because at least for us,
it's not.
Let me know if you have comments or other things you'd like to see
improved or fixed in Nanite. I'm more than happy to fully outline the
ideas once I have well, a good idea on them.
Cheers, Mathias
[1]
http://github.com/ezmobius/nanite/tree/exit_hooks
[2]
http://xing.github.com/beetle/
--
http://paperplanes.de
http://twitter.com/roidrage
--
You received this message because you are subscribed to the Google Groups "Nanite" group.
To post to this group, send email to
nan...@googlegroups.com.
To unsubscribe from this group, send email to
nanite+un...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/nanite?hl=en.