I've been slowly migrating an application to Akka since scala 2.10 came out and pushed me away from scala actors; sorry for any stupid questions as I'm always learning.
As a general picture, the application consists of multiple long-lived JVMs communicating over ActiveMQ. The standard deployment is to a single machine, but with multiple services communicating over AMQ for the ability to move specific pieces of functionality to other boxes. As the migration and component rewrites have progressed, I'm solely left with actors communicating with each other over AMQ using akka-camel. The natural next step was to explore akka-remote.
My questions started out as "is this an abuse/unintended usage of akka-remote? Is akka-remote meant to be used outside of akka-cluster? Is it useful for communicating to local JVMs? What about network hiccups for remote JVMs?"
I went ahead and implemented a decent amount of code for using akka-remote to talk to one of the services after bumping our akka version to 2.3.3, and have to say I'm pleased, especially when comparing to ActiveMQ. Local machine communication is flawless, but once I started testing with remote machines and doing "ifdown eth0; sleep 20; ifup eth0" network disruption tests, I'm left with questions about how to handle quarantines. I looked at reference.conf and heeded the admonition to NOT change the quarantine timeout from 5 days - restarting one of the actor systems is the only alternative.
So - what're the best practices concerning restarting the ActorSystem?
- I'm not clustering - these are a few long-lived "heavy" services, not just nodes spinning up to do small processing tasks
- Our general deployment is not HA, we don't usually have standbys waiting
- Restarting the JVM isn't optimal
* since the services are fairly substantial and there's a non-trivial amount of initialization including database hits to pre-fill caches, restarting the JVM is a possibility (less time than the remoting gate time), but isn't the first route I'd choose
* we (very rarely) run on non-linux platforms and so tend to try to keep stuff in the JVM instead of relying on upstart/launchd/windows services/etc
My only other thought is to run an additional ActorSystem for remoting.
- allows programmatic configuration (our runtime configuration system could change remoting settings and restart the remoting ActorSystem with the new settings)
- a quarantine situation would just require the remoting ActorSystem to be recreated, not a restart of the whole JVM
However, one of the very earliest entries in the Akka documentation states "An ActorSystem is a heavyweight structure that will allocate 1…N Threads, so create one per logical application." I know creating multiple dispatchers in the same ActorSystem is fine, and sometimes (at least historically) a dedicated dispatcher was recommended for some remoting cases; I also know starting a new ActorSystem takes some amount of time to create dispatchers, parse configs, etc; so I'm thinking that the big yellow warning in the documentation is a general guideline for getting started with Akka, not a hard and fast rule.
Sorry for the long post, can anybody give me some guidance on the situation?