dead letter error when sender dies immediately after sending the message

2,774 views
Skip to first unread message

Chris Curtin

unread,
Mar 6, 2014, 9:28:50 AM3/6/14
to akka...@googlegroups.com
Hi,

I'm working through the failure scenarios to understand where we could lose data and I'm receiving a dead letter error unexpectedly.

The model is a Collector Actor and a set of Worker Actors. Collector is the parent to the Workers. A 'boss' is watching the Collector with a restart() SupervisorStrategy. Workers 'pull' work from the Collector, so when a Worker finishes, it sends a 'done' request then the Collector finds the next thing to do and sends it.

We are deliberately throwing an exception in the Collector's onReceive() method so it gets restarted to understand the flow. We are also making the Worker Actors sleep for 5 seconds to simulate activity when they receive work.

Frequently we see dead letter errors  on the LAST message sent from the Collector before we throw the exception:

[INFO] [03/06/2014 08:59:04.801] [Main-akka.actor.default-dispatcher-2] [akka://Main/user/app/Supervisor/1394114329779_4] Message [com.silverpop.rnd.collector_failure.TestMessage] from Actor[akka://Main/user/app/Supervisor#-138733138] to Actor[akka://Main/user/app/Supervisor/1394114329779_4#1820908525] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

Based on the timestamps in the logs, it looks like the restart process is stopping the Worker (who is idle at this point) BEFORE the write to its Mailbox, so the message is getting send to the dead letter.

Sample code:

@Override
    public void onReceive(Object msg) {

        m_numProcessed++;

        TestMessage test = (TestMessage) msg;
        log.error("Received message from: " + sender().path().toString() + " " + ((TestMessage) msg).getMessageNumber());

        ActorRef sender = sender();
        if (sender.isTerminated()) {
            log.error("Sender was terminated, so not sending the next message. " + sender.path());
        } else {
            int msgId = m_messageNumber++;
            log.error("Sending message id: " + msgId);
            sender().tell(new TestMessage(NUM_SECONDS, msgId), getSelf());
        }

        boolean blowup = true;
        if (blowup && m_numProcessed % 20 == 0) {
            throw new RuntimeException("Blowing up the Collector");
        }

    }


Full logs:

[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10] [akka://Main/user/app/Supervisor] Received message from: akka://Main/user/app/Supervisor/1394114329779_3 16
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10] [akka://Main/user/app/Supervisor] Sending message id: 20
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10] [akka://Main/user/app/Supervisor] Received message from: akka://Main/user/app/Supervisor/1394114329779_0 17
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10] [akka://Main/user/app/Supervisor] Sending message id: 21
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10] [akka://Main/user/app/Supervisor] Received message from: akka://Main/user/app/Supervisor/1394114329779_2 15
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10] [akka://Main/user/app/Supervisor] Sending message id: 22
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10] [akka://Main/user/app/Supervisor] Received message from: akka://Main/user/app/Supervisor/1394114329779_1 18
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10] [akka://Main/user/app/Supervisor] Sending message id: 23
[ERROR] [03/06/2014 08:59:04.796] [Main-akka.actor.default-dispatcher-6] [akka://Main/user/app/Supervisor] Received message from: akka://Main/user/app/Supervisor/1394114329779_4 19
[ERROR] [03/06/2014 08:59:04.796] [Main-akka.actor.default-dispatcher-6] [akka://Main/user/app/Supervisor] Sending message id: 24
[ERROR] [03/06/2014 08:59:04.796] [Main-akka.actor.default-dispatcher-2] [akka://Main/user/app/Supervisor] Blowing up the Collector
java.lang.RuntimeException: Blowing up the Collector
at com.silverpop.rnd.collector_failure.CollectorFailure.onReceive(CollectorFailure.java:68)
at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

[ERROR] [03/06/2014 08:59:04.796] [Main-akka.actor.default-dispatcher-2] [akka://Main/user/app/Supervisor] Collector: I got a preRestart
[ERROR] [03/06/2014 08:59:04.797] [Main-akka.actor.default-dispatcher-6] [akka://Main/user/app/Supervisor/1394114329779_4] akka://Main/user/app/Supervisor/1394114329779_4 I got a post-stop message? Why?
[INFO] [03/06/2014 08:59:04.801] [Main-akka.actor.default-dispatcher-2] [akka://Main/user/app/Supervisor/1394114329779_4] Message [com.silverpop.rnd.collector_failure.TestMessage] from Actor[akka://Main/user/app/Supervisor#-138733138] to Actor[akka://Main/user/app/Supervisor/1394114329779_4#1820908525] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[ERROR] [03/06/2014 08:59:09.798] [Main-akka.actor.default-dispatcher-9] [akka://Main/user/app/Supervisor/1394114329779_0] akka://Main/user/app/Supervisor/1394114329779_0 I got a post-stop message? Why?
[ERROR] [03/06/2014 08:59:09.798] [Main-akka.actor.default-dispatcher-10] [akka://Main/user/app/Supervisor/1394114329779_3] akka://Main/user/app/Supervisor/1394114329779_3 I got a post-stop message? Why?


looking at the timestamps, the other threads (akka://Main/user/app/Supervisor/1394114329779_0 for example) finish their 5 second sleep before getting the postShutdown message as you'd expect. 

but akka://Main/user/app/Supervisor/1394114329779_4 gets the postShutdown message immediately and then the dead letter error.

Reading the documentation, shouldn't the inbox for that Worker still exist while Collector is doing it's restart logic? Or is the issue because the child Actor is destroyed and not doing its own 'restart' logic?

Java 1.7, Akka 2.3.0 running on a local machine from in IntelliJ.

Thanks,

Chris

Björn Antonsson

unread,
Mar 7, 2014, 9:17:37 AM3/7/14
to akka...@googlegroups.com
Hi Chris,

Have you overridden the default actor life cycle behavior of stopping all children, described in the documentation of preRestart?

B/
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/groups/opt_out.
-- 
Björn Antonsson
Typesafe – Reactive Apps on the JVM
twitter: @bantonsson

Chris Curtin

unread,
Mar 7, 2014, 11:38:38 AM3/7/14
to akka...@googlegroups.com
Hi Björn,

Thanks for the reply. It took me a few minutes to figure out what you were referencing since we did have the SupervisoryStrategy set to 'restart()', but then I found this:

"call the old instance’s preRestart hook (defaults to sending termination requests to all children and callingpostStop)"

and then I stopped calling 'super.preRestart' and it started to work. 

Am I correct that the SupervisoryStrategy of the children have no bearing when Akka is doing the restart()?

Thanks for your help,

Chris

Björn Antonsson

unread,
Mar 10, 2014, 4:07:40 AM3/10/14
to akka...@googlegroups.com, Chris Curtin
Hi Chris,

On 7 March 2014 at 17:40:39, Chris Curtin (ccu...@curtinhome.com) wrote:

Hi Björn,

Thanks for the reply. It took me a few minutes to figure out what you were referencing since we did have the SupervisoryStrategy set to 'restart()', but then I found this:

"call the old instance’s preRestart hook (defaults to sending termination requests to all children and callingpostStop)"

and then I stopped calling 'super.preRestart' and it started to work. 

Am I correct that the SupervisoryStrategy of the children have no bearing when Akka is doing the restart()?


Yes, the supervisor strategy is only used when handling the failures of an actor. When the action has been decided, the life cycle hooks come into play.


B/


For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages