Setting "auto.offset.reset" to latest in kafka consumer stops Lagom from retrying after failed event

1,754 views
Skip to first unread message

Jim Zhai

unread,
Jun 28, 2017, 2:46:33 AM6/28/17
to Lagom Framework Users
Hi everyone,

I recently changed "auto.offset.reset" to "latest" in the consumer setting of a Lagom service, then I noticed that when an event processing fails, Lagom no longer retries automatically. Changing "auto.offset.reset" back to "earliest" activates the auto retries. Is this an expected behavior, or is there sth wrong with my setup?

Here's a snippet from application.conf:

lagom.broker.kafka {

 # Set to empty string so that the brokers configuration will be used instead of service locator lookup

 service-name = ""

 brokers = "steamer-01.srvs.cloudkafka.com:9093,steamer-03.srvs.cloudkafka.com:9093,steamer-02.srvs.cloudkafka.com:9093"

}


kafka.configuration = {

 security.protocol = "SSL"

 ssl.truststore.location = "/Users/jim/Documents/cloudkarafka-keystore-staging/truststore.jks"

 ssl.truststore.password = "olympus"

 ssl.keystore.location = "/Users/jim/Documents/cloudkarafka-keystore-staging/keystore.jks"

 ssl.keystore.password = "olympus"

 ssl.keypassword = "olympus"

#  auto.offset.reset = "latest"

}


akka.kafka.producer.kafka-clients = ${kafka.configuration}


akka.kafka.consumer.kafka-clients = ${kafka.configuration}


By my understanding, "auto.offset.reset" will only be considered when there's no offset found for the consumer. Therefore for one given instance of application it shouldn't affect the event pulling at all after the application is up and running. However the observed behavior feels like the kafka consumer disregards the offset stored, and resets offset based on the configuration.

Any explaining / clarification would be appreciated. Thank you.

Tim Moore

unread,
Jul 3, 2017, 10:57:36 PM7/3/17
to Jim Zhai, Lagom Framework Users
Hi Jim,

Lagom batches offset commits (for performance reasons), so if your consumer fails within the first batch of messages consumed, no offset will be committed. Could that explain what's happening?


It sounds like the expected behavior of auto.offset.reset = latest is to skip messages. Could you talk more about what behavior you want to achieve?

Thanks,
Tim

--
You received this message because you are subscribed to the Google Groups "Lagom Framework Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lagom-framework+unsubscribe@googlegroups.com.
To post to this group, send email to lagom-framework@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lagom-framework/5fbc37f6-1221-419f-be5e-417baa88416f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Tim Moore
Senior Engineer, Lagom, Lightbend, Inc.


Jim Zhai

unread,
Jul 4, 2017, 10:04:54 PM7/4/17
to Lagom Framework Users, jim....@skiddoo.com.au
Hi Tim,

Thx for taking time to answer my question. Much appreciated.

After reading your reply I realized that it's happening when the first event fails, which fits your explanation. Once at least one event is processed, the offset is stored in kafka and the upcoming failed event will be retried. This is confirmed with our tests.

Our goal is to build a service that sends out email notification on certain events. The service does not have persistence. It relies on Kafka for the at least once delivery guarentee. We'd also like to prevent sending duplicated email for the past events, as they should have been handled by a legacy service, hence we tend to use auto.offset.reset = latest here. It's great that we can achieve both with almost 0 cost.

Thx for your help.
To unsubscribe from this group and stop receiving emails from it, send an email to lagom-framewo...@googlegroups.com.
To post to this group, send email to lagom-f...@googlegroups.com.

Tim Moore

unread,
Jul 5, 2017, 10:05:13 PM7/5/17
to Jim Zhai, Lagom Framework Users
Sure thing, Jim. Thanks for the extra background. Please keep in mind that Kafka/Lagom's at-least-once guarantee means that you might receive those events more than once. This means you might send duplicate emails even for new events. This can be a very tricky problem to solve fully as you're probably already aware! :)

Cheers,
Tim

To unsubscribe from this group and stop receiving emails from it, send an email to lagom-framework+unsubscribe@googlegroups.com.
To post to this group, send email to lagom-framework@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lagom-framework/ea4bd1f6-032e-407f-b93b-0ef2c15c3334%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Jim Zhai

unread,
Jul 5, 2017, 10:16:17 PM7/5/17
to Lagom Framework Users, jim....@skiddoo.com.au
Yes, in our group we discussed about the non existence of at-most-once guarentee. We decided that it's tolerable if we send duplicated emails once every while. 

We also have plans to come back and apply a process manager pattern here, as the email sending is actually a 3 step task that can be further split. ATM we're still in the quick-and-dirty phase trying to setup a few things that can work together.

Thx again for the heads up ^^


Tim

Reply all
Reply to author
Forward
0 new messages