Yes. It's about the changelog.
Note, changelogs is stored in compacted topics. Thus, even you can
figure out the correct offsets, _old_ data will get deleted via
compaction on the broker side. It might be possible, to disable log
compaction, but you cannot just apply log retention as this might result
in data loss. If you set retention time to infinite to avoid data loss,
you get an topic that grows unbounded on the other hand. So you are in
"bad shape" for each case...
Also, it's a quite hard problem to figure out the correct offsets within
the changelog topic in the first place.
-Matthias
> <
http://application.id> ==
group.id <
http://group.id>), and
> afterwards restart your Kafka Streams application. It will pick up the
> committed offset as set by the tool on startup.
>
> Note, that this might not result in a consistent state of your
> application though, as your application would reuse it's current state
> and the state will not be reset to the corresponding point (resetting
> the state to a point back in time is not possible atm).
>
> Maybe, your application semantics is resilient to this and it's just
> fine for your application. If not, an alternative would be to wipe out
> the state completely, using bin/kafka-stream-application-reset
> (together
> with KafkaStreams#cleanup()) and restart your application with an empty
> state to do the reprocessing. You will still need to use
> bin/kafka-consumer-groups to set the required start offsets.
>
> Which approach is better, depends on your application semantics.
>
>
> -Matthias
>
> On 9/1/17 2:13 AM,
peter...@gmail.com <javascript:> wrote:
> > Hi,
> >
> > Is it possible to start the streaming application from a custom
> offset
> > instead of from either "earliest" of "latest"?
> > My idea is to persist the offset and payload's timestamp pairs on a
> > daily basis. When we need to reprocess records from
> > day X we can just use those offsets as starting points. The
> retention in
> > our input Kafka topic is large enough so offsets
> > won't get invalidated frequently.
> > Is this the right way to achieve such "partial" reprocessing?
> >
> > Thanks,
> > Peter
> >
> > --
>
>
https://groups.google.com/d/msgid/confluent-platform/ce0fc587-ace4-4b6e-be83-a8a322a24ca9%40googlegroups.com
> <
https://groups.google.com/d/msgid/confluent-platform/ce0fc587-ace4-4b6e-be83-a8a322a24ca9%40googlegroups.com?utm_medium=email&utm_source=footer>.