when_needed starts & continues snapshot with purged binlogs

142 views
Skip to first unread message

Adil Karim

unread,
Oct 9, 2023, 8:52:10 AM10/9/23
to debezium
Hi there,

I'm facing a persistent issue with snapshotting when using Debezium 2.3 with MySQL (managed in DigitalOcean). When I shut down the connectors for a while and the binlogs are purged by the database automatically in the intervening period, when I restart the connector it will attempt to snapshot using the last binlog location last used and run through the entire snapshot process (around 24 hours for us) before finally concluding that the binlog has been purged and restarting.

My understanding of `when_needed` is that it checks that the binlogs and GTID are available on the server before proceeding, but it does not seem to do that. From the docs:

```when_needed - the connector runs a snapshot upon startup whenever it deems it necessary. That is, when no offsets are available, or when a previously recorded offset specifies a binlog location or GTID that is not available in the server.```

If I run `show binary logs` on the server before running the connector then I can see that the binlog has been purged already. I can see that on the topic `dbserver1` (also topic prefix) that the last binlog recorded is the purged one, but I'm not sure why it's reading from there if we're using `when_needed.

Is there something I'm missing here? Are we meant to reset the binlog location manually somehow before proceeding with a snapshot?

Thank you

jiri.p...@gmail.com

unread,
Oct 10, 2023, 12:26:23 AM10/10/23
to debezium
Hi,

could you please share the complete log?

J.

Adil Karim

unread,
Nov 17, 2023, 7:35:26 AM11/17/23
to debezium
Sure. I've attached it (with some data redacted). 

`binlog.033414` was purged before the restart, and Debezium keeps trying to pull the same binlog.

Adil
debezium_binlog_purge.log

jiri.p...@gmail.com

unread,
Nov 20, 2023, 8:49:42 AM11/20/23
to debezium
ok, this looks like real corner case. So you started the snapshot, Debezium recorded position from which it should resume streaming. Now binlog is truncated. Snapshot is finished but streaming cannot resume as there is no binlog available. So connector fails, you restart it. According to stored snapshots we already have stored binlog position but snapshot was not completed successfully so the snapshot is rexecuted bbut streaming cannot be started as the original binlog position stored in offests was intentionally resumed.

As there are othere considerations at play I'd say we need to accept limitation that the first snapshot must succeed to get the mechanism properly working otherwise we might end up in a different can of worms.

J.

Adil Karim

unread,
Jan 2, 2024, 1:10:15 PM1/2/24
to debe...@googlegroups.com
The same thing seems to happen if you do this much later on as well. This just happened to us now. The Connect cluster was scheduled onto another node due to a Kubernetes upgrade so we had some downtime (around 10 minutes). In the intervening period the binlog was truncated and couldn't be found again - you then have to start from scratch.


Adil Karim
Co-Founder
LIX
t: +44 113 868 3463
e: ad...@lix-it.com | w: lix-it.com
a
98 Bramley Rd, London, N14 4HS, UK






--
You received this message because you are subscribed to a topic in the Google Groups "debezium" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/debezium/hivJFzxwTLY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to debezium+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/debezium/5533a550-0106-450a-9e82-9700585f4657n%40googlegroups.com.

jiri.p...@gmail.com

unread,
Jan 3, 2024, 1:25:04 AM1/3/24
to debezium
Hi,

but in that case a different solution needs to be found. It is really important to keep the binlog for some time just to cover downtime periods. Why it is not possible with DI MySQL? Have tried to reach the support and ask about the truncate policy and maybe require some retention time?

J.

To unsubscribe from this group and all its topics, send an email to debezium+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages