DB history topic close to 1G

Liang Mou

unread,

Feb 3, 2025, 9:29:03 PM2/3/25

to debezium

Hi there,

One of our mysql connector has the below error, after checking the size of the database history topic it is close to 1G which is 100X than other working connectors. As we set the retention of the database history topic to basically forever, what's the best way to reduce the amount of the data in the topic?

[2025-02-04 00:32:06,107] ERROR WorkerSourceTask{id=db0007-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask) java.lang.IllegalStateException: The database history couldn't be recovered. Consider to increase the value for database.history.kafka.recovery.poll.interval.ms at io.debezium.relational.history.KafkaDatabaseHistory.recoverRecords(KafkaDatabaseHistory.java:309) at io.debezium.relational.history.AbstractDatabaseHistory.recover(AbstractDatabaseHistory.java:112) at io.debezium.relational.history.DatabaseHistory.recover(DatabaseHistory.java:163) at io.debezium.relational.HistorizedRelationalDatabaseSchema.recover(HistorizedRelationalDatabaseSchema.java:62) at io.debezium.schema.HistorizedDatabaseSchema.recover(HistorizedDatabaseSchema.java:38) at io.debezium.connector.mysql.MySqlConnectorTask.validateAndLoadDatabaseHistory(MySqlConnectorTask.java:353) at io.debezium.connector.mysql.MySqlConnectorTask.start(MySqlConnectorTask.java:107) at io.debezium.connector.common.BaseSourceTask.start(BaseSourceTask.java:133) at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:208) at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177) at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829)

Thanks,

Liang

Chris Cranford

unread,

Feb 4, 2025, 2:17:05 AM2/4/25

to debe...@googlegroups.com

Hi Liang -

There is some side work that we put into a schema history compaction tool [1], but that isn't yet production ready so I'm afraid the alternatives are:

- Deleting the history topic & re-create it using snapshot.mode=recovery (Debezium 3+) or snapshot.mode=schema_history_recovery (Debezium 1.x/2.x)
- Deleting the history & offsets paired with a re-snapshot if applicable.

Just be aware that for (1), you need to guarantee that whatever the connector's current read position offset is that there has been no schema changes on the captured tables from that point in time until now, since the history topic is re-created based on the current table structure metadata and if there are any inconsistencies, this will lead to connector failures when the connector streams older events with different numbers of columns or column types.

Thanks,
-cc

--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/debezium/b3633b0d-0fb6-4fb4-acd8-1fd9368bd8e8n%40googlegroups.com.

Liang Mou

unread,

Feb 4, 2025, 12:14:08 PM2/4/25

to debezium

Thanks Chris, initially I tried to tune the below parameters but didn't get them work, eventually I did the first approach as you mentioned which worked, fortunately it didn't have schema change in that timeframe. Looking forward for your long term fix as I noticed some other db history topics size are also growing fast.

database.history.kafka.recovery.poll.interval.ms=5000
database.history.kafka.recovery.attempts=300
task.shutdown.graceful.timeout.ms=300000
rebalance.timeout.ms=120000