Update connector config `incremental.snapshot.chunk.size` while running snapshot

435 views
Skip to first unread message

Thanh Thảo Huỳnh

unread,
Sep 13, 2022, 12:32:41 PM9/13/22
to debezium
Hi,
I'm using Debezium 2.0.0.Beta1. At first, I didn't set this config `incremental.snapshot.chunk.size` (the default value is 1024). Then I triggered an incremental snapshot and realized the snapshot speed was very slow.
Can I increase the config `incremental.snapshot.chunk.size` while running a snapshot? Does it improve this snapshot performance?

Thank you.

Chris Cranford

unread,
Sep 14, 2022, 8:51:46 AM9/14/22
to debe...@googlegroups.com, Thanh Thảo Huỳnh
Hi, yes you can modify the value while the connector is performing the snapshot.  This is because the incremental snapshot will be paused while the connector performs a re-balance after the configuration change and upon restart will resume the incremental snapshot.

Chris
--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/debezium/079e1436-0ea8-47aa-816e-47f61905a89en%40googlegroups.com.

Thanh Thảo Huỳnh

unread,
Sep 14, 2022, 10:22:53 AM9/14/22
to debezium
Thanks, Chris.

I doubled the `incremental.snapshot.chunk.size` 4 times from 1024 to 16384 but the snapshot speed didn't increase.
I think so because I exposed Debezium metrics and visualized them in Grafana for this query
sum(rate(debezium_metrics_totalnumberofeventsseen{plugin="$connector_type",name="$connector_name",context="snapshot"}[5m]))
I attached the chart below.

Do you have any idea to increase the incremental snapshot speed?

Thank you.
 
Screen Shot 2022-09-14 at 21.12.03.png

jiri.p...@gmail.com

unread,
Sep 15, 2022, 2:16:42 AM9/15/22
to debezium
Please try to raise it to 256 or 512 K. Also make sure that max.batch.size and max.queue.size are enlarged too.

J.

Thanh Thảo Huỳnh

unread,
Sep 15, 2022, 4:32:59 AM9/15/22
to debezium
Thank you.

I will try it.

Thanh Thảo Huỳnh

unread,
Sep 25, 2022, 10:56:05 PM9/25/22
to debezium
Hi,

I increased these configs:
incremental.snapshot.chunk.size: 524288
max.batch.size: 1048576
max.queue.size: 5242880

The throughput is ~500K records/minute.
How can I decrease the interval of polling source records?
I decreased the offset.flush.interval.ms from 60000 to 30000 but seem not working (The throughput is still ~500K records/minute).

Thank you.

jiri.p...@gmail.com

unread,
Sep 27, 2022, 2:38:02 AM9/27/22
to debezium
There is no poll pause, is is done as soon as the previous batch is processed. Don't forget that in parlel streaming messages are sent too. HOw does QueueRemainingCapacity metric look over a time period?

J

Message has been deleted

Thanh Thảo Huỳnh

unread,
Sep 27, 2022, 2:54:21 AM9/27/22
to debezium
Hi,

I attached the chart below.
Query: debezium_metrics_queueremainingcapacity{strimzi_io_kind=~"KafkaConnect.*",strimzi_io_cluster="$strimzi_connect_cluster_name"}

Thanks.
Screen Shot 2022-09-27 at 13.48.45.png

Thanh Thảo Huỳnh

unread,
Sep 27, 2022, 3:03:52 AM9/27/22
to debezium
So the only way to increase throughput is to increase incremental.snapshot.chunk.size, right?

jiri.p...@gmail.com

unread,
Sep 27, 2022, 8:27:07 AM9/27/22
to debezium
And this case it seems so

J.

Reply all
Reply to author
Forward
0 new messages