Debezium MySQL Connector not creating new topics after tables added to whitelist (0.9.0)

Arun Prasadh

unread,

Mar 28, 2021, 1:39:32 PM3/28/21

to debezium

Hi,

As the title says, we had to add more tables to our existing MySQL source connectors' whitelist. When we restarted the connector, not all the topics were created. There are some missing tables in the topics list.

We do have Debezium 0.9.0 version and have set the "snapshot.new.tables" field set to "parallel". There are no errors in the connect logs and connector is running just fine. Our heap space is set to 10G as max limit as well.

I will provide the config but wanted to know if this is a known issue even with DBZ-175 ?

Additionally, with 'snapshot.new.tables' set to 'parallel', do we still need to perform the backfilling process when we add new tables (create a new dummy connector for the new tables and snapshot them, once completed, add the new tables to the existing connector and delete the dummy)?

Also, is there any limit on how many tables we can add to the whitelist? We have multiple connectors with over 50 tables with 1 connector close to 100 tables.

Arun Prasadh

unread,

Mar 29, 2021, 8:45:00 AM3/29/21

to debezium

Here is the config I promised.

connector.class=io.debezium.connector.mysql.MySqlConnector

snapshot.locking.mode=minimal

transforms.unwrap.delete.handling.mode=rewrite

transforms.AddPrefix.type=org.apache.kafka.connect.transforms.RegexRouter

tasks.max=1

database.history.kafka.topic=historic_<db_name>_<category>_<version>

transforms=unwrap,dropPrefix,AddPrefix

transforms.dropPrefix.regex=<db_name>.(.*)

table.whitelist="<list of 120 tables; comma separated; format= db_name.table_name>"

transforms.AddPrefix.replacement=<env_name>_$0

database.jdbc.driver=com.mysql.cj.jdbc.Driver

decimal.handling.mode=double

transforms.AddPrefix.regex=.*

snapshot.new.tables=parallel

offset_flush_timeout_ms=10000

database.history.skip.unparseable.ddl=true

heartbeat.topics.prefix=debezium-heartbeat

transforms.unwrap.type=io.debezium.transforms.UnwrapFromEnvelope

database.whitelist=<db_name>

snapshot.fetch.size=100000

transforms.dropPrefix.replacement=$1

bigint.unsigned.handling.mode=long

database.user=<db_user>

database.server.id=<server_id>

database.history.kafka.bootstrap.servers=<host>:9092

time.precision.mode=connect

database.server.name=<server_name>

errors.retry.delay.max.ms=60000

transforms.dropPrefix.type=org.apache.kafka.connect.transforms.RegexRouter

database.port=<port>

inconsistent.schema.handling.mode=warn

offset_flush_interval_ms=60000

database.serverTimezone=UTC

database.hostname=<db_host_name>

database.password=<db_pwd>

errors.tolerance=all

database.history=io.debezium.relational.history.KafkaDatabaseHistory

snapshot.mode=when_needed

Hieu Lam Tri

unread,

Apr 26, 2021, 7:14:03 AM4/26/21

to debezium

Hi Arundir,

I think when you update the config of your connector, it only stream data when there's update on the source table. It don't stream from the beginning.

I have asked this question in gitter and Jiri have advised that you should do the following

1. stop your old connector and update it.

2. Add a new connector to include new tables. This would stream the table from the begining.

3. Delete connector in step 2 and start your connector again.

Hope that can help

Regards,

Hieu

Reply all

Reply to author

Forward