Debezium's log suddenly began inspecting all tables (including all columns) after receiving a significant change in rows (1000 rows or more).

37 views
Skip to first unread message

Rafli J

unread,
Jun 26, 2024, 6:01:17 AM (7 days ago) Jun 26
to debezium
Hi,
I work with Debezium and CitusDB (PostgreSQL) in a multi-tenancy environment. I use one connector per node with multiple schemas (over 500 schemas) and dozens of tables.
{
  "name": "worker1",
  "config": {
    "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
    "database.hostname": "***",
    "database.port": "***",
    "database.user": "***",
    "database.password": "***",
    "database.dbname": "***",
    "database.server.name": "worker1",
    "plugin.name": "pgoutput",
    "slot.name": "worker1",
    "publication.name": "dbz_publication",
    "table.include.list": "^(.*)\\.(Rosters|Leaves|Permits|Overtimes|GeneralSettings|DateProcessingTools|ApprovalStatuses|EmployeeTransactions|PermitTransactions|LeaveTransactions|EmployeeLeaves|OvertimeTransactions|Employees)$",
    "transforms": "Reroute",
    "transforms.Reroute.type": "org.apache.kafka.connect.transforms.RegexRouter",
    "transforms.Reroute.regex": "(.*)\\.(.*)\\.(.*)",
    "transforms.Reroute.replacement": "$3",
    "tombstones.on.delete": "false",
    "time.precision.mode": "connect",
    "decimal.handling.mode": "double",
    "key.converter": "org.apache.kafka.connect.json.JsonConverter",
    "value.converter": "org.apache.kafka.connect.json.JsonConverter",
    "key.converter.schemas.enable": "false",
    "value.converter.schemas.enable": "false",
    "snapshot.mode": "never",
    "schema.refresh.mode": "columns_diff",
    "table.ignore.builtin": "true",
    "topic.creation.enable": "true",
    "topic.prefix": "dbz-",
    "topic.creation.default.replication.factor": 1,
    "topic.creation.default.partitions": 10,
    "topic.creation.default.cleanup.policy": "compact",
    "topic.creation.default.compression.type": "lz4",
    "max.batch.size": "1024"
  }
}
The connector is functioning well, but I encountered an issue when I made a large batch update of 1000 rows at once. Debezium suddenly inspected all tables (including their columns) even though I did not include some of these tables in the update.
Untitled.png
How can I prevent this unnecessary table inspection? During this process, Debezium did not publish any changes to Kafka. After the process completed, Debezium resumed capturing and publishing changes to Kafka as expected."

jiri.p...@gmail.com

unread,
Jun 27, 2024, 4:21:07 AM (6 days ago) Jun 27
to debezium
Hi,

this happens when PostgreSQL sends relation message into the logical decoding so the question is what triggered database to send that amount of relation messages.

Jiri

Rafli J

unread,
Jun 27, 2024, 5:55:38 AM (6 days ago) Jun 27
to debezium
Hi Jiri,

This happens because I often make changes to the table data involving up to 1000 rows at a time.

jiri.p...@gmail.com

unread,
Jun 27, 2024, 7:21:02 AM (6 days ago) Jun 27
to debezium
This still does not explain why it is emitted by PostgreSQL in the first place. That must be something unusual that forces PostgreSQL to do that. All-in-all this is not something that Debezium can influence and would probably be good to discuss the matter with PostgreSQL community.

Jiri

Reply all
Reply to author
Forward
0 new messages