Not capturing CDC for table w/o PK even with specified message.key.columns

138 views
Skip to first unread message

Mitchell Bregman

unread,
Sep 11, 2023, 9:39:23 AM9/11/23
to debezium
Hi there. First time posting in this group, let me know if I am missing context.

I have a Postgres 14 database with schema named: device and table named: health_checks. This table does not have a PK, rather the combination of columns: device_id,metric,created_at will create a unique key. So in my application.properties, I have specified the message.key.columns: 
public.health_checks:device_id,metric,created_at. I should also note that this is not the only table in my setup, however this is the only table without a PK.

When deploying the server to track change events on this database, we are not receiving changes on the health_checks table, but are getting messages for dbz_heartbeats and other tables for example. Does adding the message.key.columns property require me to include a key columns for all other tables in this database as well? I figured not including would automatically assume to use the PK for the other tables.

If it's helpful to include my entire application.properties I can share, let me know

Any insight helps! Thanks so much.


Mitchell Bregman

unread,
Sep 11, 2023, 11:51:57 AM9/11/23
to debezium
Sorry, clarification: schema name is: public and database name is: device

Chris Cranford

unread,
Sep 11, 2023, 1:02:28 PM9/11/23
to debe...@googlegroups.com
Hi Mitchell,

You can safely supply a "message.key.columns" configuration only for the key-less table and that should suffice.  All other captured tables will use the configured primary key instead. 

For the lack of changes from the table, can you confirm that the "table.include.list" is specifying the table as "public.health_checks".  Additionally, if you're using pgoutput, can you confirm which "publication.autocreate.mode" you are using?  If this is set to "FILTERED" or "DISABLED", make sure that the publication includes the "health_checks" table or that the publication was created with ALL TABLES.

The application.properties would be helpful to identify if some other configuration is the culprit.

Thanks,
Chris
--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/debezium/94fe1ecf-7cad-4201-8edc-b0c95779d646n%40googlegroups.com.

Mitchell Bregman

unread,
Sep 11, 2023, 5:29:45 PM9/11/23
to debezium

For some more context, I have deployed Debezium against a bunch of other DBs, however all tables within them contain PKs. This is our first time coming across a DB without PK set. With this DB, we are receiving messages from all tables that have a PK, but the table (device_health_checks) we do not get messages and result in slot lag growing fast due to messages not being flushed.

Attached is the application.properties (omitting some sensitive info)

I also should include that we have implemented a custom ChangeConsumer as such:

@Override public void handleBatch(List<ChangeEvent<Object, Object>> records, RecordCommitter<ChangeEvent<Object, Object>> committer) throws InterruptedException { for (ChangeEvent<Object, Object> record : records) { LOGGER.trace("Received event '{}'", record); Object rv = record.value(); if (rv == null) { rv = ""; } PutRecordRequest putRecord; try { UUID uuid = UUID.randomUUID(); String uuidAsString = uuid.toString(); putRecord = PutRecordRequest.builder() .partitionKey((record.key() != null) ? md5(getString(record.key())) : uuidAsString) .streamName(streamNameMapper.map(record.destination())) .data(SdkBytes.fromByteArray(getBytes(rv))) .build(); client.putRecord(putRecord); } catch (NoSuchAlgorithmException e) { LOGGER.warn("Exception putting record to Kinesis due to hashing algorithm used: {}", e); } committer.markProcessed(record); } committer.markBatchFinished(); }

Let me know if anything seems off to you!

application.properties

jiri.p...@gmail.com

unread,
Sep 12, 2023, 12:09:36 AM9/12/23
to debezium
Hi,

what is the replica identity for the problematic table? Could you check if you'd get data from snapshot? Could you please share the full log?

J.

Mitchell Bregman

unread,
Sep 12, 2023, 10:48:58 AM9/12/23
to debezium
Hi J,

To answer your questions:
1. The replica identity is "default".
2. Data comes through during a snapshot
3. Attached is full log.

Let me know if I can provide any other context!

Mitch
demo.log

Chris Cranford

unread,
Sep 12, 2023, 10:58:16 AM9/12/23
to debe...@googlegroups.com
Hi Mitchell,

Do you get changes if you adjust the replica identity to FULL?

Chris

Mitchell Bregman

unread,
Sep 12, 2023, 2:21:54 PM9/12/23
to debezium
No changes upon adjusting replica identity to FULL. Any other ideas?

Mitchell Bregman

unread,
Sep 12, 2023, 2:23:41 PM9/12/23
to debezium
Still processing messages for other tables, just not the target table (device_health_checks)

FWIW - log included here to show REPLICA IDENTITY as FULL:

2023-09-12 18:04:29,839 INFO  [io.deb.con.pos.PostgresSchema] (debezium-postgresconnector-my-server-change-event-source-coordinator) REPLICA IDENTITY for 'public.device_health_settings' is 'DEFAULT'; UPDATE and DELETE events will contain previous values only for PK columns

2023-09-12 18:04:29,840 INFO  [io.deb.con.pos.PostgresSchema] (debezium-postgresconnector-my-server-change-event-source-coordinator) REPLICA IDENTITY for 'public.device_health_checks' is 'FULL'; UPDATE AND DELETE events will contain the previous values of all the columns

2023-09-12 18:04:29,842 INFO  [io.deb.con.pos.PostgresSchema] (debezium-postgresconnector-my-server-change-event-source-coordinator) REPLICA IDENTITY for 'public.device_health_history' is 'DEFAULT'; UPDATE and DELETE events will contain previous values only for PK columns

2023-09-12 18:04:29,843 INFO  [io.deb.con.pos.PostgresSchema] (debezium-postgresconnector-my-server-change-event-source-coordinator) REPLICA IDENTITY for 'public.dbz_heartbeats' is 'DEFAULT'; UPDATE and DELETE events will contain previous values only for PK columns


jiri.p...@gmail.com

unread,
Sep 13, 2023, 8:42:08 AM9/13/23
to debezium
Hi,

could you please try to enable snapshot and captur the full log too? It seems there is only beginning.

J.

Mitchell Bregman

unread,
Sep 13, 2023, 10:21:26 AM9/13/23
to debezium
I can try to enable snapshot, but for the purposes of this deployment we only want data that is moving forward as there will be a TTL regardless in our destination, so we do not really care for snapshot, but let me know attempt it and see. Will get back to you!

Mitchell Bregman

unread,
Sep 13, 2023, 10:48:26 AM9/13/23
to debezium
Does it require a snapshot if we are using message.key.columns?

jiri.p...@gmail.com

unread,
Sep 14, 2023, 4:28:04 AM9/14/23
to debezium
No, it does not, I want to make sure first that the table is properly handled. Please enable capturing of just this one to simplify logging.

J.

Mitchell Bregman

unread,
Sep 14, 2023, 12:03:56 PM9/14/23
to debezium
Hi Jiri,

Thanks so much for looking into this. Attached are the logs with settings as you described (just this one table, snapshot initial).

debezium.log

jiri.p...@gmail.com

unread,
Sep 15, 2023, 5:02:21 AM9/15/23
to debezium
ok, so it correctly detected. Could you please share the DDL of the table so I can try to reproduce it locally?

J.

Mitchell Bregman

unread,
Sep 15, 2023, 11:03:54 AM9/15/23
to debezium
Here is DDL:

create table device_health_checks ( camera_external_id uuid not null, metric "DeviceHealthMetric" not null, calculated numeric, severity "DeviceHealthSeverity" not null, created_at timestamp with time zone default CURRENT_TIMESTAMP not null, health_score smallint, serial text, camera_id integer, enable_case_create boolean default true not null, case_priority "DeviceHealthCasePriority" default 'medium'::"DeviceHealthCasePriority" not null, device_type text ) partition by RANGE (created_at);

Mitchell Bregman

unread,
Sep 20, 2023, 10:37:53 AM9/20/23
to debezium
Hey Jiri - any ideas?

jiri.p...@gmail.com

unread,
Sep 25, 2023, 12:20:38 AM9/25/23
to debezium
Hi,

I beleive the problem might be in  metric "DeviceHealthMetric" not null - this is custom type. Could you try to do the same on a table without this column?


J.
Reply all
Reply to author
Forward
0 new messages