We also thought that the process might be impacted due to DML during the sink process. We made a new test, in a controlled environment, and the performance maintained constant during the process.
I've attached an image which shows the bytes in/out during the snapshot+sink process. This test included 7 tables, 5 with just a few rows, 1 table with 21M and another with ~6M. Is clear from the image that the snapshot performance outperformed the sink performance. Can I mitigate this fact?
I'm doing CDC in docker containers. This machine as 64GB of RAM and 4 cores.
{
"name": "xxxx",
"config": {
"connector.class": "io.debezium.connector.jdbc.JdbcSinkConnector",
"tasks.max": "1",
"topics": "TOPICNAME",
"connection.url": "XXXXX",
"connection.username": "USER",
"connection.password": "PW",
"quote.identifiers": "true",
"schema.evolution": "none",
"insert.mode": "upsert",
"key.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"delete.enabled": "true",
"primary.key.mode": "record_key",
"primary.key.fields": "PK1,PK2,PK3,PK4",
"table.name.format": "XXXXX"
}
}