Hi Debezium Team,
I'm running Debezium Engine for MySQL in a large-scale environment and experiencing severe performance issues. Looking for guidance on optimization.
Environment Setup
Debezium Engine (not Kafka Connect)
Scale: ~150 databases × ~400 tables = 58,000-60,000 total tables
Started from old binlog position (catching up on backlog)
Current throughput: Only 500-600 records/second
Problem: At this rate, unable to catch up with ongoing database changes
Configuration
Kafka Producer Settings
properties
linger.ms=15
batch.size=32768
retries=5
delivery.timeout.ms=160000
request.timeout.ms=30000
# Serialization
key.serializer=org.apache.kafka.common.serialization.StringSerializer
value.serializer=org.apache.kafka.common.serialization.StringSerializer
# SASL/Security
sasl_enable=true
security.protocol=SASL_PLAINTEXT
sasl.mechanism=PLAIN
Debezium Engine Settings
properties# Connector
connector.class=io.debezium.connector.mysql.MySqlConnector
topic.prefix=prod
# Snapshot
snapshot.mode=no_data
snapshot.locking.mode=none
# Batching
max.batch.size=600
max.queue.size=2400
# Offset Storage
offset.storage=org.apache.kafka.connect.storage.KafkaOffsetBackingStore
offset.storage.replication.factor=3
offset.storage.partitions=1
offset.flush.interval.ms=0
offset.flush.timeout.ms=10000
offset.storage.topic.producer.acks=1
# Schema History
schema.history.internal.producer.acks=1
schema.history.internal.recover.from.snapshot=true
schema.history.internal.kafka.recovery.poll.interval.ms=10000000
database.history.store.only.captured.tables.ddl=true
# Data Handling
decimal.handling.mode=double
include.schema.changes=true
# Converters
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=false
value.converter.schemas.enable=false
converter.schemas.enable=false
# Heartbeat & Partitioning
heartbeat.interval.ms=30000
enable.custom.partitioner=true
partitioner.class=com.pe.dataplatform.common.utils.kafkawriter.KafkaCustomPartitioner
# Timeouts
consumer.request.timeout.ms=30000
producer.request.timeout.ms=30000
Questions
right now it is not able to catch up with ongoing database changes, is there way to increase this throughput ?
we were using maxwell previously, and it was able to process 3k to 5k records per second.
are we missing anything from the configuration part?