MySQL connector - GTID purged

483 views
Skip to first unread message

David Ešner

unread,
May 13, 2024, 3:43:06 AM5/13/24
to debezium
Hi, we are running Debezium Engine and we are running into issues when connector is stopoped and restarted after longer period (15+minutes). It seems that after the connector runs the GTIDs do not exist on the server anymore and the only way to fix it is to perform initial snapshot again (it works when I set snapshot mode to `when_needed`)

My question is, is there any special mechanism in the connector that would somehow mark those GTIDs to be purged? 

Or is there any server side setup to increase the period the GTIDs are kept so we do not need to perform initial snapshot in case the connector stops for more than 15 mins? It seems that the `binlog_expire_logs_seconds=2592000` variable has no effect. 

This is the log that leads to failure:
```
Stopping down connector
Some of the GTIDs needed to replicate have been already purged
GTIDs known by the server but not processed yet 'a1cc6bc0-bf13-11ed-bae9-028ced4c5c33:29836-29866', for replication are available only 'a1cc6bc0-bf13-11ed-bae9-028ced4c5c33:29842-29866'
Server has already purged 'a1cc6bc0-bf13-11ed-bae9-028ced4c5c33:1-29841' GTIDs
The current GTID set 'a1cc6bc0-bf13-11ed-bae9-028ced4c5c33:1-29866' does not contain the GTID set 'a1cc6bc0-bf13-11ed-bae9-028ced4c5c33:1-29835' required by the connector
GTID Set retained: 'a1cc6bc0-bf13-11ed-bae9-028ced4c5c33:1-29835'
```

Thank you


David Ešner

unread,
May 13, 2024, 3:52:31 AM5/13/24
to debezium
Just to add, when I set the gtid_mode=OFF and the connector uses binlog positon instead everything works fine even after the connector was stopped for a long period.

Dne pondělí 13. května 2024 v 9:43:06 UTC+2 uživatel David Ešner napsal:

jiri.p...@gmail.com

unread,
May 14, 2024, 6:06:17 AM5/14/24
to debezium
Hi,

Debezium looks into gtid_purged variable that contains the list of purged transactions and refuses to start if there is any transaction that was created while the connector was down.

Is there any othere process that purges the binlog? It is a bit suprising that when GTIDs are off the connector would be able to resume. This would indicated that the purging is not done peridically but another elemnt is in place too.

Jiri

David Ešner

unread,
May 14, 2024, 11:16:38 AM5/14/24
to debezium
Hi Jiri, thank you very much for your answer!

I tested this on AWS RDS and Digital Ocean DB cluster, where in both cases we have also a RO replica connected to the master (from where the Debezium connector is syncing). Could these RO replicase be the cause of that?

I tried to setup standalone server without any RO replica and it seems to work with GTID mode ON just fine. Is there any setup I am missing on the servers with replica perhaps?

Dne úterý 14. května 2024 v 12:06:17 UTC+2 uživatel jiri.p...@gmail.com napsal:

jiri.p...@gmail.com

unread,
May 16, 2024, 1:17:12 AM5/16/24
to debezium
Hi,

I'd recommend you to try it locally - with RO replica. The trouble with cloud services is that you never know what other factors/changes/services are aded by them into play without your knowledge.

If it works locally then the only option is contact cloud provider support asking the question.

Jiri

David Ešner

unread,
May 27, 2024, 4:00:35 AM5/27/24
to debezium
I can confirm that in other setups, local or even in GCP managed SQL with RO replika it works fine. Now I am just very curious what kind of setup or process is running in the aforementioned setups that causes the binlog to be purged. I will try to investigate with the cloud provider. I will share my findings here for future reference.  But for now, it seems that this is truly related to the source setup.

Thanks for your help.

 

Dne čtvrtek 16. května 2024 v 7:17:12 UTC+2 uživatel jiri.p...@gmail.com napsal:

David Ešner

unread,
Jul 8, 2024, 2:51:48 AM7/8/24
to debezium
This is unfortunately often recurrent problem and we weren't able to find the cause yet. Is there any way to force Debezium to use the binlog position only rather than GTID?

Thank you

Dne pondělí 27. května 2024 v 10:00:35 UTC+2 uživatel David Ešner napsal:

Chris Cranford

unread,
Jul 8, 2024, 10:05:42 AM7/8/24
to debezium
Hi - 

Currently this is always enabled for MariaDB and is only enabled if GTID_MODE=ON for MySQL. 

Chris 

David Ešner

unread,
Jul 9, 2024, 8:21:55 AM7/9/24
to debezium
Thank you Chris! 

Dne pondělí 8. července 2024 v 16:05:42 UTC+2 uživatel Chris Cranford napsal:

rohit singh

unread,
Aug 7, 2024, 6:56:33 AM8/7/24
to debezium
Hi Team,

Any solution on this one, as i am also getting the same issue. 

The problem is i have to keep snapshot.mode as  "never" only , coz of some DB access issue. 

Please let me know if there is any solution for handling this problem.


Connector Configurations:

"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"tasks.max": "1",
"database.hostname": "*.*.*.*",
"database.connectionTimeZone": "UTC",
"database.port": "3306",
"config.providers": "file",
"config.providers.file.class": "org.apache.kafka.common.config.provider.FileConfigProvider",
"database.user": "${file:/etc/debezium/secrets/config.properties:database.user}",
"database.password": "${file:/etc/debezium/secrets/config.properties:database.password}",
"database.server.name": "dev-on-prem-1",
"database.include.list": "distribution",
"table.include.list": "distribution.reportaudittest",
"topic.prefix": "nimbus_dev.ATTTEST.VBC.alpha",
"schema.history.internal.kafka.bootstrap.servers": "<Bootstrap Servers>",
"schema.history.internal.kafka.topic": "nimbus_dev.ATTTEST.VBC.alpha.distribution.history",
"schema.history.window.size.minutes": "1440",
"schema.history.internal.kafka.recovery.attempts": 100,
"schema.history.internal.producer.security.protocol": "SSL",
"schema.history.internal.producer.ssl.truststore.location": "/etc/debezium/secrets/truststore.jks",
"schema.history.internal.producer.ssl.truststore.password": "prmcert",
"schema.history.internal.producer.ssl.keystore.location": "/etc/debezium/secrets/keystore.jks",
"schema.history.internal.producer.ssl.keystore.password": "prmcert",
"schema.history.internal.producer.ssl.key.password": "prmcert",
"schema.history.internal.consumer.security.protocol": "SSL",
"schema.history.internal.consumer.ssl.truststore.location": "/etc/debezium/secrets/truststore.jks",
"schema.history.internal.consumer.ssl.truststore.password": "prmcert",
"schema.history.internal.consumer.ssl.keystore.location": "/etc/debezium/secrets/keystore.jks",
"schema.history.internal.consumer.ssl.keystore.password": "prmcert",
"schema.history.internal.consumer.ssl.key.password": "prmcert",
"schema.history.internal.store.only.captured.tables.ddl":"false",
"include.schema.changes": "true",
"snapshot.mode":"never",
"transforms": "route",
"transforms.route.type": "org.apache.kafka.connect.transforms.RegexRouter",
"transforms.route.regex": "^nimbus_dev.ATTTEST.VBC.alpha$",
"transforms.route.replacement": "nimbus_dev.ATTTEST.VBC.alpha.schema-changes",
"schema.history.internal.producer.group.id": "dev-nimbus-vbr-consumer-1",
"schema.history.internal.consumer.group.id": "dev-nimbus-vbr-consumer-1",
"group.id": "dev-nimbus-vbr-consumer-1",
"key.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"key.converter.schema.registry.url": "<schema registry>",
"value.converter.schema.registry.url": "<schema registry>",
"plugin.path": "/kafka/connect/",
"tombstones.on.delete": "false"


Thanks
Rohit Singh
Reply all
Reply to author
Forward
0 new messages