Documents are duplicated on MongoDB Replication

72 views
Skip to first unread message

Yuto Umeda

unread,
Sep 22, 2021, 7:41:24 AM9/22/21
to Tungsten Replicator Discuss
version: Tungsten Replicator 6.1.13 build 45
replication: MySQL on RDS(v5.7.22) -> Extractor -> Applier -> MongoDB Atlas

Hi, we use Tungsten Replicator extracts from MySQL and applies to MongoDB.
When executing UPDATE a record on MySQL, it seems that a document on MongoDB that is same as the record will be updated as well, but sometimes the document is not updated but inserted as new document.
An example is shown below.

----
ex.) We have a table has columns 'id'(PK, int) and 'name'(varchar) and that is replicated to mongo.

MySQL: {id: 1, name: "John"} 
MongoDB: {_id: "aaaaa", id: 1, name: "John"}

Then Update the record on MySQL {id: 1, name: "Paul"}, the documents are duplicated like below.

MySQL: {id: 1, name: "Paul"} 
MongoDB: {_id: "aaaaa", id: 1, name: "John"}, {_id: "bbbbb", id: 1, name: "Paul"}
---


We investigated this matter by checking THL log files.
However, logs are same as when UPDATE works properly and we haven't found out the cause yet.

We have questions below.

1. Does Applier use 'upsert' when updating a document on mongo?
2. Does Applier find a target document by using a primary key on MySQL?
3. Is there any way to avoid duplication?

Sorry for my bad english,
Thanks

Chris Parker

unread,
Sep 22, 2021, 9:13:44 AM9/22/21
to tungsten-repl...@googlegroups.com
Hi 

Thanks for the information.  Let me investigate this for you.

Could you possibly send me the config from the extractor and applier?

If you go onto each host and connect as tungsten (sudo su - tungsten) and then issue `tpm reverse` on each and send me the output that will help me to see your config.

Thanks
Chris

Chris Parker
Customer Success Director, EMEA & APAC
Continuent Ltd., a Delaware Corporation



--
You received this message because you are subscribed to the Google Groups "Tungsten Replicator Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tungsten-replicator...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tungsten-replicator-discuss/8acf596b-9bbb-4975-9177-0899d6234eadn%40googlegroups.com.

Yuto Umeda

unread,
Sep 23, 2021, 11:58:59 PM9/23/21
to Tungsten Replicator Discuss
Hi Chris, thank you for your reply.

Results of 'tpm reverse' the below (some values such as credentials are replaced for dummy values)

---
##
# Extractor
##

$ tpm reverse
# Installed from tungsten@myservice-tungsten-extraction:/opt/continuent/software/tungsten-replicator-for-mysql-extractor-6.1.13-45
# Configuration built from /etc/tungsten/tungsten.ini
# Defaults for all data services and hosts
tools/tpm configure defaults \
--auto-recovery-max-attempts=10 \
--disable-security-controls=true \
--install-directory=/opt/continuent \
--mysql-allow-intensive-checks=true \
'--profile-script=~/.bash_profile' \
--skip-validation-check=InstallerMasterSlaveCheck \
--skip-validation-check=MySQLBinaryLogsEnabledCheck \
--skip-validation-check=MySQLMyISAMCheck \
--skip-validation-check=MySQLPermissionsCheck \
--skip-validation-check=RowBasedBinaryLoggingCheck \
--user=tungsten
# Options for the myservice data service
tools/tpm configure myservice \
--master=myservice-tungsten-extraction \
--members=myservice-tungsten-extraction
# Options for myservice-tungsten-extraction
tools/tpm configure --hosts=myservice-tungsten-extraction \
--java-mem-size=3072 \
--rmi-port=10001
# Options for myservice-tungsten-extraction
tools/tpm configure myservice \
--hosts=myservice-tungsten-extraction \
--datasource-mysql-conf=/dev/null \
--enable-heterogeneous-service=true \
--privileged-master=false \
--property=replicator.filter.pkey.addColumnsToDeletes=true \
--property=replicator.filter.pkey.addPkeyToInserts=true \
--property=replicator.filter.replicate.do=myschema.mytable \
--property=replicator.service.comments=true \
--replication-host=xxxxxx.rds.amazonaws.com \
--replication-password=XXXXXXXXX \
--replication-user=myuser \
--svc-extractor-filters=dropcatalogdata,replicate,colnames,pkey \
--thl-log-retention=7d \
--thl-port=2112


##
# Applier
##

$ tpm reverse
# Installed from tungsten@myservice-tungsten-target:/opt/continuent/software/tungsten-replicator-for-mongodb-6.1.13-45
# Configuration built from /etc/tungsten/tungsten.ini
# Defaults for all data services and hosts
tools/tpm configure defaults \
--auto-recovery-max-attempts=10 \
--disable-security-controls=false \
--enable-rmi-authentication=false \
--enable-rmi-ssl=false \
--enable-thl-ssl=false \
--install-directory=/opt/continuent \
--mysql-allow-intensive-checks=true \
'--profile-script=~/.bash_profile' \
--user=tungsten
# Options for the myservice data service
tools/tpm configure myservice \
--master=myservice-tungsten-extraction \
--members=myservice-tungsten-target
# Options for myservice-tungsten-target
tools/tpm configure --hosts=myservice-tungsten-target \
--java-mem-size=3072 \
--rmi-port=10001
# Options for myservice-tungsten-target
tools/tpm configure myservice \
--hosts=myservice-tungsten-target \
--datasource-type=mongodb \
--master-thl-port=2112 \
'--property=replicator.applier.dbms.connectString=mongodb+srv://${replicator.global.db.user}:${replicator.global.db.password}@${replicator.global.db.host}/?retryWrites=true&w=majority' \
--property=replicator.filter.dropcolumn.definitionsFile=/opt/continuent/share/dropcolumn.json \
--property=replicator.filter.rename.definitionsFile=/opt/continuent/share/rename.csv \
--property=replicator.filter.replicate.do=myschema.mytable \
--replication-host=xxxxxxx.mongodb.net \
--replication-password=XXXXXXXX \
--replication-port=27017 \
--replication-user=myuser \
--role=slave \
--svc-applier-filters=dropstatementdata,replicate,dropcolumn,rename \
--thl-log-retention=7d \
--thl-port=2112
---

Thanks

2021年9月22日水曜日 22:13:44 UTC+9 chris....@continuent.com:

Yuto Umeda

unread,
Sep 28, 2021, 5:05:04 AM9/28/21
to Tungsten Replicator Discuss
Hi, Chris
I would appreciate it if you would update me on this matter.
Thank you so much for your investigating.

2021年9月24日金曜日 12:58:59 UTC+9 Yuto Umeda:

Chris Parker

unread,
Sep 28, 2021, 3:11:39 PM9/28/21
to tungsten-repl...@googlegroups.com
Hi Yuto,

I have some more information for you from our engineering team, specifically answers to your three questions:

  1. Does Applier use 'upsert' when updating a document on mongo?
-> yes
  1. Does Applier find a target document by using a primary key on MySQL?
-> yes
  1. Is there any way to avoid duplication?
-> add unique index on mongodb side 

Can you confirm if you have keys setup in mongo? If not, we recommend that you create unique keys that match the primary keys within MySQL.

This doc page may also be useful:


Thanks
Chris

Chris Parker
Customer Success Director, EMEA & APAC
Continuent Ltd., a Delaware Corporation

Yuto Umeda

unread,
Sep 29, 2021, 3:39:03 AM9/29/21
to Tungsten Replicator Discuss
Hi Chris, Thank you for your quick reply.


> add unique index on mongodb side
> Can you confirm if you have keys setup in mongo? If not, we recommend that you create unique keys that match the primary keys within MySQL.
We don't have unique keys in mongo, so we will try. Thanks.

Sorry I have one more question,
Does status of applier change to OFFLINE:ERROR if applier cannot update a document due to a unique key on mongo?
If changes to OFFLINE:ERROR, is there any way to ignore the error and keep status ONLINE when getting the error due to unique keys?

Thanks
2021年9月29日水曜日 4:11:39 UTC+9 chris....@continuent.com:

Chris Parker

unread,
Sep 29, 2021, 4:23:51 AM9/29/21
to tungsten-repl...@googlegroups.com
Hi Yuto,

Correct, on a dupliocate key error the replicator will go into an OFFLINE:ERROR state, the only way to pass this is to then skip the error.

In the trepctl status output you will see pendingErrorSeqNo

Using this value you can then issue trepctl online -skip-seqno xx where xx is the value from above.

There is currently no way to tell the replicator to do this automatically, as that could introduce significant data drift and present further issues, therefore this needs to be a manual choice/operation.

Thanks
Chris

Chris Parker
Customer Success Director, EMEA & APAC
Continuent Ltd., a Delaware Corporation

chris....@continuent.com
+44 (0)7790 685 269


Reply all
Reply to author
Forward
0 new messages