Status: New
Owner: ----
Labels: Type-Defect Priority-Medium FoundIn
New issue 1108 by
vait...@gmail.com: Replication from MySQL into Verica 7
(delete fails) with tungsten-replicator-3.0.0-524
https://code.google.com/p/tungsten-replicator/issues/detail?id=1108
What steps will reproduce the problem?
1. Create database 'myschema' into MySQL or MariaDB (doesn't matter):
CREATE TABLE country
(
id CHAR(2),
name VARCHAR(64) NOT NULL,
capital_city_id INT,
PRIMARY KEY(id)
) ENGINE = INNODB;
2. Add following table definition of regular & stage table into Vertica 7:
CREATE TABLE myschema.country
(
id CHAR(2) NOT NULL PRIMARY KEY,
name VARCHAR(64) NOT NULL
)
ORDER BY id;
CREATE TABLE myschema.stage_xxx_country
(
tungsten_opcode CHAR(1) ,
tungsten_seqno INT ,
tungsten_row_id INT ,
tungsten_commit_timestamp TIMESTAMP ,
id CHAR(2) ,
name VARCHAR(64)
) ORDER BY tungsten_seqno, tungsten_row_id;
3. Use dropcolumn.js filter with following JSON:
[
{
"schema": "bongacash",
"table": "country",
"columns": [
"capital_city_id"
]
}
]
4. Run master and slave with following configuration:
Master (extractor):
./tools/tpm configure myschema \
--install-directory=/opt/continuent \
--topology=master-slave \
--enable-heterogenous-service=true \
--disable-relay-logs=true \
--master=db3mysql \
--members=db3mysql,vertica \
--replication-user=tungsten \
--replication-password=XXXXXXX \
--mysql-enable-enumtostring=true \
--mysql-enable-settostring=true \
--mysql-use-bytes-for-string=false \
--svc-extractor-filters=replicate,enumtostring,settostring,pkey,colnames,dropcolumn
\
--property=
replicator.filter.replicate.do=myschema.country \
--property=replicator.filter.pkey.addColumnsToDeletes=true \
--property=replicator.filter.pkey.addPkeyToInserts=true \
--property=replicator.filter.dropcolumn.definitionsFile=/opt/continuent/share/dropcolumn.json
\
--java-file-encoding=UTF8 \
--start
Slave (applier):
./tools/tpm configure myschema \
--master-thl-host=db3mysql \
--install-directory=/opt/continuent/myschema/applier \
--topology=master-slave \
--enable-heterogenous-service=true \
--datasource-type=vertica \
--master=db3mysql \
--members=db3mysql,vertica \
--hosts=vertica \
--replication-user=vertica \
--replication-password=XXXXXXX \
--batch-enabled=true \
--batch-load-language=js \
--batch-load-template=vertica6 \
--vertica-dbname=myschema \
--replication-host=vertica \
--replication-port=5433 \
--property=replicator.applier.dbms=com.continuent.tungsten.replicator.applier.batch.SimpleBatchApplier
\
--skip-validation-check=InstallerMasterSlaveCheck \
--svc-applier-block-commit-size=25000 \
--svc-applier-block-commit-interval=3s
5. Fill MySQL table:
INSERT INTO `country` VALUES('AD', 'Andorra', 6579);
INSERT INTO `country` VALUES('AE', 'United Arab Emirates', 610);
INSERT INTO `country` VALUES('AF', 'Afghanistan', 102889);
INSERT INTO `country` VALUES('AG', 'Antigua and Barbuda', 191177);
INSERT INTO `country` VALUES('AI', 'Anguilla', NULL);
INSERT INTO `country` VALUES('AL', 'Albania', 224153);
INSERT INTO `country` VALUES('AM', 'Armenia', 249432);
6. Delete row from MySQL:
DELETE FROM `country` WHERE id = 'AD';
What is the expected output?
Expected row wit id = 'AD' to be deleted from Vertica.
What do you see instead?
LOAD DATA ROW count does not match: sql=COPY myschema.stage_xxx_country
FROM '/tmp/staging/myschema/staging0/myschema-country-33643.csv' DIRECT
NULL 'null' DELIMITER ',' ENCLOSED BY '"' expected_copy_rows=1
And content of '/tmp/staging/myschema/staging0/myschema-country-33643.csv':
<<<CSV
"D","4","1","2015-03-11 13:53:22.000","AC",null,null
>>> CSV
What version of the product are you using?
3.0.0-524
On what operating system?
CentOS 6
Please provide any additional information below.
When I'am updating any record CSV contains following correct DELETE data:
<<<CSV
"D","2","1","2015-03-11 13:44:32.000","AD",null
"I","2","2","2015-03-11 13:44:32.000","AD","AD UPDATE"
>>> CSV
Notes:
I also did some investigation and suppose that error may come not from my
configuration but from
(src/java/com/continuent/tungsten/replicator/applier/batch/SimpleBatchApplier.java):
// Fetch column names and values.
List<ColumnSpec> keySpecs = orc.getKeySpec();
List<ColumnSpec> colSpecs = orc.getColumnSpec();
ArrayList<ArrayList<ColumnVal>> keyValues = orc
.getKeyValues();
// If the colspecs are empty, use keyspecs. This
// gets around an upstream bug in column metadata
// generation for tables without primary keys (Issue
// 916).
if (colSpecs.size() == 0)
colSpecs = keySpecs;
// Get information about the table definition.
Table tableMetadata = this.getTableMetadata(schema,
table, colSpecs, keySpecs);
// Insert each column into the CSV file.
writeValues(seqno, commitTimestamp, service,
tableMetadata, keySpecs, keyValues, DELETE);
Here you see the code that colSpec = keySpecs and that is why it produces
CSV with all fields, maybe I need to configure applier somehow and let him
know what I don't need extra (capital_city_id) column? From the otherside I
see that SimpleBatchApplier can produce correct CSV for DELETE statement
(ex: myschema-country-2.csv)
--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings