I am in the process of evaluating / testing the mk-table-checksum with --replicate switch to check that my slaves and their masters are in sync.
2 of my test cases failed. I am curious if I am missing a command-line switch or configuration OR if I am asking mk-table-checksum to do something that it doesn't do.
Here are the scenarios that failed. They are all DDL related.
MASTER: mysql Ver 14.12 Distrib 5.0.51a, for debian-linux-gnu (i486) using readline 5.2 SLAVE: mysql Ver 14.12 Distrib 5.0.51a, for debian-linux-gnu (i486) using readline 5.2
The environment is isolated --> there are no connections to the instances apart from mine.
Before each test case is executed, I confirm that the mast and slave are in sync. 1- mk-table-checksum -u {name} -p{pwd} h=green,P=3620 --replicate mysql.checksum --emptyrepltbl 2- mk-table-checksum -u {name} -p{pwd} --replicate mysql.checksum --replcheck 1 h=192.168.0.7,P=3610
First test that failed: Add a table to the slave Expected results: The --replcheck switch would note the DDL is different. Actual Result: The --replcheck returned nothing, implying that the master and slave are in sync.
Second test that failed: Drop a table from the slave that exists on both the slave and master. Expected result: The --replcheck switch would note the DDL is different. Actual Result: The --replcheck returned nothing, implying that the master and slave are in sync. Also, Replication broke because of the missing table on the slave: <From SHOW SLAVE STATUS \G> Last_Errno: 1146 Last_Error: Error 'Table 'sugarcrm.calls' doesn't exist' on query. Default database: 'sugarcrm'. Query: 'REPLACE /*sugarcrm.calls:1/1*/ INTO mysql.checksum (db, tbl, chunk, boundaries, this_cnt, this_crc) SELECT 'sugarcrm', 'calls', 0 AS chunk_num, '1=1', COUNT(*) AS cnt, LOWER(CONCAT(LPAD(CONV(BIT_XOR(CAST(CONV(SUBSTRING(@crc, 1, 16), 16, 10) AS UNSIGNED)), 10, 16), 16, '0'), LPAD(CONV(BIT_XOR(CAST(CONV(SUBSTRING(@crc := MD5(CONCAT_WS('#', `id`, `date_entered`, `date_modified`, `assigned_user_id`, `modified_user_id`, `created_by`, `team_id`, `name`, `duration_hours`, `duration_minutes`, `date_start`, `date_end`, `parent_type`, `status`, `direction`, `parent_id`, `description`, `deleted`, `reminder_time`, `outlook_id`, CONCAT(ISNULL(`date_entered`), ISNULL(`date_modified`), ISNULL(`assigned_user_id`), ISNULL(`modified_user_id`), ISNULL(`created_by`), ISNULL(`team_id`), ISNULL(`name`), ISNULL(`duration_hours`), ISNULL(`duration_minutes`), ISNULL(`date_start`), ISNULL(`date_end`), ISNULL(`parent_type`), ISNULL(`status`),
> I am in the process of evaluating / testing the mk-table-checksum with > --replicate switch to check that my slaves and their masters are in sync.
> 2 of my test cases failed. I am curious if I am missing a command-line > switch or configuration OR if I am asking mk-table-checksum to do > something that it doesn't do.
> Here are the scenarios that failed. They are all DDL related.
> MASTER: mysql Ver 14.12 Distrib 5.0.51a, for debian-linux-gnu (i486) > using readline 5.2 > SLAVE: mysql Ver 14.12 Distrib 5.0.51a, for debian-linux-gnu (i486) > using readline 5.2
> The environment is isolated --> there are no connections to the > instances apart from mine.
> Before each test case is executed, I confirm that the mast and slave are > in sync. > 1- mk-table-checksum -u {name} -p{pwd} h=green,P=3620 --replicate > mysql.checksum --emptyrepltbl > 2- mk-table-checksum -u {name} -p{pwd} --replicate mysql.checksum > --replcheck 1 h=192.168.0.7,P=3610
> First test that failed: Add a table to the slave > Expected results: The --replcheck switch would note the DDL is different. > Actual Result: The --replcheck returned nothing, implying that the > master and slave are in sync.
Are you trying to sync schemas (table ddls), too? mk-table-checksum cannot do this. It doesn't check schemas for differences; it assumes that tables on source and destination are the same.
> Second test that failed: Drop a table from the slave that exists on both > the slave and master. > Expected result: The --replcheck switch would note the DDL is different. > Actual Result: The --replcheck returned nothing, implying that the > master and slave are in sync. Also, Replication broke because of the > missing table on the slave: > <From SHOW SLAVE STATUS \G> > Last_Errno: 1146 > Last_Error: Error 'Table 'sugarcrm.calls' doesn't > exist' on query. Default database: 'sugarcrm'. Query: 'REPLACE > /*sugarcrm.calls:1/1*/ INTO mysql.checksum (db, tbl, chunk, boundaries, > this_cnt, this_crc) SELECT 'sugarcrm', 'calls', 0 AS chunk_num, '1=1', > COUNT(*) AS cnt, > LOWER(CONCAT(LPAD(CONV(BIT_XOR(CAST(CONV(SUBSTRING(@crc, 1, 16), 16, 10) > AS UNSIGNED)), 10, 16), 16, '0'), > LPAD(CONV(BIT_XOR(CAST(CONV(SUBSTRING(@crc := MD5(CONCAT_WS('#', `id`, > `date_entered`, `date_modified`, `assigned_user_id`, `modified_user_id`, > `created_by`, `team_id`, `name`, `duration_hours`, `duration_minutes`, > `date_start`, `date_end`, `parent_type`, `status`, `direction`, > `parent_id`, `description`, `deleted`, `reminder_time`, `outlook_id`, > CONCAT(ISNULL(`date_entered`), ISNULL(`date_modified`), > ISNULL(`assigned_user_id`), ISNULL(`modified_user_id`), > ISNULL(`created_by`), ISNULL(`team_id`), ISNULL(`name`), > ISNULL(`duration_hours`), ISNULL(`duration_minutes`), > ISNULL(`date_start`), ISNULL(`date_end`), ISNULL(`parent_type`), > ISNULL(`status`),
Does sugarcrm.calls exist on the master but not on the slave?
Also, given your command line opts, it looks like you're using an older version of Maatkit. What does mk-table-checksum --version say, please?
I forgot to mention that there is also --schema which checksums the
tables' schemas instead of their data. This may be relevant to what
you're trying to do.
On Oct 22, 8:29 am, Daniel Nichter <dan...@percona.com> wrote:
> > I am in the process of evaluating / testing the mk-table-checksum with
> > --replicate switch to check that my slaves and their masters are in sync.
> > 2 of my test cases failed. I am curious if I am missing a command-line
> > switch or configuration OR if I am asking mk-table-checksum to do
> > something that it doesn't do.
> > Here are the scenarios that failed. They are all DDL related.
> > MASTER: mysql Ver 14.12 Distrib 5.0.51a, for debian-linux-gnu (i486)
> > using readline 5.2
> > SLAVE: mysql Ver 14.12 Distrib 5.0.51a, for debian-linux-gnu (i486)
> > using readline 5.2
> > The environment is isolated --> there are no connections to the
> > instances apart from mine.
> > Before each test case is executed, I confirm that the mast and slave are
> > in sync.
> > 1- mk-table-checksum -u {name} -p{pwd} h=green,P=3620 --replicate
> > mysql.checksum --emptyrepltbl
> > 2- mk-table-checksum -u {name} -p{pwd} --replicate mysql.checksum
> > --replcheck 1 h=192.168.0.7,P=3610
> > First test that failed: Add a table to the slave
> > Expected results: The --replcheck switch would note the DDL is different.
> > Actual Result: The --replcheck returned nothing, implying that the
> > master and slave are in sync.
> Are you trying to sync schemas (table ddls), too? mk-table-checksum cannot do
> this. It doesn't check schemas for differences; it assumes that tables on
> source and destination are the same.
> > Second test that failed: Drop a table from the slave that exists on both
> > the slave and master.
> > Expected result: The --replcheck switch would note the DDL is different.
> > Actual Result: The --replcheck returned nothing, implying that the
> > master and slave are in sync. Also, Replication broke because of the
> > missing table on the slave:
> > <From SHOW SLAVE STATUS \G>
> > Last_Errno: 1146
> > Last_Error: Error 'Table 'sugarcrm.calls' doesn't
> > exist' on query. Default database: 'sugarcrm'. Query: 'REPLACE
> > /*sugarcrm.calls:1/1*/ INTO mysql.checksum (db, tbl, chunk, boundaries,
> > this_cnt, this_crc) SELECT 'sugarcrm', 'calls', 0 AS chunk_num, '1=1',
> > COUNT(*) AS cnt,
> > LOWER(CONCAT(LPAD(CONV(BIT_XOR(CAST(CONV(SUBSTRING(@crc, 1, 16), 16, 10)
> > AS UNSIGNED)), 10, 16), 16, '0'),
> > LPAD(CONV(BIT_XOR(CAST(CONV(SUBSTRING(@crc := MD5(CONCAT_WS('#', `id`,
> > `date_entered`, `date_modified`, `assigned_user_id`, `modified_user_id`,
> > `created_by`, `team_id`, `name`, `duration_hours`, `duration_minutes`,
> > `date_start`, `date_end`, `parent_type`, `status`, `direction`,
> > `parent_id`, `description`, `deleted`, `reminder_time`, `outlook_id`,
> > CONCAT(ISNULL(`date_entered`), ISNULL(`date_modified`),
> > ISNULL(`assigned_user_id`), ISNULL(`modified_user_id`),
> > ISNULL(`created_by`), ISNULL(`team_id`), ISNULL(`name`),
> > ISNULL(`duration_hours`), ISNULL(`duration_minutes`),
> > ISNULL(`date_start`), ISNULL(`date_end`), ISNULL(`parent_type`),
> > ISNULL(`status`),
> Does sugarcrm.calls exist on the master but not on the slave?
> Also, given your command line opts, it looks like you're using an older
> version of Maatkit. What does mk-table-checksum --version say, please?
Thanks for the quick response... I have not been able to successfully validate my email with Google Groups, so I need to respond with my native email client.
[1] I would like to know when schema's are out of sync. Is this an option in maatkit? Can I use the information_schema database to check for DDL inconsistencies? [2] Originally, sugarcrm.calls existed on both the master and the slave. For the test, I dropped the table on the slave. It exists on the master. [3] Version: mk-table-checksum Ver 1.1.26 Distrib 1877 Changeset 1871 [OS: Linux lenny 2.6.26-1-686 #1 SMP Fri Mar 13 18:08:45 UTC 2009 i686 GNU/Linux]
Tucker, Gabriel wrote: > Hi, > I am in the process of evaluating / testing the mk-table-checksum with > --replicate switch to check that my slaves and their masters are in sync. > 2 of my test cases failed. I am curious if I am missing a command-line > switch or configuration OR if I am asking mk-table-checksum to do > something that it doesn't do. > Here are the scenarios that failed. They are all DDL related. > MASTER: mysql Ver 14.12 Distrib 5.0.51a, for debian-linux-gnu (i486) > using readline 5.2 > SLAVE: mysql Ver 14.12 Distrib 5.0.51a, for debian-linux-gnu (i486) > using readline 5.2 > The environment is isolated --> there are no connections to the > instances apart from mine. > Before each test case is executed, I confirm that the mast and slave are > in sync. > 1- mk-table-checksum -u {name} -p{pwd} h=green,P=3620 --replicate > mysql.checksum --emptyrepltbl > 2- mk-table-checksum -u {name} -p{pwd} --replicate mysql.checksum > --replcheck 1 h=192.168.0.7,P=3610 > First test that failed: Add a table to the slave > Expected results: The --replcheck switch would note the DDL is different. > Actual Result: The --replcheck returned nothing, implying that the > master and slave are in sync.
Are you trying to sync schemas (table ddls), too? mk-table-checksum cannot do this. It doesn't check schemas for differences; it assumes that tables on source and destination are the same.
> Second test that failed: Drop a table from the slave that exists on both > the slave and master. > Expected result: The --replcheck switch would note the DDL is different. > Actual Result: The --replcheck returned nothing, implying that the > master and slave are in sync. Also, Replication broke because of the > missing table on the slave: > <From SHOW SLAVE STATUS \G> > Last_Errno: 1146 > Last_Error: Error 'Table 'sugarcrm.calls' doesn't > exist' on query. Default database: 'sugarcrm'. Query: 'REPLACE > /*sugarcrm.calls:1/1*/ INTO mysql.checksum (db, tbl, chunk, boundaries, > this_cnt, this_crc) SELECT 'sugarcrm', 'calls', 0 AS chunk_num, '1=1', > COUNT(*) AS cnt, > LOWER(CONCAT(LPAD(CONV(BIT_XOR(CAST(CONV(SUBSTRING(@crc, 1, 16), 16, 10) > AS UNSIGNED)), 10, 16), 16, '0'), > LPAD(CONV(BIT_XOR(CAST(CONV(SUBSTRING(@crc := MD5(CONCAT_WS('#', `id`, > `date_entered`, `date_modified`, `assigned_user_id`, `modified_user_id`, > `created_by`, `team_id`, `name`, `duration_hours`, `duration_minutes`, > `date_start`, `date_end`, `parent_type`, `status`, `direction`, > `parent_id`, `description`, `deleted`, `reminder_time`, `outlook_id`, > CONCAT(ISNULL(`date_entered`), ISNULL(`date_modified`), > ISNULL(`assigned_user_id`), ISNULL(`modified_user_id`), > ISNULL(`created_by`), ISNULL(`team_id`), ISNULL(`name`), > ISNULL(`duration_hours`), ISNULL(`duration_minutes`), > ISNULL(`date_start`), ISNULL(`date_end`), ISNULL(`parent_type`), > ISNULL(`status`),
Does sugarcrm.calls exist on the master but not on the slave?
Also, given your command line opts, it looks like you're using an older version of Maatkit. What does mk-table-checksum --version say, please? -Daniel
> Thanks for the quick response... I have not been able to successfully > validate my email with Google Groups, so I need to respond with my > native email client.
> [1] I would like to know when schema's are out of sync. Is this an > option in maatkit? Can I use the information_schema database to check > for DDL inconsistencies?
Yes, --schema does this. When using --schema, a lot of other options are disabled, so you cannot, for example, checksum both schemas and data in one run of the script. But, see below about your version of Maatkit...
> [2] Originally, sugarcrm.calls existed on both the master and the > slave. For the test, I dropped the table on the slave. It exists on > the master.
Hm, I should probably test that this is not a bug. In general, no script should ever produce slave errors but first let's...
> [3] Version: mk-table-checksum Ver 1.1.26 Distrib 1877 Changeset 1871 > [OS: Linux lenny 2.6.26-1-686 #1 SMP Fri Mar 13 18:08:45 UTC 2009 i686 > GNU/Linux]
...have you upgrade. :-) v1877 is quite old; the latest is 4790. You can download newer versions at http://code.google.com/p/maatkit/downloads/list. Some bugs were fixed in mk-table-sync and mk-loadavg just after 4790 was release so if you plan on using either of these two scripts you should also:
> > I am in the process of evaluating / testing the mk-table-checksum with > > --replicate switch to check that my slaves and their masters are in > sync.
> > 2 of my test cases failed. I am curious if I am missing a command-line > > switch or configuration OR if I am asking mk-table-checksum to do > > something that it doesn't do.
> > Here are the scenarios that failed. They are all DDL related.
> > MASTER: mysql Ver 14.12 Distrib 5.0.51a, for debian-linux-gnu (i486) > > using readline 5.2 > > SLAVE: mysql Ver 14.12 Distrib 5.0.51a, for debian-linux-gnu (i486) > > using readline 5.2
> > The environment is isolated --> there are no connections to the > > instances apart from mine.
> > Before each test case is executed, I confirm that the mast and slave are > > in sync. > > 1- mk-table-checksum -u {name} -p{pwd} h=green,P=3620 --replicate > > mysql.checksum --emptyrepltbl > > 2- mk-table-checksum -u {name} -p{pwd} --replicate mysql.checksum > > --replcheck 1 h=192.168.0.7,P=3610
> > First test that failed: Add a table to the slave > > Expected results: The --replcheck switch would note the DDL is > different. > > Actual Result: The --replcheck returned nothing, implying that the > > master and slave are in sync.
> Are you trying to sync schemas (table ddls), too? mk-table-checksum > cannot do > this. It doesn't check schemas for differences; it assumes that tables on > source and destination are the same.
On Thu, Oct 22, 2009 at 11:03 AM, Daniel Nichter <dan...@percona.com> wrote: > Tucker, Gabriel wrote:
>> Dan,
>> Thanks for the quick response... I have not been able to successfully >> validate my email with Google Groups, so I need to respond with my native >> email client.
>> [1] I would like to know when schema's are out of sync. Is this an >> option in maatkit? Can I use the information_schema database to check for >> DDL inconsistencies?
> Yes, --schema does this. When using --schema, a lot of other options are > disabled, so you cannot, for example, checksum both schemas and data in one > run of the script. But, see below about your version of Maatkit...
> [2] Originally, sugarcrm.calls existed on both the master and the slave. >> For the test, I dropped the table on the slave. It exists on the master.
> Hm, I should probably test that this is not a bug. In general, no script > should ever produce slave errors but first let's...
> [3] Version: mk-table-checksum Ver 1.1.26 Distrib 1877 Changeset 1871 >> [OS: Linux lenny 2.6.26-1-686 #1 SMP Fri Mar 13 18:08:45 UTC 2009 i686 >> GNU/Linux]
> ...have you upgrade. :-) v1877 is quite old; the latest is 4790. You can > download newer versions at http://code.google.com/p/maatkit/downloads/list > . > Some bugs were fixed in mk-table-sync and mk-loadavg just after 4790 was > release so if you plan on using either of these two scripts you should also:
>> > I am in the process of evaluating / testing the mk-table-checksum with >> > --replicate switch to check that my slaves and their masters are in >> sync.
>> > 2 of my test cases failed. I am curious if I am missing a command-line >> > switch or configuration OR if I am asking mk-table-checksum to do >> > something that it doesn't do.
>> > Here are the scenarios that failed. They are all DDL related.
>> > MASTER: mysql Ver 14.12 Distrib 5.0.51a, for debian-linux-gnu (i486) >> > using readline 5.2 >> > SLAVE: mysql Ver 14.12 Distrib 5.0.51a, for debian-linux-gnu (i486) >> > using readline 5.2
>> > The environment is isolated --> there are no connections to the >> > instances apart from mine.
>> > Before each test case is executed, I confirm that the mast and slave >> are >> > in sync. >> > 1- mk-table-checksum -u {name} -p{pwd} h=green,P=3620 --replicate >> > mysql.checksum --emptyrepltbl >> > 2- mk-table-checksum -u {name} -p{pwd} --replicate mysql.checksum >> > --replcheck 1 h=192.168.0.7,P=3610
>> > First test that failed: Add a table to the slave >> > Expected results: The --replcheck switch would note the DDL is >> different. >> > Actual Result: The --replcheck returned nothing, implying that the >> > master and slave are in sync.
>> Are you trying to sync schemas (table ddls), too? mk-table-checksum >> cannot do >> this. It doesn't check schemas for differences; it assumes that tables on >> source and destination are the same.
On Thu, Oct 22, 2009 at 11:03 AM, Daniel Nichter <dan...@percona.com> wrote: >> [2] Originally, sugarcrm.calls existed on both the master and the >> slave. For the test, I dropped the table on the slave. It exists on >> the master.
> Hm, I should probably test that this is not a bug. In general, no script > should ever produce slave errors but first let's...
It is not a bug. The --replicate option does
INSERT INTO <checksum table> SELECT <checksum query> FROM <table>
This will definitely break replication if the table doesn't exist on the slave, and with the --replicate option, there is no attempt to connect to other servers, so there'd be no way to know about the missing table anyway. I don't think this behavior is easy or necessary to change.
mk-table-checksum is generally meant to checksum data, with the --schema option being a limited way to find differrences in table schemas.
I upgraded maatkit and ran the mk-table-checksum with the --schema switch. It works as I would expect.
My thoughts are first to confirm that the schema matches on the Master and the Slave. The only way I see I can accomplish this is by running the --schema option from both the Master and the Slave. If the schema is correct, then I will check the data by executing mk-table-checksum with the --replicate and --replicate-check from the Master.