Re: [percona-group] corrupt partitioned table after server crash

2,840 views
Skip to first unread message

dileep ch

unread,
Nov 20, 2012, 12:28:43 AM11/20/12
to percona-d...@googlegroups.com
Hi,
  Can we know what updates are done on the table.And let me know the permissions of .frm file for the table  commute_amplify_10000.

Regards,
Dileep Chittaluri.



On Tue, Nov 20, 2012 at 5:03 AM, Dane Miller <da...@optimalsocial.com> wrote:
Hi,

After a server crash last night, one of our tables is corrupt and unusable.  It is a partitioned InnoDB table, where partitions had just been updated (added/dropped) seconds before the server crashed.  This is an EC2 instance, but the crash looks similar to power failure on a physical server -- the syslog entries stop at a certain time, then 10min later we see boot messages starting up. I naively assumed InnoDB tables were crash-safe, but it appears that partitioning breaks this contract.  I'd love to understand this better and would appreciate any insights.

Thanks,
Dane


Here are details...

mysql> select * from commute_amplify_10000;
ERROR 1696 (HY000): Failed to read from the .par file


EC2 m1.small
Percona Server 5.5.24-rel26.0-256.lucid
# uname -a
Linux amplify-db1.optorb.com 2.6.32-317-ec2 #36-Ubuntu SMP Fri Jul 8 17:53:11 UTC 2011 i686 GNU/Linux

# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 10.04.3 LTS
Release: 10.04
Codename: lucid

binlogs:
 local ephemeral storage
 /dev/sdb on /mnt type xfs (rw,noatime,nodiratime,nobarrier)
data:
 EBS
 /dev/sdj on /opt/mysql type xfs (rw,noatime,nodiratime,nobarrier)

mysql> check table commute_amplify_10000\G
*************************** 1. row ***************************
   Table: slicer.commute_amplify_10000
      Op: check
Msg_type: Error
Msg_text: Failed to read from the .par file
*************************** 2. row ***************************
   Table: slicer.commute_amplify_10000
      Op: check
Msg_type: Error
Msg_text: Incorrect information in file: './slicer/commute_amplify_10000.frm'
*************************** 3. row ***************************
   Table: slicer.commute_amplify_10000
      Op: check
Msg_type: error
Msg_text: Corrupt
3 rows in set (0.00 sec)

-rw-rw---- 1 mysql mysql    0 2012-11-19 08:17 /opt/mysql/data/slicer/commute_amplify_10000.par


--
You received this message because you are subscribed to the Google Groups "Percona Discussion" group.
To post to this group, send email to percona-d...@googlegroups.com.
 
 

dileep ch

unread,
Nov 20, 2012, 12:41:28 AM11/20/12
to percona-d...@googlegroups.com
Hi Dane,
It might happened because of corrupted .par file. The immediate solution for that is to remove the present .par file and replace the recent .par file from recent backups and then restart the service of mysql. And now try to read the data from that partitioned table. This will be successful process. And if you want to change the partitions type just take a fullbackup before you do and do the updates om partitions.

Regards,
Dileep Chittaluri.

Dane Miller

unread,
Nov 20, 2012, 7:14:04 PM11/20/12
to percona-d...@googlegroups.com
I examined the binlogs and discovered that our application no longer uses this table "commute_amplify_10000".  The table is empty, and there have been no inserts or updates to this table in the past 5 days.  The activity on this table consists entirely of partitioning queries --  adding partitions: ALTER TABLE ... REORGANIZE PARTITION pmax ..., and dropping partitions: ALTER TABLE ... DROP PARTITION ...


"Drop partitions" ran at 8:17:02, and then the sever crashed 25 seconds later at 8:17:27 (timestamp of last heartbeat).  Here are the final ALTER TABLE queries from 8:17:02:

ALTER TABLE commute_amplify_10000 DROP PARTITION commute_amplify_10000_135328230;
ALTER TABLE commute_amplify_10000 DROP PARTITION commute_amplify_10000_135328320;
ALTER TABLE commute_amplify_10000 DROP PARTITION commute_amplify_10000_135328410;

Dane

Dane Miller

unread,
Nov 20, 2012, 7:17:04 PM11/20/12
to percona-d...@googlegroups.com
Here are the permissions of the .frm file...

-rw-rw---- 1 mysql mysql  18604 2012-11-19 08:17 /var/lib/mysql/slicer/commute_amplify_10000.frm

Dane

On Monday, November 19, 2012 9:29:26 PM UTC-8, dileep ch wrote:

Dane Miller

unread,
Nov 20, 2012, 7:22:46 PM11/20/12
to percona-d...@googlegroups.com
Thanks for the recovery tips.  We failed over to a replica and took this instance out of operation, so recovery isn't the issue.  I'm hoping to understand the failure better to avoid in the future.

Is this a known limitation of mysql partitions -- that they're not crash safe?  Or perhaps a risk of crashing during DDL in general?  A bug?

Dane

Justin Swanhart

unread,
Nov 20, 2012, 7:47:15 PM11/20/12
to percona-d...@googlegroups.com
Hi,

In MySQL a crash during DDL is typically very bad operationally.
MySQL does not have two-phase commit between storage-engine
independent files on disk which describe tables (.frm, .par, .opt
files, etc) and the underlying storage for the table itself. This
means that a crash during the drop of a partition or table, or during
the reorganization of a table can cause MySQL can enter into an
inconsistent state with respect to the files on disk and the table
definitions themselves.

--Justin

Dane Miller

unread,
Nov 21, 2012, 1:57:46 PM11/21/12
to percona-d...@googlegroups.com
On Tue, Nov 20, 2012 at 4:47 PM, Justin Swanhart
<justin....@percona.com> wrote:
> In MySQL a crash during DDL is typically very bad operationally.
> MySQL does not have two-phase commit between storage-engine
> independent files on disk which describe tables (.frm, .par, .opt
> files, etc) and the underlying storage for the table itself. This
> means that a crash during the drop of a partition or table, or during
> the reorganization of a table can cause MySQL can enter into an
> inconsistent state with respect to the files on disk and the table
> definitions themselves.

Thanks Justin. This is unfortunate, but it makes sense. This implies
that our partitioning strategy is riskier than I realized. We're
doing anywhere from 10-30 ALTER TABLEs per hour to add/drop
partitions.

This makes table partitioning with mysql less appealing. Will have to
ponder futher.

Dane
Reply all
Reply to author
Forward
0 new messages