Very high disk reading IO and slave lagged when using TokuDB

138 views
Skip to first unread message

Jun Zhang

unread,
Jul 30, 2016, 5:40:42 PM7/30/16
to Percona Discussion
HI guys:

    I'm running three 5.7.13-6-log Percona Servers, one master and two slaves. I found all of them having very high reading IO, but lower writing IO, and the slaves are logging behind the master.  Can any one tell me why and how to resolve them. 

 The machines are 128GB memory, 1.7TB SSD, 24 cores Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz


"root@localhost Sat Jul 30 17:54:41 2016 17:54:41 [(none)]>status;
--------------
mysql  Ver 14.14 Distrib 5.7.13-6, for Linux (x86_64) using  6.2


Connection id:          5479
Current database:
Current user:           root@localhost
SSL:                    Not in use
Current pager:          stdout
Using outfile:          '/home/work/mysql_query.log'
Using delimiter:        ;
Server version:         5.7.13-6-log Percona Server (GPL), Release 6, Revision e3d58bb
Protocol version:       10
Connection:             Localhost via UNIX socket
Server characterset:    utf8
Db     characterset:    utf8
Client characterset:    utf8
Conn.  characterset:    utf8
UNIX socket:            /tmp/mysqld.sock
Uptime:                 1 hour 3 min 14 sec


Threads: 92  Questions: 1999752  Slow queries: 57  Opens: 1276  Flush tables: 1  Open tables: 1178  Queries per second avg: 527.082


[mysqld]
port              
= 3306

socket            
= /tmp/mysqld.sock
datadir            
= /data/mysql/ssd
pid
-file           = /tmp/mysqld.pid


innodb_numa_interleave


slow
-query-log

slow
-query-log-file=mysql-slow.log




log
-error=mysql-err.log


explicit_defaults_for_timestamp


character
-set-server = utf8
default-storage-engine = InnoDB
open_files_limit
= 655350


connect_timeout
= 120
wait_timeout
= 86400
interactive_timeout
= 86400
innodb_lock_wait_timeout
= 600


max_connections
= 1000
max_allowed_packet
= 32M
max_connect_errors
= 1000000


innodb_buffer_pool_size
= 10G
innodb_buffer_pool_instances
= 6
innodb_log_file_size
= 512M


transaction_isolation
= READ-COMMITTED


innodb_log_buffer_size
= 96M
innodb_io_capacity
= 3000


innodb_file_per_table
= 1


innodb_flush_method
= O_DIRECT
innodb_flush_log_at_trx_commit
= 0


# must disable InnoDB asynchronous IO if backing up InnoDB tables via TokuBackup utility.
innodb_use_native_aio
= 0 #if tmpdir fieltype is tmpfs, AIO will disable, tmpfs doesn't support AIO.
innodb_read_io_threads
= 6
innodb_write_io_threads
= 6
innodb_old_blocks_time
= 1000


query_cache_size
= 0
query_cache_type
= 0
thread_cache_size
= 60
join_buffer_size
= 8M


gtid_mode
=ON
server_id
=1
master
-info-repository=table
relay
-log-info-repository=table
enforce
-gtid-consistency


log
-slave-updates


log
-bin=mysql-bin
log
-bin-index=mysql-bin.index
binlog_format
=row


skip_slave_start
slave
-parallel-workers=2


relay
-log=mysql-relay
relay
-log-index=mysql-relay.index


# tokudb
plugin
-dir = /usr/lib64/mysql/plugin/
plugin
-load=ha_tokudb.so
tokudb_cache_size
= 90G
tokudb_directio
= 1 # 0: Buffered IO, 1: DIRECT IO
tokudb_commit_sync
= 0
tokudb_row_format
= tokudb_zlib
tokudb_read_block_size
= 128K
tokudb_read_buf_size
= 128K


tokudb
-data-dir = /data/mysql/ssd/tokudbData
tokudb
-log-dir = /data/mysql/ssd/tokudbLog


#buffer & cache
table_open_cache
= 2048
table_definition_cache
= 2048
max_heap_table_size
= 96M
sort_buffer_size
= 2M
join_buffer_size
= 2M
tmp_table_size
= 96M
key_buffer_size
= 8M
read_buffer_size
= 2M
read_rnd_buffer_size
= 16M
bulk_insert_buffer_size
= 32M


[root@localhost tokudbData]# sar -p -d 1
Linux 3.10.0-327.22.2.el7.x86_64 (localhost)         07/30/2016      _x86_64_        (24 CPU)


06:00:23 PM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
06:00:24 PM       sda      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
06:00:24 PM       sdb      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
06:00:24 PM       sdc   9750.00 796209.00  16008.00     83.30      3.52      0.36      0.09     91.30


06:00:24 PM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
06:00:25 PM       sda      2.00      0.00     40.00     20.00      0.00      0.00      0.00      0.00
06:00:25 PM       sdb      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
06:00:25 PM       sdc  10342.00 838181.00  18248.00     82.81      3.14      0.31      0.09     94.90


06:00:25 PM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
06:00:26 PM       sda      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
06:00:26 PM       sdb      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
06:00:26 PM       sdc   9280.00 747696.00  16316.00     82.33      2.84      0.31      0.10     94.80


06:00:26 PM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
06:00:27 PM       sda      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
06:00:27 PM       sdb      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
06:00:27 PM       sdc   5141.00 396756.00  12564.00     79.62      1.36      0.27      0.14     72.50


06:00:27 PM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
06:00:28 PM       sda      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
06:00:28 PM       sdb      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
06:00:28 PM       sdc   3356.00 260632.00   5926.00     79.43      0.84      0.25      0.18     61.80


[root@localhost tokudbData]# xfs_info /dev/sdc1
meta
-data=/dev/sdc1              isize=256    agcount=4, agsize=113636224 blks
         
=                       sectsz=512   attr=2, projid32bit=1
         
=                       crc=0        finobt=0
data    
=                       bsize=4096   blocks=454544896, imaxpct=5
         
=                       sunit=0      swidth=0 blks
naming  
=version 2              bsize=4096   ascii-ci=0 ftype=0
log      
=internal               bsize=4096   blocks=221945, version=2
         
=                       sectsz=512   sunit=0 blks, lazy-count=1
realtime
=none                   extsz=4096   blocks=0, rtextents=0
[root@localhost tokudbData]# mount|grep sdc1
/dev/sdc1 on /data/mysql/ssd type xfs (rw,noatime,nodiratime,seclabel,attr2,nobarrier,inode64,noquota)


"root@localhost Sat Jul 30 18:04:37 2016 18:04:37 [zabbix_server]>show create table history\G
*************************** 1. row ***************************
       Table: history
Create Table: CREATE TABLE `history` (
  `itemid` bigint(20) unsigned NOT NULL,
  `clock` int(11) NOT NULL DEFAULT '0',
  `value` double(16,4) NOT NULL DEFAULT '0.0000',
  `ns` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`itemid`,`clock`,`ns`)
) ENGINE=TokuDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin ROW_FORMAT=TOKUDB_FAST
/*!50100 PARTITION BY RANGE (`clock`)
SUBPARTITION BY HASH (`itemid`)
SUBPARTITIONS 6
(PARTITION p20160727_22 VALUES LESS THAN (1469628000) ENGINE = TokuDB,
 PARTITION p20160730_22 VALUES LESS THAN (1469887200) ENGINE = TokuDB,
 PARTITION p20160802_22 VALUES LESS THAN (1470146400) ENGINE = TokuDB,
 PARTITION p20160805_22 VALUES LESS THAN (1470405600) ENGINE = TokuDB,
 PARTITION p20160808_22 VALUES LESS THAN (1470664800) ENGINE = TokuDB,
 PARTITION p20160811_22 VALUES LESS THAN (1470924000) ENGINE = TokuDB,
 PARTITION pmax VALUES LESS THAN MAXVALUE ENGINE = TokuDB) */
1 row in set (0.01 sec)

"root@localhost Sat Jul 30 18:04:43 2016 18:04:43 [zabbix_server]>show create table history_uint\G
*************************** 1. row ***************************
       Table: history_uint
Create Table: CREATE TABLE `history_uint` (
  `itemid` bigint(20) unsigned NOT NULL,
  `clock` int(11) NOT NULL DEFAULT '0',
  `value` bigint(20) unsigned NOT NULL DEFAULT '0',
  `ns` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`itemid`,`clock`,`ns`)
) ENGINE=TokuDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin ROW_FORMAT=TOKUDB_FAST
/*!50100 PARTITION BY RANGE (`clock`)
SUBPARTITION BY HASH (`itemid`)
SUBPARTITIONS 6
(PARTITION p20160727_22 VALUES LESS THAN (1469628000) ENGINE = TokuDB,
 PARTITION p20160730_22 VALUES LESS THAN (1469887200) ENGINE = TokuDB,
 PARTITION p20160802_22 VALUES LESS THAN (1470146400) ENGINE = TokuDB,
 PARTITION p20160805_22 VALUES LESS THAN (1470405600) ENGINE = TokuDB,
 PARTITION p20160808_22 VALUES LESS THAN (1470664800) ENGINE = TokuDB,
 PARTITION p20160811_22 VALUES LESS THAN (1470924000) ENGINE = TokuDB,
 PARTITION pmax VALUES LESS THAN MAXVALUE ENGINE = TokuDB) */
1 row in set (0.00 sec)


Jaime Crespo

unread,
Aug 5, 2016, 8:27:18 AM8/5/16
to percona-d...@googlegroups.com
> Can any one tell me why and how to resolve them.

Jun,


You are probably in need of a db consultancy. You should ask the Percona guys for one: they can probably guide you way better than random internet comments and they are worth every penny invested; as they are proper proffesionals helping you.

Cheers,


--
You received this message because you are subscribed to the Google Groups "Percona Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to percona-discussion+unsub...@googlegroups.com.
To post to this group, send email to percona-discussion@googlegroups.com.
Visit this group at https://groups.google.com/group/percona-discussion.
To view this discussion on the web visit https://groups.google.com/d/msgid/percona-discussion/70c63481-9d73-4984-a391-2f3089e316d9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Jaime Crespo

Peter Zaitsev

unread,
Aug 5, 2016, 8:35:05 AM8/5/16
to percona-discussion
Hi Zun, 

Couple of thoughts here.    It looks like you have a lot of reads going from the disk.  There are few things I would consider here

1) DirectIO for TokuDB is not always best choice.  You might get better performance disabling it and having tokudb_cache_size of 32GB or so  - this will allow OS cache to be used for caching compressed data which is good for some workloads.

2) You have tokudb_read_block_size of 128K which is good for HDD but often not so good for SSD -  block values of 16KB-32KB might work better if you have IO bound workload

3) I would suggest you to install Percona Monitoring and Management and share your TokuDB stats similar to these - It might give more details of what is happening





--
You received this message because you are subscribed to the Google Groups "Percona Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to percona-discussion+unsub...@googlegroups.com.
To post to this group, send email to percona-discussion@googlegroups.com.
Visit this group at https://groups.google.com/group/percona-discussion.
To view this discussion on the web visit https://groups.google.com/d/msgid/percona-discussion/70c63481-9d73-4984-a391-2f3089e316d9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Peter Zaitsev, CEO, Percona
Tel: +1 888 401 3401 ext 7360   Skype:  peter_zaitsev



Reply all
Reply to author
Forward
0 new messages