tokudb read bench issue :(

237 views
Skip to first unread message

raphael veyrin

unread,
Jul 4, 2014, 4:49:47 AM7/4/14
to tokud...@googlegroups.com
I've read in many place the overall benefits of tokudb, mainly on insert speed but not much on read,  so in set it up on a 5.5.38 mariadb under linux.
Server is a monster (32 cores, 256g ram, 256g ssd)

I dump an innodb table of 1.5M rows to make my comparison. (copying then converting to tokudb)

My reference has multiple indexes and a partition

CREATE TABLE IF NOT EXISTS `table1_tokudb` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `field1` int(10) unsigned NOT NULL,
  `field2` int(10) unsigned NOT NULL,
  `field3` int(10) unsigned NOT NULL,
  `ts_date` date NOT NULL,
  `ts_time` time NOT NULL,
  `value` text,
  `utc` datetime NOT NULL,
  `field4` int(10) unsigned NOT NULL,
  `field5` int(10) unsigned NOT NULL,
  PRIMARY KEY (`id`,`ts_date`),
  UNIQUE KEY `unique_logvalue_by_date` (`field1`,`field2`,`field3`,`ts_date`,`ts_time`),
  KEY `date_index` (`ts_date`),
  KEY `time_index` (`ts_time`),
  KEY `sensor_index` (`field1`),
  KEY `attribute_index` (`field3`),
  KEY `sensor_endpoint_attribute` (`field1`,`field2`,`field3`)
) ENGINE=TokuDB  DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4
/*!50100 PARTITION BY HASH (MONTH(ts_date))
PARTITIONS 13 */ AUTO_INCREMENT=140694470 ;

The dataset in innodb requires 88Mo for data and 260 for Index



When i run select without cache i'm disappointed by the result and i'm wondering what is causing this issue without finding it.

SELECT SQL_NO_CACHE field1,count(*) FROM compare.table1 GROUP BY field1;
SELECT SQL_NO_CACHE field1,count(*) FROM compare.table1_tokudb GROUP BY field1;


average of tenth call gives the following results:
Innodb table: 0.24 second
tokudb table: 1.49 second


Is it because dataset resides in memory and tokudb shines when it does not fit anymore ?
Is it because innodb has right indexes
Is it because of partition ?
key block size ?

anyway the select rate is 7 times slower, and i'm wondering if something is wrong or if it's an average response time


any help appreciated











Phil

unread,
Jul 11, 2014, 10:02:06 AM7/11/14
to tokud...@googlegroups.com
I think TokuDB will ignore your KEY_BLOCK_SIZE, but it wouldn't hurt to omit that and also the ROW_FORMAT and let TokuDB use its defaults.

That aside, yes TokuDB shines more when the data doesn't fit into RAM.

Other benefits could be seen by creating relevant clustering indexes that suit your application.

Phil

yoku ts.

unread,
Jul 14, 2014, 8:23:55 AM7/14/14
to raphael veyrin, tokud...@googlegroups.com
Hi,

I used to experience near you.
In my opinion, TokuDB is slow when using partitioning without partition pruning.

mysql> SHOW CREATE TABLE t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `num` int(10) unsigned NOT NULL,
  `val` varchar(32) DEFAULT NULL,
  PRIMARY KEY (`num`)
) ENGINE=TokuDB DEFAULT CHARSET=utf8
1 row in set (0.01 sec)

mysql> SHOW CREATE TABLE t2\G
*************************** 1. row ***************************
       Table: t2
Create Table: CREATE TABLE `t2` (
  `num` int(10) unsigned NOT NULL,
  `val` varchar(32) DEFAULT NULL,
  PRIMARY KEY (`num`)
) ENGINE=TokuDB DEFAULT CHARSET=utf8
/*!50100 PARTITION BY HASH (num)
PARTITIONS 20 */
1 row in set (0.00 sec)

Two tables store same data, but.

mysql> SELECT COUNT(*) FROM t1;
+----------+
| COUNT(*) |
+----------+
|  1000000 |
+----------+
1 row in set (0.50 sec)

mysql> SELECT COUNT(*) FROM t2;
+----------+
| COUNT(*) |
+----------+
|  1000000 |
+----------+
1 row in set (2.93 sec)

t2 (partitioning by hash) is slower than t1(not partitioning).




--
You received this message because you are subscribed to the Google Groups "tokudb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tokudb-user...@googlegroups.com.
To post to this group, send email to tokud...@googlegroups.com.
Visit this group at http://groups.google.com/group/tokudb-user.
For more options, visit https://groups.google.com/d/optout.

joe

unread,
Jul 14, 2014, 9:10:13 AM7/14/14
to tokud...@googlegroups.com
Yoku:

The select count(*) will perform a full table scan. Partitioning the table into 20 partitions will result in 20 files that all need to be opened and scanned serially. This is the behavior of both TokuDB and InnoDB. A more common use case of partitioning that we see is the use of range partitioning due to the data management advantages since modern storage infrastructures effectively spread the data across physical spindles.

Raphael:

TokuDB offers big benefits to reduce io. It looks like the dataset you've created will fit into RAM so I'd expect comparable performance between the two engines. The difference you posted seems large, if you can package up your use case along with the data and post it, I can take a look to see if there's a reason for this big of a difference.

Joe

Rich Prohaska

unread,
Jul 14, 2014, 10:17:04 AM7/14/14
to tokud...@googlegroups.com
I found that 'select count(*) from t1' uses tokudb bulk fetch, while the 'select count(*) from t2' does not.  Tokudb bulk fetch is an algorithm inside of the tokudb storage engine that speeds up range queries.  This algorithm is NOT being used for the second query since we did not patch the partition storage engine with a NEW handler API that enables it.  We will enable tokudb bulk fetch, which will speed up range queries on partitioned tables, in the sept tokudb release.   Thanks for reporting it.

yoku ts.

unread,
Jul 14, 2014, 11:28:25 PM7/14/14
to tokud...@googlegroups.com
Hi, joe, Rich,

Yes, I saw using partitioning without partition pruning is slower than not using partitioning one, with TokuDB and XtraDB(InnoDB).
But I thought TokuDB slows down hardly, more than XtraDB slows down.

I tested three cases, table without partitioning, table with hash partitioning, table with range partitioning, each TokuDB and XtraDB.

(I'm sorry about these entries wirtten in Japanese, please see SQL statements)

My result, TokuDB with partition by range using pruning is the slowest case.
(I saw query type is "index", whole index scan, only in that case)

This means "TokuDB can't use tokudb bulk fetch"?
If pruning partition always avoids tokudb bulk fetch, I should turn off partitioning for tokudb bulk fetch, shouldn't I?


yoku,







Rich Prohaska

unread,
Jul 15, 2014, 10:17:43 AM7/15/14
to tokud...@googlegroups.com
I ran your experiment with 1M rows on t1 (no partitions), t2 (10 hash partitions), and t3 (several range partitions) on a debug build.

select count(*) from t1, 1.4 seconds, tokudb bulk fetch ON because tokudb::prelock_range is called.
select count(*) from t2, 5.4 seconds, tokudb bulk fetch OFF because tokudb::prelock_range is NOT called.
select count(*) from t3, 5.7 seconds, tokudb bulk fetch OFF because tokudb::prelock_range is NOT called.

The TokuDB bug is that index scans for partitions do not use bulk fetch when they could, and performance suffers.

select count(*) from t1 where num>900000, 0.82 seconds, tokudb bulk fetch ON because tokudb::prelock_range called from MRR code.
select count(*) from t2 where num>900000, 0.86 seconds, tokudb bulk fetch ON, 10 calls to tokudb::prelock_range called frm MRR code for each of the 10 partitions.
select count(*) from t3 where num>900000, 2.55 seconds, tokudb bulk fetch OFF, 2 full index scans with no calls to tokudb::prelock_range.

Same bug as the first case.
To unsubscribe from this group and stop receiving emails from it, send an email to tokudb-user+unsubscribe@googlegroups.com.

yoku ts.

unread,
Jul 15, 2014, 11:53:10 AM7/15/14
to tokud...@googlegroups.com
Hi Rich,
Thank you for your explaining in detail.

Would you let me know more about the bug?

* select count(*) from t1 is NOT affected
* select count(*) from t2 is affected (this query have to call tokudb::prelock_range)
* select count(*) from t3 is affected (same as t2)

* select count(*) from t1 where num > 900000 is NOT affected
* select count(*) from t2 where num > 900000 is NOT affected
* Is *NOT* select count(*) from t3 where num > 900000 affected? (just my guess)
 (Can't this query really use tokudb::prelock_range?)

If my guess is correct, TokuDB with partitioning works well when 
"fetching few rows from each partitions" more than "fetching many rows from one partition"?

Regards,


yoku,


To unsubscribe from this group and stop receiving emails from it, send an email to tokudb-user...@googlegroups.com.

Rich Prohaska

unread,
Jul 15, 2014, 1:57:10 PM7/15/14
to tokud...@googlegroups.com
For the three cases in which TokuDB bulk fetch is OFF, performance is slower than it can be.  These are the cases that we intend on fixing. 

raphael veyrin

unread,
Jul 17, 2014, 7:26:20 AM7/17/14
to tokud...@googlegroups.com, pou...@gmail.com
interesting, i'll give it a try 

raphael veyrin

unread,
Jul 17, 2014, 7:34:59 AM7/17/14
to tokud...@googlegroups.com
I was aware of the benefits when out of ram dataset, and i was expecting a similar performance and i'm posting because it's far from similar (and all the blog post i've found talk about the insert).

i've test with range select (but maybe on different partition), i'll reitera test without partition or partition pruned

thanks

Rich Prohaska

unread,
Jul 18, 2014, 9:19:38 AM7/18/14
to tokud...@googlegroups.com
The tokudb-7.1.8-rc4 github tag fixes the slow index scans that sometimes occur on partitioned tokudb tables.
Reply all
Reply to author
Forward
0 new messages