what happened to set unique_checks=0 for Toku

41 views
Skip to first unread message

MARK CALLAGHAN

unread,
May 11, 2016, 11:46:28 AM5/11/16
to percona-d...@googlegroups.com
A long time ago someone added an option to the Python insert benchmark client to use '

"set unique_checks=0'' for TokuDB and that made CPU bound loads faster because Toku could skip checking for uniqueness of the PK on insert. This doesn't do much to reduce IO load when the inserts are in PK order, but it saves on CPU. 

I don't see a benefit from using that for Toku in Percona Server 5.6.26. Did the optimization get removed? 

We have the optimization for MyRocks. Unfortunately we didn't use the existing option name and it comes from "
set rocksdb_skip_unique_check=1" until this is fixed:
https://github.com/facebook/mysql-5.6/issues/246


--
Mark Callaghan
mdca...@gmail.com

Peter Zaitsev

unread,
May 11, 2016, 12:48:17 PM5/11/16
to percona-discussion
Hi Mark,

Thanks for sharing!  Did you see unique_checks= with TokuDB improving performance in some of the earlier versions ? 

--
You received this message because you are subscribed to the Google Groups "Percona Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to percona-discuss...@googlegroups.com.
To post to this group, send email to percona-d...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Peter Zaitsev, CEO, Percona
Tel: +1 888 401 3401 ext 7360   Skype:  peter_zaitsev



MARK CALLAGHAN

unread,
May 11, 2016, 1:13:19 PM5/11/16
to percona-d...@googlegroups.com
I don't remember. I think that change to iibench.py came from the original Toku team so Zardosht or Leif might remember.
Mark Callaghan
mdca...@gmail.com

George Lorch

unread,
May 11, 2016, 1:56:09 PM5/11/16
to Percona Discussion
OK, so it seems there is still code in place for this option, but I do not like the idea at all. Allowing a user to intentionally corrupt/de-sync their indices is a terrible idea in any storage engine. I can not say if it is functional today or not though. It is possible that implementation of other features (such as Read Free Replication) might have unintentionally broken the behavior. I have added https://tokutek.atlassian.net/browse/DB-994 to our roadmap to investigate a practical and safe implementation of this idea.


On Wednesday, May 11, 2016 at 10:13:19 AM UTC-7, MarkCallaghan wrote:
I don't remember. I think that change to iibench.py came from the original Toku team so Zardosht or Leif might remember.
On Wed, May 11, 2016 at 9:48 AM, Peter Zaitsev <p...@percona.com> wrote:
Hi Mark,

Thanks for sharing!  Did you see unique_checks= with TokuDB improving performance in some of the earlier versions ? 
On Wed, May 11, 2016 at 11:46 AM, MARK CALLAGHAN <mdca...@gmail.com> wrote:
A long time ago someone added an option to the Python insert benchmark client to use '

"set unique_checks=0'' for TokuDB and that made CPU bound loads faster because Toku could skip checking for uniqueness of the PK on insert. This doesn't do much to reduce IO load when the inserts are in PK order, but it saves on CPU. 

I don't see a benefit from using that for Toku in Percona Server 5.6.26. Did the optimization get removed? 

We have the optimization for MyRocks. Unfortunately we didn't use the existing option name and it comes from "
set rocksdb_skip_unique_check=1" until this is fixed:
https://github.com/facebook/mysql-5.6/issues/246


--
Mark Callaghan
mdca...@gmail.com

--
You received this message because you are subscribed to the Google Groups "Percona Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to percona-discussion+unsub...@googlegroups.com.

To post to this group, send email to percona-d...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Peter Zaitsev, CEO, Percona
Tel: +1 888 401 3401 ext 7360   Skype:  peter_zaitsev



--
You received this message because you are subscribed to the Google Groups "Percona Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to percona-discussion+unsub...@googlegroups.com.

To post to this group, send email to percona-d...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Mark Callaghan
mdca...@gmail.com

Peter Zaitsev

unread,
May 11, 2016, 2:12:41 PM5/11/16
to percona-discussion
George,

This functionality exists for Innodb for years.    Users should not use it but  when loading data which has been already validated it often can be quite a performance speedup.    
This might be considered as an option to be used by loaded software only which validates data match before loading.

On Wed, May 11, 2016 at 1:56 PM, George Lorch <george...@percona.com> wrote:
OK, so it seems there is still code in place for this option, but I do not like the idea at all. Allowing a user to intentionally corrupt/de-sync their indices is a terrible idea in any storage engine. I can not say if it is functional today or not though. It is possible that implementation of other features (such as Read Free Replication) might have unintentionally broken the behavior. I have added https://tokutek.atlassian.net/browse/DB-994 to our roadmap to investigate a practical and safe implementation of this idea.

On Wednesday, May 11, 2016 at 10:13:19 AM UTC-7, MarkCallaghan wrote:
I don't remember. I think that change to iibench.py came from the original Toku team so Zardosht or Leif might remember.
On Wed, May 11, 2016 at 9:48 AM, Peter Zaitsev <p...@percona.com> wrote:
Hi Mark,

Thanks for sharing!  Did you see unique_checks= with TokuDB improving performance in some of the earlier versions ? 
On Wed, May 11, 2016 at 11:46 AM, MARK CALLAGHAN <mdca...@gmail.com> wrote:
A long time ago someone added an option to the Python insert benchmark client to use '

"set unique_checks=0'' for TokuDB and that made CPU bound loads faster because Toku could skip checking for uniqueness of the PK on insert. This doesn't do much to reduce IO load when the inserts are in PK order, but it saves on CPU. 

I don't see a benefit from using that for Toku in Percona Server 5.6.26. Did the optimization get removed? 

We have the optimization for MyRocks. Unfortunately we didn't use the existing option name and it comes from "
set rocksdb_skip_unique_check=1" until this is fixed:
https://github.com/facebook/mysql-5.6/issues/246


--
Mark Callaghan
mdca...@gmail.com

--
You received this message because you are subscribed to the Google Groups "Percona Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to percona-discuss...@googlegroups.com.
To post to this group, send email to percona-d...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Peter Zaitsev, CEO, Percona
Tel: +1 888 401 3401 ext 7360   Skype:  peter_zaitsev



--
You received this message because you are subscribed to the Google Groups "Percona Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to percona-discuss...@googlegroups.com.
To post to this group, send email to percona-d...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Mark Callaghan
mdca...@gmail.com

--
You received this message because you are subscribed to the Google Groups "Percona Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to percona-discuss...@googlegroups.com.

To post to this group, send email to percona-d...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

MARK CALLAGHAN

unread,
May 11, 2016, 2:16:26 PM5/11/16
to percona-d...@googlegroups.com
From the one page I checked, with InnoDB this is only for unique constraints on secondary indexes. It doesn't help with the PK because it can't do anything for the PK. With a b-tree you have to read the page before adding the inserted row to it.

For RocksDB this helps with unique secondary indexes and with the PK.

George Lorch

unread,
May 11, 2016, 2:16:52 PM5/11/16
to percona-d...@googlegroups.com
According to the documentation, if it is to be trusted, InnoDB will always do uniqueness check on the PK and skip the check on unique secondaries, this seems safe. In the TokuDB implementation, there is no test to see if the PK is unique and will allow uniqueness checks to be skipped on the PK and secondaries, which is what can cause out of sync indices.

You received this message because you are subscribed to a topic in the Google Groups "Percona Discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/percona-discussion/CbXCdUsw9MQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to percona-discuss...@googlegroups.com.

To post to this group, send email to percona-d...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
George O. Lorch III
Software Engineer, Percona
US/Arizona (GMT -7)
skype: george.ormond.lorch.iii

Abdelhak Errami

unread,
May 11, 2016, 3:00:16 PM5/11/16
to Percona Discussion
George:

I think if unique_checks is ON, we should be able record somewhere like in main FT or information_schema tables that the table was pre-sorted, in case down the road we found that the table became corrupted so we know what may caused it, this way everyone wins :)

Abdel


On Wednesday, May 11, 2016 at 2:16:52 PM UTC-4, George Lorch wrote:
According to the documentation, if it is to be trusted, InnoDB will always do uniqueness check on the PK and skip the check on unique secondaries, this seems safe. In the TokuDB implementation, there is no test to see if the PK is unique and will allow uniqueness checks to be skipped on the PK and secondaries, which is what can cause out of sync indices.
On Wed, May 11, 2016 at 11:12 AM, Peter Zaitsev <p...@percona.com> wrote:
George,

This functionality exists for Innodb for years.    Users should not use it but  when loading data which has been already validated it often can be quite a performance speedup.    
This might be considered as an option to be used by loaded software only which validates data match before loading.
On Wed, May 11, 2016 at 1:56 PM, George Lorch <george...@percona.com> wrote:
OK, so it seems there is still code in place for this option, but I do not like the idea at all. Allowing a user to intentionally corrupt/de-sync their indices is a terrible idea in any storage engine. I can not say if it is functional today or not though. It is possible that implementation of other features (such as Read Free Replication) might have unintentionally broken the behavior. I have added https://tokutek.atlassian.net/browse/DB-994 to our roadmap to investigate a practical and safe implementation of this idea.

On Wednesday, May 11, 2016 at 10:13:19 AM UTC-7, MarkCallaghan wrote:
I don't remember. I think that change to iibench.py came from the original Toku team so Zardosht or Leif might remember.
On Wed, May 11, 2016 at 9:48 AM, Peter Zaitsev <p...@percona.com> wrote:
Hi Mark,

Thanks for sharing!  Did you see unique_checks= with TokuDB improving performance in some of the earlier versions ? 
On Wed, May 11, 2016 at 11:46 AM, MARK CALLAGHAN <mdca...@gmail.com> wrote:
A long time ago someone added an option to the Python insert benchmark client to use '

"set unique_checks=0'' for TokuDB and that made CPU bound loads faster because Toku could skip checking for uniqueness of the PK on insert. This doesn't do much to reduce IO load when the inserts are in PK order, but it saves on CPU. 

I don't see a benefit from using that for Toku in Percona Server 5.6.26. Did the optimization get removed? 

We have the optimization for MyRocks. Unfortunately we didn't use the existing option name and it comes from "
set rocksdb_skip_unique_check=1" until this is fixed:
https://github.com/facebook/mysql-5.6/issues/246


--
Mark Callaghan
mdca...@gmail.com

--
You received this message because you are subscribed to the Google Groups "Percona Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to percona-discussion+unsub...@googlegroups.com.
To post to this group, send email to percona-d...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Peter Zaitsev, CEO, Percona
Tel: +1 888 401 3401 ext 7360   Skype:  peter_zaitsev



--
You received this message because you are subscribed to the Google Groups "Percona Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to percona-discussion+unsub...@googlegroups.com.
To post to this group, send email to percona-d...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Mark Callaghan
mdca...@gmail.com

--
You received this message because you are subscribed to the Google Groups "Percona Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to percona-discussion+unsub...@googlegroups.com.
To post to this group, send email to percona-d...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Peter Zaitsev, CEO, Percona
Tel: +1 888 401 3401 ext 7360   Skype:  peter_zaitsev



--
You received this message because you are subscribed to a topic in the Google Groups "Percona Discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/percona-discussion/CbXCdUsw9MQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to percona-discussion+unsub...@googlegroups.com.

To post to this group, send email to percona-d...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages