Issue while benchmarking Percona XtraDB Cluster (beta)

Davey Shafik

unread,

Mar 26, 2012, 12:49:26 AM3/26/12

to percona-d...@googlegroups.com

Hey,

A few weeks (months?) ago I setup a 4-node cluster with the alpha of
Percona XtraDB Cluster for several reasons:

I was looking at a number of traditional MySQL "clustering" and
scaling techniques, and trying out some new ideas of mine; I decided
to evaluate percona cluster at the same time and to do some
exploratory benchmarks against traditional MySQL replication with one
semi-sync slave, and the rest as async slaves.

For all setups, I used AWS micros; which if you know anything about
micros, is a flawed system upon which to run benchmarks because the
available resources are not guaranteed (they are burstable). I placed
the read nodes in the traditional replication setup behind an ELB, as
well as *all* nodes in the percona cluster behind a (different) ELB.

So, with that in mind, my benchmark consisted of importing a single
350MB sql dump (again, a terrible benchmark, as it would only hit one
node!); because of the flaws I won't dwell on the results, but they
were simply:

- Traditional replication: 250 qps
- Percona Cluster: 1000 qps

This weekend I decided to benchmark Percona Cluster beta, and this
time decided to use 4 Large instances, with an XL instance to run the
benchmarks to alleviate any issues of using micros.

First I ran into issues installing the deb packages on both Debian
Squeeze and Ubuntu Oneiric:

dpkg: error processing
/var/cache/apt/archives/percona-xtradb-cluster-common-5.5_5.5.20-23.4-3738.<distro>_all.deb
(--unpack):
trying to overwrite '/usr/share/info/mysql.info.gz', which is also in
package libmysqlclient18 5.5.21-rel25.0-227.<distro>

This was resolved by doing:

$ dpkg -i --force-overwrite
/var/cache/apt/archives/percona-xtradb-cluster-common-5.5_5.5.20-23.4-3738.<distro>_all.deb
&& apt-get -f install

I ended up using Oneiric after fixing this.

My config looks like this:

#---------------------------------------------
[mysqld]
datadir=/data
user=mysql

binlog_format=ROW

wsrep_provider=/usr/lib/libgalera_smm.so

# Primary
wsrep_cluster_address=gcomm://
# Secondary
# wsrep_cluster_address=gcomm://10.28.78.185

wsrep_slave_threads=2
wsrep_cluster_name=engineyard
wsrep_sst_method=xtrabackup

innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2

#---------------------------------------------

/data is an EBS, and I'm using xtrabackup for my sst method.

I am using sysbench to benchmark, this is my setup:

# 1M rows, imported in via the ELB (takes 40s, 25K writes/s)
$ sysbench --test=oltp --mysql-table-engine=innodb
--oltp-table-size=1000000
--mysql-host=xdbc-cluster-1125795606.us-east-1.elb.amazonaws.com
--mysql-user=root prepare

The MAX(id) is 3999999 (not sure why…), but there is 1M records.

# 10K requests; reconnecting on each transaction, complex test mode:
Variable threads
$ sysbench --num-threads=<INT> --max-requests=10000 --test=oltp
--oltp-table-size=1000000
--mysql-host=xdbc-cluster-1125795606.us-east-1.elb.amazonaws.com
--mysql-user=root --oltp-reconnect-mode=transaction
--oltp-test-mode=complex run

When I run with 16 threads, I consistently get (for example):

ALERT: failed to execute mysql_stmt_execute(): Err1062 Duplicate entry
'500519' for key 'PRIMARY'
FATAL: database error, exiting...

Same with 14 threads; sporadically with 12 threads, rarely with 8
threads and but even with 4 threads…

When it does works, I'm looking at ~1200-2100 qps (see below), but
obviously this is an issue if we're seeing collisions with only 8
concurrent connections.

Any thoughts? I'm going to try using rsync instead of xtrabackup.

- Davey

=========

sysbench 0.4.12: multi-threaded system evaluation benchmark

No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 4

Doing OLTP test.
Running mixed OLTP test
Using Special distribution (12 iterations, 1 pct of values are
returned in 75 pct cases)
Using "BEGIN" for starting transactions
Using auto_inc on the id column
Maximum number of requests for OLTP test is limited to 10000
Threads started!
Done.

OLTP test statistics:
queries performed:
read: 140000
write: 50000
other: 30000
total: 220000
transactions: 10000 (59.90 per sec.)
deadlocks: 0 (0.00 per sec.)
read/write requests: 190000 (1138.11 per sec.)
other operations: 30000 (179.70 per sec.)

Test execution summary:
total time: 166.9441s
total number of events: 10000
total time taken by event execution: 667.5932
per-request statistics:
min: 56.87ms
avg: 66.76ms
max: 552.94ms
approx. 95 percentile: 72.36ms

Threads fairness:
events (avg/stddev): 2500.0000/4.18
execution time (avg/stddev): 166.8983/0.02

sysbench 0.4.12: multi-threaded system evaluation benchmark

No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 8

Doing OLTP test.
Running mixed OLTP test
Using Special distribution (12 iterations, 1 pct of values are
returned in 75 pct cases)
Using "BEGIN" for starting transactions
Using auto_inc on the id column
Maximum number of requests for OLTP test is limited to 10000
Threads started!
Done.

OLTP test statistics:
queries performed:
read: 140098
write: 50033
other: 30011
total: 220142
transactions: 10000 (111.35 per sec.)
deadlocks: 7 (0.08 per sec.)
read/write requests: 190131 (2117.19 per sec.)
other operations: 30011 (334.19 per sec.)

Test execution summary:
total time: 89.8034s
total number of events: 10000
total time taken by event execution: 718.1731
per-request statistics:
min: 57.88ms
avg: 71.82ms
max: 393.15ms
approx. 95 percentile: 90.65ms

Threads fairness:
events (avg/stddev): 1250.0000/4.30
execution time (avg/stddev): 89.7716/0.02

Davey Shafik

unread,

Mar 26, 2012, 10:50:02 AM3/26/12

to percona-d...@googlegroups.com

Just wanted to note that using rsync for sst did not alleviate this issue.

- Davey

Alex Yurchenko

unread,

Mar 26, 2012, 12:26:22 PM3/26/12

to percona-d...@googlegroups.com

On 2012-03-26 17:50, Davey Shafik wrote:
> Just wanted to note that using rsync for sst did not alleviate this
> issue.

It won't - the issue is in sysbench configuration and the number of
rows. You're simply getting cluster-wide collisions (too few rows) and
that is freaking sysbench out.

Try --mysql-ignore-duplicates=1 option with sysbench.

>> The MAX(id) is 3999999 (not sure why…), but there is 1M records.

Autoincrement increment in 4-node cluster is 4.

Regards,
Alex

--
Alexey Yurchenko,
Codership Oy, www.codership.com
Skype: alexey.yurchenko, Phone: +358-400-516-011

Davey Shafik

unread,

Mar 26, 2012, 12:40:07 PM3/26/12

to percona-d...@googlegroups.com

> It won't - the issue is in sysbench configuration and the number of rows.
> You're simply getting cluster-wide collisions (too few rows) and that is
> freaking sysbench out.

Can you explain this? too *few* rows? How can I alleviate this issue
without the flag? Is it possible?

- Davey

> --
> You received this message because you are subscribed to the Google Groups
> "Percona Discussion" group.
> To post to this group, send email to percona-d...@googlegroups.com.
> To unsubscribe from this group, send email to
> percona-discuss...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/percona-discussion?hl=en.
>

Vadim Tkachenko

unread,

Mar 26, 2012, 12:50:28 PM3/26/12

to percona-d...@googlegroups.com

Davey,

sysbench by default uses special distribution, which with 1000000 rows
you basically
update the same rows again and again.

I may suggest you try --oltp-dist-type=uniform for sysbench.

Thanks,
Vadim

--
Vadim Tkachenko, CTO, Percona Inc.
Phone +1-925-400-7377, Skype: vadimtk153
Schedule meeting: http://tungle.me/VadimTkachenko

Join us at Percona Live: MySQL Conference And Expo 2012
http://www.percona.com/live/mysql-conference-2012/

Alex Yurchenko

unread,

Mar 26, 2012, 1:16:19 PM3/26/12

to percona-d...@googlegroups.com

On 2012-03-26 19:50, Vadim Tkachenko wrote:
> Davey,
>
> sysbench by default uses special distribution, which with 1000000
> rows
> you basically
> update the same rows again and again.
>
> I may suggest you try --oltp-dist-type=uniform for sysbench.
>
> Thanks,
> Vadim
>
>
>
> On Mon, Mar 26, 2012 at 9:40 AM, Davey Shafik <m...@daveyshafik.com>
> wrote:
>>> It won't - the issue is in sysbench configuration and the number of
>>> rows.
>>> You're simply getting cluster-wide collisions (too few rows) and
>>> that is
>>> freaking sysbench out.
>>
>> Can you explain this? too *few* rows? How can I alleviate this issue
>> without the flag? Is it possible?

In general - no. In multi-master setup there is always _some_
probability of cluster-wide conflict. Sysbench just should not freak out
and simply should retry the transaction. Increasing the number of rows
or changing distribution will only reduce the probability of that
conflict, but not eliminate it completely.

Regards,
Alex

>> - Davey
>>
<snip>

>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "Percona Discussion" group.
>> To post to this group, send email to
>> percona-d...@googlegroups.com.
>> To unsubscribe from this group, send email to
>> percona-discuss...@googlegroups.com.
>> For more options, visit this group at
>> http://groups.google.com/group/percona-discussion?hl=en.
>>
>
>
>
> --
> Vadim Tkachenko, CTO, Percona Inc.
> Phone +1-925-400-7377, Skype: vadimtk153
> Schedule meeting: http://tungle.me/VadimTkachenko
>
> Join us at Percona Live: MySQL Conference And Expo 2012
> http://www.percona.com/live/mysql-conference-2012/

--

Baron Schwartz

unread,

Mar 26, 2012, 5:03:26 PM3/26/12

to percona-d...@googlegroups.com

I think it is helpful to understand the locking semantics of Percona XtraDB Cluster. It is one of the few semantic differences compared to using a single server with InnoDB.

1. With InnoDB, locking is pessimistic. That means that if a row must be locked, it is locked immediately and held until the end of the transaction. It's possible for a cycle in locks to be created, which causes a deadlock. But we will never come to the commit and discover that we need a lock on a row we didn't get. No after-the-fact surprises.

2. With Percona XtraDB cluster, the transaction is insulated to the node on which it runs, with no cross-node communication until commit time. That means that locking is pessimistic *locally* but optimistic *cluster-wide*. It is possible to finish all the work and try to commit, and then discover that there was a conflict on another node, at which point the transaction fails.

- Baron

Henrik Ingo

unread,

Apr 19, 2012, 1:27:18 AM4/19/12

to percona-d...@googlegroups.com

On Mon, Mar 26, 2012 at 7:49 AM, Davey Shafik <m...@daveyshafik.com> wrote:
> - Traditional replication: 250 qps
> - Percona Cluster: 1000 qps

Just to point out that this is normal if you compare asynchronous vs
synchronous replication with something single threaded. It is
basically a meaningless measurement, your sysbench tests with multiple
threads will be more interesting.

> I am using sysbench to benchmark, this is my setup:
>
> # 1M rows, imported in via the ELB (takes 40s, 25K writes/s)
> $ sysbench --test=oltp --mysql-table-engine=innodb
> --oltp-table-size=1000000
> --mysql-host=xdbc-cluster-1125795606.us-east-1.elb.amazonaws.com
> --mysql-user=root prepare
>
> The MAX(id) is 3999999 (not sure why…), but there is 1M records.
>
> # 10K requests; reconnecting on each transaction, complex test mode:
> Variable threads
> $ sysbench --num-threads=<INT> --max-requests=10000 --test=oltp
> --oltp-table-size=1000000
> --mysql-host=xdbc-cluster-1125795606.us-east-1.elb.amazonaws.com
> --mysql-user=root --oltp-reconnect-mode=transaction
> --oltp-test-mode=complex run

Note that sysbench 0.5 ignores oltp-reconnect-mode, the functionality
is not there.

> When I run with 16 threads, I consistently get (for example):
>
> ALERT: failed to execute mysql_stmt_execute(): Err1062 Duplicate entry
> '500519' for key 'PRIMARY'
> FATAL: database error, exiting...

Others have suggested that this is due to rollbacks, however sysbench
does properly handle those. Well, at least sysbench 0.5, I don't know
if you used 0.4.

One thing you have missing is that when benchmarking a galera cluster
you must set

--oltp-auto-inc=off

This is because sysbench assumes a continuous sequence for the primary
key, however with a galera cluster you end up with "holes" in your
auto_increment sequence. Setting --oltp-auto-inc=off forces sysbench
to generate the primary key sequence and you get it correctly.

I think the above error may be due to this, at least if you were
running sysbench 0.5.

henrik

--
henri...@avoinelama.fi
+358-40-8211286 skype: henrik.ingo irc: hingo
www.openlife.cc

My LinkedIn profile: http://www.linkedin.com/profile/view?id=9522559

Hai Zhang

unread,

Apr 19, 2012, 3:01:30 AM4/19/12

to percona-d...@googlegroups.com

Hi, there,

I also have one question that how to test the difference of performance between mysql async-replication and semi-replication. Anyone gives me some idea? I mean, which points are key during the testing i MUST focus on.

Thanks so much.

--
You received this message because you are subscribed to the Google Groups "Percona Discussion" group.
To post to this group, send email to percona-d...@googlegroups.com.
To unsubscribe from this group, send email to percona-discuss...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/percona-discussion?hl=en.

--

Regards,

ZhangHai (MySQL DBA)

CNTV

Alex Yurchenko

unread,

Apr 19, 2012, 7:27:42 AM4/19/12

to percona-d...@googlegroups.com

On 2012-04-19 10:01, Hai Zhang wrote:
> Hi, there,
>
> I also have one question that how to test the difference of
> performance
> between mysql async-replication and semi-replication. Anyone gives me
> some
> idea? I mean, which points are key during the testing i MUST focus
> on.
> Thanks so much.

Well, latencies and throughput is all you can measure directly.

From what I saw, when benchmarking MySQL async replication you'll
always have an issue of slave lag. (I suspect it may be the case with
semi-sync as well) Different people treat it differently. Some correct
the throughput fro the lag, some just note how long it was - neither
approach gives you a real picture. Because the main difference here may
be not in performance.

The whole benchmark thing, as I understand it, tries to achieve maximum
score by maxing out your hardware. In production you never want to max
out your hardware, so it will operate in a different mode. What becomes
more important in production is _qualitative_ properties of the
software. So if you're testing, focus on how well the software fits your
requirements, not only on how fast it is. It is not that likely that
performance will be a decisive factor for you (although of course it may
be).

Regards,
Alex

--

Hai Zhang

unread,

Apr 19, 2012, 11:43:54 PM4/19/12

to percona-d...@googlegroups.com

Alex, thanks so much for your words.

Before I used sysbench to test on the async-replication and semi-replication mode. But I found the result that it was not expected by myself ( maybe I am wrong ). From my understanding, the master with async-replication mode should get more query and run faster than one with semi-replication mode. Because the master with semi-replication mode must wait for the ACK response from at least one slave till up to the special timeout. you see, waiting means time consuming.

As follow is the related info:

-Version

MySQL 5.5.23

-Cmd

sysbench --num-threads=32 --max-requests=1000000 --test=oltp --oltp-table-size=1000000 --oltp-test-mode=nontrx --oltp-nontrx-mode=insert --mysql-host=Master_IP

-Output from Master

------- iostat ------- ---------- sysbench ---------

semi-replication ~2900 tps 555 T/s | 83 ms(95%)

async-replication ~2000 tps 489 T/s | 92 ms(95%)

So when I faced on the front of the outputs, I doubted that my testing method was reasonable or not. Later I threw my puzzle here. I am still trying to understand the semi-replication mode further so far. Please kindly correct me when I am wrong. Thanks,all.

To post to this group, send email to percona-discussion@googlegroups.com.

To unsubscribe from this group, send email to

percona-discussion+unsub...@googlegroups.com.

For more options, visit this group at
http://groups.google.com/group/percona-discussion?hl=en.

--
Regards,
ZhangHai (MySQL DBA)
CNTV

--
Alexey Yurchenko,
Codership Oy, www.codership.com
Skype: alexey.yurchenko, Phone: +358-400-516-011

--
You received this message because you are subscribed to the Google Groups "Percona Discussion" group.

To post to this group, send email to percona-discussion@googlegroups.com.
To unsubscribe from this group, send email to percona-discussion+unsub...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/percona-discussion?hl=en.

Reply all

Reply to author

Forward