Big performance hit after migrating from standalone MySQL server to MariaDB Galera cluster

1,404 views
Skip to first unread message

nonshatter

unread,
Sep 25, 2015, 12:28:48 PM9/25/15
to codership
Hi,

We have a process that runs overnight on several machines with around 20 processes that all connect to the database a do a lot of inserts against two tables. 

We normally truncate the tables and insert around 25 million rows during each daily run. All of this used to take about 6 hours when we had a standalone MySQL instance (No replication, InnoDB tables).

Now we have tried pointing the process at our shiny new MariaDB Galera cluster (load balanced using HAproxy), we are seeing about a massive drop in performance with the process taking about 15 hours to run.

Obviously there's a trade off between high availability and performance, but we didn't expect it to be so significant. I'm no database expert, but can anyone kindly point me in the right direction to try track down the bottleneck, or suggest how to speed things up?

Here's our galera config:

[galera]

wsrep_provider=/path/to/libgalera_smm.so

wsrep_cluster_address=gcomm://x.x.x.x,y.y.y.y,z.z.z.z

wsrep_slave_threads=1

binlog_format=row

default_storage_engine=InnoDB

innodb_autoinc_lock_mode=2

query_cache_size=0


The application itself is running in a giant loop, doing something like this. No changes have been made to it since pointing at MariaDB, but there might ways of optimising it?

FOR ( 1..1,500,000 )
{
  IF ( SELECT COUNT(1) FROM table1 WHERE id=$id )
  {  
    DELETE FROM table1 WHERE id=$id
    DELETE FROM table2 WHERE id=$id
  }
  INSERT record INTO table1 ...
  FOR ( 1 to 20 )
  {
    INSERT record INTO table2
  }
  COMMIT;
}

Thanks in advance


Jay Janssen

unread,
Sep 25, 2015, 1:10:33 PM9/25/15
to nonshatter, codership
So there is 1 transaction for every iteration of the outer loop?

There are a few things to consider with Galera over "normal" replication:
  • Every commit (auto-commit or otherwise) incurs a roundtrip to the other nodes in the cluster.  This will be a lot lower than a single local commit in Innodb, especially as the Galera nodes get further away
  • Even if you tested on a cluster of one node, if Galera was in operation, each transaction must also 'certify' before finishing a commit (this happens after replication when cluster size > 1).  This too slows down commits
  • Galera doesn't like large transactions, so if any of the transaction(s) or statements above affects a lot of rows it can cause problems with other replication on the cluster that may also be happening.
  • Galera uses something called 'flow control' to limit apply lag across the cluster.  This kicks in early and often, so your 20 processes doing simultaneous writes across the cluster, but only 1 applier thread on each node can cause backlog.
Usually the workarounds for compensating for the above is:
  • Use small transactions
  • Add parallelism to your processing
  • Increase wsrep_slave_threads to improve node apply rate

--
You received this message because you are subscribed to the Google Groups "codership" group.
To unsubscribe from this group and stop receiving emails from it, send an email to codership-tea...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Jay Janssen
Managing Consultant, Percona

alexey.y...@galeracluster.com

unread,
Sep 25, 2015, 6:38:22 PM9/25/15
to Jay Janssen, nonshatter, codership
All true, plus HAProxy will at least double the client-server latency if
not limit the overall throughput.

On 2015-09-25 20:10, Jay Janssen wrote:
> So there is 1 transaction for every iteration of the outer loop?
>
> There are a few things to consider with Galera over "normal"
> replication:
>
> - Every commit (auto-commit or otherwise) incurs a roundtrip to the
> other nodes in the cluster. This will be a lot lower than a single
> local
> commit in Innodb, especially as the Galera nodes get further away
> - Even if you tested on a cluster of one node, if Galera was in
> operation, each transaction must also 'certify' before finishing a
> commit
> (this happens after replication when cluster size > 1). This too
> slows
> down commits
> - Galera doesn't like large transactions, so if any of the
> transaction(s) or statements above affects a lot of rows it can
> cause
> problems with other replication on the cluster that may also be
> happening.
> - Galera uses something called 'flow control' to limit apply lag
> across
> the cluster. This kicks in early and often, so your 20 processes
> doing
> simultaneous writes across the cluster, but only 1 applier thread on
> each
> node cselect group_concat(variable_value separator '-') from (select
variable_Value from GLOBAL_STATUS where VARIABLE_NAME='WSREP_rEADY'
union all select variable_Value from GLOBAL_STATUS where
VARIABLE_NAME='WSREP_CONNECTED' union all select
if(STRCMP('Synced','Donor/Desynced'),'Synced','Desynced') from
GLOBAL_STATUS where VARIABLE_NAME='wsrep_local_state_comment'union all
select if(variable_value<2,0,1) from GLOBAL_STATUS where
VARIABLE_NAME='wsrep_cluster_size' ) as a;an cause backlog.
>
> Usually the workarounds for compensating for the above is:
>
> - Use small transactions
> - Add parallelism to your processing
> - Increase wsrep_slave_threads to improve node apply rate

James Wang

unread,
Sep 28, 2015, 5:07:21 AM9/28/15
to codership, jay.j...@percona.com, lloyd...@googlemail.com
Sir, 
What is the solution please?  Use an alternative to HAProxy? 

James Wang

unread,
Sep 28, 2015, 5:11:55 AM9/28/15
to codership, lloyd...@googlemail.com
Hi Jay,

What is the recommend value for wsrep_slave_threads please?  We tuned it to 32.
By parallelism, did you mean to write some script with multi-threaded capability to import data?
Small transactions: how small is small please?  Would you please elaborate by an example?  For example, do a commit every 1000 iterations?

Thanks a lot in advance

alexey.y...@galeracluster.com

unread,
Sep 28, 2015, 5:49:47 AM9/28/15
to James Wang, codership, jay.j...@percona.com, lloyd...@googlemail.com
On 2015-09-28 12:07, James Wang wrote:
> Sir,
> What is the solution please? Use an alternative to HAProxy?

Any middle man will double the hops - hence double the latency. Plus
HAProxy is single threaded and user space connection proxying is VERY
expensive, maybe more expensive than processing the actual query. Make
you conclusions.

Nothing beats direct connections to the server.

James Wang

unread,
Sep 28, 2015, 6:14:43 AM9/28/15
to codership, jwang...@gmail.com, jay.j...@percona.com, lloyd...@googlemail.com


On Monday, 28 September 2015 10:49:47 UTC+1, Alexey Yurchenko wrote:
On 2015-09-28 12:07, James Wang wrote:
> Sir,
> What is the solution please?  Use an alternative to HAProxy?

Any middle man will double the hops - hence double the latency. Plus
HAProxy is single threaded and user space connection proxying is VERY
expensive, maybe more expensive than processing the actual query. Make
you conclusions.

Nothing beats direct connections to the server.


True, indeed.  However, how do you achieve auto-fail-over then?
Thanks in advance

Jay Janssen

unread,
Sep 28, 2015, 7:52:37 AM9/28/15
to James Wang, codership, lloyd...@googlemail.com
On Mon, Sep 28, 2015 at 5:11 AM, James Wang <jwang...@gmail.com> wrote:
Hi Jay,

What is the recommend value for wsrep_slave_threads please?  We tuned it to 32.

​>1, but probably not much bigger than 1-2x your CPU cores in my opinion.  Practically in most workloads you won't see wsrep_apply_window much above 1, but there are definitely cases where parallel apply makes a big difference.​

 
By parallelism, did you mean to write some script with multi-threaded capability to import data?

​Yes, I mean that the work is spread out and parallelized within the application.​

 
Small transactions: how small is small please?  Would you please elaborate by an example?  For example, do a commit every 1000 iterations?

​I think the only way to understand what large transactions do (and to get an idea of how big is too big) is to use myq_status in my myq-tools repo here: https://github.com/jayjanssen/myq-tools  -- use the 'wsrep' view on one of your nodes.

This tool will show you replication traffic, second-by-second.  ​Large transactions going through will show up in a few ways:
  • ​One or more "stalls" in normal replication traffic, where the counters go to 0.  These can be a second or two or quite long
  • The above are accompanied usually by some large 'data' increases.  For example, your normal replication traffic may be 100-200KB, but a large transaction going through may bump that up to 10+MB or more.
​I don't have a quick copy-paste to show you of this, but it's easy to reproduce​

James Wang

unread,
Sep 28, 2015, 9:05:28 AM9/28/15
to codership, jwang...@gmail.com, lloyd...@googlemail.com
Thanks a lot.

Pablo Medina

unread,
Sep 28, 2015, 9:46:15 AM9/28/15
to James Wang, codership, lloyd...@googlemail.com
James,

If you are interested in an alternative to HAProxy and you are using Java, you may take a look at: https://github.com/despegar/galera-java-client

Regards,
Pablo

--

James Wang

unread,
Sep 28, 2015, 9:51:16 AM9/28/15
to codership, jwang...@gmail.com, lloyd...@googlemail.com
Thanks a lot.  Shall give it a s try.

alexey.y...@galeracluster.com

unread,
Sep 28, 2015, 1:25:29 PM9/28/15
to James Wang, codership, jay.j...@percona.com, lloyd...@googlemail.com
On 2015-09-28 13:14, James Wang wrote:
> On Monday, 28 September 2015 10:49:47 UTC+1, Alexey Yurchenko wrote:
>>
>> On 2015-09-28 12:07, James Wang wrote:
>> > Sir,
>> > What is the solution please? Use an alternative to HAProxy?
>>
>> Any middle man will double the hops - hence double the latency. Plus
>> HAProxy is single threaded and user space connection proxying is VERY
>> expensive, maybe more expensive than processing the actual query. Make
>> you conclusions.
>>
>> Nothing beats direct connections to the server.
>>
>
>
> True, indeed. However, how do you achieve auto-fail-over then?
> Thanks in advance

Client application logic, libglb, DNS load balancing - sky is the limit
;)

James Wang

unread,
Sep 29, 2015, 4:28:07 AM9/29/15
to codership, jwang...@gmail.com, jay.j...@percona.com, lloyd...@googlemail.com


On Monday, 28 September 2015 18:25:29 UTC+1, Alexey Yurchenko wrote:
>> Any middle man will double the hops - hence double the latency. Plus
>> HAProxy is single threaded and user space connection proxying is VERY
>> expensive, maybe more expensive than processing the actual query. Make
>> you conclusions.
>>
>> Nothing beats direct connections to the server.
>>
>
>
> True, indeed.  However, how do you achieve auto-fail-over then?
> Thanks in advance

>Client application logic, libglb, DNS load balancing - sky is the limit ;)


Thanks a lot. 

James Wang

unread,
Sep 29, 2015, 6:38:15 AM9/29/15
to codership
Hi,

Any feedback/update after Jay and Alexey's suggestions please?

Lloyd Roles

unread,
Sep 29, 2015, 6:48:13 AM9/29/15
to James Wang, codership
Hi, 

I have run a couple of tests to try and eliminate whatever is causing the performance issue.

After taking HAProxy out of the situation and pointing the application at one of the cluster nodes, I am seeing some significant speed improvements (about a 35% increase in speed), at the cost of having no failover, of course.

This is a good article which I am just digesting: 

    http://www.severalnines.com/blog/benchmark-load-balancers-mysqlmariadb-galera-cluster

This should convince operations that we need to try something else. 

As for Jay's suggestion of increasing the number of applier threads, I haven't got that far yet. I'll let you know if that yields more speed.

Thanks

James Wang

unread,
Sep 29, 2015, 7:26:32 AM9/29/15
to codership, jwang...@gmail.com
Thanks a lot for the update.

Have you tried to increase wsrep_slave_threads (we tune it to 32 - roughly 4 times CPUs) please?

Also, in your out for loop, there is commit in each iteration.  How about commit, say, every 50 iterations?

Warren Johnson

unread,
Aug 4, 2019, 1:31:18 PM8/4/19
to codership
Four years on, I wanted to hop in and make a comment in case anyone comes across this. 

By the very nature of Galera, the entire cluster has to agree to do an insert. This communication causes latency as you'd expect.  Individual inserts can suffer greatly if done in large quantities.  A solution is to group your inserts into single statements.

Instead of
INSERT INTO TABLE (field1) VALUE (value1)
...
INSERT INTO TABLE (field100) VALUE (value 100)

Do one big insert statement
INSERT INTO TABLE (field1) VALUE (value1), (value2), (value3)
etc

The enormous latency comes from having the nodes having to agree 100 times to do something.  Instead, have it agree once. 
Reply all
Reply to author
Forward
0 new messages