Slow MaxScale performance compared to direct connection

dongy...@gmail.com

unread,

Nov 25, 2015, 11:29:57 PM11/25/15

to MaxScale

Hi everyone,

I have been trying to use MaxScale and testing it out in my Ubuntu VM. I just wanted to see it working with a simplest setup, yet I have found that MaxScale significantly lowers the throughput of the database compared to direct connection. I think it is similar to what has been discussed in this thread (https://groups.google.com/forum/#!searchin/maxscale/slow/maxscale/xYJ5LAlHtx0/yR8gRMwiOiAJ) about a year ago and the thread did not have a clear answer to the problem.

I am using MariaDB 10.0.21 and MaxScale 1.2.1. I have run tpcc benchmark with OLTPBenchmark (http://oltpbenchmark.com/wiki/index.php?title=Main_Page - site is broken atm). I just use single readconnroute with listener and nothing else. The throughput difference is about 15~20x between MaxScale and direct connection:

1) with MaxScale: Rate limited reqs/s: Results(nanoSeconds=30001768896, measuredRequests=350) = 11.665978803225297 requests/sec
2) with direct connection: Rate limited reqs/s: Results(nanoSeconds=30002113184, measuredRequests=5859) = 195.2862441411154 requests/sec

What could be the reason behind such huge overhead? Any help would be greatly appreciated. The following is the maxscale.cnf that I am currently using:

[maxscale]
threads=4

[Read Connection Router]
type=service
router=readconnroute
servers=server1
user=root
passwd=
enable_root_user=true

[Read Connection Listener]
type=listener
service=Read Connection Router
protocol=MySQLClient
port=3600
socket=/home/dyoon/maxscale/readconn.sock

[server1]
type=server
address=127.0.0.1
port=3400
protocol=MySQLBackend

Thanks,
Dong Young

James Wang

unread,

Nov 26, 2015, 6:19:48 AM11/26/15

to MaxScale, dongy...@gmail.com

Hi,

Do you have figures for HAProxy please?

Markus

unread,

Nov 26, 2015, 6:29:57 AM11/26/15

to maxs...@googlegroups.com

Hi,

Your configuration will never be faster than a direct connection due to
the fact that you only use one server. Compared to a direct connection
to a database, you have to take an extra network hop through MaxScale
which will result in slower connections and lower benchmark results.
This is true for all proxy solutions regardless of the implementation.

If you're looking for a throughput increase, you need more than one
server. MaxScale gives the largest throughput increase when the backend
server is the bottleneck which allows MaxScale to load balance the
queries to slave servers with less ongoing operations. I'd suggest
trying a benchmark with MaxScale in front of a cluster (a three-node
Galera cluster, for example) and compare those results with a single server.

If the benchmark has transactions or modifies data, I'd suggest trying
the benchmark with the readwritesplit module. It can detect write
operations and direct them to the master while still load balancing the
read operations across all available slaves. For the best read
performance, select should not be wrapped in transactions.

Markus

--
Markus Mäkelä, Software Engineer
MariaDB Corporation
t: +358 40 7740484 | Skype: markus.j.makela

James Wang

unread,

Nov 26, 2015, 6:49:41 AM11/26/15

to MaxScale

agreed.

however, 15-20 times slower is more than I can accept.

Mudd, Simon

unread,

Nov 26, 2015, 5:02:08 PM11/26/15

to MaxScale

> On 26 Nov 2015, at 05:29, dongy...@gmail.com wrote:
>
> Hi everyone,
>
> I have been trying to use MaxScale and testing it out in my Ubuntu VM. I just wanted to see it working with a simplest setup, yet I have found that MaxScale significantly lowers the throughput of the database compared to direct connection. I think it is similar to what has been discussed in this thread (https://groups.google.com/forum/#!searchin/maxscale/slow/maxscale/xYJ5LAlHtx0/yR8gRMwiOiAJ) about a year ago and the thread did not have a clear answer to the problem.
>
> I am using MariaDB 10.0.21 and MaxScale 1.2.1. I have run tpcc benchmark with OLTPBenchmark (http://oltpbenchmark.com/wiki/index.php?title=Main_Page - site is broken atm). I just use single readconnroute with listener and nothing else. The throughput difference is about 15~20x between MaxScale and direct connection:
>
> 1) with MaxScale: Rate limited reqs/s: Results(nanoSeconds=30001768896, measuredRequests=350) = 11.665978803225297 requests/sec
> 2) with direct connection: Rate limited reqs/s: Results(nanoSeconds=30002113184, measuredRequests=5859) = 195.2862441411154 requests/sec
>
> What could be the reason behind such huge overhead? Any help would be greatly appreciated. The following is the maxscale.cnf that I am currently using:

… [stuff omitted]

This reminds me somewhat of performance_schema in MySQL. Perhaps slightly off-topic but perhaps not.

Profiling the maxscale binary will probably help find where it spends most time but that will depend on a specific run and may be helpful but will
not be generic.

While there are people who complain about the P_S overhead in MySQL and obviously MaxScale needs to be as fast as possible, providing some internal
metrics into MaxScale which can be queried from outside might be quite interesting and might help allow both developers and users see
where MaxScale is busy and what is causing any “slowness”.

The question above clearly indicates a perceived problem and as Markus says it’s impossible
to stick a proxy in between a client and a server and not trigger some sort of overhead, but it’s clearly much better, if possible, to be able to measure things
and come up with some numbers so that the developers can see where to focus their time, or what may be responsible for the perceived
loss in performance.

So I would welcome a future development of MaxScale including something like MySQL's performance_schema, which would
allow us to answer such questions more easily.

Simon

dongy...@gmail.com

unread,

Nov 26, 2015, 6:11:02 PM11/26/15

to MaxScale, dongy...@gmail.com

2015년 11월 26일 목요일 오전 3시 19분 48초 UTC-8, James Wang 님의 말:

FYI, I got the throughput from HAProxy (1.6). Again, I used a simplest setup for HAProxy as well: A single mariadb server + HAProxy. The throughput with HAProxy is as follows:

Rate limited reqs/s: Results(nanoSeconds=30003744026, measuredRequests=5172) = 172.3784870154258 requests/sec

Such throughput is kind of the number I expected (about 10~20% overhead) from MaxScale as well. The following is the config I used for HAProxy:

global
defaults
timeout client 30s
timeout server 30s
timeout connect 30s

listen mysql-cluster
bind 127.0.0.1:3800
mode tcp
option mysql-check user haproxy_check
balance roundrobin
server mysql-1 127.0.0.1:3400 check

dongy...@gmail.com

unread,

Nov 26, 2015, 6:16:45 PM11/26/15

to MaxScale

2015년 11월 26일 목요일 오전 3시 29분 57초 UTC-8, Markus Mäkelä 님의 말:

Thank you for the reply. I agree with you and totally understand that the use of a proxy incurs an overhead and performance will not improve unless you use multiple servers to distribute the workload. However, I did not expect to see such a huge overhead (i.e., approx. 95% decrease in performance) with a simple configuration. It is why I reported and asked in this forum regarding the issue. People would expect 5~20% overhead normally and I suppose that the number I got from HAProxy looks reasonable in that sense. I just wanted to know that whether this has been a known issue or is it just me configuring thing incorrectly for MaxScale.

dongy...@gmail.com

unread,

Nov 27, 2015, 12:47:26 AM11/27/15

to MaxScale, dongy...@gmail.com

I have found out what has been causing the huge overhead. It was the number of threads. It works best with 'threads=1' while previously I was testing with 'threads=4'. MaxScale starts to hog CPU as more threads are polling, which in turn degrades the performance of DBMS significantly. I guess it depends on the caliber of the system that you run MaxScale + MariaDB, but it seems to be 'threads=1' is enough for small systems. I wonder why 'maxscale_template.cnf' has 'threads=4' as the default...

FYI, the following is the throughput I get w.r.t. # of threads in MaxScale:

thread = 1
00:27:43,403 (DBWorkload.java:707) INFO - Rate limited reqs/s: Results(nanoSeconds=30000304382, measuredRequests=5067) = 168.89828634672685 requests/sec

thread = 2
00:28:35,503 (DBWorkload.java:707) INFO - Rate limited reqs/s: Results(nanoSeconds=30001305569, measuredRequests=3444) = 114.79500424003697 requests/sec

thread = 4
00:30:39,698 (DBWorkload.java:707) INFO - Rate limited reqs/s: Results(nanoSeconds=30006436827, measuredRequests=618) = 20.59558099360599 requests/sec

thread = 8
00:31:46,944 (DBWorkload.java:707) INFO - Rate limited reqs/s: Results(nanoSeconds=30001248566, measuredRequests=220) = 7.3330281410128695 requests/sec

2015년 11월 25일 수요일 오후 8시 29분 57초 UTC-8, dongy...@gmail.com 님의 말:

Markus

unread,

Nov 27, 2015, 12:52:16 AM11/27/15

to maxs...@googlegroups.com

Hi,

Good to hear that the reason for the huge overhead was found. We know
that having a default value isn't optimal and we've implemented
automatic detection of CPU cores for 1.3.0 which should make MaxScale
perform better without fine-tuning the thread count. With this we can
probably remove the threads parameter from the template configuration file.

Markus

James Wang

unread,

Nov 27, 2015, 4:03:44 AM11/27/15

to MaxScale, dongy...@gmail.com

On Friday, 27 November 2015 05:47:26 UTC, dongy...@gmail.com wrote:

I have found out what has been causing the huge overhead. It was the number of threads. It works best with 'threads=1' while previously I was testing with 'threads=4'. MaxScale starts to hog CPU as more threads are polling, which in turn degrades the performance of DBMS significantly. I guess it depends on the caliber of the system that you run MaxScale + MariaDB, but it seems to be 'threads=1' is enough for small systems. I wonder why 'maxscale_template.cnf' has 'threads=4' as the default...

FYI, the following is the throughput I get w.r.t. # of threads in MaxScale:

thread = 1
00:27:43,403 (DBWorkload.java:707) INFO - Rate limited reqs/s: Results(nanoSeconds=30000304382, measuredRequests=5067) = 168.89828634672685 requests/sec

thread = 2
00:28:35,503 (DBWorkload.java:707) INFO - Rate limited reqs/s: Results(nanoSeconds=30001305569, measuredRequests=3444) = 114.79500424003697 requests/sec

thread = 4
00:30:39,698 (DBWorkload.java:707) INFO - Rate limited reqs/s: Results(nanoSeconds=30006436827, measuredRequests=618) = 20.59558099360599 requests/sec

thread = 8
00:31:46,944 (DBWorkload.java:707) INFO - Rate limited reqs/s: Results(nanoSeconds=30001248566, measuredRequests=220) = 7.3330281410128695 requests/sec

Good to know you found the issue.

Is the over-head in the 10 - 20% range now?

Martin Brampton

unread,

Nov 27, 2015, 6:37:36 AM11/27/15

to maxs...@googlegroups.com

I'm glad to hear you've resolved the issue. The configuration guide
recommends starting with a single thread, but the default configuration
does have 4 threads.

We're interested to know exactly how the performance came to be so bad.
What was the exact configuration of your VM? Was it a public service
(VPS) or a VM on your own system?

--
Martin Brampton, Principal Software Engineer
MariaDB Corporation | t: +44 1751-432935 | Skype: blacksheepresearch

Reply all

Reply to author

Forward