Connection pooling in proxysql

5,407 views
Skip to first unread message

Liang Guo

unread,
Jan 17, 2016, 8:54:31 PM1/17/16
to proxysql
Hi,

I have a question regarding connection pooling in proxysql. From notes, so-called connection pool are available idle backend connections to mysql servers, and they are used to complete new client connection requests fast. Once a connection from the pool is used for a client connection, it will stay used by that client until client disconnects. So, a backend connection from the connection pool cannot be shared by more than one active client connection at any time. Or, with proxysql connection pool, there cannot be N client connections and M backend connections, where N > M (e.g. N = 1000 and M = 100). Is my understanding correct?

Thanks,
Liang

René Cannaò

unread,
Jan 17, 2016, 9:04:49 PM1/17/16
to proxysql
Hi Liang,

Your definition of connection pool is generally correct : a connection will stay used by that client until client disconnects.
Although, ProxySQL doesn't have a simple connection pool: it implements multiplexing.

Quoting from an article I recently wrote:

When a client connects to ProxySQL no connection to any backend is established. When a client issues a request ProxySQL determines if the request needs a connection to the backend or not, and only if required it forwards the request to a backend. As soon as the request is completed, ProxySQL determines if the connection to the backend is still required, and if not it returns it to a connection pool.
That is: the connection to the backend is returned to the connection pool not when the client disconnects, but when the request is completed, assuming that it can be reused by other clients (for example if there are no active transactions).
Similarly, when a backend connection is linked to a client connection, ProxySQL will ensure that the environment is set correctly : schema, charset, autocommit, etc .
In other words, ProxySQL doesn't just implement Persistent Connection, but also Connection Multiplexing. In fact, ProxySQL can handle hundreds of thousands of clients, yet forward all their traffic to few connections to the backend.
So ProxySQL can handle N client connections and M backend connections , where N > M (even N thousands times bigger than M).

Liang Guo

unread,
Jan 17, 2016, 11:35:13 PM1/17/16
to proxysql
Hi Rene,

Thanks for the fast reply. Sorry, I missed the "connection multiplexing" and saw it in your blog post. I'd like to try our proxysql for testing. It's a bit unclear how to write a good proxysql.cnf, is there some example cnf file for reference. I have seen 2 styles in different posts.

I find the following 2 [mysql] config parameters in the old github repo "proxysql-old", and there the doc still listed mysql_share_connections was experimental. From your recent presentation slides, you mentioned that "some edge cases not supported". What kind of edge cases do you refer to?

mysql_connection_pool_enabled
mysql_share_connections

Thanks,
Liang

René Cannaò

unread,
Jan 18, 2016, 12:57:36 PM1/18/16
to proxysql
Hi Liang,

Although proxysql is configurable through config file, my goal is to make it configurable completely from its admin interface. Among other reasons, it should be possible to configure and re-configure proxysql dynamically without ever logging into the box where it is installed (this is indeed already possible) and use relation between tables to create complex configuration.

With regards on how to configure it, please have a look at https://github.com/sysown/proxysql/tree/master/doc .
If there is anything that isn't clear enough I will be happy to assist with any question.

"mysql_share_connections" and "mysql_connection_pool_enabled" : these are configuration that do not exist anymore. They were used in "proxysql-old" , the old/prototype version of ProxySQL .
Now the current global variables are described here : https://github.com/sysown/proxysql/blob/master/doc/global_variables.md
Although the documentation is not in sync with the development , and I will have to soon update the documentation .

With regards to "some edge cases not supported" : at the time of the presentation, ProxySQL had only 1 criteria to determine if multiplexing had to be disabled: if a transaction was running, multiplexing was disabled until the transaction would rollback or commit.
The edge cases that were not supported but now are supported are LOCK TABLES , SELECT GET_LOCK() , TEMPORARY TABLE ,  and the use of session variables. Now these edge cases are all supported.

Connections multiplexing is now always enabled, and the current version doesn't allow to disable it, at all!!
Although, a user of ProxySQL had a specific need to disable connections multiplexing because it hit a bug in Galera/MariaDB : https://mariadb.atlassian.net/browse/MDEV-8338
In ProxySQL this is tracked in https://github.com/sysown/proxysql/issues/468 .
This enhancement (and others) are available on branch v1.1.1 : https://github.com/sysown/proxysql/tree/v1.1.1 .
If you are running testing, I recommend to use branch v1.1.1

To disable multiplexing is as easy as:
UPDATE global_variables SET variable_value='false' WHERE variable_name='mysql-multiplexing';
LOAD MYSQL VARIABLES TO RUNTIME;
SAVE MYSQL VARIABLES TO DISK;

Regards,
René

Liang Guo

unread,
Jan 18, 2016, 8:16:26 PM1/18/16
to proxysql
Hi Rene,

I think a good file based configuration should be still useful. For example, initial deploy to several hosts using chef would require a predefined config file, and admin interface may be helpful with dynamic change at runtime.

For connection pooling, if it is always enabled. What about the number of backend connections in the pool? Is it configurable through config file or admin interface? It'd be useful to configure it in both way, I guess.

Regarding query routing rule, if I just want to use one host group which may contain 1 mysql server only, I'd prefer the fast forwarding mode, right? In that case, do I need `mysql_query_rules:{}` in the config file? I apologize if this is already documented in the configuration.md. I will look closely at it when I am ready to set up proxysql and try it out.

From you blog post on benchmarking against maxscale, it's interesting that Proxysql has much better response time and QPS than Maxscale. How is Proxysql different from maxscale in terms of async network i/o handling (poll, epoll) such that Proxysql performs better? Maxscale does 2 read(), several malloc (for byffer) per request serving. In addition to that, routing logic. Proxysql has some routing logic too. How Proxysql differentiate in terms of performance? 

Very interesting project. I will definitely try it out.

Thanks,
Liang

René Cannaò

unread,
Jan 20, 2016, 5:33:05 PM1/20/16
to proxysql
Hi Laing,

answer inline:


On Monday, January 18, 2016 at 5:16:26 PM UTC-8, Liang Guo wrote:
Hi Rene,

I think a good file based configuration should be still useful. For example, initial deploy to several hosts using chef would require a predefined config file, and admin interface may be helpful with dynamic change at runtime.

I agree with this point. In fact, it is in the todo list:
https://github.com/sysown/proxysql/issues/406

With regards to deploy to several hosts using Chef or similar, an option is to deploy the database file (proxysql.db) instead of the config file.

For connection pooling, if it is always enabled. What about the number of backend connections in the pool? Is it configurable through config file or admin interface? It'd be useful to configure it in both way, I guess.

You can configure both in config file (max_connections in mysql_servers group) and through the admin interface (field max_connections in mysql_servers table) the maximum number of connections to the backend, on a per backend basis.
In the admin interface you can also configure the maximum number of free connections as a percentage of max_connections. The setting is mysql-free_connections_pct in the global_variables table, and the default is 10.
That means that if serverA is configured with max_connections=1000 , proxysql keeps up to 100 free connections.

Regarding query routing rule, if I just want to use one host group which may contain 1 mysql server only, I'd prefer the fast forwarding mode, right? In that case, do I need `mysql_query_rules:{}` in the config file? I apologize if this is already documented in the configuration.md. I will look closely at it when I am ready to set up proxysql and try it out.

The fast forwarding mode as described in http://www.proxysql.com/2015/06/sql-load-balancing-benchmark-comparing.html doesn't exist anymore (even if you can find some reference in the code, it needs some cleanup) .
The reason why fast forward doesn't exist anymore is that it disables all the features that makes ProxySQL unique, like multiplexing, query caching, query rewrite, real time analysis of the queries, throttling, error detection and retry, the ability to track number of queries executed per server and bytes transferred, the possibility to run "SHOW PROCESSLIST" in the admin interface and see what is running, transaction timeout, query timeout, etc etc.
If a user doesn't need any of these features, he/she probably doesn't need ProxySQL (or any proxy that understand the MySQL Protocol) and they can get away with HAProxy.
If you want to use one host group which may contain 1 mysql server only, you do not need any mysql_query_rules : you only need to configure the default_hostgroup in mysql_users table (or default_hostgroup in mysql_users{} in config file).
The same apply if you have many servers in a Galera cluster: no need to configure any entry in mysql_query_rules, you only need to define a default_hostgroup .ProxySQL will automatically handle any node failure and re-execute the query when needed, without sending any error to the client.

 
From you blog post on benchmarking against maxscale, it's interesting that Proxysql has much better response time and QPS than Maxscale. How is Proxysql different from maxscale in terms of async network i/o handling (poll, epoll) such that Proxysql performs better? Maxscale does 2 read(), several malloc (for byffer) per request serving. In addition to that, routing logic. Proxysql has some routing logic too. How Proxysql differentiate in terms of performance? 

Actually, MaxScale has a slightly better response time than ProxySQL at very low concurrency (1-4 connections, more details below). But as the number of connections increases, the response time and throughput of ProxySQL is way better than MaxScale.
ProxySQL uses poll() for all the async network I/O . I think one of the reasons ProxySQL has a so great throughput is that there are very few critical sections between the various threads, therefore most of the work done by each thread doesn't interfere with others. To achieve this, MySQL sessions aren't shared by threads, and once a MySQL session (client connection) is assigned to a thread, it will be processed by the same thread until it disconnect. This also means that each thread calls poll() for different file descriptions (the only exception are the ports that ProxySQL uses to accept connections). Although, the connection pool is shared: in future versions I will try to improve this, but according to my benchmark the connection pool (and multiplexing) isn't a bottleneck, at all.
I have been wondering why ProxySQL is slightly slower than MaxScale at low concurrency, and I think I finally fix it:
https://github.com/sysown/proxysql/commit/5917fece65d038e9ca532f8caafd8e1b2bdbc77a
In other words, libmariadbclient was calling poll() after ProxySQL .
I need to run new benchmarks and get new numbers :-)
Thinking out loud, I also need to upgrade libmariadbclient , but as I had to work around several bugs in this version of libmariadbclient (including severe memory leaks), an upgrade will require a lot a lot of testing. As there is no big incentive in upgrading the library, for now I am focusing on other features.

The fact that there is very few critical section allows to scale almost linearly.
Most of the memory allocated in a thread is deallocated in the same thread: jemalloc is very happy of this!

Each thread has its own copy of query rules: this improves performance, but only if there are a lot of rules.


Very interesting project. I will definitely try it out.

Thanks you! :)
 

Steven Hsu

unread,
Dec 14, 2016, 9:07:41 PM12/14/16
to proxysql
Hi Rene,

Is there a documentation on how does connection pooling / multiplexing work on version 1.3.1? 
We are hitting some backend connection timeout errors (10s) - however when the error occurred, the backend was accepting connections normally. 


On Sunday, January 17, 2016 at 5:54:31 PM UTC-8, Liang Guo wrote:
Reply all
Reply to author
Forward
0 new messages