Quick Tutorial on Admin Interface

666 views
Skip to first unread message

René Cannaò

unread,
Apr 15, 2014, 5:33:02 PM4/15/14
to prox...@googlegroups.com

Hostgroups


The majority of the proxies are not able to do read/write split, some of them have a very bad implementation on it , while only few perform read/write split the proper way.


ProxySQL has a completely innovative and original methodology that groups servers into hostgroups, and queries are routed to hostgroups according to specific criteria.

In fact, it is possible to group servers into hostgroups based on their functionality, and based on the nature of the queries ProxySQL will route to a host in a defined hostgroup.

This approach makes read/write split a simple subcase of queries routing into hostgroups:

- write statements are sent to a hostgroup with all the masters

- read statements are sent to a hostgroup with all the slaves


Note that I used the plural “masters” and not “master” on purpose : this means that ProxySQL not only supports the traditional setup of one master and multiple slaves, but also supports setups with multiple masters like NDB and Galera.

These multiple masters can have several slaves all configured in another hostgroup.


To be able to support a variety of replication setups, by design ProxySQL is not replication aware. ProxySQL doesn’t perform any sort of check on replication, and this means that:

- it doesn’t know if replication is broken ;

- it doesn’t know if replication is implemented via native replication, external replication, or any sort of synchronous replication ;

- it doesn’t know if you have replication at all . Why not ? Maybe your slaves replicate from a server that is not in your masters pool, or maybe it is updated only via ETL .


As surprising it may sound, it is a feature the fact that ProxySQL is not replication aware: in fact, it allows you to create arbitrarily complex setups. Although this come with a price: you need to monitor your setup with external tools, and if needed you must re-configure ProxySQL (operation that can be done online).


While ProxySQL is in theory able to handle an arbitrarily large number of hostgroups, currently there is a hardcoded limit of 64 hostgroups.

What can you do with so many hostgroups? Let me give you some examples:

- you can create a hostgroup that manage the majority of the writes;

- you can create a hostgroup to serve the majority of your reads;

- you can create a hostgroup that connect your application to a completely different cluster that manages queue, and use the same database connection;

- you can create a hostgroup that connect your application to a completely different cluster that manages a chat, and use the same database connection;

- you can create a hostgroup to run slow reporting queries against a specific set of servers;

- you can create a hostgroup to run ETL write statements against a set of servers;

- you can create a hostgroup to run ETL select queries against another set of servers;

- you can create a hostgroup to query servers on a different datacenter .


From the example above it looks clear why ProxySQL needs to be not replication aware: if ProxySQL is replication aware, some of the above setups are not possible as there is no relations between hostgroups!


Important note: a single query cannot target multiple hostgroups. That means that you can use the same database connection to query data from multiple hostgroups, but a single query cannot join data from multiple hostgroups.


I am sure one of the questions crossing your mind right now is: how to configure hostgroups?

Hostgroups were not configured in the config file, and actually the config file is not where you can configure hostgroups (at least, not in current release).

So again, from where we can configure hostgroups? And how?


The admin interface : first introduction and hostgroups


If you remember, during the configuration of ProxySQL using proxy_interactive_configurator.pl you were asked to configure an admin interface port and a username and password for it.

To refresh your memory, the defaults were:

proxy_admin_port=6032

proxy_admin_user=admin

proxy_admin_password=admin


You can connect to the admin interface using any application able to send queries and to process resultset using the mysql protocol. For instance, you can use MySQL monitor.


Let’s try that!


vegaicm@voyager:~$ mysql -u admin -padmin -h 127.0.0.1 -P6032

Welcome to the MySQL monitor.  Commands end with ; or \g.

Your MySQL connection id is 3049257792

Server version: 5.1.30 (ProxySQL)


Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.


Oracle is a registered trademark of Oracle Corporation and/or its

affiliates. Other names may be trademarks of their respective

owners.


Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.


mysql>


Welcome to the ProxySQL admin interface!

To distinguish if you are connected to ProxySQL on port 6033 (that forwards traffic to mysqld backends) or on port 6032 (admin interface) I recommend to change the prompt of your mysql monitor client:


mysql> \R proxysql-admin>

PROMPT set to 'proxysql-admin> '

proxysql-admin>


One of the first commands I normally run when connecting to a mysql server is “SHOW DATABASES” , or “SHOW PROCESSLIST” . If you run the same you will get an interesting (but expected!) surprise:


proxysql-admin> SHOW DATABASES;

ERROR 1047 (08S01): near "SHOW": syntax error

proxysql-admin> SHOW PROCESSLIST;

ERROR 1047 (08S01): near "SHOW": syntax error


While the admin interface of ProxySQL can communicate using the MySQL protocol, it is not a mysql server and several commands are not available.

In fact, all the commands you executed in the admin interface are analyzed by ProxySQL looking for well defined commands, and any not recognized command is forwarded to an embedded sqlite3 database .


The first command you should look into is SHOW TABLES :


proxysql-admin> SHOW TABLES;

+------------------+

| tables           |

+------------------+

| server_status    |

| servers          |

| hostgroups       |

| users            |

| global_variables |

| debug_levels     |

| query_rules      |

+------------------+

7 rows in set (0.00 sec)



Great! That is the first step to see what is going on inside ProxySQL .


Let start having a look at some of these tables:


proxysql-admin> SHOW CREATE TABLE server_status\G

*************************** 1. row ***************************

      table: server_status

Create Table: CREATE TABLE server_status ( status INT NOT NULL PRIMARY KEY, status_desc VARCHAR NOT NULL, UNIQUE(status_desc) )

1 row in set (0.00 sec)



The output is not exactly as you would expect from a mysql server, but it is absolutely easy to understand.


Let looks two more tables:


proxysql-admin> SHOW CREATE TABLE servers\G

*************************** 1. row ***************************

      table: servers

Create Table: CREATE TABLE servers ( hostname VARCHAR NOT NULL , port INT NOT NULL DEFAULT 3306 , read_only INT NOT NULL DEFAULT 1, status INT NOT NULL DEFAULT ('OFFLINE') REFERENCES server_status(status) , PRIMARY KEY(hostname, port) )

1 row in set (0.01 sec)


proxysql-admin> SHOW CREATE TABLE hostgroups\G

*************************** 1. row ***************************

      table: hostgroups

Create Table: CREATE TABLE hostgroups ( hostgroup_id INT NOT NULL DEFAULT 0, hostname VARCHAR NOT NULL , port INT NOT NULL DEFAULT 3306, FOREIGN KEY (hostname, port) REFERENCES servers (hostname, port) , PRIMARY KEY (hostgroup_id, hostname, port) )

1 row in set (0.00 sec)



Now that we have a better understanding of some tables structures, let look at their content:



proxysql-admin> SELECT * FROM server_status;

+--------+--------------+

| status | status_desc  |

+--------+--------------+

| 0      | OFFLINE_HARD |

| 1      | OFFLINE_SOFT |

| 2      | SHUNNED      |

| 3      | ONLINE       |

+--------+--------------+

4 rows in set (0.00 sec)


proxysql-admin> SELECT * FROM servers;

+-----------+-------+-----------+--------+

| hostname  | port  | read_only | status |

+-----------+-------+-----------+--------+

| 127.0.0.1 | 23389 | 0         | 3      |

| 127.0.0.1 | 23390 | 1         | 3      |

| 127.0.0.1 | 23391 | 1         | 3      |

+-----------+-------+-----------+--------+

3 rows in set (0.00 sec)


proxysql-admin> SELECT * FROM hostgroups;

+--------------+-----------+-------+

| hostgroup_id | hostname  | port  |

+--------------+-----------+-------+

| 1            | 127.0.0.1 | 23389 |

| 0            | 127.0.0.1 | 23389 |

| 1            | 127.0.0.1 | 23390 |

| 1            | 127.0.0.1 | 23391 |

+--------------+-----------+-------+

4 rows in set (0.00 sec)


Table server_status describe 4 statuses: without going into details (we will discuss these another time), note that ProxySQL considers active all hosts that are ONLINE .


What hosts are online ?


proxysql-admin> SELECT s.*, ss.status_desc FROM servers s NATURAL JOIN server_status ss WHERE status_desc='ONLINE';

+-----------+-------+-----------+--------+-------------+

| hostname  | port  | read_only | status | status_desc |

+-----------+-------+-----------+--------+-------------+

| 127.0.0.1 | 23389 | 0         | 3      | ONLINE      |

| 127.0.0.1 | 23390 | 1         | 3      | ONLINE      |

| 127.0.0.1 | 23391 | 1         | 3      | ONLINE      |

+-----------+-------+-----------+--------+-------------+

3 rows in set (0.00 sec)



To which hostgroups they belong?



proxysql-admin> SELECT h.hostgroup_id hs_id , s.hostname , s.port , s.read_only ro, ss.status_desc desc FROM servers s NATURAL JOIN server_status ss NATURAL JOIN hostgroups h WHERE status_desc='ONLINE' ORDER BY hostgroup_id, hostname, port;

+-------+-----------+-------+------+--------+

| hs_id | hostname  | port  | ro   | desc   |

+-------+-----------+-------+------+--------+

| 0     | 127.0.0.1 | 23389 | 0    | ONLINE |

| 1     | 127.0.0.1 | 23389 | 0    | ONLINE |

| 1     | 127.0.0.1 | 23390 | 1    | ONLINE |

| 1     | 127.0.0.1 | 23391 | 1    | ONLINE |

+-------+-----------+-------+------+--------+

4 rows in set (0.01 sec)







How these tables are populated ?

a) consider table server_status as a read-only table (it is not, but please don’t edit. Future versions of ProxySQL will prevent changes in this table);

b) table servers is populated with the hosts specified in proxysql.cnf , variable mysql_servers ;

c) table hostgroups is populated at startup with the follow criteria : all the servers that are alive (user mysql_usage_user is used to do that check) go to hostgroup 1, and the ones with read_only=0 go also to hostgroup 0.


Very important notes:

1) these tables are populated from the content of proxysql.cnf only if they are empty. That means that once these tables are initialized, adding or removing servers from proxysql.cnf won’t have any effect unless you delete the content of such tables. That is the reason why in previous tutorial we delete these files before updating proxysql.cnf ;

2) the split on hostgroup 0 and hostgroup 1 in automatically done only if the tables are empty: adding new servers won’t automatically assign them to any hostgroup .



The next table you should look into is query_rules :


proxysql-admin> SHOW CREATE TABLE query_rules\G

*************************** 1. row ***************************

      table: query_rules

Create Table: CREATE TABLE query_rules (rule_id INT NOT NULL PRIMARY KEY, active INT NOT NULL DEFAULT 0, username VARCHAR, schemaname VARCHAR, flagIN INT NOT NULL DEFAULT 0, match_pattern VARCHAR NOT NULL, negate_match_pattern INT NOT NULL DEFAULT 0, flagOUT INT NOT NULL DEFAULT 0, replace_pattern VARCHAR, destination_hostgroup INT NOT NULL DEFAULT 0, audit_log INT NOT NULL DEFAULT 0, performance_log INT NOT NULL DEFAULT 0, cache_tag INT NOT NULL DEFAULT 0, invalidate_cache_tag INT NOT NULL DEFAULT 0, invalidate_cache_pattern VARCHAR, cache_ttl INT NOT NULL DEFAULT 0)

1 row in set (0.00 sec)


Currently the table has 16 columns, but right now only 11 columns are in use.

The unused columns will support features that will be available in near future releases of ProxySQL.

Let’s have a look at the columns currently used and their content:


proxysql-admin> SELECT rule_id, active, username user, schemaname schema, flagIN, match_pattern, negate_match_pattern neg, flagOUT, replace_pattern replace, destination_hostgroup hs_id, cache_ttl ttl FROM query_rules ORDER BY rule_id ;

+---------+--------+------+--------+--------+---------------------+------+---------+---------+-------+------+

| rule_id | active | user | schema | flagIN | match_pattern       | neg  | flagOUT | replace | hs_id | ttl  |

+---------+--------+------+--------+--------+---------------------+------+---------+---------+-------+------+

| 10      | 1      | NULL | NULL   | 0      | ^SELECT             | 1    | 0       | NULL    | 0     | -1   |

| 20      | 1      | NULL | NULL   | 0      | \s+FOR\s+UPDATE\s*$ | 0    | 0       | NULL    | 0     | -1   |

| 10000   | 1      | NULL | NULL   | 0      | .*                  | 0    | 0       | NULL    | 1     | 30   |

+---------+--------+------+--------+--------+---------------------+------+---------+---------+-------+------+

3 rows in set (0.00 sec)



If you are familiar with iptables you know that it features chains, each chain has rules, rules are defined for packets, and each rule has an optional target.

query_rules in ProxySQL works very similar to this, with some variants.

Each row in query_rules represents a rule.


A rule can be active (active=1) and therefore processed, or inactive (active=0) and thus ignored.

Rules are ordered and processed in ascending order ( ORDER BY rule_id ).

Rules are applied to query statements (COM_QUERY).

Rules are grouped in chains represented by flagIN. If a query is processed in a chain, rules from other chains are ignored. By default, all queries are processed by the chain with flagIN=0 .

Rules match queries using:

- exact match on username or NULL ( NULL matches any username )

- exact match on schema name or NULL ( NULL matches any schema name )

- regex using match_pattern inverted by negate_match_pattern

- flagIN (if set by a previous rule).

If a query is matched again a rule, a series of action can be performed:

- if flagOUT is not 0, the query will be flagged with the value of it and from now on will be examined only against rules with the same flagIN;

- if replace_pattern is not null, the query will be rewritten using the replace pattern;

- destination_groups specifies to which hostgroup the query will be sent;

- cache_ttl specifies the caching timeout .

Depending from the value of cache_ttl, further action is performed:

a) cache_ttl = -1 : the query won’t be cached, and will be immediately forwarded to the backend;

b) cache_ttl = 0 : this has the special meaning of not setting any timeout, and the queries will be processed against the next rule(s);

c) cache_ttl > 0 : the query will be cached, and no more query rules will be processed.

In other words, through cache_ttl you specify when no more query rules need to be processed. Therefore it is very important in which order the query rules are processed.


The example above means:

rule10) any query that doesn’t start with “SELECT” needs to be send to hostgroup 0 , without caching;

rule20) any query that doesn’t end with “FOR UPDATE” needs to be send to hostgroup 0 , without caching; (note: since rule10 has filtered out any query not starting with “SELECT”, rule20 processes only queries starting with “SELECT”);

rule1000) any query (match_pattern= ‘.*’ is a catch all, and it is recommended to have it as last rule) that reaches rule1000 is cached for 30 seconds and sent to hostgroup 1.


The above is a very simple of read/write split together with query caching.


Defining queries patterns, their destination_hostgroup and their cache_ttl , it is possible to specify which queries needs to be cached and and/or which queries need to be sent to slaves.


For example, adding the follow rule you can specify that if the current schema is maindb , all the SELECT statements against the users table won’t be cached but read from slave:


INSERT INTO query_rules (rule_id, active, username, schemaname, match_pattern, destination_hostgroup,cache_ttl) VALUES (30, 1, NULL, ’maindb’, ’^SELECT.* FROM users .*’, 1, -1);



Applying query rules


When the query_rules table is modified the rules in it are not applied immediately. ProxySQL reload the query_rules tables after issuing the FLUSH QUERY RULES command in the admin interface:


proxysql-admin> flush query rules;

Query OK, 3 rows affected (0.00 sec)



Statuses of query rules


The admin interface allows also to get some status about the query rules.

Running the command DUMP RUNTIME QUERY RULES , ProxySQL will populate a table named runtime_query_rules with the current rules loaded in memory, and how many queries these rules have processed (hits columns). Below an example:



proxysql-admin> dump runtime query rules;

Query OK, 3 rows affected (0.01 sec)


proxysql-admin> SELECT * FROM runtime_query_rules\G

*************************** 1. row ***************************

                rule_id: 10

                   hits: 40006

               username: NULL

             schemaname: NULL

                 flagIN: 0

          match_pattern: ^SELECT

    negate_match_pattern: 1

                flagOUT: 0

        replace_pattern: NULL

  destination_hostgroup: 0

              audit_log: 0

        performance_log: 0

              cache_tag: 0

    invalidate_cache_tag: 0

invalidate_cache_pattern: NULL

              cache_ttl: -1

*************************** 2. row ***************************

                rule_id: 20

                   hits: 0

               username: NULL

             schemaname: NULL

                 flagIN: 0

          match_pattern: \s+FOR\s+UPDATE\s*$

    negate_match_pattern: 0

                flagOUT: 0

        replace_pattern: NULL

  destination_hostgroup: 0

              audit_log: 0

        performance_log: 0

              cache_tag: 0

    invalidate_cache_tag: 0

invalidate_cache_pattern: NULL

              cache_ttl: -1

*************************** 3. row ***************************

                rule_id: 10000

                   hits: 21879

               username: NULL

             schemaname: NULL

                 flagIN: 0

          match_pattern: .*

    negate_match_pattern: 0

                flagOUT: 0

        replace_pattern: NULL

  destination_hostgroup: 1

              audit_log: 0

        performance_log: 0

              cache_tag: 0

    invalidate_cache_tag: 0

invalidate_cache_pattern: NULL

              cache_ttl: 30

3 rows in set (0.01 sec)



Note that the table runtime_query_rules is not automatically updated. You need to run the command DUMP RUNTIME QUERY RULES to refresh its content.

Future versions of ProxySQL won’t need to run the command DUMP RUNTIME QUERY RULES, but currently this is the only options.



Statuses of query cache.


ProxySQL also exports statuses related to the query cache.

To get statistics about the query cache you need to run the command DUMP RUNTIME QUERY CACHE (also this command will become obsolete in future versions of ProxySQL) and query the table runtime_query_cache . Example below:


proxysql-admin> DUMP RUNTIME QUERY CACHE;

Query OK, 1 row affected (0.00 sec)


proxysql-admin> SELECT size_keys+size_values+size_metas `Size`, * FROM runtime_query_cache\G

*************************** 1. row ***************************

          Size: 33494365

current_entries: 12842

     size_keys: 410944

    size_values: 32621109

    size_metas: 462312

     count_SET: 21879

 count_SET_ERR: 0

     count_GET: 341913

  count_GET_OK: 258149

  count_purged: 9037

        dataIN: 56608772

       dataOUT: 308666606

1 row in set (0.00 sec)




From these values it is easy to understand how efficient is the query cache:

count_GET_OK*100/count_GET gives you the hit ratio (75.5%)

count_SET/count_GET gives you the set/get ratio

dataIN/dataOUT gives you information about how much data was written and read (in bytes) from the query cache.


size_keys+size_values+size_metas returns the total amount of memory used by the query cache.

Make sure you have configured mysql_query_cache_size in config file : the default query cache size is very small, only 1MB.

Right now is not possible to dynamically change mysql_query_cache_size , but future versions of ProxySQL will have this feature.

Reply all
Reply to author
Forward
0 new messages