Proxysql with Aurora cluster endpoint

289 views
Skip to first unread message

Sulav Regmi

unread,
Mar 1, 2024, 10:02:45 AM3/1/24
to proxysql
Hello,

I'm seeing an issue with Aurora cluster read/write endpoints behaving differently from RDS. The hostgroup 1 is the only writer hostgroup and the rest are reader hostgroups. I'm seeing an issue where the reader (xx-db-rr.xx.internal) is running as hostgroup 1 looking at the stats_mysql_errors table. This is the exact same configuration with an RDS primary and a replica, but there were no issues like this. Please see below for the table contents. Am I misunderstanding something?

mysql> SELECT hostgroup_id, hostname FROM mysql_servers;
+--------------+----------------------+
| hostgroup_id | hostname             |
+--------------+----------------------+
| 1            | xx-db-rw.xx.internal |
| 2            | xx-db-rr.xx.internal |
| 2            | xx-db-rw.xx.internal |
| 3            | xx-db-rr.xx.internal |
+--------------+----------------------+
4 rows in set (0.00 sec)


---------------------

mysql> SELECT * FROM mysql_replication_hostgroups;
+------------------+------------------+------------+---------+
| writer_hostgroup | reader_hostgroup | check_type | comment |
+------------------+------------------+------------+---------+
| 1                | 2                | read_only  |         |
+------------------+------------------+------------+---------+
1 row in set (0.00 sec)

---------------------

mysql> SELECT hostgroup, hostname, last_error FROM stats_mysql_errors\G
*************************** 1. row ***************************
hostgroup: 1
hostname: xx-db-rr.xx.internal
last_error: Cannot execute statement in a READ ONLY transaction.
*************************** 2. row ***************************
hostgroup: 1
hostname: xx-db-rr.xx.internal
last_error: The MySQL server is running with the --read-only option so it cannot execute this statement
2 rows in set (0.01 sec)


Thank you.

Sulav Regmi

unread,
Mar 18, 2024, 12:22:59 PM3/18/24
to proxysql

I think I know what the issue might be (have yet to confirm). The `mysql_replication_hostgroups` check_type should be `innodb_read_only` instead of the default `read_only` or to make it compatible during an RDS -> Aurora migration `read_only|innodb_read_only` since the `mysql-monitor_writer_is_also_reader` with the value of `true` will have the node be on both the reader and writer hostgroups.

Sulav Regmi

unread,
Mar 19, 2024, 3:39:13 PM3/19/24
to proxysql
A new issue around this is that the mysql_replication_hostgroups data from the proxysql.cnf file doesn't get applied when the check_type is set to read_only|innodb_read_only.

mysql_replication_hostgroups=
({
  writer_hostgroup = 1
  reader_hostgroup = 2
  check_type       = "read_only|innodb_read_only"
  comment          = "Database"
})

The above results in,

mysql> SELECT * FROM mysql_replication_hostgroups;
+------------------+------------------+------------+---------+
| writer_hostgroup | reader_hostgroup | check_type | comment |
+------------------+------------------+------------+---------+
| 1                | 2                | read_only  |         |
+------------------+------------------+------------+---------+
1 row in set (0.00 sec)


René Cannaò

unread,
Mar 19, 2024, 3:41:38 PM3/19/24
to Sulav Regmi, proxysql
Hi Salav,

ProxySQL supports Aurora with dedicated monitoring. Please use that, instead of replication hostgroups.
Details here:

Thanks,
René

--
You received this message because you are subscribed to the Google Groups "proxysql" group.
To unsubscribe from this group and stop receiving emails from it, send an email to proxysql+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/proxysql/95c54ee3-4d42-4af5-acb1-5b75e0403cb6n%40googlegroups.com.

Sulav Regmi

unread,
Mar 19, 2024, 5:18:25 PM3/19/24
to proxysql
Thanks René, I need to get this setup quickly which is why I'm trying to do this the easier way for now. I'll eventually look into using the dedicated monitoring.

René Cannaò

unread,
Mar 19, 2024, 6:02:06 PM3/19/24
to Sulav Regmi, proxysql
Hi Sulav,

I respectfully disagree here, sorry.
I think the "easy way" is the right way: you are struggling with this for over 2 weeks because you are using the wrong set of monitoring tools.
Use the right set of tool, it is the easy way: use the native support for AWS Aurora.

Thanks,
René

Sulav Regmi

unread,
Apr 1, 2024, 11:32:29 PM4/1/24
to proxysql
Hey René,

Thanks. I switched to using the mysql_aws_aurora_hostgroups now.


mysql> SELECT hostgroup_id, hostname FROM mysql_servers;
+--------------+------------------------------------------------------------------+-----------------+---------------------+
| hostgroup_id | hostname                                                         | max_connections | max_replication_lag |
+--------------+------------------------------------------------------------------+-----------------+---------------------+
| 1            | xx-db-rw.xx.internal                                             | 200             | 5                   |
| 1            | xx-xx-rw.endpoint.us-east-1.rds.amazonaws.com                    | 1000            | 0                   |
| 2            | xx-db-rr.xx.internal                                             | 200             | 5                   |
| 2            | xx-db-rw.xx.internal                                             | 200             | 0                   |
| 2            | xx-xx-rr.
endpoint.us-east-1.rds.amazonaws.com                    | 1000            | 0                   |
| 2            | xx-xx-rw.
endpoint.us-east-1.rds.amazonaws.com                    | 1000            | 0                   |
| 3            | xx-db-rr.xx.internal                                             | 200             | 0                   |
| 5            | xx-xx-mirror.
endpoint.us-east-1.rds.amazonaws.com                | 200             | 0                   |
+--------------+------------------------------------------------------------------+-----------------+---------------------+


The  xx-db-rw/rr.xx.internal endpoints point to the cluster rw/rr endpoints respectively and looks like ProxySQL discovers the instance endpoints and inserts into the the servers table. 

The issue I'm currently seeing which I didn't see with the RDS backend is a high amount of Aborted clients on Aurora when running load tests. In general everything works fine, but once there's a high amount of load, errors like these show up,

MySQL_Session.cpp:1690:handler_again___status_PINGING_SERVER(): [ERROR] Ping timeout during ping on xx-db-rw.xx.internal:3306 after 200092us (timeout 200ms)
MySQL_Monitor.cpp:5913:monitor_AWS_Aurora_thread_HG(): [ERROR] Timeout on AWS Aurora health check for xx-xx-rw.endpoint.us-east-1.rds.amazonaws.com:3306 after 1008ms. If the server is overload, increase mysql_aws_aurora_hostgroups.check_timeout_ms
MySQL_Monitor.cpp:6054:monitor_AWS_Aurora_thread_HG(): [ERROR] Error after 1000ms on server hs-db-rw.staging.internal:3306 : timeout check
mysql_connection.cpp:1178:handler(): [ERROR] Connect timeout on xx-db-rw.xx.internal:3306 : exceeded by 3077us
MySQL_Monitor.cpp:5862:monitor_AWS_Aurora_thread_HG(): [ERROR] Error on AWS Aurora check for xx-xx-rw.endpoint.us-east-1.rds.amazonaws.com:3306 after 1001ms. Unable to create a connection. If the server is overload, increase mysql-monitor_connect_timeout. Error: timeout or error in creating new connection: Lost connection to MySQL server at 'handshake: reading initial communication packet', system error: 110
MySQL_Session.cpp:3101:handler_again___status_CHANGING_USER_SERVER(): [ERROR] Change user timeout during COM_CHANGE_USER on xx-db-rw.xx.internal , 3306

They happen for about 30 seconds and then they resolve itself without any intervention. I even increased the connect_timeout_server to be 2s and increased the monitor_read_only_timeout to be 1.5s. The database is fine during that time, I can connect to it and query against it directly, but for some reason ProxySQL says it can't reach it. Aurora CPU gets to about 20% usage but nothing out of the ordinary due to load tests. I'm talking to AWS about if there's anything on the Aurora side. Any ideas, anything I can look out for on the ProxySQL end? Thanks!

Reply all
Reply to author
Forward
0 new messages