HAProxy and WSREP has not yet prepared node for application use

449 views
Skip to first unread message

Eric Howey

unread,
Oct 30, 2021, 2:28:17 PM10/30/21
to codership
I'm using Galera with HAProxy in an attempt to keep an ancient php application as available as possible. I've run across a scenario that if WSREP is not ready on a node, instead of the php application and haproxy trying a different node, it just fails and the user experiences errors. Ideally, if the node isn't fully ready, I'd like HAProxy to direct queries to a different node that is ready. For some reason I just can't get it to work right.

Two ways I can see this being fixed. I'm just not experienced enough with Galera to figure this out:
  1. If Galera is not fully ready, mariadb should not accept any connections
  2. If Galera is not fully ready, HAProxy should know it and set the server as down.

Here is my mariadb config:
https://gist.github.com/erichowey/b88795e19a68d174dddf55383e488919

Here is my HAProxy config:
https://gist.github.com/erichowey/6a9fe7022cb6ad6d71208454ca57e136

And here is what error the PHP app is throwing:
https://gist.github.com/erichowey/b4c31f89dd7832e5e7a8c55b1f67497d

Thank You!


Gabor Orosz

unread,
Oct 30, 2021, 3:04:44 PM10/30/21
to codership
Hi,

The first one can be achieved with a notification script: https://galeracluster.com/library/documentation/notification-cmd.html
For the second one, you have to setup a simple tcp or an http service on each backend, which will provide health information about the given MariaDB instance to HAproxy:
I think you can get some inspiration from the old Percona's approach by searching for galeracheck, but please do not use xinetd...

Best regards,
GOro

Shawn M.

unread,
Oct 30, 2021, 4:55:58 PM10/30/21
to codership
Maybe just use ProxySQL instead of HAProxy? It's better suited for this job as an SQL load balancer and doesn't require any extra configuration or external scripts to handle down Galera nodes quite gracefully, plus you get other benefits, such as the time-based query cache to help reduce your loads in some cases.

Regards,
Shawn


--
You received this message because you are subscribed to the Google Groups "codership" group.
To unsubscribe from this group and stop receiving emails from it, send an email to codership-tea...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/codership-team/1eb94c29-97ca-4a94-92af-301679973393n%40googlegroups.com.

Jaikiran Pai

unread,
Oct 31, 2021, 3:51:34 AM10/31/21
to Gabor Orosz, codership
Hello Gabor,

On 31/10/21 12:34 am, Gabor Orosz wrote:
> Hi,
>
> The first one can be achieved with a notification script:
> https://galeracluster.com/library/documentation/notification-cmd.html
> For the second one, you have to setup a simple tcp or an http service on
> each backend, which will provide health information about the given MariaDB
> instance to HAproxy:
>
> -
> http://cbonte.github.io/haproxy-dconv/2.5/configuration.html#option%20httpchk
> -
> http://cbonte.github.io/haproxy-dconv/2.5/configuration.html#4.2-option%20tcp-check
>
> I think you can get some inspiration from the old Percona's approach by
> searching for galeracheck, but please do not use xinetd...

I'm curious - why not xinetd? I don't have extensive experience with
Galera but from the examples I had seen for these health checks we ended
up using xinetd in one of our products. I'm now wondering if that's not
a good thing.

-Jaikiran


Eric Howey

unread,
Oct 31, 2021, 3:23:27 PM10/31/21
to codership
Hi All,

Thank you for the answers. This is extremely helpful! I'm looking through the galeracheck code for inspiration. It seems that it checks if wsrep_local_state is set to 4. If it is, the node is considered up. If it's not, the node is considered down. Looking at this document: https://galeracluster.com/library/documentation/node-states.html#node-state-changes

It states that if wsrep_local_state is set to 4, th
en wsrep_ready is set to 1. I have 3 servers and their sole purpose is to ensure high availability of my databases. Given my use case and what I'm trying to accomplish, I think it would be best for me to simply check if wsrep_ready is set to 1 and then assume the node is alive and ready to process transactions? Does that sound correct? I believe I can accomplish this with a tcp-check in haproxy.

Eric Howey

unread,
Nov 1, 2021, 6:04:39 PM11/1/21
to codership
I ended up writing a php script to check if wsrep_local_state is set to 4 and using haproxy check-http. Thank you all!!

Gabor Orosz

unread,
Nov 3, 2021, 8:39:40 AM11/3/21
to codership
Hi,

systemd implements all the necessary functionalities that the xinetd based solution depends on, thus socket activation would be enough.

Br,
GOro
Reply all
Reply to author
Forward
0 new messages