Haproxy config help

30 views
Skip to first unread message

Isreal Varela

unread,
Jul 5, 2021, 11:02:11 AMJul 5
to okd-wg

Hello, I new to Okd and decided to jump in the deep end. I am trying to spin up a cluster but for some reason my nodes never show healthy in haproxy. I can connect to each node directly without issue but going through haproxy fails with a L4 timeout. Link to the cfg I am using: https://github.com/cragr/okd4_files/blob/master/haproxy.cfg

I have turned firewalls off and did a tcpdump to see that traffic is going from the proxy node to a backend node.

From the haproxy system I can connect to the nodes on the back end but I can not connect through the proxy to the backend nodes as they never show heaelthy. I not sure what I am missing and any help would be great.

Thanks,
Isreal

Brian Innes

unread,
Jul 5, 2021, 11:51:58 AMJul 5
to okd-wg
Hello Isreal, welcome to the OKD community.  

Can you give a little background on your setup so we can understand it and help resolve the issue.  Are you running on a public cloud infrastructure, on a local VM infrastructure (VMWare / oVirt) or bare metal?

How did you do the OKD install and setup?  Are you following a set of instructions, if so can you provide the link?

Depending on your requirements you can run an OKD cluster without needing an external HAproxy, so a good first step may be to get the cluster up and running on the local network, then if required add the HAproxy?

It is a difficult to provide help without a little more information

Isreal Varela

unread,
Jul 5, 2021, 2:04:29 PMJul 5
to okd-wg
and updated to include the following versions:
openshift-install version
openshift-install 4.7.0-0.okd-2021-05-22-050008

Fedora CoreOS 33.20210426.3.0

I have tried a few other versions but I keep running into the same issue. I moved the dns pointing at api-int.okd.xxx.com away from haproxy and pointed it directly at the bootstrap node to bring up the first control plane node.

I can connect directly to the bootstrap node on :6443 and :22623 but as soon as I move api-int or api back to the haproxy system, I get:
[core@okd4-bootstrap ~]$ curl -ILk https://api.okd.xxx.com:6443/healthz
curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to api.okd.xxx.com:6443

[core@okd4-bootstrap ~]$ dig +short api-int.okd.xxx.com
192.168.100.210 <- Haproxy Node

I am guessing that is because the node never show green in haproxy.

Pointing directly at the bootstrap node
[core@okd4-bootstrap ~]$ curl -k https://192.168.100.200:22623/healthz -IL
HTTP/2 200  
content-length: 0
date: Mon, 05 Jul 2021 16:39:57 GMT

[core@okd4-bootstrap ~]$ curl -k https://192.168.100.200:6443/healthz
ok

When I move the DNS of api-int to the bootstrap node:
[core@okd4-bootstrap ~]$ dig +short api-int.okd.xxx.com
192.168.100.200

[core@okd4-bootstrap ~]$ curl https://api-int.okd.xxx.com:22623/healthz -ILk
HTTP/2 200  
content-length: 0
date: Mon, 05 Jul 2021 16:57:06 GMT


Haproxy is showing the nodes are actively dropping the connection:
Jul  5 12:46:52 okd4-services haproxy[2267]: 192.168.100.201:50480 [05/Jul/2021:12:46:52.134] okd4_k8s_api_fe okd4_k8s_api_be/<NOSRV> -1/-1/0 0 SC 1/1/0/0/0 0/0
I just cant figure out why. I have firewalls all off, the haproxy system can curl the bootstrap node without issue, selinux is off, not sure what else to try.

I am running in a proxmox environment with UPI. The goal is to have a three node control plane and 2 workers behind haproxy. I can try to stand up a single node control plane and see if that goes better.

Thanks for any help you might be able to provide

John Fortin

unread,
Jul 5, 2021, 2:27:31 PMJul 5
to Isreal Varela, okd-wg
Have you turned off the bootstrap node and removed it from haproxy?  If not that can cause issues. 


On Jul 5, 2021, at 2:04 PM, Isreal Varela <isr...@gmail.com> wrote:


--
You received this message because you are subscribed to the Google Groups "okd-wg" group.
To unsubscribe from this group and stop receiving emails from it, send an email to okd-wg+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/okd-wg/df11aac6-33e6-43c2-8d6c-d038308d1ef2n%40googlegroups.com.

Isreal Varela

unread,
Jul 7, 2021, 2:43:47 PMJul 7
to okd-wg
Just to close the loop on this, I was able to replace haproxy with Nginx and my stack came up without issue. I am confused by this but I have not really every been able to get haproxy to work in other home lab setups in the past.

Thanks everyone

Eduardo Lúcio Amorim Costa

unread,
Jul 7, 2021, 4:59:48 PMJul 7
to okd-wg
Hi "isr...@gmail.com"! =D

My knowledge of OpenShift (OKD) isn't deep - I actually still have a few things I need help from our community here - but I've had a perfect experience with the setup below for HAProxy...

```
#---------------------------------------------------------------------
# Global settings.
#---------------------------------------------------------------------
global
   maxconn 20000
   log /dev/log local0 info
   chroot /var/lib/haproxy
   pidfile /var/run/haproxy.pid
   user haproxy
   group haproxy
   daemon

   # Turn on stats unix socket.
   stats socket /var/lib/haproxy/stats

#---------------------------------------------------------------------
# Common defaults that all the "listen" and "backend" sections will use if not designated
# in their block.
#---------------------------------------------------------------------
defaults
   log global
   maxconn 20000
   mode http
   option dontlognull
   option http-server-close
   option httplog
   option redispatch
   retries 3
   timeout check 10s
   timeout client 300s
   timeout connect 10s
   timeout http-keep-alive 10s
   timeout http-request 10s
   timeout queue 1m
   timeout server 300s

listen stats
   bind :9000
   mode http
   option forwardfor except 127.0.0.0/8
   stats enable
   stats uri /

frontend okd4_k8s_api_fe
   bind :6443
   default_backend okd4_k8s_api_be
   mode tcp
   option tcplog

backend okd4_k8s_api_be
   balance roundrobin
   mode tcp
   # server okd4-bootstrap 10.3.0.4:6443 check
   server okd4-control-plane-1 10.3.0.5:6443 check
   server okd4-control-plane-2 10.3.0.6:6443 check
   server okd4-control-plane-3 10.3.0.7:6443 check

frontend okd4_machine_config_server_fe
   bind :22623
   default_backend okd4_machine_config_server_be
   mode tcp
   option tcplog

backend okd4_machine_config_server_be
   balance roundrobin
   mode tcp
   # server okd4-bootstrap 10.3.0.4:22623 check
   server okd4-control-plane-1 10.3.0.5:22623 check
   server okd4-control-plane-2 10.3.0.6:22623 check
   server okd4-control-plane-3 10.3.0.7:22623 check

frontend okd4_http_ingress_traffic_fe
   bind *:80
   default_backend okd4_http_ingress_traffic_be
   mode tcp
   option tcplog

backend okd4_http_ingress_traffic_be
   balance roundrobin
   mode tcp
   server okd4-compute-1 10.3.0.8:80 check
   server okd4-compute-2 10.3.0.9:80 check

frontend okd4_https_ingress_traffic_fe
   bind *:443
   default_backend okd4_https_ingress_traffic_be
   mode tcp
   option tcplog

backend okd4_https_ingress_traffic_be
   balance roundrobin
   mode tcp
   server okd4-compute-1 10.3.0.8:443 check
   server okd4-compute-2 10.3.0.9:443 check
```

So I suggest you compare it with yours and see if something is divergent. Note in particular the parameter `balance roundrobin` (not `source`). This helped me avoid some problems.

Another suggestion of a general nature is that you note that your servers have enough hardware resources as shown in this table (as this is often a "hidden" source of a lot of problems):

```
NAME            ROLE            CPU  RAM
OKD_BOOTSTRAP   bootstrap       4[V] 8~16
OKD_MASTER_1    master          4[V] 8~16
OKD_MASTER_2    master          4[V] 8~16
OKD_MASTER_3    master          4[V] 8~16
OKD_WORKER_1    worker          4[V] 12~16
OKD_WORKER_2    worker          4[V] 12~16
OKD_SERVICES    DNS/LB/web/NFS  4[V] 4

 _ [V] Nested virtualization enabled (if your nodes run on a Hypervisor).
```

Hope I was helpful! Success! =D
Reply all
Reply to author
Forward
0 new messages