Vault on Kubernetes - TLS Handshake Errors

603 views
Skip to first unread message

Dillon Courts

unread,
Feb 7, 2019, 1:25:52 PM2/7/19
to Vault
I am running a vault cluster (3 instances, v1.0.2) on kubernetes behind a kubernetes service. Each vault instance has ssl configured on a wildcard cert *.domain.com. I am using the mysql backend and running a 3 node k8s cluster. A vault pod runs on each node as part of a deployment.

Vault is working just fine but all my pods are throwing continuous TLS Handshake Errors:

2019-02-07T18:09:31.516Z [INFO]  http: TLS handshake error from 10.240.0.4:49697: EOF
2019-02-07T18:09:31.614Z [INFO]  http: TLS handshake error from 168.63.129.16:53269: EOF
2019-02-07T18:09:36.518Z [INFO]  http: TLS handshake error from 10.240.0.4:49767: EOF
2019-02-07T18:09:44.205Z [INFO]  http: TLS handshake error from 10.240.0.5:55704: EOF
2019-02-07T18:09:46.621Z [INFO]  http: TLS handshake error from 168.63.129.16:53455: EOF
2019-02-07T18:09:51.522Z [INFO]  http: TLS handshake error from 10.240.0.4:50834: EOF
2019-02-07T18:10:04.211Z [INFO]  http: TLS handshake error from 10.240.0.5:56017: EOF
2019-02-07T18:10:06.528Z [INFO]  http: TLS handshake error from 10.240.0.4:51127: EOF
2019-02-07T18:10:16.532Z [INFO]  http: TLS handshake error from 10.240.0.4:51276: EOF
2019-02-07T18:10:19.216Z [INFO]  http: TLS handshake error from 10.240.0.5:56209: EOF
2019-02-07T18:10:26.632Z [INFO]  http: TLS handshake error from 168.63.129.16:54049: EOF
2019-02-07T18:10:29.218Z [INFO]  http: TLS handshake error from 10.240.0.5:56408: EOF
2019-02-07T18:10:31.539Z [INFO]  http: TLS handshake error from 10.240.0.4:51524: EOF
2019-02-07T18:10:34.222Z [INFO]  http: TLS handshake error from 10.240.0.5:56539: EOF
2019-02-07T18:10:36.540Z [INFO]  http: TLS handshake error from 10.240.0.4:51593: EOF
2019-02-07T18:10:41.543Z [INFO]  http: TLS handshake error from 10.240.0.4:59672: EOF
2019-02-07T18:10:44.227Z [INFO]  http: TLS handshake error from 10.240.0.5:56756: EOF
2019-02-07T18:10:46.545Z [INFO]  http: TLS handshake error from 10.240.0.4:59817: EOF
2019-02-07T18:10:46.640Z [INFO]  http: TLS handshake error from 168.63.129.16:54297: EOF
2019-02-07T18:10:56.547Z [INFO]  http: TLS handshake error from 10.240.0.4:60025: EOF
2019-02-07T18:10:56.643Z [INFO]  http: TLS handshake error from 168.63.129.16:54490: EOF
2019-02-07T18:11:01.644Z [INFO]  http: TLS handshake error from 168.63.129.16:54548: EOF
2019-02-07T18:11:06.552Z [INFO]  http: TLS handshake error from 10.240.0.4:60147: EOF
2019-02-07T18:11:06.645Z [INFO]  http: TLS handshake error from 168.63.129.16:54584: EOF
2019-02-07T18:11:09.237Z [INFO]  http: TLS handshake error from 10.240.0.5:57124: EOF

I know the two 10.240.0 IPs are the other nodes in the cluster. I assume the 168 is a master node but I'm not sure (running on Azure AKS). Note that I have TURNED OFF liveness and readiness probes so I don't believe that is the problem. I have also tried adding "disable_clustering": true to my vault config w/ no luck.

Does anyone know how to resolve this? I realize this is probably more of a kubernetes question than a vault question but thought this would be a good place to start.

I am concerned because it seems like after about a day 2 of my vault pods become unresponsive and I'm wondering if it is related to the tls handshake errors.

Vault config:

{
"ui": true,
"disable_clustering": true,
"listener": {
"tcp": {
"address": "0.0.0.0:8200",
"tls_disable": false,
"tls_prefer_server_cipher_suites": true,
"tls_cipher_suites": "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA",
"tls_cert_file": "/vault/tls/tls.crt",
"tls_key_file": "/vault/tls/tls.key",
"tls_disable_client_certs": true
}
},
"storage": {
"mysql": {
"address": "<address>",
"username": "<username>",
"password": "<password>",
"ha_enabled": "true",
"table": "vault",
"lock_table": "vault_lock"
}
},
"seal": {
"azurekeyvault": {
"tenant_id": "<tenant_id>",
"client_id": "<client_id>",
"client_secret": "<client_secret",
"vault_name": "<vault_name>",
"key_name": "<vault_key>"
}
}
}


Deployment:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: vault
labels:
app: vault
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
template:
metadata:
labels:
app: vault
spec:
hostNetwork: true
containers:
- name: vault
image: "vault:1.0.2"
imagePullPolicy: IfNotPresent
command: ["vault", "server", "-config", "/vault/config/config.json"]
ports:
- containerPort: 8200
name: api
- containerPort: 8201
name: cluster-address
# readinessProbe:
# # Ready depends on preference
# httpGet:
# path: /v1/sys/health?standbycode=204&uninitcode=204&
# port: 8200
# scheme: HTTPS
securityContext:
readOnlyRootFilesystem: true
capabilities:
add:
- IPC_LOCK
env:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: VAULT_LOG_LEVEL
value: "info"
- name: VAULT_API_ADDR
- name: VAULT_CLUSTER_ADDR
value: "https://$(POD_IP):8201"
volumeMounts:
- name: vault-config
mountPath: /vault/config/
- name: vault-root
mountPath: /root/
- name: ssl
mountPath: /vault/tls
resources: {}
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app: vault
volumes:
- name: vault-config
configMap:
name: "vault-config"
- name: vault-root
emptyDir: {}
- name: ssl
secret:
secretName: ssl


Service:

apiVersion: v1
kind: Service
metadata:
name: vault
labels:
app: vault
spec:
type: LoadBalancer
ports:
- port: 443
protocol: TCP
targetPort: 8200
name: api
selector:
app: vault


Thanks!
Reply all
Reply to author
Forward
0 new messages