--
You received this message because you are subscribed to the Google Groups "K8ssandra Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to k8ssandra-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/k8ssandra-users/72fbd3db-c65c-4834-930f-f31e8da236d3n%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hi John
I am getting error
Error from server (NotFound): secrets "k8ssandra-test-superuser" not found
I am quite new (and honestly terrible in K8s stuff so I am just following the steps mentioned in the documents to the best of my knowledge). Just to clarify, I have done the following (sorry, long email)
- Created a project in GCP
- Made it the current project
- Enable Kubernetes API in it
- Created a cluster (3 EC2 nodes, 20GB SSD, 2CPU, 8GB ram)
- connect to the cluster
- Both Helm and Kubectl were already available
- checked that cluster is running - kubectl cluster-info
output
Kubernetes master is running at https://...
GLBCDefaultBackend is running at https://.../api/v1/namespaces/kube-system/services/default-http-backend:http/proxy
KubeDNS is running at https://.../api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Metrics-server is running at https://.../api/v1/namespaces/kube-system/services/https:metrics-server:/proxy
Next, executed the following commands
- helm repo add k8ssandra https://helm.k8ssandra.io/
- helm repo add traefik https://helm.traefik.io/traefik
- helm repo update
- helm install k8ssandra-tools k8ssandra/k8ssandra
- helm install k8ssandra-cluster-a k8ssandra/k8ssandra-cluster
Executed kubectl get pods to check pods are running.
Next downloaded the traefik.values.yml file and executed the following commands
- helm repo add traefik https://helm.traefik.io/traefik
- helm install traefik traefik/traefik -n traefik --create-namespace -f traefik.values.yaml
- kubectl get services -n traefik - waited for external ip address
Now I have the external IP address (which I suppose I can use to connect from outside world)
But this fails
CASS_USER=$(kubectl get secret k8ssandra-test-superuser -o json | jq -r '.data.username' | base64 --decode)
Error from server (NotFound): secrets "k8ssandra-test-superuser" not found
Just to be clear, I suppose your recommendation (in the documents) is not to use K3d. I tried it but the cluster creation failed. Thus I moved to using the traefik.values.yaml file.
- wget -q -O - https://raw.githubusercontent.com/rancher/k3d/main/install.sh | bash
- k3d cluster create \
--k3s-server-arg "--no-deploy" \
--k3s-server-arg "traefik" \
--port "80:32080@loadbalancer" \
--port "443:32443@loadbalancer" \
--port "9000:32090@loadbalancer" \
--port "9042:32091@loadbalancer" \
--port "9142:32092@loadbalancer"
The above failed.
Sent from Mail for Windows 10
Forgot to add, http://external_IP:9000/dashboard works. But cqlsh isn’t
From: Manu Chadha
Sent: 21 December 2020 20:51
To: John Sanda
Cc: Manu Chadha; K8ssandra Users
Subject: RE: K8ssandra on GCP
Hi John
I didn’t change any files.
On running `kubectl get secrets | grep superuser`, I get k8ssandra-superuser.
I am able to get username and password now using
CASS_USER=$(kubectl get secret k8ssandra-superuser -o json | jq -r '.data.username' | base64 --decode
CASS_PASS=$(kubectl get secret k8ssandra-superuser -o json | jq -r '.data.password' | base64 --decode)
But I am not able to connect with the machine using “cqlsh -u username -p password external_ip 9042”
Connection error: ('Unable to connect to any servers', {'35.242.186.196': OperationTimedOut('errors=Timed out creating connection (5 seconds), last_host=None',)})
The machine is however ping-able from external network using external ip returned by
kubectl get services -n traefik
The service has the following ports open - 9042:32091/TCP,9142:32092/TCP,9000:32090/TCP,80:32080/TCP,443:32443/TCP
To answer your other question, I followed the link you have mentioned.
K3d wasn’t available by default on GCP console (unlike helm and Kubectl) so I first installed it
Then I executed the following command but it kept failing
k3d cluster create --k3s-server-arg "--no-deploy" --k3s-server-arg "traefik" --port "80:32080@loadbalancer" --port "443:32443@loadbalancer" --port "9000:32090@loadbalancer" --port "9042:32091@loadbalancer" --port "9142:32092@loadbalancer"
INFO[0000] Created network 'k3d-k3s-default'
INFO[0000] Created volume 'k3d-k3s-default-images'
INFO[0001] Creating node 'k3d-k3s-default-server-0'
INFO[0002] Pulling image 'docker.io/rancher/k3s:v1.19.4-k3s1'
INFO[0013] Creating LoadBalancer 'k3d-k3s-default-serverlb'
INFO[0015] Pulling image 'docker.io/rancher/k3d-proxy:v3.4.0'
ERRO[0018] Failed to start container
ERRO[0018] Failed to create node 'k3d-k3s-default-serverlb'
ERRO[0018] Failed to create loadbalancer
ERRO[0018] Error response from daemon: driver failed programming external connectivity on endpoint k3d-k3s-default-serverlb (122d623da1891f4f5f7ab795597843ee022d014d181e0be3a038188fb0fa73f7): Error starting userland proxy: listen tcp 0.0.0.0:80: bind: address already in use
ERRO[0018] Failed to create cluster >>> Rolling Back
INFO[0018] Deleting cluster 'k3s-default'
INFO[0018] Deleted k3d-k3s-default-server-0
INFO[0018] Deleted k3d-k3s-default-serverlb
INFO[0018] Deleting cluster network '4e1cc603ca33a7fbfa2056b81bc671116d95b8da23c25686db9bb7b39a90f388'
FATA[0018] Cluster creation FAILED, all changes have been rolled back!
So I ignored step 1 and did steps 2,3 and 4 from the link. In the end I did see an external IP when I run
kubectl get services -n traefik
Could you suggest what I might be doing wrong and why am I not able to connect using the cqlsh command?
Thanks
Manu
Hi John
The output of get storageclass for me is
The output of kubectl describe pod k8ssandra-dc1-default-sts-0 is
Name: k8ssandra-dc1-default-sts-0
Namespace: default
Priority: 0
Node: gke-k8ssandra-cluster-default-pool-ed69f9d6-9nlx/10.154.0.4
Start Time: Mon, 21 Dec 2020 15:42:08 +0000
Labels: app.kubernetes.io/managed-by=cass-operator
cassandra.datastax.com/cluster=k8ssandra
cassandra.datastax.com/datacenter=dc1
cassandra.datastax.com/node-state=Started
cassandra.datastax.com/rack=default
cassandra.datastax.com/seed-node=true
controller-revision-hash=k8ssandra-dc1-default-sts-569497476c
statefulset.kubernetes.io/pod-name=k8ssandra-dc1-default-sts-0
Annotations: kubernetes.io/limit-ranger: LimitRanger plugin set: cpu request for container cassandra; cpu request for init container jmx-credentials
Status: Running
IP: 10.84.2.7
IPs:
IP: 10.84.2.7
Controlled By: StatefulSet/k8ssandra-dc1-default-sts
Init Containers:
server-config-init:
Container ID: docker://63d1bdff7fc5a133b06107236120bfe0e0abb23e42ce67070ad267df13c52181
Image: datastax/cass-config-builder:1.0.3
Image ID: docker-pullable://datastax/cass-config-builder@sha256:5d01e624674d1216f10445f23cc9ed97330dc09cda3115096fbf70d67ef22125
Port: <none>
Host Port: <none>
State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 21 Dec 2020 15:42:28 +0000
Finished: Mon, 21 Dec 2020 15:42:34 +0000
Ready: True
Restart Count: 0
Limits:
cpu: 1
memory: 256M
Requests:
cpu: 1
memory: 256M
Environment:
CONFIG_FILE_DATA: {"cassandra-yaml":{},"cluster-info":{"name":"k8ssandra","seeds":"k8ssandra-seed-service"},"datacenter-info":{"graph-enabled":0,"name":"dc1","solr-enabled":0,"spark-enabled":0},"jvm-options":{"initial_heap_size":"800M","max_heap_size":"800M"}}
POD_IP: (v1:status.podIP)
HOST_IP: (v1:status.hostIP)
USE_HOST_IP_FOR_BROADCAST: false
RACK_NAME: default
PRODUCT_VERSION: 3.11.7
PRODUCT_NAME: cassandra
DSE_VERSION: 3.11.7
Mounts:
/config from server-config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-2bcs9 (ro)
jmx-credentials:
Container ID: docker://4e0210ed7c17aea4b23b48777d8e80e4c798521ba6b6c29db3c263a635d4979d
Image: busybox
Image ID: docker-pullable://busybox@sha256:bde48e1751173b709090c2539fdf12d6ba64e88ec7a4301591227ce925f3c678
Port: <none>
Host Port: <none>
Args:
/bin/sh
-c
echo -n "$JMX_USERNAME $JMX_PASSWORD" > /config/jmxremote.password
State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 21 Dec 2020 15:42:35 +0000
Finished: Mon, 21 Dec 2020 15:42:35 +0000
Ready: True
Restart Count: 0
Requests:
cpu: 100m
Environment:
JMX_USERNAME: <set to the key 'username' in secret 'k8ssandra-cluster-a-reaper-secret-k8ssandra'> Optional: false
JMX_PASSWORD: <set to the key 'password' in secret 'k8ssandra-cluster-a-reaper-secret-k8ssandra'> Optional: false
Mounts:
/config from server-config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-2bcs9 (ro)
Containers:
cassandra:
Container ID: docker://79195e8e5704cda5d97f0c2d0e6d3f476037350f41c73f3276caeedb2f83d4f0
Image: jsanda/mgmtapi-3_11:v0.1.13-k8c-88
Image ID: docker-pullable://jsanda/mgmtapi-3_11@sha256:d50652c7be5c38264703636f75dce53b561fff5e766a42e753592fc5276f8a5d
Ports: 9042/TCP, 9142/TCP, 7000/TCP, 7001/TCP, 7199/TCP, 8080/TCP, 9103/TCP, 9160/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
State: Running
Started: Mon, 21 Dec 2020 15:42:53 +0000
Ready: True
Restart Count: 0
Requests:
cpu: 100m
Liveness: http-get http://:8080/api/v0/probes/liveness delay=15s timeout=1s period=15s #success=1 #failure=3
Readiness: http-get http://:8080/api/v0/probes/readiness delay=20s timeout=1s period=10s #success=1 #failure=3
Environment:
LOCAL_JMX: no
DS_LICENSE: accept
DSE_AUTO_CONF_OFF: all
USE_MGMT_API: true
MGMT_API_EXPLICIT_START: true
DSE_MGMT_EXPLICIT_START: true
Mounts:
/config from server-config (rw)
/etc/encryption/ from encryption-cred-storage (rw)
/var/lib/cassandra from server-data (rw)
/var/log/cassandra from server-logs (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-2bcs9 (ro)
server-system-logger:
Container ID: docker://3cac924cd9fd0d7baa226b5bec60c85fa8a86da39c0df2a7913451c1bd60cd02
Image: busybox
Image ID: docker-pullable://busybox@sha256:bde48e1751173b709090c2539fdf12d6ba64e88ec7a4301591227ce925f3c678
Port: <none>
Host Port: <none>
Args:
/bin/sh
-c
tail -n+1 -F /var/log/cassandra/system.log
State: Running
Started: Mon, 21 Dec 2020 15:42:53 +0000
Ready: True
Restart Count: 0
Limits:
cpu: 100m
memory: 64M
Requests:
cpu: 100m
memory: 64M
Environment: <none>
Mounts:
/var/log/cassandra from server-logs (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-2bcs9 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
server-data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: server-data-k8ssandra-dc1-default-sts-0
ReadOnly: false
server-config:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
server-logs:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
encryption-cred-storage:
Type: Secret (a volume populated by a Secret)
SecretName: dc1-keystore
Optional: false
default-token-2bcs9:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-2bcs9
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events: <none>
The value is being referenced in cass_dc.yaml - https://github.com/k8ssandra/k8ssandra/blob/main/charts/k8ssandra-cluster/templates/cassdc.yaml
size: {{ .Values.size }}
Am I correct that the current cluster has only 1 node?
I changed the value to 3 in values.yaml file and upgraded the installation. I can now see 3 statefuls.
k8ssandra-dc1-default-sts-0 2/2 Running 0 28h
k8ssandra-dc1-default-sts-1 0/2 Pending 0 5m24s
k8ssandra-dc1-default-sts-2 2/2 Running 0 5m24s
but one of them is stock on pending!
Interestingly however, the replication problem is solved now because I suppose because I am using QUORUM and as 2 nodes are up, the command was successful for dc1:3 replication.
myapp | True | {'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'dc1': '3'}
coordinator | True | {'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'dc1': '3'}
What might be the cause that one of the nodes is stuck in 0/2?
From: Manu Chadha
Sent: 23 December 2020 12:58
To: K8ssandra Users; John Sanda
Subject: RE: NoHostAvailable error in K8ssandra
Or could it be that I am actually running a DC with only one node (default setting of K8ssandra)? I checked https://github.com/k8ssandra/k8ssandra/blob/main/charts/k8ssandra-cluster/values.yaml and found that size is 1. Does this size correspond to no. of nodes in a data center?
From: Manu Chadha
Sent: 23 December 2020 12:46
To: K8ssandra Users; John Sanda
Subject: RE: NoHostAvailable error in K8ssandra
Hi
Unfortunately, I have not made any progress in resolving the issue. Some help would be much appreciated.
The only place where I have seen ports getting exposed is in traefik.values.yaml. I believe the nodes in a cluster talk to each other on port 7000. Unless K8ssandra takes care of managing inter-node communication internally (maybe using pod IPs etc)should 7000 be added in traefik.values.yaml? I honestly doubt as I think these ports work on the external IP and not on pod or node IPs.
ports:
traefik:
expose: true
nodePort: 32090
web:
nodePort: 32080
websecure:
nodePort: 32443
cassandra: <- add entry here like expose:true, port:7000, nodePort:7000?
expose: true
port: 9042
nodePort: 32091
From: Manu Chadha
Sent: 22 December 2020 11:52
To: K8ssandra Users <k8ssand...@googlegroups.com>
Subject: NoHostAvailable error in K8ssandra
Hi
I have created a 3 node K8ssandra cluster (used default values of K8ssandra). I suppose K8ssandra will create a 3 node cluster.
I then created keyspace with NetworkTopologyStrategy with datacentre replication level of 3
CREATE KEYSPACE codingjedi WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1': '3'} ;
Then when I tried to insert data into a table, I get NoHostAvailable error
k8ssandra-superuser@cqlsh:codingjedi> INSERT INTO tablename ... VALUES ... IF NOT EXISTS;
NoHostAvailable:
I assume the above error happens if the replication factor or consistency level is not satisfied. Am I correct? If so, what might be causing the error considering that there are 3 nodes.
To debug further, I change the replication factor to 1 and things work ( it didn’t work for RF=2 as well).
I suspect that maybe inter-node communication is blocked (though I don’t know if it is due to fault in my configuration/firewall issues) or K8ssandra.
How could I debug it further?
Thanks
Limits:
cpu: 1
memory: 256M
Requests:
cpu: 1
memory: 256M
This means that each Cassandra pod is requesting 1 CPU. We can change it to request less, but honestly you would be better off by using a machine with more CPUs.
It has warning
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 43m (x3 over 43m) default-scheduler pod has unbound immediate PersistentVolumeClaims
Warning FailedScheduling 83s (x30 over 43m) default-scheduler 0/3 nodes are available: 3 Insufficient cpu.
Guessing - Maybe I had to change size at multiple places which I haven’t
The complete output is
Name: k8ssandra-dc1-default-sts-1
Namespace: default
Priority: 0
Node: <none>
Labels: app.kubernetes.io/managed-by=cass-operator
cassandra.datastax.com/cluster=k8ssandra
cassandra.datastax.com/datacenter=dc1
cassandra.datastax.com/node-state=Ready-to-Start
cassandra.datastax.com/rack=default
controller-revision-hash=k8ssandra-dc1-default-sts-569497476c
statefulset.kubernetes.io/pod-name=k8ssandra-dc1-default-sts-1
Annotations: kubernetes.io/limit-ranger: LimitRanger plugin set: cpu request for container cassandra; cpu request for init container jmx-credentials
Status: Pending
IP:
IPs: <none>
Controlled By: StatefulSet/k8ssandra-dc1-default-sts
Init Containers:
server-config-init:
Image: datastax/cass-config-builder:1.0.3
Port: <none>
Host Port: <none>
Limits:
cpu: 1
memory: 256M
Requests:
cpu: 1
memory: 256M
Environment:
CONFIG_FILE_DATA: {"cassandra-yaml":{},"cluster-info":{"name":"k8ssandra","seeds":"k8ssandra-seed-service"},"datacenter-info":{"graph-enabled":0,"name":"dc1","solr-enabled":0,"spark-enabled":0},"jvm-options":{"initial_heap_size":"800M","max_heap_size":"800M"}}
POD_IP: (v1:status.podIP)
HOST_IP: (v1:status.hostIP)
USE_HOST_IP_FOR_BROADCAST: false
RACK_NAME: default
PRODUCT_VERSION: 3.11.7
PRODUCT_NAME: cassandra
DSE_VERSION: 3.11.7
Mounts:
/config from server-config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-ftj87 (ro)
jmx-credentials:
Image: busybox
Port: <none>
Host Port: <none>
Args:
/bin/sh
-c
echo -n "$JMX_USERNAME $JMX_PASSWORD" > /config/jmxremote.password
Requests:
cpu: 100m
Environment:
JMX_USERNAME: <set to the key 'username' in secret 'k8ssandra-cluster-a-reaper-secret-k8ssandra'> Optional: false
JMX_PASSWORD: <set to the key 'password' in secret 'k8ssandra-cluster-a-reaper-secret-k8ssandra'> Optional: false
Mounts:
/config from server-config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-ftj87 (ro)
Containers:
cassandra:
Image: jsanda/mgmtapi-3_11:v0.1.13-k8c-88
Ports: 9042/TCP, 9142/TCP, 7000/TCP, 7001/TCP, 7199/TCP, 8080/TCP, 9103/TCP, 9160/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
Requests:
cpu: 100m
Liveness: http-get http://:8080/api/v0/probes/liveness delay=15s timeout=1s period=15s #success=1 #failure=3
Readiness: http-get http://:8080/api/v0/probes/readiness delay=20s timeout=1s period=10s #success=1 #failure=3
Environment:
LOCAL_JMX: no
DS_LICENSE: accept
DSE_AUTO_CONF_OFF: all
USE_MGMT_API: true
MGMT_API_EXPLICIT_START: true
DSE_MGMT_EXPLICIT_START: true
Mounts:
/config from server-config (rw)
/etc/encryption/ from encryption-cred-storage (rw)
/var/lib/cassandra from server-data (rw)
/var/log/cassandra from server-logs (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-ftj87 (ro)
server-system-logger:
Image: busybox
Port: <none>
Host Port: <none>
Args:
/bin/sh
-c
tail -n+1 -F /var/log/cassandra/system.log
Limits:
cpu: 100m
memory: 64M
Requests:
cpu: 100m
memory: 64M
Environment: <none>
Mounts:
/var/log/cassandra from server-logs (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-ftj87 (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
server-data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: server-data-k8ssandra-dc1-default-sts-1
ReadOnly: false
server-config:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
server-logs:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
encryption-cred-storage:
Type: Secret (a volume populated by a Secret)
SecretName: dc1-keystore
Optional: false
default-token-ftj87:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-ftj87
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 43m (x3 over 43m) default-scheduler pod has unbound immediate PersistentVolumeClaims
Warning FailedScheduling 96s (x30 over 43m) default-scheduler 0/3 nodes are available: 3 Insufficient cpu.
From: John Sanda
Sent: 23 December 2020 14:46
To: Manu Chadha
Cc: K8ssandra Users
Also, if you don’t mind, could you please explain to me what the limit means? I am unable to work out the math. The cluster has 6 vCPUs
I am happy to increase it to the right size. What configuration do you recommend? I went with 2 CPU and 8GB because that was minimum production for C*.
Does K8ssandra need more as minimal?
Also, if you don’t mind, could you please explain to me what the limit means? I am unable to work out the math. The cluster has 6 vCPUs. If 3 C* nodes are there (nodes per DC =3 from size: {{ .Values.size }}) then there are still 3 CPUs left. What am I not considering in the math?
Yes, 6 CPUs in total, 2 per node. Also, 8 GB RAM per node, 24GB in total.
Output of Top is (to be honest, I don’t think I know what I am looking for but the nos. don’t seem staggering).
NAMESPACE NAME CPU(cores) MEMORY(bytes)
default cass-operator-86d4dc45cd-588c8 6m 13Mi
default grafana-deployment-66557855cc-j7476 6m 234Mi
default k8ssandra-cluster-a-grafana-operator-k8ssandra-5b89b64f4f-8pbxk 6m 321Mi
default k8ssandra-cluster-a-reaper-k8ssandra-847c99ccd8-dsnj4 3m 241Mi
default k8ssandra-cluster-a-reaper-operator-k8ssandra-87d56d56f-wn8hw 2m 12Mi
default k8ssandra-dc1-default-sts-0 250m 1513Mi
default k8ssandra-dc1-default-sts-2 166m 1443Mi
default k8ssandra-tools-kube-prome-operator-6bcdf668d4-ndhw9 1m 17Mi
default prometheus-k8ssandra-cluster-a-prometheus-k8ssandra-0 31m 169Mi
kube-system event-exporter-gke-77cccd97c6-xxmjp 1m 12Mi
kube-system fluentd-gke-5rdqb 6m 176Mi
kube-system fluentd-gke-f4f7c 19m 181Mi
kube-system fluentd-gke-h74tv 9m 170Mi
kube-system fluentd-gke-scaler-54796dcbf7-8p55m 0m 6Mi
kube-system gke-metrics-agent-cdb9f 1m 22Mi
kube-system gke-metrics-agent-cjprt 1m 21Mi
kube-system gke-metrics-agent-pjqn8 2m 22Mi
kube-system kube-dns-7bb4975665-hsqtw 4m 36Mi
kube-system kube-dns-7bb4975665-sfrdr 3m 34Mi
kube-system kube-dns-autoscaler-645f7d66cf-2b4zp 1m 4Mi
kube-system kube-proxy-gke-k8ssandra-cluster-default-pool-1b1cc22a-3455 2m 14Mi
kube-system kube-proxy-gke-k8ssandra-cluster-default-pool-1b1cc22a-rd6t 5m 14Mi
kube-system kube-proxy-gke-k8ssandra-cluster-default-pool-1b1cc22a-xtk0 2m 14Mi
kube-system l7-default-backend-678889f899-bkm68 1m 2Mi
kube-system metrics-server-v0.3.6-64655c969-65qmc 1m 19Mi
kube-system prometheus-to-sd-dwdq5 0m 7Mi
kube-system prometheus-to-sd-fjprd 0m 7Mi
kube-system prometheus-to-sd-xbh7r 2m 7Mi
kube-system stackdriver-metadata-agent-cluster-level-7c568dcbb6-h5pwn 35m 26Mi
traefik traefik-57788644cc-twx5n 5m 20Mi
no_reply@cloudshell:~ (k8ssandra-299315)$
To me, the issue seems to be that C* needs 51% CPU per node(don’t know why though). This much capacity is available in nodes gke-k8ssandra-cluster-default-pool-1b1cc22a-3455 and gke-k8ssandra-cluster-default-pool-1b1cc22a-rd6t but the 3rd node has other processes running which reduce the available cpu for the 3rd node
· In node gke-k8ssandra-cluster-default-pool-1b1cc22a-3455, default namespace takes 56% cpu and kube namespace takes around 23%. The main process of default is k8ssandra-dc1-default-sts-1· In node gke-k8ssandra-cluster-default-pool-1b1cc22a-rd6t, default takes 51% and kube takes 25%. The main process of default is k8ssandra-dc1-default-sts-2· In node gke-k8ssandra-cluster-default-pool-1b1cc22a-xtk0, default takes 42% and kube takes 15% leaving no space for k8ssandra-dc1-default-sts-1
What do you recommend? If I increase CPU to 4 then then problem would be solved though there’ll be waste of 2-3 CPUs per node!
Sent from Mail for Windows 10
From: Manu Chadha
Sent: 23 December 2020 16:04
To: John Sanda
Cc: K8ssandra Users
Subject: RE: NoHostAvailable error in K8ssandra
Here you go sir (and I am grateful that you are giving this your time). Also attached the file for easy reading.
-----
Name: gke-k8ssandra-cluster-default-pool-1b1cc22a-3455
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/instance-type=e2-standard-2
cloud.google.com/gke-nodepool=default-pool
cloud.google.com/gke-os-distribution=cos
failure-domain.beta.kubernetes.io/region=europe-west2
failure-domain.beta.kubernetes.io/zone=europe-west2-a
kubernetes.io/hostname=gke-k8ssandra-cluster-default-pool-1b1cc22a-3455
Annotations: container.googleapis.com/instance_id: 8182738576015624683
node.alpha.kubernetes.io/ttl: 0
node.gke.io/last-applied-node-labels: cloud.google.com/gke-nodepool=default-pool,cloud.google.com/gke-os-distribution=cos
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Tue, 22 Dec 2020 09:07:57 +0000
Taints: <none>
Unschedulable: false
Lease:
HolderIdentity: gke-k8ssandra-cluster-default-pool-1b1cc22a-3455
AcquireTime: <unset>
RenewTime: Wed, 23 Dec 2020 15:59:50 +0000
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
CorruptDockerOverlay2 False Wed, 23 Dec 2020 15:56:25 +0000 Tue, 22 Dec 2020 09:07:59 +0000 NoCorruptDockerOverlay2 docker overlay2 is functioning properly
FrequentUnregisterNetDevice False Wed, 23 Dec 2020 15:56:25 +0000 Tue, 22 Dec 2020 09:07:59 +0000 NoFrequentUnregisterNetDevice node is functioning properly
FrequentKubeletRestart False Wed, 23 Dec 2020 15:56:25 +0000 Tue, 22 Dec 2020 09:07:59 +0000 NoFrequentKubeletRestart kubelet is functioning properly
FrequentDockerRestart False Wed, 23 Dec 2020 15:56:25 +0000 Tue, 22 Dec 2020 09:07:59 +0000 NoFrequentDockerRestart docker is functioning properly
FrequentContainerdRestart False Wed, 23 Dec 2020 15:56:25 +0000 Tue, 22 Dec 2020 09:07:59 +0000 NoFrequentContainerdRestart containerd is functioning properly
KernelDeadlock False Wed, 23 Dec 2020 15:56:25 +0000 Tue, 22 Dec 2020 09:07:59 +0000 KernelHasNoDeadlock kernel has no deadlock
ReadonlyFilesystem False Wed, 23 Dec 2020 15:56:25 +0000 Tue, 22 Dec 2020 09:07:59 +0000 FilesystemIsNotReadOnly Filesystem is not read-only
NetworkUnavailable False Tue, 22 Dec 2020 09:07:58 +0000 Tue, 22 Dec 2020 09:07:58 +0000 RouteCreated NodeController create implicit route
MemoryPressure False Wed, 23 Dec 2020 15:59:39 +0000 Tue, 22 Dec 2020 09:07:57 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Wed, 23 Dec 2020 15:59:39 +0000 Tue, 22 Dec 2020 09:07:57 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Wed, 23 Dec 2020 15:59:39 +0000 Tue, 22 Dec 2020 09:07:57 +0000 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Wed, 23 Dec 2020 15:59:39 +0000 Tue, 22 Dec 2020 09:07:57 +0000 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: 10.154.0.7
ExternalIP: 34.105.169.69
InternalDNS: gke-k8ssandra-cluster-default-pool-1b1cc22a-3455.europe-west2-a.c.k8ssandra-299315.internal
Hostname: gke-k8ssandra-cluster-default-pool-1b1cc22a-3455.europe-west2-a.c.k8ssandra-299315.internal
Capacity:
attachable-volumes-gce-pd: 127
cpu: 2
ephemeral-storage: 5971884Ki
hugepages-2Mi: 0
memory: 8172624Ki
pods: 110
Allocatable:
attachable-volumes-gce-pd: 127
cpu: 1930m
ephemeral-storage: 134979166
hugepages-2Mi: 0
memory: 6207568Ki
pods: 110
System Info:
Machine ID: bc2c3c203be3a7759f925425a85124e9
System UUID: bc2c3c20-3be3-a775-9f92-5425a85124e9
Boot ID: e7f11a34-3ac5-4376-a56a-b29f765c1c53
Kernel Version: 4.19.112+
OS Image: Container-Optimized OS from Google
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://19.3.1
Kubelet Version: v1.16.15-gke.4901
Kube-Proxy Version: v1.16.15-gke.4901
PodCIDR: 10.84.0.0/24
PodCIDRs: 10.84.0.0/24
ProviderID: gce://k8ssandra-299315/europe-west2-a/gke-k8ssandra-cluster-default-pool-1b1cc22a-3455
Non-terminated Pods: (9 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
default k8ssandra-cluster-a-reaper-k8ssandra-847c99ccd8-dsnj4 100m (5%) 0 (0%) 0 (0%) 0 (0%) 30h
default k8ssandra-dc1-default-sts-2 1 (51%) 1 (51%) 256M (4%) 256M (4%) 116m
kube-system event-exporter-gke-77cccd97c6-xxmjp 0 (0%) 0 (0%) 0 (0%) 0 (0%) 30h
kube-system fluentd-gke-h74tv 100m (5%) 1 (51%) 200Mi (3%) 500Mi (8%) 30h
kube-system gke-metrics-agent-pjqn8 3m (0%) 0 (0%) 50Mi (0%) 50Mi (0%) 30h
kube-system kube-dns-7bb4975665-sfrdr 260m (13%) 0 (0%) 110Mi (1%) 210Mi (3%) 30h
kube-system kube-proxy-gke-k8ssandra-cluster-default-pool-1b1cc22a-3455 100m (5%) 0 (0%) 0 (0%) 0 (0%) 30h
kube-system prometheus-to-sd-dwdq5 0 (0%) 0 (0%) 0 (0%) 0 (0%) 30h
traefik traefik-57788644cc-twx5n 0 (0%) 0 (0%) 0 (0%) 0 (0%) 30h
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 1563m (80%) 2 (103%)
memory 633487360 (9%) 1052917760 (16%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
attachable-volumes-gce-pd 0 0
Events: <none>
Name: gke-k8ssandra-cluster-default-pool-1b1cc22a-rd6t
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/instance-type=e2-standard-2
cloud.google.com/gke-nodepool=default-pool
cloud.google.com/gke-os-distribution=cos
failure-domain.beta.kubernetes.io/region=europe-west2
failure-domain.beta.kubernetes.io/zone=europe-west2-a
kubernetes.io/hostname=gke-k8ssandra-cluster-default-pool-1b1cc22a-rd6t
Annotations: container.googleapis.com/instance_id: 6741938848215599595
node.alpha.kubernetes.io/ttl: 0
node.gke.io/last-applied-node-labels: cloud.google.com/gke-nodepool=default-pool,cloud.google.com/gke-os-distribution=cos
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Tue, 22 Dec 2020 09:07:57 +0000
Taints: <none>
Unschedulable: false
Lease:
HolderIdentity: gke-k8ssandra-cluster-default-pool-1b1cc22a-rd6t
AcquireTime: <unset>
RenewTime: Wed, 23 Dec 2020 15:59:46 +0000
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
FrequentUnregisterNetDevice False Wed, 23 Dec 2020 15:56:20 +0000 Tue, 22 Dec 2020 09:07:58 +0000 NoFrequentUnregisterNetDevice node is functioning properly
FrequentKubeletRestart False Wed, 23 Dec 2020 15:56:20 +0000 Tue, 22 Dec 2020 09:07:58 +0000 NoFrequentKubeletRestart kubelet is functioning properly
FrequentDockerRestart False Wed, 23 Dec 2020 15:56:20 +0000 Tue, 22 Dec 2020 09:07:58 +0000 NoFrequentDockerRestart docker is functioning properly
FrequentContainerdRestart False Wed, 23 Dec 2020 15:56:20 +0000 Tue, 22 Dec 2020 09:07:58 +0000 NoFrequentContainerdRestart containerd is functioning properly
KernelDeadlock False Wed, 23 Dec 2020 15:56:20 +0000 Tue, 22 Dec 2020 09:07:58 +0000 KernelHasNoDeadlock kernel has no deadlock
ReadonlyFilesystem False Wed, 23 Dec 2020 15:56:20 +0000 Tue, 22 Dec 2020 09:07:58 +0000 FilesystemIsNotReadOnly Filesystem is not read-only
CorruptDockerOverlay2 False Wed, 23 Dec 2020 15:56:20 +0000 Tue, 22 Dec 2020 09:07:58 +0000 NoCorruptDockerOverlay2 docker overlay2 is functioning properly
NetworkUnavailable False Tue, 22 Dec 2020 09:07:58 +0000 Tue, 22 Dec 2020 09:07:58 +0000 RouteCreated NodeController create implicit route
MemoryPressure False Wed, 23 Dec 2020 15:59:19 +0000 Tue, 22 Dec 2020 09:07:57 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Wed, 23 Dec 2020 15:59:19 +0000 Tue, 22 Dec 2020 09:07:57 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Wed, 23 Dec 2020 15:59:19 +0000 Tue, 22 Dec 2020 09:07:57 +0000 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Wed, 23 Dec 2020 15:59:19 +0000 Tue, 22 Dec 2020 09:07:57 +0000 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: 10.154.0.5
ExternalIP: 35.242.186.196
InternalDNS: gke-k8ssandra-cluster-default-pool-1b1cc22a-rd6t.europe-west2-a.c.k8ssandra-299315.internal
Hostname: gke-k8ssandra-cluster-default-pool-1b1cc22a-rd6t.europe-west2-a.c.k8ssandra-299315.internal
Capacity:
attachable-volumes-gce-pd: 127
cpu: 2
ephemeral-storage: 5971884Ki
hugepages-2Mi: 0
memory: 8172640Ki
pods: 110
Allocatable:
attachable-volumes-gce-pd: 127
cpu: 1930m
ephemeral-storage: 134979166
hugepages-2Mi: 0
memory: 6207584Ki
pods: 110
System Info:
Machine ID: 1f819bfba543c43d2d679c9112217707
System UUID: 1f819bfb-a543-c43d-2d67-9c9112217707
Boot ID: d993f96b-fefa-4a62-a6df-09877a48a3c1
Kernel Version: 4.19.112+
OS Image: Container-Optimized OS from Google
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://19.3.1
Kubelet Version: v1.16.15-gke.4901
Kube-Proxy Version: v1.16.15-gke.4901
PodCIDR: 10.84.1.0/24
PodCIDRs: 10.84.1.0/24
ProviderID: gce://k8ssandra-299315/europe-west2-a/gke-k8ssandra-cluster-default-pool-1b1cc22a-rd6t
Non-terminated Pods: (9 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
default k8ssandra-dc1-default-sts-0 1 (51%) 1 (51%) 256M (4%) 256M (4%) 30h
kube-system fluentd-gke-5rdqb 100m (5%) 1 (51%) 200Mi (3%) 500Mi (8%) 30h
kube-system fluentd-gke-scaler-54796dcbf7-8p55m 0 (0%) 0 (0%) 0 (0%) 0 (0%) 30h
kube-system gke-metrics-agent-cdb9f 3m (0%) 0 (0%) 50Mi (0%) 50Mi (0%) 30h
kube-system kube-dns-7bb4975665-hsqtw 260m (13%) 0 (0%) 110Mi (1%) 210Mi (3%) 30h
kube-system kube-dns-autoscaler-645f7d66cf-2b4zp 20m (1%) 0 (0%) 10Mi (0%) 0 (0%) 30h
kube-system kube-proxy-gke-k8ssandra-cluster-default-pool-1b1cc22a-rd6t 100m (5%) 0 (0%) 0 (0%) 0 (0%) 30h
kube-system l7-default-backend-678889f899-bkm68 10m (0%) 10m (0%) 20Mi (0%) 20Mi (0%) 30h
kube-system prometheus-to-sd-xbh7r 0 (0%) 0 (0%) 0 (0%) 0 (0%) 30h
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 1493m (77%) 2010m (104%)
memory 664944640 (10%) 1073889280 (16%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
attachable-volumes-gce-pd 0 0
Events: <none>
Name: gke-k8ssandra-cluster-default-pool-1b1cc22a-xtk0
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/instance-type=e2-standard-2
cloud.google.com/gke-nodepool=default-pool
cloud.google.com/gke-os-distribution=cos
failure-domain.beta.kubernetes.io/region=europe-west2
failure-domain.beta.kubernetes.io/zone=europe-west2-a
kubernetes.io/hostname=gke-k8ssandra-cluster-default-pool-1b1cc22a-xtk0
Annotations: container.googleapis.com/instance_id: 6505924889152326123
node.alpha.kubernetes.io/ttl: 0
node.gke.io/last-applied-node-labels: cloud.google.com/gke-nodepool=default-pool,cloud.google.com/gke-os-distribution=cos
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Tue, 22 Dec 2020 09:08:05 +0000
Taints: <none>
Unschedulable: false
Lease:
HolderIdentity: gke-k8ssandra-cluster-default-pool-1b1cc22a-xtk0
AcquireTime: <unset>
RenewTime: Wed, 23 Dec 2020 15:59:46 +0000
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
KernelDeadlock False Wed, 23 Dec 2020 15:56:34 +0000 Tue, 22 Dec 2020 09:08:08 +0000 KernelHasNoDeadlock kernel has no deadlock
ReadonlyFilesystem False Wed, 23 Dec 2020 15:56:34 +0000 Tue, 22 Dec 2020 09:08:08 +0000 FilesystemIsNotReadOnly Filesystem is not read-only
CorruptDockerOverlay2 False Wed, 23 Dec 2020 15:56:34 +0000 Tue, 22 Dec 2020 09:08:08 +0000 NoCorruptDockerOverlay2 docker overlay2 is functioning properly
FrequentUnregisterNetDevice False Wed, 23 Dec 2020 15:56:34 +0000 Tue, 22 Dec 2020 09:08:08 +0000 NoFrequentUnregisterNetDevice node is functioning properly
FrequentKubeletRestart False Wed, 23 Dec 2020 15:56:34 +0000 Tue, 22 Dec 2020 09:08:08 +0000 NoFrequentKubeletRestart kubelet is functioning properly
FrequentDockerRestart False Wed, 23 Dec 2020 15:56:34 +0000 Tue, 22 Dec 2020 09:08:08 +0000 NoFrequentDockerRestart docker is functioning properly
FrequentContainerdRestart False Wed, 23 Dec 2020 15:56:34 +0000 Tue, 22 Dec 2020 09:08:08 +0000 NoFrequentContainerdRestart containerd is functioning properly
NetworkUnavailable False Tue, 22 Dec 2020 09:08:06 +0000 Tue, 22 Dec 2020 09:08:06 +0000 RouteCreated NodeController create implicit route
MemoryPressure False Wed, 23 Dec 2020 15:59:27 +0000 Tue, 22 Dec 2020 09:08:05 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Wed, 23 Dec 2020 15:59:27 +0000 Tue, 22 Dec 2020 09:08:05 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Wed, 23 Dec 2020 15:59:27 +0000 Tue, 22 Dec 2020 09:08:05 +0000 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Wed, 23 Dec 2020 15:59:27 +0000 Tue, 22 Dec 2020 09:08:06 +0000 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: 10.154.0.6
ExternalIP: 34.105.242.148
InternalDNS: gke-k8ssandra-cluster-default-pool-1b1cc22a-xtk0.europe-west2-a.c.k8ssandra-299315.internal
Hostname: gke-k8ssandra-cluster-default-pool-1b1cc22a-xtk0.europe-west2-a.c.k8ssandra-299315.internal
Capacity:
attachable-volumes-gce-pd: 127
cpu: 2
ephemeral-storage: 5971884Ki
hugepages-2Mi: 0
memory: 8172640Ki
pods: 110
Allocatable:
attachable-volumes-gce-pd: 127
cpu: 1930m
ephemeral-storage: 134979166
hugepages-2Mi: 0
memory: 6207584Ki
pods: 110
System Info:
Machine ID: efbb8f4aab3ae8d9f37258ed1ce97ce7
System UUID: efbb8f4a-ab3a-e8d9-f372-58ed1ce97ce7
Boot ID: 35338e12-064e-457a-b004-494b2027ddef
Kernel Version: 4.19.112+
OS Image: Container-Optimized OS from Google
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://19.3.1
Kubelet Version: v1.16.15-gke.4901
Kube-Proxy Version: v1.16.15-gke.4901
PodCIDR: 10.84.2.0/24
PodCIDRs: 10.84.2.0/24
ProviderID: gce://k8ssandra-299315/europe-west2-a/gke-k8ssandra-cluster-default-pool-1b1cc22a-xtk0
Non-terminated Pods: (12 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
default cass-operator-86d4dc45cd-588c8 100m (5%) 0 (0%) 0 (0%) 0 (0%) 30h
default grafana-deployment-66557855cc-j7476 250m (12%) 1 (51%) 256Mi (4%) 1Gi (16%) 30h
default k8ssandra-cluster-a-grafana-operator-k8ssandra-5b89b64f4f-8pbxk 100m (5%) 0 (0%) 0 (0%) 0 (0%) 30h
default k8ssandra-cluster-a-reaper-operator-k8ssandra-87d56d56f-wn8hw 100m (5%) 100m (5%) 20Mi (0%) 30Mi (0%) 30h
default k8ssandra-tools-kube-prome-operator-6bcdf668d4-ndhw9 100m (5%) 0 (0%) 0 (0%) 0 (0%) 30h
default prometheus-k8ssandra-cluster-a-prometheus-k8ssandra-0 200m (10%) 100m (5%) 25Mi (0%) 25Mi (0%) 30h
kube-system fluentd-gke-f4f7c 100m (5%) 1 (51%) 200Mi (3%) 500Mi (8%) 30h
kube-system gke-metrics-agent-cjprt 3m (0%) 0 (0%) 50Mi (0%) 50Mi (0%) 30h
kube-system kube-proxy-gke-k8ssandra-cluster-default-pool-1b1cc22a-xtk0 100m (5%) 0 (0%) 0 (0%) 0 (0%) 30h
kube-system metrics-server-v0.3.6-64655c969-65qmc 48m (2%) 143m (7%) 105Mi (1%) 355Mi (5%) 30h
kube-system prometheus-to-sd-fjprd 0 (0%) 0 (0%) 0 (0%) 0 (0%) 30h
kube-system stackdriver-metadata-agent-cluster-level-7c568dcbb6-h5pwn 98m (5%) 48m (2%) 202Mi (3%) 202Mi (3%) 30h
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 1199m (62%) 2391m (123%)
memory 858Mi (14%) 2186Mi (36%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
attachable-volumes-gce-pd 0 0
Events: <none>