What could be the reason for the many pod reboots

76 views
Skip to first unread message

Dmitry Bobrovsky

unread,
Jul 9, 2021, 5:58:20 AM7/9/21
to rook-dev
Hello!

rook-ceph 1.4. 
kubernetes 1.19.
This is development cluster deployed on one node.
I see many pod reboots. Sometimes the ceph becomes unavailable at all. But i don't understand what is the reason for this.

k -n rook-ceph get pod
NAME                                            READY   STATUS    RESTARTS   AGE
rook-ceph-mgr-a-755d89bdb4-2xprg                1/1     Running   17         120d
rook-ceph-mon-a-544fb6f8b8-mzqxn                1/1     Running   16         120d
rook-ceph-operator-6bb96b97d6-285jm             1/1     Running   35         120d
rook-ceph-osd-0-75b996c7dd-lr24q                1/1     Running   119        120d
rook-ceph-rgw-object-store-a-74554f7977-hztkp   1/1     Running   168        26h
rook-ceph-tools-54c767bfbb-pdhpr                1/1     Running   39         293d

How can this be diagnosed? or
Where can I start to investigate this situation?

Travis Nielsen

unread,
Jul 12, 2021, 10:58:06 AM7/12/21
to Dmitry Bobrovsky, rook-dev
The liveness probes may be failing. Check with “kubectl describe <pod>” to see the liveness probe status. How much memory does the node have? It may need more. 

Travis

--
You received this message because you are subscribed to the Google Groups "rook-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rook-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rook-dev/4f578f4c-6d8e-4d07-90f7-5396d51fbc1dn%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages