Disappearing k8s master

31 views
Skip to first unread message

he...@andrewhowden.com

unread,
Apr 21, 2018, 4:23:07 AM4/21/18
to Kubernetes user discussion and Q&A
Hola All,

I am struggling with a bug currently that I don't reaaaly know how to resolve. In essence, my k8s master has disappeared.

Specifically, it's not available:

- Via kubectl
- In the admin panel
- To the (single) node within the cluster

All signs point to it being switched off, or separated by a network partition. However, I'm not 100% sure how to debug such a case.

Firstly, it's a personal account -- this particular cluster is "fine" (read: I feel stupid when people say my website is down but other than that), but I also use these in a professional capacity, for larger workloads -- that's suuper scary.

Secondly, it's a cluster that only runs preemptable nodes. They're way cheaper and I don't care about small downtimes.

From monitoring, it looks like it died on ~ April 13, 6:19 PM (I think AEST?). The cluster itself is fairly talkative until 2018-03-13 10:24:27 where it no longer logs anything further, or logs are dropped. I would guess the former -- the incident will happen as the node gets rotated out, not when the master dies.

I'm kiiind of at a loss. It's all still remaining there if a helpful Google Cloud person visits these forums (I want to understand the root cause so it doesn't happen to other, more important accounts) -- but has anyone else seen this?

Mayur Nagekar

unread,
Apr 21, 2018, 4:25:59 AM4/21/18
to kubernet...@googlegroups.com
Whats the output of kubectl get nodes ?


--
You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-users+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.



--

Thanks,

Mayur

he...@andrewhowden.com

unread,
Apr 21, 2018, 4:31:24 AM4/21/18
to Kubernetes user discussion and Q&A
On Saturday, April 21, 2018 at 10:25:59 AM UTC+2, Mayur Nagekar wrote:
> Whats the output of kubectl get nodes ?

__USER_NAME__@__PROJECT_NAME__:~$ kubectl get nodes -v=9
I0421 10:27:53.952335 620 loader.go:357] Config loaded from file /home/__USER_NAME__/.kube/config
I0421 10:27:53.953721 620 round_trippers.go:417] curl -k -v -XGET -H "Accept: application/json, */*" -H "User-Agent: kubectl/v1.8.6 (linux/amd64) kubernetes/6260bb0" https://104.155.__XXX__.__XXX__/api
I0421 10:28:23.958470 620 round_trippers.go:436] GET https://104.155.__XXX__.__XXX__/api in 30004 milliseconds
I0421 10:28:23.958494 620 round_trippers.go:442] Response Headers:
I0421 10:28:23.958559 620 cached_discovery.go:126] skipped caching discovery info due to Get https://104.155.__XXX__.__XXX__/api: dial tcp 104.155.__XXX__.__XXX__:443: i/o timeout
I0421 10:28:23.958586 620 helpers.go:225] Connection error: Get https://104.155.__XXX__.__XXX__/api: dial tcp 104.155.__XXX__.__XXX__:443: i/o timeout
F0421 10:28:23.958602 620 helpers.go:120] Unable to connect to the server: dial tcp 104.155.__XXX__.__XXX__:443: i/o timeout

__${FOO}__ <-- Something that was sanitized.

he...@andrewhowden.com

unread,
Apr 21, 2018, 4:35:49 AM4/21/18
to Kubernetes user discussion and Q&A
On Saturday, April 21, 2018 at 10:31:24 AM UTC+2, he...@andrewhowden.com wrote:
> I0421 10:27:53.953721 620 round_trippers.go:417] curl -k -v -XGET -H "Accept: application/json, */*" -H "User-Agent: kubectl/v1.8.6 (linux/amd64) kubernetes/6260bb0" https://104.155.__XXX__.__XXX__/api

Obligatorily, I have actually checked that IP is the one I intended to connect to.

Mayur Nagekar

unread,
Apr 21, 2018, 3:50:13 PM4/21/18
to kubernet...@googlegroups.com
Can you login into the master and run `docker images` and `docker ps -a` ? is the control plane intact/sane ?

--
You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-users+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.



--

Thanks,

Mayur

Mayur Nagekar

unread,
Apr 21, 2018, 3:51:28 PM4/21/18
to kubernet...@googlegroups.com
Also, I'd highly recommend open a bug in k8s repo in github. The responses are much faster there.
If you happen to do that, please let me know or better copy me. I am @miyurz there.
--

Thanks,

Mayur

Andrew Howden

unread,
Apr 22, 2018, 2:55:20 AM4/22/18
to kubernet...@googlegroups.com
Holaa

On Sat., 21 Apr. 2018, 9:50 pm Mayur Nagekar, <mayur....@gmail.com> wrote:
Can you login into the master and run `docker images` and `docker ps -a` ? is the control plane intact/sane ?

Can't login to the master; it's managed but Google Kubernetes Engine. Thus the bug ^^ it does seem 100% dead though
--
Andrew Howden
Careful, well crafted web development.
--
W: https://andrewhowden.com/
PGP: https://pgp.andrewhowden.com (79BAC08A6ED1FF1EABE350A7587D3B3A961D2D2D)
--

Mayur Nagekar

unread,
Apr 22, 2018, 10:12:01 AM4/22/18
to kubernet...@googlegroups.com
Oh, if its managed then you got to write to GKE's support. Not much one can do unless we can investigate by ssh-ing into the master.

--
You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-users+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.



--

Thanks,

Mayur

Andrew Howden

unread,
Apr 22, 2018, 1:08:03 PM4/22/18
to kubernet...@googlegroups.com
So far as I can see, there *is* no support 😞 

On Sun., 22 Apr. 2018, 4:12 pm Mayur Nagekar, <mayur....@gmail.com> wrote:
Oh, if its managed then you got to write to GKE's support. Not much one can do unless we can investigate by ssh-ing into the master.

On Sun, Apr 22, 2018 at 12:25 PM, Andrew Howden <he...@andrewhowden.com> wrote:
Holaa

On Sat., 21 Apr. 2018, 9:50 pm Mayur Nagekar, <mayur....@gmail.com> wrote:
Can you login into the master and run `docker images` and `docker ps -a` ? is the control plane intact/sane ?

Can't login to the master; it's managed but Google Kubernetes Engine. Thus the bug ^^ it does seem 100% dead though
--
Andrew Howden
Careful, well crafted web development.
--
W: https://andrewhowden.com/
PGP: https://pgp.andrewhowden.com (79BAC08A6ED1FF1EABE350A7587D3B3A961D2D2D)
--

--
You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-use...@googlegroups.com.
To post to this group, send email to kubernet...@googlegroups.com.



--

Thanks,

Mayur

--
You received this message because you are subscribed to a topic in the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kubernetes-users/5HIwcxqSSvU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kubernetes-use...@googlegroups.com.
To post to this group, send email to kubernet...@googlegroups.com.
--

Mayur Nagekar

unread,
Apr 22, 2018, 2:29:57 PM4/22/18
to kubernet...@googlegroups.com
I think you can try this if not already

1. Sign in to the console.

2. On the left side of the top bar, select your project from Project List.

3. Open the menu Gallery Menu on the top left, and click on Support.

5. Your current support package will be displayed.

Reference: https://support.google.com/cloud/#topic=6255036&contact=1



On Sun, Apr 22, 2018 at 10:37 PM, Andrew Howden <he...@andrewhowden.com> wrote:
So far as I can see, there *is* no support 😞 

On Sun., 22 Apr. 2018, 4:12 pm Mayur Nagekar, <mayur....@gmail.com> wrote:
Oh, if its managed then you got to write to GKE's support. Not much one can do unless we can investigate by ssh-ing into the master.

On Sun, Apr 22, 2018 at 12:25 PM, Andrew Howden <he...@andrewhowden.com> wrote:
Holaa

On Sat., 21 Apr. 2018, 9:50 pm Mayur Nagekar, <mayur....@gmail.com> wrote:
Can you login into the master and run `docker images` and `docker ps -a` ? is the control plane intact/sane ?

Can't login to the master; it's managed but Google Kubernetes Engine. Thus the bug ^^ it does seem 100% dead though
--
Andrew Howden
Careful, well crafted web development.
--
W: https://andrewhowden.com/
PGP: https://pgp.andrewhowden.com (79BAC08A6ED1FF1EABE350A7587D3B3A961D2D2D)
--

--
You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-users+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.



--

Thanks,

Mayur

--
You received this message because you are subscribed to a topic in the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kubernetes-users/5HIwcxqSSvU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kubernetes-users+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
--
Andrew Howden
Careful, well crafted web development.
--
W: https://andrewhowden.com/
PGP: https://pgp.andrewhowden.com (79BAC08A6ED1FF1EABE350A7587D3B3A961D2D2D)
--

--
You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-users+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.



--

Thanks,

Mayur

Andrew Howden

unread,
May 2, 2018, 12:33:22 PM5/2/18
to kubernet...@googlegroups.com

To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-use...@googlegroups.com.
To post to this group, send email to kubernet...@googlegroups.com.



--

Thanks,

Mayur

--
You received this message because you are subscribed to a topic in the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kubernetes-users/5HIwcxqSSvU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kubernetes-use...@googlegroups.com.
To post to this group, send email to kubernet...@googlegroups.com.
--
Andrew Howden
Careful, well crafted web development.
--
W: https://andrewhowden.com/
PGP: https://pgp.andrewhowden.com (79BAC08A6ED1FF1EABE350A7587D3B3A961D2D2D)
--

--
You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-use...@googlegroups.com.
To post to this group, send email to kubernet...@googlegroups.com.



--

Thanks,

Mayur

--
You received this message because you are subscribed to a topic in the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kubernetes-users/5HIwcxqSSvU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kubernetes-use...@googlegroups.com.
To post to this group, send email to kubernet...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages