Fluentd / ElasticSearch / Kibana setup on GKE

803 views
Skip to first unread message

Sean Jezewski

unread,
Aug 23, 2016, 7:18:20 PM8/23/16
to Kubernetes user discussion and Q&A, seanwj...@gmail.com
I'm trying to setup logging on my GKE cluster by following the suggestions from this thread, but have hit a wall.

Specifically, I've:

1) Created a new cluster on GKE with google logging disabled
2) Ran `kubectl create -f ...` on each of the following 5 components:

es-controller.yaml
es-service.yaml
kibana-controller.yaml
kibana-service.yaml

fluentd-es.yaml

However I'm seeing some odd results. My service/rc/pods for kibana and elastic search are created, then die off very quickly. 

Here's what it looks like before those components die off:

$kubectl get all --namespace=kube-system
NAME                                                           DESIRED        CURRENT       AGE
elasticsearch
-logging-v1                                       2              2             4s
kibana
-logging-v1                                              1              1             3s
kube
-dns-v17.1                                                 2              2             2h
kubernetes
-dashboard-v1.1.1                                    1              1             2h
l7
-default-backend-v1.0                                        1              1             2h
NAME                                                           CLUSTER
-IP     EXTERNAL-IP   PORT(S)         AGE
default-http-backend                                           10.3.245.50    <nodes>       80/TCP          2h
elasticsearch
-logging                                          10.3.242.22    <none>        9200/TCP        5s
heapster                                                      
10.3.246.252   <none>        80/TCP          2h
kibana
-logging                                                 10.3.241.94    <nodes>       5601/TCP        3s
kube
-dns                                                       10.3.240.10    <none>        53/UDP,53/TCP   2h
kubernetes
-dashboard                                           10.3.242.135   <none>        80/TCP          2h
NAME                                                           READY          STATUS        RESTARTS        AGE
elasticsearch
-logging-v1-lq5gx                                 1/1            Running       0               4s
elasticsearch
-logging-v1-p78mn                                 1/1            Running       0               4s
fluentd
-elasticsearch                                          1/1            Running       0               6s
heapster
-v1.1.0-2096339923-4j1vu                               2/2            Running       0               2h
kibana
-logging-v1-km83q                                        1/1            Running       0               3s
kube
-dns-v17.1-aqteq                                           3/3            Running       0               2h
kube
-dns-v17.1-qcm9z                                           3/3            Running       0               2h
kube
-proxy-gke-pachyderm-log-test-default-pool-8429ab58-1sio   1/1            Running       0               2h
kube
-proxy-gke-pachyderm-log-test-default-pool-8429ab58-b7tw   1/1            Running       0               2h
kube
-proxy-gke-pachyderm-log-test-default-pool-8429ab58-eead   1/1            Running       0               2h
kubernetes
-dashboard-v1.1.1-aajls                              1/1            Running       0               2h
l7
-default-backend-v1.0-filol                                  1/1            Running       0               2h

Normally I'd get the logs using the previous flag, but that doesn't seem to work. However if I get the logs fast enough, I can see a few things, but nothing out of the ordinary.

$kubectl logs elasticsearch-logging-v1-lq5gx --namespace=kube-system
I0823
21:16:06.814369       5 elasticsearch_logging_discovery.go:42] Kubernetes Elasticsearch logging discovery
I0823
21:16:07.829359       5 elasticsearch_logging_discovery.go:75] Found ["10.0.1.4" "10.0.2.6"]

Which looks pretty normal to me. Grabbing the logs from the kibana pod quickly, I see:

$kubectl logs kibana-logging-v1-km83q --namespace=kube-system
ELASTICSEARCH_URL
=http://elasticsearch-logging:9200
{"@timestamp":"2016-08-23T21:16:10.223Z","level":"error","node_env":"production","error":"Request error, retrying -- connect ECONNREFUSED"}
{"@timestamp":"2016-08-23T21:16:10.227Z","level":"warn","message":"Unable to revive connection: http://elasticsearch-logging:9200/","node_env":"production"}
{"@timestamp":"2016-08-23T21:16:10.227Z","level":"warn","message":"No living connections","node_env":"production"}
{"@timestamp":"2016-08-23T21:16:10.229Z","level":"info","message":"Unable to connect to elasticsearch at http://elasticsearch-logging:9200. Retrying in 2.5 seconds.","node_env":"production"}

Which makes sense ... considering the elastic search pod dies off very quickly.

The fluentd pod reports an error connecting to elastic search, but again, thats not surprising considering it quickly dies off.

Clearly, I'm missing some steps in configuring these services to connect to each other. Any advice along these lines is appreciated.

I know that I should probably be using a daemon set to spin up the fluentd pod, but I thought I could get these components wired together before worrying about that step. Maybe thats not the case. Either way, I'd appreciate any pointers in setting that up as well.

Vishnu Kannan

unread,
Aug 23, 2016, 7:32:33 PM8/23/16
to kubernet...@googlegroups.com, seanwj...@gmail.com, Piotr Szczesniak
+Piotr

--
You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-users+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.

Sean Jezewski

unread,
Sep 27, 2016, 7:58:23 PM9/27/16
to Kubernetes user discussion and Q&A, seanwj...@gmail.com, pszcz...@google.com
Hey folks - any thing I can do to triage this? I'm still seeing the same behavior (es controller / service get created then die quickly), but I have a bit more logs from the es logging pod (below).

I'm picking this project back up in earnest and need to get logging working.

$kubectl logs elasticsearch-logging-v1-wolo0 --namespace=kube-system
I0927 22:44:51.019820       5 elasticsearch_logging_discovery.go:42] Kubernetes Elasticsearch logging discovery
I0927 22:45:02.041152       5 elasticsearch_logging_discovery.go:75] Found ["10.0.1.4" "10.0.2.4"]
I0927 22:45:12.045874       5 elasticsearch_logging_discovery.go:75] Found ["10.0.1.4" "10.0.2.4"]
I0927 22:45:12.046053       5 elasticsearch_logging_discovery.go:87] Endpoints = ["10.0.1.4" "10.0.2.4"]
[2016-09-27 22:45:13,103][INFO ][node                     ] [Wysper] version[1.5.2], pid[9], build[62ff986/2015-04-27T09:21:06Z]
[2016-09-27 22:45:13,103][INFO ][node                     ] [Wysper] initializing ...
[2016-09-27 22:45:13,108][INFO ][plugins                  ] [Wysper] loaded [], sites []
[2016-09-27 22:45:16,172][INFO ][node                     ] [Wysper] initialized
[2016-09-27 22:45:16,173][INFO ][node                     ] [Wysper] starting ...
[2016-09-27 22:45:16,305][INFO ][transport                ] [Wysper] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.0.2.4:9300]}
[2016-09-27 22:45:16,314][INFO ][discovery                ] [Wysper] kubernetes-logging/o28T_eq1RgWNoKkjrq4sdA
[2016-09-27 22:45:19,390][INFO ][cluster.service          ] [Wysper] detected_master [Fixer][Nmfzb8bBTjedQrO_eDL8ig][elasticsearch-logging-v1-e7dvr][inet[/10.0.1.4:9300]]{master=true}, added {[Fixer][Nmfzb8bBTjedQrO_eDL8ig][elasticsearch-logging-v1-e7dvr][inet[/10.0.1.4:9300]]{master=true},}, reason: zen-disco-receive(from master [[Fixer][Nmfzb8bBTjedQrO_eDL8ig][elasticsearch-logging-v1-e7dvr][inet[/10.0.1.4:9300]]{master=true}])
[2016-09-27 22:45:19,397][INFO ][http                     ] [Wysper] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.0.2.4:9200]}
[2016-09-27 22:45:19,397][INFO ][node                     ] [Wysper] started



The only difference as far as I can tell might be that I'm not using the 'kube-up.sh' script to create my cluster (though I've confirmed that my cluster has cloud logging disabled). But at this point, that's the only thing I can think of to try. Well that or trying k8s 1.4
+Piotr

To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-use...@googlegroups.com.
To post to this group, send email to kubernet...@googlegroups.com.

Piotr Szczesniak

unread,
Sep 28, 2016, 12:32:30 PM9/28/16
to Sean Jezewski, Kubernetes user discussion and Q&A, seanwj...@gmail.com
Hi Sean,

Apologies for the delay.

There are two problems with your attitude:
1) fluentd-elasticsearch should run on every single node. In your setup it runs only on one node. In Kubernetes currently it is a manifest pod being uploaded to every node during cluster setup. There is an effort to migrate it to be a Daemon Set (seehttps://github.com/kubernetes/kubernetes/pull/32088). It would be hard for you to do it without #32088 being merged. You can try to reuse Daemon Set yaml file from the PR.
2) You are running all pods in kube-system namespace. Addon updater will kill all pods that are not created during cluster setup in this namespace (see more details https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/addon-manager). You should use a different namespace.

Piotr

Sean Jezewski

unread,
Sep 28, 2016, 1:05:07 PM9/28/16
to Piotr Szczesniak, Kubernetes user discussion and Q&A, Sean Jezewski
Ah, that makes sense. I'll try those suggestions and report back.

Thanks!

On Wed, Sep 28, 2016 at 9:18 AM, Piotr Szczesniak <pszcz...@google.com> wrote:
Hi Sean,

Apologies for the delay.

There are two problems with your attitude:
1) fluentd-elasticsearch should run on every single node. In your setup it runs only on one node. In Kubernetes currently it is a manifest pod being uploaded to every node during cluster setup. There is an effort to migrate it to be a Daemon Set (see https://github.com/kubernetes/kubernetes/pull/32088). It would be hard for you to do it without #32088 being merged. You can try to reuse Daemon Set yaml file from the PR.
2) You are running all pods in kube-system namespace. Addon updater will kill all pods that are not created during cluster setup in this namespace (see more details https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/addon-manager). You should use a different namespace.

Piotr
On Wed, Sep 28, 2016 at 1:58 AM, Sean Jezewski <se...@pachyderm.io> wrote:
Reply all
Reply to author
Forward
0 new messages