voltha deployment in a 3-node kubernetes cluster

219 views
Skip to first unread message

Jerry Travlos

unread,
Sep 5, 2018, 7:34:02 AM9/5/18
to VOLTHA Discuss
Hello,

I 'm trying to deploy voltha in a 3-node kubernetes cluster.
The 3 target nodes are ubuntu server 16.04 VMs running in a bare metal server.
I 'm using a 2nd bare metal server (ubuntu 16.04) as a development machine.
All can ping each other.

I installed kubernetes in the 3-node cluster, by following https://guide.opencord.org/prereqs/k8s-multi-node.html.
Then I installed kubectl, by following https://kubernetes.io/docs/tasks/tools/install-kubectl/ and helm, by following https://guide.opencord.org/prereqs/helm.html.
Finally, I 'm trying to depploy VOLTHA, by following https://guide.opencord.org/charts/voltha.html.

Could someone clarify the following:

1. The process of installing VOLTHA helm chart (https://guide.opencord.org/charts/voltha.html) should be repeated in all target nodes?

2. What if I would like to test some code changes and need to redeploy with the changed VOLTHA?

3. Is this the proper way to deploy voltha in a multi node kubernetes cluster or the installer script way, described in voltha/install/BuildingTheInstaller.md, should be followed?

Thanks
Jerry

David Bainbridge

unread,
Sep 5, 2018, 9:44:45 AM9/5/18
to Jerry Travlos, VOLTHA Discuss
On Wed, Sep 5, 2018 at 4:34 AM Jerry Travlos <makis....@gmail.com> wrote:
Hello,

I 'm trying to deploy voltha in a 3-node kubernetes cluster.
The 3 target nodes are ubuntu server 16.04 VMs running in a bare metal server.
I 'm using a 2nd bare metal server (ubuntu 16.04) as a development machine.
All can ping each other.

I installed kubernetes in the 3-node cluster, by following https://guide.opencord.org/prereqs/k8s-multi-node.html.
Then I installed kubectl, by following https://kubernetes.io/docs/tasks/tools/install-kubectl/ and helm, by following https://guide.opencord.org/prereqs/helm.html.
Finally, I 'm trying to depploy VOLTHA, by following https://guide.opencord.org/charts/voltha.html.

Could someone clarify the following:

1. The process of installing VOLTHA helm chart (https://guide.opencord.org/charts/voltha.html) should be repeated in all target nodes?

The Helm command only needs to be executed on one of the nodes.
 
2. What if I would like to test some code changes and need to redeploy with the changed VOLTHA?

This gets a little more tricky and there are likely multiple ways to accomplish what you want. The essence of what you need to do is create a new version of the Docker image that contains your changes and update the Kubernetes cluster with that new image. It might be easisest to delete the VOLTHA instance and just reinstall it via Helm. You want need to re-install Kubernetes, just VOLTHA using Helm.
 
3. Is this the proper way to deploy voltha in a multi node kubernetes cluster or the installer script way, described in voltha/install/BuildingTheInstaller.md, should be followed?

Using Helm is the correct way.
 

Thanks
Jerry

--
You received this message because you are subscribed to the Google Groups "VOLTHA Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to voltha-discus...@opencord.org.
To post to this group, send email to voltha-...@opencord.org.
Visit this group at https://groups.google.com/a/opencord.org/group/voltha-discuss/.
To view this discussion on the web visit https://groups.google.com/a/opencord.org/d/msgid/voltha-discuss/3730a328-d23d-4ba9-a90a-c2ab4ce86efe%40opencord.org.
For more options, visit https://groups.google.com/a/opencord.org/d/optout.

makis....@gmail.com

unread,
Sep 17, 2018, 7:11:35 AM9/17/18
to VOLTHA Discuss, makis....@gmail.com
Hi,

I deployed voltha in a 3-node cluster ("helm install -n voltha voltha") using a local docker registry with images from voltha 1.4.

Voltha container fails to get running, because lookup for KV store's IP address fails:
cord@node1:~/helm-charts$ kubectl get pod -n voltha
NAME                                        READY     STATUS             RESTARTS   AGE
default-http-backend-5c6d95c48-mprgr        1/1       Running            0          15m
freeradius-6d49d9588b-xxtxx                 1/1       Running            0          15m
netconf-75796c6558-pvpfr                    1/1       Running            0          15m
nginx-ingress-controller-566c84c9fd-frwwz   1/1       Running            0          15m
ofagent-57b8c8d77d-rxsh7                    1/1       Running            0          15m
vcli-5dd959d78f-mmms2                       1/1       Running            0          15m
vcore-0                                     1/1       Running            0          15m
voltha-6dd5f6d69-5zj8x                      0/1       CrashLoopBackOff   4          15m

cord@node1:~/helm-charts$ kubectl -n voltha logs voltha-6dd5f6d69-7q2qm
2018-09-13 11:17:23.600161 I | KV-store etcd at etcd-cluster.default.svc.cluster.local:2379
2018-09-13 11:17:35.811178 I | etcd-cluster.default.svc.cluster.local name resolution failed 1 time(s) retrying...
2018-09-13 11:17:46.946168 I | etcd-cluster.default.svc.cluster.local name resolution failed 2 time(s) retrying...
2018-09-13 11:17:56.048312 I | etcd-cluster.default.svc.cluster.local name resolution failed 3 time(s) retrying...
2018-09-13 11:17:58.053973 I | etcd-cluster.default.svc.cluster.local name resolution failed 4 time(s) retrying...
2018-09-13 11:18:00.058159 I | etcd-cluster.default.svc.cluster.local name resolution failed 5 time(s) retrying...
2018-09-13 11:18:02.158812 I | etcd-cluster.default.svc.cluster.local name resolution failed 6 time(s) retrying...
2018-09-13 11:18:04.163029 I | etcd-cluster.default.svc.cluster.local name resolution failed 7 time(s) retrying...
2018-09-13 11:18:06.231449 I | etcd-cluster.default.svc.cluster.local name resolution failed 8 time(s) retrying...
2018-09-13 11:18:08.235489 I | etcd-cluster.default.svc.cluster.local name resolution failed 9 time(s) retrying...
2018-09-13 11:18:10.239725 I | etcd-cluster.default.svc.cluster.local name resolution failed 10 time(s) retrying...
2018-09-13 11:18:12.243695 I | etcd-cluster.default.svc.cluster.local name resolution failed 10 times giving up
2018-09-13 11:18:12.243735 I | Can't proceed without KV store's vIP address: %slookup etcd-cluster.default.svc.cluster.local on 10.233.0.3:53: no such host

It seems that nslookup command fails:
cord@node1:~/helm-charts$ kubectl exec -ti busybox -- nslookup kubernetes.default
Server:    10.233.0.3
Address 1: 10.233.0.3 kube-dns.kube-system.svc.cluster.local

nslookup: can't resolve 'kubernetes.default'
command terminated with exit code 1

Here is resolv.conf:
cord@node1:~/helm-charts$ kubectl exec busybox cat /etc/resolv.conf
nameserver 10.233.0.3
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5

But I don't see any failures in kube-dns pod:
cord@node1:~/helm-charts$ kubectl get pod -n kube-system
NAME                                    READY     STATUS    RESTARTS   AGE
calico-node-6spkf                       1/1       Running   0          17d
calico-node-qqdb2                       1/1       Running   0          17d
calico-node-zwkws                       1/1       Running   1          17d
kube-apiserver-node1                    1/1       Running   1          17d
kube-apiserver-node2                    1/1       Running   1          17d
kube-controller-manager-node1           1/1       Running   0          17d
kube-controller-manager-node2           1/1       Running   0          17d
kube-dns-7bd4d5fbb6-85glj               3/3       Running   13         2d
kube-dns-7bd4d5fbb6-c6tlh               3/3       Running   18         2d
kube-proxy-node1                        1/1       Running   1          17d
kube-proxy-node2                        1/1       Running   1          17d
kube-proxy-node3                        1/1       Running   1          17d
kube-scheduler-node1                    1/1       Running   0          17d
kube-scheduler-node2                    1/1       Running   0          17d
kubedns-autoscaler-679b8b455-c89zg      1/1       Running   0          17d
kubernetes-dashboard-55fdfd74b4-gx6gg   1/1       Running   0          17d
nginx-proxy-node3                       1/1       Running   1          17d
tiller-deploy-5c688d5f9b-sshbh          1/1       Running   0          17d

cord@node1:~/helm-charts$ kubectl -n kube-system logs kube-dns-7bd4d5fbb6-c6tlh kubedns
I0914 13:30:50.684351       1 dns.go:48] version: 1.14.10
I0914 13:30:50.689707       1 server.go:69] Using configuration read from directory: /kube-dns-config with period 10s
I0914 13:30:50.689784       1 server.go:121] FLAG: --alsologtostderr="false"
I0914 13:30:50.689798       1 server.go:121] FLAG: --config-dir="/kube-dns-config"
I0914 13:30:50.689807       1 server.go:121] FLAG: --config-map=""
I0914 13:30:50.689813       1 server.go:121] FLAG: --config-map-namespace="kube-system"
I0914 13:30:50.689818       1 server.go:121] FLAG: --config-period="10s"
I0914 13:30:50.689826       1 server.go:121] FLAG: --dns-bind-address="0.0.0.0"
I0914 13:30:50.689832       1 server.go:121] FLAG: --dns-port="10053"
I0914 13:30:50.689841       1 server.go:121] FLAG: --domain="cluster.local."
I0914 13:30:50.689849       1 server.go:121] FLAG: --federations=""
I0914 13:30:50.689857       1 server.go:121] FLAG: --healthz-port="8081"
I0914 13:30:50.689863       1 server.go:121] FLAG: --initial-sync-timeout="1m0s"
I0914 13:30:50.689868       1 server.go:121] FLAG: --kube-master-url=""
I0914 13:30:50.689875       1 server.go:121] FLAG: --kubecfg-file=""
I0914 13:30:50.689881       1 server.go:121] FLAG: --log-backtrace-at=":0"
I0914 13:30:50.689889       1 server.go:121] FLAG: --log-dir=""
I0914 13:30:50.689896       1 server.go:121] FLAG: --log-flush-frequency="5s"
I0914 13:30:50.689901       1 server.go:121] FLAG: --logtostderr="true"
I0914 13:30:50.689907       1 server.go:121] FLAG: --nameservers=""
I0914 13:30:50.689912       1 server.go:121] FLAG: --stderrthreshold="2"
I0914 13:30:50.689918       1 server.go:121] FLAG: --v="2"
I0914 13:30:50.689923       1 server.go:121] FLAG: --version="false"
I0914 13:30:50.689932       1 server.go:121] FLAG: --vmodule=""
I0914 13:30:50.690050       1 server.go:169] Starting SkyDNS server (0.0.0.0:10053)
I0914 13:30:50.690382       1 server.go:179] Skydns metrics enabled (/metrics:10055)
I0914 13:30:50.690396       1 dns.go:188] Starting endpointsController
I0914 13:30:50.690402       1 dns.go:191] Starting serviceController
I0914 13:30:50.690502       1 dns.go:184] Configuration updated: {TypeMeta:{Kind: APIVersion:} Federations:map[] StubDomains:map[] UpstreamNameservers:[]}
I0914 13:30:50.693195       1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I0914 13:30:50.693222       1 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
I0914 13:30:51.197440       1 dns.go:222] Initialized services and endpoints from apiserver
I0914 13:30:51.197469       1 server.go:137] Setting up Healthz Handler (/readiness)
I0914 13:30:51.197485       1 server.go:142] Setting up cache handler (/cache)
I0914 13:30:51.197494       1 server.go:128] Status HTTP port 8081
I0917 09:25:15.717215       1 dns.go:601] Could not find endpoints for service "freeradius" in namespace "voltha". DNS records will be created once endpoints show up.
I0917 09:25:15.811153       1 dns.go:601] Could not find endpoints for service "netconf" in namespace "voltha". DNS records will be created once endpoints show up.
I0917 09:25:16.039319       1 dns.go:601] Could not find endpoints for service "vcore" in namespace "voltha". DNS records will be created once endpoints show up.
I0917 09:26:09.983292       1 dns.go:601] Could not find endpoints for service "etcd-cluster" in namespace "default". DNS records will be created once endpoints show up.


cord@node1:~/helm-charts$ kubectl -n kube-system logs kube-dns-7bd4d5fbb6-c6tlh dnsmasq
I0916 17:51:08.776540       1 main.go:74] opts: {{/usr/sbin/dnsmasq [-k --cache-size=1000 --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053] true} /etc/k8s/dns/dnsmasq-nanny 10000000000}
I0916 17:51:08.776929       1 nanny.go:94] Starting dnsmasq [-k --cache-size=1000 --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053]
I0916 17:51:08.869265       1 nanny.go:119]
W0916 17:51:08.869293       1 nanny.go:120] Got EOF from stdout
I0916 17:51:08.869352       1 nanny.go:116] dnsmasq[9]: started, version 2.78 cachesize 1000
I0916 17:51:08.869390       1 nanny.go:116] dnsmasq[9]: compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify
I0916 17:51:08.869421       1 nanny.go:116] dnsmasq[9]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I0916 17:51:08.869446       1 nanny.go:116] dnsmasq[9]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I0916 17:51:08.869469       1 nanny.go:116] dnsmasq[9]: using nameserver 127.0.0.1#10053 for domain cluster.local
I0916 17:51:08.869535       1 nanny.go:116] dnsmasq[9]: reading /etc/resolv.conf
I0916 17:51:08.869563       1 nanny.go:116] dnsmasq[9]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I0916 17:51:08.869586       1 nanny.go:116] dnsmasq[9]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I0916 17:51:08.869609       1 nanny.go:116] dnsmasq[9]: using nameserver 127.0.0.1#10053 for domain cluster.local
I0916 17:51:08.869633       1 nanny.go:116] dnsmasq[9]: using nameserver 10.233.0.3#53
I0916 17:51:08.869714       1 nanny.go:116] dnsmasq[9]: read /etc/hosts - 7 addresses

cord@node1:~/helm-charts$ kubectl -n kube-system logs kube-dns-7bd4d5fbb6-c6tlh sidecar
I0916 17:51:15.199545       1 main.go:51] Version v1.14.8.3
I0916 17:51:15.199834       1 server.go:45] Starting server (options {DnsMasqPort:53 DnsMasqAddr:127.0.0.1 DnsMasqPollIntervalMs:5000 Probes:[{Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1} {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1}] PrometheusAddr:0.0.0.0 PrometheusPort:10054 PrometheusPath:/metrics PrometheusNamespace:kubedns})
I0916 17:51:15.199947       1 dnsprobe.go:75] Starting dnsProbe {Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1}
I0916 17:51:15.203488       1 dnsprobe.go:75] Starting dnsProbe {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1}

DNS service is up:
cord@node1:~/helm-charts$ kubectl get svc --namespace=kube-system
NAME                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE
kube-dns               ClusterIP   10.233.0.3      <none>        53/UDP,53/TCP   17d
kubernetes-dashboard   ClusterIP   10.233.55.167   <none>        443/TCP         17d
tiller-deploy          ClusterIP   10.233.31.12    <none>        44134/TCP       17d

DNS endpoints are exposed:
cord@node1:~/helm-charts$ kubectl get ep kube-dns --namespace=kube-system
NAME       ENDPOINTS                                                     AGE
kube-dns   10.233.71.10:53,10.233.71.14:53,10.233.71.10:53 + 1 more...   17d

The VM is running Ubuntu server 16.04 and uses a static IP address:
cord@node1:~/helm-charts$ cat /etc/network/interfaces
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

source /etc/network/interfaces.d/*

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto enp0s8
iface enp0s8 inet static
address 10.85.185.188
gateway 10.85.185.145
netmask 255.255.255.224

Version of kubectl:
cord@node1:~/helm-charts$ kubectl version
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.4", GitCommit:"5ca598b4ba5abb89bb773071ce452e33fb66339d", GitTreeState:"clean", BuildDate:"2018-06-06T08:00:59Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.4", GitCommit:"5ca598b4ba5abb89bb773071ce452e33fb66339d", GitTreeState:"clean", BuildDate:"2018-06-06T08:00:59Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

I have already tried various solutions suggested on the internet, with no luck.
Is it maybe some DNS configuration missing or a misconfiguration?
Any idea on how to solve this or what to further check?

Thanks

makis....@gmail.com

unread,
Sep 24, 2018, 10:11:29 AM9/24/18
to VOLTHA Discuss, makis....@gmail.com
Hi,

Could someone give a hint on the following please:

Voltha container does not crash any more,
but there are error/warning logs in etcd-operator like (full etcd-operator log attached):
time="2018-09-24T11:35:00Z" level=error msg="failed to reconcile: fail to add new member (etcd-cluster-0001): context deadline exceeded" cluster-name=etcd-cluster pkg=cluster

time="2018-09-24T11:35:10Z" level=error msg="failed to reconcile: lost quorum" cluster-name=etcd-cluster pkg=cluster

time="2018-09-24T11:36:13Z" level=error msg="failed to update members: list members failed: creating etcd client failed: grpc: timed out when dialing" cluster-name=etcd-cluster pkg=cluster

time="2018-09-24T11:38:37Z" level=warning msg="all etcd pods are dead." cluster-name=etcd-cluster pkg=cluster

Nevertheless, voltha-etcd-operator is in running state.
Could we continue with voltha testing, even with these errors/warnings?

We have also followed the workaround for the known etcd-operator bug described in https://guide.opencord.org/charts/voltha.html.
Are the above errors relative with this bug maybe?

Thanks,
Jerry
voltha-etcd-operator.log

makis....@gmail.com

unread,
Sep 28, 2018, 7:18:42 AM9/28/18
to VOLTHA Discuss, makis....@gmail.com

Saurav Das

unread,
Sep 28, 2018, 1:48:41 PM9/28/18
to makis....@gmail.com, VOLTHA Discuss
Questions regarding the deployment of voltha in seba/cord are best asked on the seba mailing list


On Fri, Sep 28, 2018 at 4:18 AM, <makis....@gmail.com> wrote:

--
You received this message because you are subscribed to the Google Groups "VOLTHA Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to voltha-discuss+unsubscribe@opencord.org.

To post to this group, send email to voltha-...@opencord.org.
Visit this group at https://groups.google.com/a/opencord.org/group/voltha-discuss/.
Reply all
Reply to author
Forward
0 new messages