Tectonic Terraform installer fails on azure

Kai Timmer

unread,

Oct 6, 2017, 12:06:52 PM10/6/17

to CoreOS User

I attempted a Tectonic cluster installation on azure. The installer in the end failed with the following message:

null_resource.tectonic (remote-exec): A dependency job for tectonic.service failed. See 'journalctl -xe' for details.

Error applying plan:

1 error(s) occurred:

* null_resource.tectonic: 1 error(s) occurred:

* Script exited with non-zero exit status: 1

Logged in on the master i get the following output from journalctl:

Oct 06 16:03:52 tectonic-test-master-0 python[963]: 2017/10/06 16:03:52.421590 WARNING Failed to flush firewall

Oct 06 16:03:52 tectonic-test-master-0 kubelet-wrapper[933]: W1006 16:03:52.957218 933 cni.go:189] Unable to update cni config: No networks found in /etc/kubernetes/cni/net.d

Oct 06 16:03:52 tectonic-test-master-0 kubelet-wrapper[933]: E1006 16:03:52.957351 933 kubelet.go:2136] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

Oct 06 16:03:55 tectonic-test-master-0 kubelet-wrapper[933]: E1006 16:03:55.675555 933 reflector.go:190] k8s.io/kubernetes/pkg/kubelet/kubelet.go:400: Failed to list *v1.Service: Get https://tectonic-test-api.docker-intl.example.com:443/api/v1/services?resourceVersion=0: dial tcp 13.90.197.190:443: i/o timeout

Oct 06 16:03:55 tectonic-test-master-0 kubelet-wrapper[933]: E1006 16:03:55.678435 933 reflector.go:190] k8s.io/kubernetes/pkg/kubelet/kubelet.go:408: Failed to list *v1.Node: Get https://tectonic-test-api.docker-intl.example.com:443/api/v1/nodes?fieldSelector=metadata.name%3Dtectonic-test-master-0&resourceVersion=0: dial tcp 13.90.197.190:443: i/o timeout

Oct 06 16:03:55 tectonic-test-master-0 kubelet-wrapper[933]: E1006 16:03:55.678448 933 reflector.go:190] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Get https://tectonic-test-api.docker-intl.example.com:443/api/v1/pods?fieldSelector=spec.nodeName%3Dtectonic-test-master-0&resourceVersion=0: dial tcp 13.90.197.190:443: i/o timeout

What can I do to get Tectonic running?

Regards,

Kai

Rob Szumski

unread,

Oct 6, 2017, 3:18:26 PM10/6/17

to Kai Timmer, CoreOS User

Hmm, looks like a bootstrapping failure. This is sometimes caused by slow S3 download speeds. Can you try restarting the machine and/or the tectonic service?

- Rob

--
You received this message because you are subscribed to the Google Groups "CoreOS User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to coreos-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kai Timmer

unread,

Oct 9, 2017, 5:11:52 AM10/9/17

to CoreOS User

Hi,

I retried a bunch of times. Always with the same result. I also tried a terraform destroy and restarting the whole process, still ended up with this error.

Any other hints?

Thanks,

Kai

Rob Szumski

unread,

Oct 11, 2017, 2:42:58 PM10/11/17

to Kai Timmer, CoreOS User

Are there any crashlooping pods? There is a root kubeconfig generated by the installer that should be able to give you access to use kubectl. If the API server is not up, something else must be going on.

What does `systemctl status tectonic.service` show in terms of failures in the ExecStartPre section?

- Rob

Kai Timmer

unread,

Oct 17, 2017, 6:52:35 AM10/17/17

to CoreOS User

Hi, again,

so, with the new version of the Tectonic installer (tectonic_1.7.5-tectonic.1) the installer finishes fine, but I still can't get he tectonic service up and running.

The tectonic.service at startup, complains about a missing dependency the bootkube.service, which can't be started and writes the following log:

eviction_manager.go:238] eviction manager: unexpected err: failed GetNode: node 'tectonic-test-master-0' not found

cni.go:189] Unable to update cni config: No networks found in /etc/kubernetes/cni/net.d

kubelet.go:2136] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready:

cni.go:189] Unable to update cni config: No networks found in /etc/kubernetes/cni/net.d

It seems like the problem is still the same as with the old installer, but the new one doesn't wait for the script to return and just "completes" successfully without verifying that the service is actually started.

Regards,

Kai

ste...@retrogaming.org

unread,

Oct 17, 2017, 11:16:35 AM10/17/17

to CoreOS User

Hi,

Jon Mosco

unread,

Oct 19, 2017, 1:05:02 PM10/19/17

to CoreOS User

Same issue here with VMware and the newest Tectonic installer. Tectonic reports a success, and the nodes are not configured correctly, and producing similar errors:

Oct 19 17:03:57 tectonic01 kubelet-wrapper[755]: W1019 17:03:57.394651 755 cni.go:189] Unable to update cni config: No networks found in /etc/kubernetes/cni/net.d

Oct 19 17:03:57 tectonic01 kubelet-wrapper[755]: E1019 17:03:57.395274 755 kubelet.go:2136] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

Oct 19 17:03:57 tectonic01 kubelet-wrapper[755]: E1019 17:03:57.744746 755 reflector.go:190] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Get https://controllers.example.com:443/api/v1/pods?fieldSelector=spec.nodeName%3Dtectonic01&resourceVersion=0: EOF

Alex Somesan

unread,

Oct 23, 2017, 2:28:13 PM10/23/17

to CoreOS User

Hello Jon, Kai,

I suspect this to be a case of misconfigured/overlapping address ranges.

Can you both please post here the contents of your tfvars files (minus any sensitive information). Can you please at the very least make sure to include all parameters that specify network address ranges from your tfvars?

Also of interest is the log output of the flannel pods from the same nodes where you saw the tectonic.service failures.

Thank you!
Alex

Stefan Ernst

unread,

Oct 23, 2017, 4:42:06 PM10/23/17

to CoreOS User

Hi,

so this is my tfvars, I left most of it default (copied from the example tfvars) and simply changed the logon / dns config. I stripped out all the comments so this is everything actually configured

tectonic_admin_email = "admin@..."

tectonic_admin_password_hash = "...."

tectonic_azure_client_secret = "...."

tectonic_azure_external_dns_zone_id = "/subscriptions/..../resourceGroups/dns/providers/Microsoft.Network/dnszones/my.cloud"

tectonic_azure_location = "eastus"

tectonic_azure_ssh_key = "/Users/.../.ssh/id_rsa_azure.pub"

tectonic_base_domain = "my.cloud"

tectonic_calico_network_policy = false

tectonic_cl_channel = "stable"

tectonic_cluster_cidr = "10.2.0.0/16"

tectonic_cluster_name = "tectonic-test"

tectonic_etcd_count = "0"

tectonic_experimental = false

tectonic_license_path = "/Users/.../tectonic-test/tectonic-license.txt"

tectonic_master_count = "1"

tectonic_pull_secret_path = "/Users/.../tectonic-test/config.json"

tectonic_service_cidr = "10.3.0.0/16"

tectonic_stats_url = "https://stats-collector.tectonic.com"

tectonic_vanilla_k8s = false

tectonic_worker_count = "3"

Cheers

Stefan

Kai Timmer

unread,

Oct 24, 2017, 8:39:47 AM10/24/17

to CoreOS User

Hello,

so this is my config file with all the values that are set to something else than the default:

tectonic_admin_email = "m...@email.com"

tectonic_admin_password_hash = "$hashedfoobar"

tectonic_azure_client_secret = "80f5911f-c926-4368-a9f2-9a9c768a836a"

tectonic_azure_cloud_environment = "AZUREPUBLICCLOUD"

tectonic_azure_etcd_storage_type = "Premium_LRS"

tectonic_azure_etcd_vm_size = "Standard_DS2_v2"

tectonic_azure_external_dns_zone_id = "/subscriptions/id/resourceGroups/docker-intl-test/providers/Microsoft.Network/dnszones/docker-intl.mydomain.rocks"

tectonic_azure_external_resource_group = "docker-intl-test"

tectonic_azure_location = "westeurope"

tectonic_azure_master_storage_type = "Premium_LRS"

tectonic_azure_master_vm_size = "Standard_DS2_v2"

tectonic_azure_ssh_key = "~/.ssh/id_rsa.pub"

tectonic_azure_worker_storage_type = "Premium_LRS"

tectonic_azure_worker_vm_size = "Standard_DS2_v2"

tectonic_base_domain = "docker-intl.mydomain.rocks"

tectonic_calico_network_policy = false

tectonic_cl_channel = "stable"

tectonic_cluster_cidr = "10.2.0.0/16"

tectonic_cluster_name = "tectonic-test"

tectonic_etcd_count = "0"

tectonic_experimental = false

tectonic_license_path = "~/tectonic/tectonic-license.txt"

tectonic_master_count = "1"

tectonic_pull_secret_path = "~/tectonic/pull-secret.json"

tectonic_service_cidr = "10.3.0.0/16"

tectonic_stats_url = "https://stats-collector.tectonic.com"

tectonic_vanilla_k8s = false

tectonic_worker_count = "2"

I hope this helps tracking the issue down.

Regards,

Kai

--
You received this message because you are subscribed to a topic in the Google Groups "CoreOS User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/coreos-user/zKKCPtJDVpU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to coreos-user...@googlegroups.com.

Alex Somesan

unread,

Oct 30, 2017, 11:51:40 AM10/30/17

to CoreOS User

Kai,

Sorry for the late response.

On a very quick skim through your tfvars I see at least two issues:

1) The value for tectonic_azure_cloud_environment is incorrect. It should be one of the values detailed here: https://www.terraform.io/docs/providers/azurerm/index.html#environment

OTH, for public cloud you don't have to specify it since that is the default in Terraform.

2) The paths to the various files requested by the installer (SSH key, tectonic license and pull secret) have to absolute, not relative. This is a known limitation of Terraform.

So instead of tectonic_azure_ssh_key = "~/.ssh/id_rsa.pub" you actually need to have tectonic_azure_ssh_key = "/path/to/home/directory/.ssh/id_rsa.pub"

The same goes for tectonic_license_path and tectonic_pull_secret_path.

Let me know if that helped.

Alex

Reply all

Reply to author

Forward