Error in preflight while creating cluster on Baremetal-VMs

726 views
Skip to first unread message

Abhishek kaushal

unread,
May 5, 2021, 9:14:11 AM5/5/21
to gce-discussion

Error in preflight while creating cluster on Bare metal

I am trying to create on-prem cluster on Bare metal with following infra detail  -

Master node 10.10.100.181
worker node- 10.10.100.180
VIP detail -
        10.10.100.180 10.10.100.181
2 10.10.100.41 10.10.100.46
3 10.10.100.42 10.10.100.47
4 10.10.100.43 10.10.100.48
5 10.10.100.44 10.10.100.49
6 10.10.100.45 10.10.100.50

I am getting error on when trying to create cluster using command  - ./bmctl create cluster -c cluster1

Error Log -
file name :- node-network.log
_TASK [vip_test : fail] *********************************************************
fatal: [10.10.100.181]: FAILED! => {"changed": false, "msg": "vip 10.10.100.41 is already used (should not be pingable)"}
fatal: [10.10.100.180]: FAILED! => {"changed": false, "msg": "vip 10.10.100.41 is already used (should not be pingable)"}
fatal: [localhost]: FAILED! => {"changed": false, "msg": "vip 10.10.100.41 is already used (should not be pingable)"}

NO MORE HOSTS LEFT *************************************************************
PLAY RECAP *********************************************************************
10.10.100.180              : ok=21   changed=7    unreachable=0    failed=1    skipped=8    rescued=0    ignored=0   
10.10.100.181              : ok=25   changed=9    unreachable=0    failed=1    skipped=7    rescued=0    ignored=0   
localhost                  : ok=15   changed=6    unreachable=0    failed=1    skipped=13   rescued=0    ignored=0   

F0504 17:02:50.723876       1 main.go:48] exit status 2_

file name :-10.10.100.180.log 

TASK [machine_preflight_check : Run kubeadm join preflight check] **************
fatal: [10.10.100.181]: FAILED! => {"changed": true, "cmd": "/usr/bin/kubeadm join phase preflight --config /tmp/kubeadm_join_defaults.yaml --ignore-preflight-errors=FileExisting-crictl,FileExisting-conntrack,FileExisting-ethtool,FileExisting-ip,FileExisting-iptables,FileExisting-socat,FileExisting-tc,KubeletVersion,FileContent--proc-sys-net-bridge-bridge-nf-call-iptables,FileContent--proc-sys-net-ipv4-ip_forward,Swap,ImagePull,SystemVerification", "delta": "0:00:01.547685", "end": "2021-05-04 17:04:54.244483", "msg": "non-zero return code", "rc": 1, "start": "2021-05-04 17:04:52.696798", "stderr": "W0504 17:04:52.777586   49628 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.\n\t[WARNING IsDockerSystemdCheck]: detected \"cgroupfs\" as the Docker cgroup driver. The recommended driver is \"systemd\". Please follow the guide at https://kubernetes.io/docs/setup/cri/\n\t[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.2. Latest validated version: 19.03\nerror execution phase preflight: couldn't validate the identity of the API Server: Get https://kube-apiserver:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s: dial tcp: lookup kube-apiserver on 10.10.100.101:53: no such host\nTo see the stack trace of this error execute with --v=5 or higher", "stderr_lines": ["W0504 17:04:52.777586   49628 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.", "\t[WARNING IsDockerSystemdCheck]: detected \"cgroupfs\" as the Docker cgroup driver. The recommended driver is \"systemd\". Please follow the guide at https://kubernetes.io/docs/setup/cri/", "\t[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.2. Latest validated version: 19.03", "error execution phase preflight: couldn't validate the identity of the API Server: Get https://kube-apiserver:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s: dial tcp: lookup kube-apiserver on 10.10.100.101:53: no such host", "To see the stack trace of this error execute with --v=5 or higher"], "stdout": "[preflight] Running pre-flight checks", "stdout_lines": ["[preflight] Running pre-flight checks"]}
...ignoring

TASK [machine_preflight_check : Check kubeadm join preflight result] ***********
skipping: [10.10.100.181] => (item=W0504 17:04:52.777586   49628 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.) 
skipping: [10.10.100.181] => (item= [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
skipping: [10.10.100.181] => (item= [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.2. Latest validated version: 19.03) 
skipping: [10.10.100.181] => (item=error execution phase preflight: couldn't validate the identity of the API Server: Get https://kube-apiserver:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s: dial tcp: lookup kube-apiserver on 10.10.100.101:53: no such host) 
skipping: [10.10.100.181] => (item=To see the stack trace of this error execute with --v=5 or higher) 

TASK [Clear artifacts] *********************************************************
changed: [10.10.100.181]

TASK [Summarize preflight check] ***********************************************
ok: [10.10.100.181] => {
    "results": {
        "check_apparmor_pass": true,
        "check_cgroup_pass": true,
        "check_disks_pass": "false",
        "check_dns_pass": true,
        "check_firewall_pass": true,
        "check_gcr_pass": true,
        "check_googleapis_pass": true,
        "check_kubeadm_pass": true,
        "check_package_availability_pass": true,
        "check_package_repo_update_pass": true,
        "check_podman_absence_pass": true,
        "check_selinux_pass": true,
        "check_time_sync_pass": true,
        "check_ufw_pass": true
    }
}

TASK [Look for failed checks that are not ignored] *****************************
skipping: [10.10.100.181] => (item={'key': 'check_selinux_pass', 'value': True}) 
skipping: [10.10.100.181] => (item={'key': 'check_time_sync_pass', 'value': True}) 
skipping: [10.10.100.181] => (item={'key': 'check_apparmor_pass', 'value': True}) 
skipping: [10.10.100.181] => (item={'key': 'check_ufw_pass', 'value': True}) 
skipping: [10.10.100.181] => (item={'key': 'check_firewall_pass', 'value': True}) 
ok: [10.10.100.181] => (item={'key': 'check_disks_pass', 'value': 'false'})
skipping: [10.10.100.181] => (item={'key': 'check_dns_pass', 'value': True}) 
skipping: [10.10.100.181] => (item={'key': 'check_cgroup_pass', 'value': True}) 
skipping: [10.10.100.181] => (item={'key': 'check_gcr_pass', 'value': True}) 
skipping: [10.10.100.181] => (item={'key': 'check_googleapis_pass', 'value': True}) 
skipping: [10.10.100.181] => (item={'key': 'check_kubeadm_pass', 'value': True}) 
skipping: [10.10.100.181] => (item={'key': 'check_podman_absence_pass', 'value': True}) 
skipping: [10.10.100.181] => (item={'key': 'check_package_repo_update_pass', 'value': True}) 
skipping: [10.10.100.181] => (item={'key': 'check_package_availability_pass', 'value': True}) 

TASK [Reach a final verdict] ***************************************************
fatal: [10.10.100.181]: FAILED! => {"changed": false, "msg": "preflight check failed"}

PLAY RECAP *********************************************************************
10.10.100.181              : ok=40   changed=16   unreachable=0    failed=1    skipped=22   rescued=0    ignored=3   

F0504 17:04:55.421867       1 main.go:201] exit status 2




Cluster YMAL is cluster1.ymal

gcrKeyPath: /home/<<USER>>/baremetal/bmctl-workspace/.sa-keys/anthos-sandbox-anthos-baremetal-gcr.json
sshPrivateKeyPath: /home/<<USER>>/.ssh/id_rsa
gkeConnectAgentServiceAccountKeyPath: /home/<<USER>>/baremetal/bmctl-workspace/.sa-keys/anthos-sandbox-anthos-baremetal-connect.json
gkeConnectRegisterServiceAccountKeyPath: /home/<<USER>>/baremetal/bmctl-workspace/.sa-keys/panthos-sandbox-anthos-baremetal-register.json
cloudOperationsServiceAccountKeyPath: /home/<<USER>>/baremetal/bmctl-workspace/.sa-keys/anthos-sandbox-anthos-baremetal-cloud-ops.json
---
apiVersion: v1
kind: Namespace
metadata:
  name: cluster-cluster1
---
kind: Cluster
metadata:
  name: cluster1
  namespace: cluster-cluster1
spec:
  type: hybrid
   anthosBareMetalVersion: 1.6.2
   gkeConnect:
    projectID: anthos-sandbox
   controlPlane:
    nodePoolSpec:
      nodes:
      - address: 10.10.100.181
  clusterNetwork:
    pods:
      cidrBlocks:
      - 192.168.0.0/16
    services:
      cidrBlocks:
      - 10.96.0.0/12
  loadBalancer:
    mode: bundled
    ports:
      controlPlaneLBPort: 443
    vips:
      controlPlaneVIP: 10.10.100.41
      ingressVIP: 10.10.100.42
    addressPools:
    - name: pool1
      addresses:
      - 10.10.100.42-10.10.100.50
    projectID: pk-anthos-sandbox
    location: us-central1
  storage:
    lvpNodeMounts:
      path: /mnt/localpv-disk
      storageClassName: local-disks
    lvpShare:
      path: /mnt/localpv-share
      storageClassName: local-shared
      numPVUnderSharedPath: 5
---
kind: NodePool
metadata:
  name: node-pool-1
  namespace: cluster-cluster1
spec:
  clusterName: cluster1
  nodes:
  - address: 10.10.100.180

Adebisi Ibirogba

unread,
May 5, 2021, 5:04:49 PM5/5/21
to gce-discussion
Thanks for reporting this issue.

I read through your post and noticed that you deployment seems to be getting stucked on "FAILED! => {"changed": false, "msg": "vip ".

I then checked the error and found where it was recommended that you can use --Force flag as a first step

      bmctl create cluster -c cluster_name --force command

It looks like the cluster wasn't created at all so no need to reset.

If the above is not helpful, do feel free to create a public issue track where an engineer will be willing to assist
Reply all
Reply to author
Forward
0 new messages