Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"clean", BuildDate:"2018-02-09T21:51:06Z", GoVersion:"go1.9.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.3", GitCommit:"f0efb3cb883751c5ffdbe6d515f3cb4fbe7b7acd", GitTreeState:"clean", BuildDate:"2017-11-08T18:27:48Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
when i ran spark submit on k8s master the driver pod is stuck in Waiting: PodInitializing state.
I tried kubectl describe node on the node where trhe driver pod is running this is what i got ,i do see there is overcommit on resources but i expected kubernetes scheduler not to schedule if resources in node are overcommitted or node is in Not Ready state ,in this case node is in Ready State but i observe same behaviour if node is in "Not Ready" state
Name: **********
Roles: worker
Labels: beta.kubernetes.io/arch=amd64
node-role.kubernetes.io/worker=true
Annotations: node.alpha.kubernetes.io/ttl=0
volumes.kubernetes.io/controller-managed-attach-detach=true
Taints: <none>
CreationTimestamp: Tue, 31 Jul 2018 09:59:24 -0400
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
OutOfDisk False Tue, 14 Aug 2018 09:31:20 -0400 Tue, 31 Jul 2018 09:59:24 -0400 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Tue, 14 Aug 2018 09:31:20 -0400 Tue, 31 Jul 2018 09:59:24 -0400 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Tue, 14 Aug 2018 09:31:20 -0400 Tue, 31 Jul 2018 09:59:24 -0400 KubeletHasNoDiskPressure kubelet has no disk pressure
Ready True Tue, 14 Aug 2018 09:31:20 -0400 Sat, 11 Aug 2018 00:41:27 -0400 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: *****
Hostname: ******
Capacity:
cpu: 16
memory: 125827288Ki
pods: 110
Allocatable:
cpu: 16
memory: 125724888Ki
pods: 110
System Info:
Machine ID: *************
System UUID: **************
Boot ID: 1493028d-0a80-4f2f-b0f1-48d9b8910e9f
Kernel Version: 4.4.0-1062-aws
OS Image: Ubuntu 16.04.4 LTS
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://Unknown
Kubelet Version: v1.8.3
Kube-Proxy Version: v1.8.3
PodCIDR: ******
ExternalID: **************
Non-terminated Pods: (11 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
kube-system calico-node-gj5mb 250m (1%) 0 (0%) 0 (0%) 0 (0%)
kube-system kube-proxy-**************************************** 100m (0%) 0 (0%) 0 (0%) 0 (0%)
kube-system prometheus-prometheus-node-exporter-9cntq 100m (0%) 200m (1%) 30Mi (0%) 50Mi (0%)
logging elasticsearch-elasticsearch-data-69df997486-gqcwg 400m (2%) 1 (6%) 8Gi (6%) 16Gi (13%)
logging fluentd-fluentd-elasticsearch-tj7nd 200m (1%) 0 (0%) 612Mi (0%) 0 (0%)
rook rook-agent-6jtzm 0 (0%) 0 (0%) 0 (0%) 0 (0%)
rook rook-ceph-osd-10-6-42-250.accel.aws-cardda.cb4good.com-gwb8j 0 (0%) 0 (0%) 0 (0%) 0 (0%)
spark accelerate-test-5-a3bfb8a597e83d459193a183e17f13b5-exec-1 2 (12%) 0 (0%) 10Gi (8%) 12Gi (10%)
spark accelerate-testing-1-8ed0482f3bfb3c0a83da30bb7d433dff-exec-5 2 (12%) 0 (0%) 10Gi (8%) 12Gi (10%)
spark accelerate-testing-2-8cecc18bb42f31a386c6304bd63e9eba-driver 1 (6%) 0 (0%) 2Gi (1%) 2432Mi (1%)
spark accelerate-testing-2-e8bd0607cc693bc8ae25cc6dc300b2c7-driver 1 (6%) 0 (0%) 2Gi (1%) 2432Mi (1%)
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
7050m (44%) 1200m (7%) 33410Mi (27%) 45874Mi (37%)
Events: <none>