VMware VSphere 6.5 Host Resources Deep Dive Free 12

0 views
Skip to first unread message

Kym Cavrak

unread,
May 3, 2024, 1:51:26 AM5/3/24
to diaketica

The VMware vSphere 6.5 Host Resources Deep Dive is a guide to building consistent high-performing ESXi hosts. Written for administrators, architects, consultants, aspiring VCDX-es and people eager to learn more about the elements that control the behavior of CPU, memory, storage and network resources.

The book covers four main topics: CPU, Memory, Storage, and Networking. The in-depth read can only be described as a serious deep dive for people who seriously want to understand what is going on under the covers of their VMware environment.

VMware vSphere 6.5 Host Resources Deep Dive free 12


Download Zip ✸✸✸ https://t.co/vOZn8MEPhS



The VMware vSphere 6.5 Host Resources Deep Dive is a guide to building consistent high-performing ESXi hosts, This book was written for administrators, architects, consultants, aspiring VCDX-es and people eager to learn more about the elements that control the behavior of CPU, memory, storage and network resources. This week we discuss this new book with the authors, Frank Denneman and Niels Hagoort.

Apologies if any of this gos over stuff you've already tried! CPU affinity (and indeed latency sensitivity) should do what you want for VMs, but according to Denneman & Hagoort's deep dive[*]: "Be aware that it never dedicates the physical CPU to the VM as the VMkernel schedules all its processes across all available pCPUs, regardless of any custom setting a VM has."

A very good read on this, which I highly recommend, can be had with the "VMware vSphere 6.5 - Host Resource Deep Dive" by Frank Denneman and Niels Hagoort. This is still available free I believe, details here : -vsphere-6-5-host-resources-deep-dive-ebook/

Now the Host Deep Resources Deep Dive, Part 3 might be a slightly confusing title. It is part 3 because we already did Part 1 at VMworld 2016 and Part 2 in 2017. We will bring a new awesome way of delivering host resources knowledge in that session. More on that later.

If you have read the VMware vSphere Clustering Technical Deepdive books, you already know what to expect in terms of technical level of content, however the sneak peaks show that Frank and Niels have gone even deeper than you can imagine.

The Tanzu Kubernetes Grid (TKG) cluster is made up of a set of Virtual Machines configured as Kubernetes nodes and joined into a cluster. The creation and desired state maintenance of those clusters involves a hierarchy of Kubernetes custom resources and controllers deployed into the Supervisor cluster. The first blog\/video in this TKG Troubleshooting series focused on an introduction to Kubernetes custom resources and the Cluster API open Source project, along with the creation of the Tanzu Kubernetes Cluster (TKC) resource. Part 1 blog\/video is available here. The TKC and associated TKG Controller are the top level in the hierarchy of resources that reconcile the TKG cluster. This part 2 blog\/video will look at how the TKC is decomposed into a set of resources that are an implementation of Cluster API. It will also look at how the ClusterAPI resources are further reconciled into Virtual Machine resources that hold the configuration required to create VMs on vSphere and form a Kubernetes cluster. To jump directly to the Part 2 video click Here\r\n\r\n \r\n\r\nHierarchy of Custom Resources and Controllers\r\n\r\n \r\n\r\nThe simplified view is that the Tanzu Kubernetes Cluster (TKC) resource is reconciled by the TKG Controller into a set of resources or objects that are an implementation of Cluster API. The TKG Controller also creates the vSphere Provider specific Virtual Machine Resources. The reconciliation process for all of these resources encompasses a set of controllers that are responsible for monitoring the health and updating appropriate configuration and status for the particular set of resources they watch. As we drill into troubleshooting in the next video in the series, you will see that each of the controllers has a log of its activities. Transitions that happen to individual resources can be found by describing that resource. Additionally we have implemented a pattern where lower level activity is reflected into higher level objects in summary form. So troubleshooting starts by describing the TKC resource and potentially viewing the log for the TKG controller. \r\n\r\n \r\n\r\nTKG Controller Creates Cluster API Objects\r\n\r\n \r\n\r\nTKG reconciles the TKC resource into a set of yaml specifications that define the Cluster API custom resources. The objects are logically grouped into Cluster, Control Plane and Workers. Further, Cluster API separates the generic specification that would be independent of the particular provider platform (vSphere, Azure, AWS, etc) from, in our case, vSphere specific configuration. The Cluster object contains the definition of the cluster itself and makes reference to WCPCluster and KubeadmControlPlane objects which hold vSphere specific configuration and the KubeAdm code to turn the virtual machines into Kubernetes nodes. The WCPMachineTemplate contains the definition for the underlying Virtual Machines that will become the control plane nodes. Note that WCP stands for Workload Control Plane and is the engineering name for the vCenter service that implements the Supervisor Cluster. More interesting from a troubleshooting perspective than the Spec, is the Status section of the objects. The Cluster reports on the status of ControlPlane and Infrastructure availability, while the WCPCluster is focused on the specifics of Infrastructure availability - particularly Networking and Load Balancing. So if you saw an infrastructure error at the TKC resource level or the Cluster resource level, you might check the WCPCluster resource for more detail. \r\n\r\n \r\n\r\nCluster\r\n\r\nStatus:\r\n Conditions:\r\n Last Transition Time: 2021-07-13T18:46:40Z\r\n Status: True\r\n Type: Ready\r\n Last Transition Time: 2021-07-13T18:46:40Z\r\n Status: True\r\n Type: ControlPlaneReady\r\n Last Transition Time: 2021-07-13T18:38:15Z\r\n Status: True\r\n Type: InfrastructureReady\r\n Control Plane Initialized: true\r\n Control Plane Ready: true\r\n Infrastructure Ready: true\r\n Observed Generation: 3\r\n Phase: Provisioned\r\n\r\nWCPCluster\r\n\r\nStatus:\r\n Conditions:\r\n Last Transition Time: 2021-07-13T18:38:14Z\r\n Status: True\r\n Type: Ready\r\n Last Transition Time: 2021-07-13T18:38:06Z\r\n Status: True\r\n Type: ClusterNetworkReady\r\n Last Transition Time: 2021-07-13T18:38:14Z\r\n Status: True\r\n Type: LoadBalancerReady\r\n Last Transition Time: 2021-07-13T18:38:06Z\r\n Status: True\r\n Type: ResourcePolicyReady\r\n Ready: true\r\n Resource Policy Name: tkg-cluster\r\n\r\n \r\n\r\nThe Control Plane objects are the next level of detail that make up the Cluster. Each of the control plane nodes is a Machine in Cluster API. There will be a generic Machine resource and a provider specific Machine resource called WCPMachine for each control plane node. There is also a KubeAdmConfig resource that holds an abstraction of the Kubeadm config that will be used to set up the nodes as Kubernetes clusters. Worker nodes are configured in a very similar process to Control Plane nodes. \r\n\r\n\r\n\r\nLet's check the Status of the Control Plane Machine by issuing the kubectl describe machine \"ControlPlaneNodeName\" command. This is where you can see that the Infrastructure is Ready and each of the Kubernetes components are Healthy (Controller Manager, API, EtcD, Scheduler, etc). Errors that show up in Infrastructure would necessitate a look at the corresponding WCPMachine. Notice that the WCPMachine status contains the VM ip and the condition of the Virtual Machine. As we will see in a moment, this information was reflected back to this resource from the VirtualMachine resource that is the source of truth for the Virtual Machine. \r\n\r\nControl Plane Machine\r\n\r\nStatus:\r\n Bootstrap Ready: true\r\n Conditions:\r\n Last Transition Time: 2021-07-13T18:45:12Z\r\n Status: True\r\n Type: Ready\r\n Last Transition Time: 2021-07-13T18:56:53Z\r\n Status: True\r\n Type: APIServerPodHealthy\r\n Last Transition Time: 2021-07-13T18:38:24Z\r\n Status: True\r\n Type: BootstrapReady\r\n Last Transition Time: 2021-07-17T10:58:54Z\r\n Status: True\r\n Type: ControllerManagerPodHealthy\r\n Last Transition Time: 2021-07-18T13:16:49Z\r\n Status: True\r\n Type: EtcdMemberHealthy\r\n Last Transition Time: 2021-07-13T18:56:53Z\r\n Status: True\r\n Type: EtcdPodHealthy\r\n Last Transition Time: 2021-07-13T18:56:50Z\r\n Status: True\r\n Type: HealthCheckSucceeded\r\n Last Transition Time: 2021-07-13T18:45:12Z\r\n Status: True\r\n Type: InfrastructureReady\r\n Last Transition Time: 2021-07-13T18:56:49Z\r\n Status: True\r\n Type: NodeHealthy\r\n Last Transition Time: 2021-07-13T18:56:53Z\r\n Status: True\r\n Type: SchedulerPodHealthy\r\n Infrastructure Ready: true\r\n Last Updated: 2021-07-13T18:56:50Z\r\n Node Ref:\r\n API Version: v1\r\n Kind: Node\r\n Name: tkg-cluster-control-plane-xztwk\r\n UID: 74b8ce8c-a282-4caf-9545-abd3696361ba\r\n Observed Generation: 3\r\n Phase: Running\r\n\r\nControl Plane WCPMachine\r\n\r\nStatus:\r\n Conditions:\r\n Last Transition Time: 2021-07-13T18:45:12Z\r\n Status: True\r\n Type: Ready\r\n Last Transition Time: 2021-07-13T18:45:12Z\r\n Status: True\r\n Type: VMProvisioned\r\n Ready: true\r\n Vm ID: 4214a174-360d-09f4-2672-6bf3bc323984\r\n Vm Ip: 192.168.120.8\r\n Vmstatus: ready\r\n\r\n \r\n\r\nOnce the Machine Objects are ready, the VirtualMachine custom resources are created and the VMService (also know as the VM Operator Controller) reconciles those resources into API calls to vCenter to instantiate the actual VMs. Note the difference between a VirtualMachine resource which holds the specification and status of the VM vs the actual Virtual Machine created in vCenter. This has been a point of confusion for those new to Kubernetes and the VM Service.\r\n\r\n \r\n\r\n\r\n\r\nThe VirtualMachine Status is very specific to a vSphere deployed VM. kubectl describe vm \"VMName\" shows things like which host the VM is deployed on, Power Status, IP Address, and the Managed Object Reference ID (MOID). \r\n\r\n \r\n\r\nVirtual Machine\r\n\r\nStatus:\r\n Bios UUID: 4214a174-360d-09f4-2672-6bf3bc323984\r\n Change Block Tracking: false\r\n Conditions:\r\n Last Transition Time: 2021-07-13T18:48:59Z\r\n Status: True\r\n Type: Ready\r\n Last Transition Time: 2021-07-13T18:38:26Z\r\n Status: True\r\n Type: VirtualMachinePrereqReady\r\n Host: esx-02a.corp.local\r\n Instance UUID: 50143365-56f2-636e-8207-40a932645926\r\n Network Interfaces:\r\n Connected: true\r\n Ip Addresses:\r\n 192.168.120.8\/24\r\n fe80::250:56ff:fe94:a347\/64\r\n Mac Address: 00:50:56:94:a3:47\r\n Connected: true\r\n Ip Addresses:\r\n fe80::748b:89ff:fee1:2fda\/64\r\n Mac Address: 76:8b:89:e1:2f:da\r\n Connected: true\r\n Ip Addresses:\r\n fe80::fc5e:bff:feea:a037\/64\r\n Mac Address: fe:5e:0b:ea:a0:37\r\n Connected: true\r\n Ip Addresses:\r\n fe80::3c67:bfff:fe47:8886\/64\r\n Mac Address: 3e:67:bf:47:88:86\r\n Connected: true\r\n Ip Addresses:\r\n fe80::ece4:b8ff:fe48:5c32\/64\r\n Mac Address: ee:e4:b8:48:5c:32\r\n Connected: true\r\n Ip Addresses:\r\n fe80::6cf6:88ff:fe55:62ed\/64\r\n Mac Address: 6e:f6:88:55:62:ed\r\n Connected: true\r\n Ip Addresses:\r\n fe80::40f2:69ff:fe7a:6d51\/64\r\n Mac Address: 42:f2:69:7a:6d:51\r\n Phase: Created\r\n Power State: poweredOn\r\n Unique ID: vm-5021\r\n Vm Ip: 192.168.120.8\r\n\r\n \r\n\r\nCluster API Controllers\r\n\r\nThis blog is not meant to be an exhaustive explanation of the reconciliation of each Custom Resource by the Controllers that watch them, but more to provide insight into the hierarchy of resources and basic navigation up and down the stack when attempting to troubleshoot an issue. When an error surfaces in the TKC resource, describing the appropriate lower level custom resources often provides the root cause. Sometimes that is not the case and further investigation needs to be done in the Controller that reconciles the resource. You see this information through the kubectl logs \"Controller Name\" -n \"Namespace Name\" command. In this case it is useful to have some understanding of which controllers are acting on a particular resource. Let's take a high level look at a couple of these controllers. Note that most controllers are deployed as a ReplicaSet with three pods in order to increase availability. You may need to check the logs for multiple pods to determine which one is the leader and is writing log info. \r\n\r\nClusterAPI Controller (CAPI) reconciles all of the ClusterAPI resources except the WCP specific objects. Reconciliation includes insuring the health and desired state of the objects as well as reflecting appropriate information between various objects. \r\n\r\nClusterAPI Controller for WCP (CAPW) reconciles the WCPMachine resources and creates the VirtualMachine resources. It also interacts with VirtualNetwork resources and can be a starting point for troubleshooting networking issues.\r\n\r\nVirtual Machine Operator Controller reconciles the VirtualMachine resource into Virtual Machines through API calls to vCenter. The VMOperator makes use of the VirtualMachineImage resource which holds the base image for the VM and the VirtualMachinClass which defines available resources (vCPU, RAM and soon GPUs) for the VM. \r\n\r\nvmware-system-capw capi-controller-manager-644998658d-9mg7d 2\/2 Running 2 5d19h\r\nvmware-system-capw capi-controller-manager-644998658d-ffgdt 2\/2 Running 0 4d14h\r\nvmware-system-capw capi-controller-manager-644998658d-kxk6v 2\/2 Running 3 5d19h\r\nvmware-system-capw capi-kubeadm-bootstrap-controller-manager-65b8d5c4dc-b9q76 2\/2 Running 0 4d14h\r\nvmware-system-capw capi-kubeadm-bootstrap-controller-manager-65b8d5c4dc-rqbfg 2\/2 Running 3 5d19h\r\nvmware-system-capw capi-kubeadm-bootstrap-controller-manager-65b8d5c4dc-vxv7m 2\/2 Running 0 5d19h\r\nvmware-system-capw capi-kubeadm-control-plane-controller-manager-8565c86bbf-9ss6p 2\/2 Running 3 5d19h\r\nvmware-system-capw capi-kubeadm-control-plane-controller-manager-8565c86bbf-c22jl 2\/2 Running 0 4d14h\r\nvmware-system-capw capi-kubeadm-control-plane-controller-manager-8565c86bbf-dwhsc 2\/2 Running 1 5d19h\r\nvmware-system-capw capw-controller-manager-58d769bd99-w2h7p 2\/2 Running 2 5d19h\r\nvmware-system-capw capw-controller-manager-58d769bd99-w56hq 2\/2 Running 0 4d14h\r\nvmware-system-capw capw-controller-manager-58d769bd99-z2rft 2\/2 Running 3 5d19h\r\nvmware-system-vmop vmware-system-vmop-controller-manager-7578487c6f-85tkc 2\/2 Running 1 5d19h\r\nvmware-system-vmop vmware-system-vmop-controller-manager-7578487c6f-pxd88 2\/2 Running 6 5d19h\r\nvmware-system-vmop vmware-system-vmop-controller-manager-7578487c6f-wbth6 2\/2 Running 0 4d14h\r\n\r\n \r\n\r\nThe goal of this blog was not to make you an expert on Cluster API and the vSphere with Tanzu implementation, but to drive a basic understanding of the structure of custom resources. The TanzuKubernetesCluster (TKC) is the top resource. Kubectl describe tkc \"ClusterName\" is the troubleshooting starting point and will tell you the status of the cluster. Depending on where error messages show up, you generally will look at WCPMachine and VirtualMachine objects for more detail. Errors in these components - or if they don't exist - mean you go to the associated controller and check the logs for a root cause. The following video will walk through some of the material in this blog and look at the resources in a live environment. Follow on videos in this series will look at common errors and walk through some troubleshooting scenarios. \r\n\r\nTanzu Kubernetes Grid (TKG) Troubleshooting Deep Dive Part 2 Video\r\n\r\n \r\n\r\nQuick Links to Entire Troubleshooting Blog\/Video Series.\r\n\r\nAbstract: Define Kubernetes custom resources, Cluster API and show how they are used to Lifecycle TKG clusters. Focus on how that impacts troubleshooting\r\n\r\nBlog: Troubleshooting Tanzu Kubernetes Grid Clusters - Part 1\r\n\r\nVideo: Troubleshooting Tanzu Kubernetes Grid Clusters - Part 1\r\n\r\nAbstract: Show how the TKC is decomposed into a set of resources that are an implementation of Cluster API. Focus on how that impacts troubleshooting\r\n\r\nBlog: Troubleshooting Tanzu Kubernetes Grid Clusters - Part 2\r\n\r\nVideo: Troubleshooting Tanzu Kubernetes Grid Clusters - Part 2\r\n\r\nAbstract: Cluster creation milestones and identify common failures and remediation at each level\r\n\r\nBlog: Troubleshooting Tanzu Kubernetes Grid Clusters - Part 3\r\n\r\nVideo: Troubleshooting Tanzu Kubernetes Grid Clusters - Part 3\r\n\r\nSpecial Thanks to Winnie Kwon, VMware Senior Engineer. Her engineering documents and willingness to answer many questions were the basis for the creation of this series of blogs\/videos.\r\n\r\n \r\n\r\n \r\n","#format":"full_html","#langcode":"en"},"#cache":"contexts":[],"tags":[],"max-age":-1,"#weight":2},"field_tags":"#cache":"contexts":[],"tags":[],"max-age":-1,"#weight":3,"field_url":"#cache":"contexts":[],"tags":[],"max-age":-1,"#weight":4,"field_content":"#theme":"field","#title":"Summary (deprecated)","#label_display":"above","#view_mode":"full","#language":"en","#field_name":"field_content","#field_type":"string_long","#field_translatable":false,"#entity_type":"node","#bundle":"article","#object":"in_preview":null,"#items":,"#formatter":"basic_string","#is_multiple":false,"#third_party_settings":[],"0":"#type":"inline_template","#template":" value","#context":"value":"This is the second in a series of blogs and supporting videos that dives into the components of the TKG Service and provides a roadmap for troubleshooting failures. The focus here is cluster API and Virtual machine resources that are generated through the cluster creation milestones and will serve as a baseline for the troubleshooting videos that follow.","#cache":"contexts":[],"tags":[],"max-age":-1,"#weight":5,"field_cc_category":"#theme":"field","#title":"Category","#label_display":"above","#view_mode":"full","#language":"en","#field_name":"field_cc_category","#field_type":"entity_reference","#field_translatable":false,"#entity_type":"node","#bundle":"article","#object":"in_preview":null,"#items":,"#formatter":"entity_reference_label","#is_multiple":true,"#third_party_settings":[],"0":"#type":"link","#title":"Deep Dive","#url":,"#options":"entity_type":"taxonomy_term","entity":,"language":,"#entity":,"#cache":"tags":["taxonomy_term:824"],"contexts":["user.permissions"],"max-age":-1,"#cache":"contexts":[],"tags":[],"max-age":-1,"#weight":6,"field_cc_level":"#theme":"field","#title":"Level","#label_display":"above","#view_mode":"full","#language":"en","#field_name":"field_cc_level","#field_type":"entity_reference","#field_translatable":false,"#entity_type":"node","#bundle":"article","#object":"in_preview":null,"#items":,"#formatter":"entity_reference_label","#is_multiple":true,"#third_party_settings":[],"0":"#type":"link","#title":"Intermediate","#url":,"#options":"entity_type":"taxonomy_term","entity":,"language":,"#entity":,"#cache":"tags":["taxonomy_term:647"],"contexts":["user.permissions"],"max-age":-1,"1":"#type":"link","#title":"Advanced","#url":,"#options":"entity_type":"taxonomy_term","entity":,"language":,"#entity":,"#cache":"tags":["taxonomy_term:648"],"contexts":["user.permissions"],"max-age":-1,"#cache":"contexts":[],"tags":[],"max-age":-1,"#weight":7,"field_cc_phase":"#theme":"field","#title":"Phase","#label_display":"above","#view_mode":"full","#language":"en","#field_name":"field_cc_phase","#field_type":"entity_reference","#field_translatable":false,"#entity_type":"node","#bundle":"article","#object":"in_preview":null,"#items":,"#formatter":"entity_reference_label","#is_multiple":true,"#third_party_settings":[],"0":"#type":"link","#title":"Manage","#url":,"#options":"entity_type":"taxonomy_term","entity":,"language":,"#entity":,"#cache":"tags":["taxonomy_term:697"],"contexts":["user.permissions"],"max-age":-1,"#cache":"contexts":[],"tags":[],"max-age":-1,"#weight":8,"field_cc_product":"#theme":"field","#title":"Product","#label_display":"above","#view_mode":"full","#language":"en","#field_name":"field_cc_product","#field_type":"entity_reference","#field_translatable":false,"#entity_type":"node","#bundle":"article","#object":"in_preview":null,"#items":,"#formatter":"entity_reference_label","#is_multiple":true,"#third_party_settings":[],"0":"#type":"link","#title":"vSphere with Tanzu","#url":,"#options":"entity_type":"taxonomy_term","entity":,"language":,"#entity":,"#cache":"tags":["taxonomy_term:3138"],"contexts":["user.permissions"],"max-age":-1,"#cache":"contexts":[],"tags":[],"max-age":-1,"#weight":9,"field_cc_solution":"#theme":"field","#title":"Solution","#label_display":"above","#view_mode":"full","#language":"en","#field_name":"field_cc_solution","#field_type":"entity_reference","#field_translatable":false,"#entity_type":"node","#bundle":"article","#object":"in_preview":null,"#items":,"#formatter":"entity_reference_label","#is_multiple":true,"#third_party_settings":[],"0":"#type":"link","#title":"Modern Applications","#url":,"#options":"entity_type":"taxonomy_term","entity":,"language":,"#entity":,"#cache":"tags":["taxonomy_term:3080"],"contexts":["user.permissions"],"max-age":-1,"#cache":"contexts":[],"tags":[],"max-age":-1,"#weight":10,"field_cc_type":"#theme":"field","#title":"Type","#label_display":"above","#view_mode":"full","#language":"en","#field_name":"field_cc_type","#field_type":"entity_reference","#field_translatable":false,"#entity_type":"node","#bundle":"article","#object":"in_preview":null,"#items":,"#formatter":"entity_reference_label","#is_multiple":true,"#third_party_settings":[],"0":"#type":"link","#title":"Blog","#url":,"#options":"entity_type":"taxonomy_term","entity":,"language":,"#entity":,"#cache":"tags":["taxonomy_term:641"],"contexts":["user.permissions"],"max-age":-1,"#cache":"contexts":[],"tags":[],"max-age":-1,"#weight":11,"field_co_author":"#cache":"contexts":[],"tags":[],"max-age":-1,"#weight":12,"field_cc_audience":"#theme":"field","#title":"Audience","#label_display":"above","#view_mode":"full","#language":"en","#field_name":"field_cc_audience","#field_type":"entity_reference","#field_translatable":false,"#entity_type":"node","#bundle":"article","#object":"in_preview":null,"#items":,"#formatter":"entity_reference_label","#is_multiple":false,"#third_party_settings":[],"0":"#type":"link","#title":"Customer","#url":,"#options":"entity_type":"taxonomy_term","entity":,"language":,"#entity":,"#cache":"tags":["taxonomy_term:2719"],"contexts":["user.permissions"],"max-age":-1,"#cache":"contexts":[],"tags":[],"max-age":-1,"#weight":13,"field_cc_internal":"#cache":"contexts":[],"tags":[],"max-age":-1,"#weight":14,"field_associated_content":"#theme":"field","#title":"Associated Content","#label_display":"above","#view_mode":"full","#language":"en","#field_name":"field_associated_content","#field_type":"entity_reference_revisions","#field_translatable":false,"#entity_type":"node","#bundle":"article","#object":"in_preview":null,"#items":,"#formatter":"entity_reference_revisions_entity_view","#is_multiple":true,"#third_party_settings":[],"0":"#paragraph":,"#view_mode":"full","#cache":"tags":"0":"paragraph_view","1":"paragraph:17952","3":"config:paragraphs.settings","4":"node:3193","contexts":["user.permissions","languages:language_interface"],"max-age":-1,"#theme":"paragraph","#weight":0,"#pre_render":[[,"build"]],"#cache":"contexts":[],"tags":[],"max-age":-1,"#weight":15,"field_cc_technology":"#theme":"field","#title":"Technology","#label_display":"above","#view_mode":"full","#language":"en","#field_name":"field_cc_technology","#field_type":"entity_reference","#field_translatable":false,"#entity_type":"node","#bundle":"article","#object":"in_preview":null,"#items":,"#formatter":"entity_reference_label","#is_multiple":true,"#third_party_settings":[],"0":"#type":"link","#title":"Kubernetes","#url":,"#options":"entity_type":"taxonomy_term","entity":,"language":,"#entity":,"#cache":"tags":["taxonomy_term:3161"],"contexts":["user.permissions"],"max-age":-1,"1":"#type":"link","#title":"Tanzu Kubernetes Grid","#url":,"#options":"entity_type":"taxonomy_term","entity":,"language":,"#entity":,"#cache":"tags":["taxonomy_term:17146"],"contexts":["user.permissions"],"max-age":-1,"#cache":"contexts":[],"tags":[],"max-age":-1,"#weight":16,"links":"#lazy_builder":["Drupal\\node\\NodeViewBuilder::renderLinks",["3193","full","en",false,null]],"#weight":17,"field_video_duration":"#theme":"field","#title":"Read Time\/Duration","#label_display":"above","#view_mode":"full","#language":"en","#field_name":"field_video_duration","#field_type":"string","#field_translatable":true,"#entity_type":"node","#bundle":"article","#object":"in_preview":null,"#items":,"#formatter":"string","#is_multiple":false,"#third_party_settings":[],"0":"#type":"inline_template","#template":" value","#context":"value":"10:46","#cache":"contexts":[],"tags":[],"max-age":-1,"#weight":18,"field_read_time_visible_on_resou":"#cache":"contexts":[],"tags":[],"max-age":-1,"#weight":19,"field_search_content":"#theme":"field","#title":"Search Content","#label_display":"above","#view_mode":"full","#language":"en","#field_name":"field_search_content","#field_type":"text_with_summary","#field_translatable":true,"#entity_type":"node","#bundle":"article","#object":"in_preview":null,"#items":,"#formatter":"text_default","#is_multiple":false,"#third_party_settings":[],"0":"#type":"processed_text","#text":"This is the second in a series of blogs and supporting videos that dives into the components of the TKG Service and provides a roadmap for troubleshooting failures. The focus here is cluster API and Virtual machine resources that are generated through the cluster creation milestones and will serve as a baseline for the troubleshooting videos that follow. The Tanzu Kubernetes Grid (TKG) cluster is made up of a set of Virtual Machines configured as Kubernetes nodes and joined into a cluster. The creation and desired state maintenance of those clusters involves a hierarchy of Kubernetes custom resources and controllers deployed into the Supervisor cluster. The first blog\/video in this TKG Troubleshooting series focused on an introduction to Kubernetes custom resources and the Cluster API open Source project, along with the creation of the Tanzu Kubernetes Cluster (TKC) resource. Part 1 blog\/video is available here. The TKC and associated TKG Controller are the top level in the hierarchy of resources that reconcile the TKG cluster. This part 2 blog\/video will look at how the TKC is decomposed into a set of resources that are an implementation of Cluster API. It will also look at how the ClusterAPI resources are further reconciled into Virtual Machine resources that hold the configuration required to create VMs on vSphere and form a Kubernetes cluster. To jump directly to the Part 2 video click Here Hierarchy of Custom Resources and Controllers The simplified view is that the Tanzu Kubernetes Cluster (TKC) resource is reconciled by the TKG Controller into a set of resources or objects that are an implementation of Cluster API. The TKG Controller also creates the vSphere Provider specific Virtual Machine Resources. The reconciliation process for all of these resources encompasses a set of controllers that are responsible for monitoring the health and updating appropriate configuration and status for the particular set of resources they watch. As we drill into troubleshooting in the next video in the series, you will see that each of the controllers has a log of its activities. Transitions that happen to individual resources can be found by describing that resource. Additionally we have implemented a pattern where lower level activity is reflected into higher level objects in summary form. So troubleshooting starts by describing the TKC resource and potentially viewing the log for the TKG controller. TKG Controller Creates Cluster API Objects TKG reconciles the TKC resource into a set of yaml specifications that define the Cluster API custom resources. The objects are logically grouped into Cluster, Control Plane and Workers. Further, Cluster API separates the generic specification that would be independent of the particular provider platform (vSphere, Azure, AWS, etc) from, in our case, vSphere specific configuration. The Cluster object contains the definition of the cluster itself and makes reference to WCPCluster and KubeadmControlPlane objects which hold vSphere specific configuration and the KubeAdm code to turn the virtual machines into Kubernetes nodes. The WCPMachineTemplate contains the definition for the underlying Virtual Machines that will become the control plane nodes. Note that WCP stands for Workload Control Plane and is the engineering name for the vCenter service that implements the Supervisor Cluster. More interesting from a troubleshooting perspective than the Spec, is the Status section of the objects. The Cluster reports on the status of ControlPlane and Infrastructure availability, while the WCPCluster is focused on the specifics of Infrastructure availability - particularly Networking and Load Balancing. So if you saw an infrastructure error at the TKC resource level or the Cluster resource level, you might check the WCPCluster resource for more detail. Cluster Status: Conditions: Last Transition Time: 2021-07-13T18:46:40Z Status: True Type: Ready Last Transition Time: 2021-07-13T18:46:40Z Status: True Type: ControlPlaneReady Last Transition Time: 2021-07-13T18:38:15Z Status: True Type: InfrastructureReady Control Plane Initialized: true Control Plane Ready: true Infrast

Reply all
Reply to author
Forward
0 new messages