Q Jobs App Download

0 views

Skip to first unread message

Tonja Witcraft

unread,

Aug 3, 2024, 2:01:03 PM8/3/24

to sinfuepertio

A Job creates one or more Pods and will continue to retry execution of the Pods until a specified number of them successfully terminate.As pods successfully complete, the Job tracks the successful completions. When a specified numberof successful completions is reached, the task (ie, Job) is complete. Deleting a Job will clean upthe Pods it created. Suspending a Job will delete its active Pods until the Jobis resumed again.

A simple case is to create one Job object in order to reliably run one Pod to completion.The Job object will start a new Pod if the first Pod fails or is deleted (for exampledue to a node hardware failure or a node reboot).

When the control plane creates new Pods for a Job, the .metadata.name of theJob is part of the basis for naming those Pods. The name of a Job must be a validDNS subdomainvalue, but this can produce unexpected results for the Pod hostnames. For best compatibility,the name should follow the more restrictive rules for aDNS label.Even when the name is a DNS subdomain, the name must be no longer than 63characters.

The requested parallelism (.spec.parallelism) can be set to any non-negative value.If it is unspecified, it defaults to 1.If it is specified as 0, then the Job is effectively paused until it is increased.

NonIndexed (default): the Job is considered complete when there have been.spec.completions successfully completed Pods. In other words, each Podcompletion is homologous to each other. Note that Jobs that have null.spec.completions are implicitly NonIndexed.

The Job is considered complete when there is one successfully completed Podfor each index. For more information about how to use this mode, seeIndexed Job for Parallel Processing with Static Work Assignment.

A container in a Pod may fail for a number of reasons, such as because the process in it exited witha non-zero exit code, or the container was killed for exceeding a memory limit, etc. If thishappens, and the .spec.template.spec.restartPolicy = "OnFailure", then the Pod stayson the node, but the container is re-run. Therefore, your program needs to handle the case when it isrestarted locally, or else specify .spec.template.spec.restartPolicy = "Never".See pod lifecycle for more information on restartPolicy.

An entire Pod can also fail, for a number of reasons, such as when the pod is kicked off the node(node is upgraded, rebooted, deleted, etc.), or if a container of the Pod fails and the.spec.template.spec.restartPolicy = "Never". When a Pod fails, then the Job controllerstarts a new Pod. This means that your application needs to handle the case when it is restarted in a newpod. In particular, it needs to handle temporary files, locks, incomplete output and the likecaused by previous runs.

By default, each pod failure is counted towards the .spec.backoffLimit limit,see pod backoff failure policy. However, you cancustomize handling of pod failures by setting the Job's pod failure policy.

Additionally, you can choose to count the pod failures independently for eachindex of an Indexed Job by setting the .spec.backoffLimitPerIndex field(for more information, see backoff limit per index).

When the feature gatesPodDisruptionConditions and JobPodFailurePolicy are both enabled,and the .spec.podFailurePolicy field is set, the Job controller does not consider a terminatingPod (a pod that has a .metadata.deletionTimestamp field set) as a failure until that Pod isterminal (its .status.phase is Failed or Succeeded). However, the Job controllercreates a replacement Pod as soon as the termination becomes apparent. Once thepod terminates, the Job controller evaluates .backoffLimit and .podFailurePolicyfor the relevant Job, taking this now-terminated Pod into consideration.

There are situations where you want to fail a Job after some amount of retriesdue to a logical error in configuration etc.To do so, set .spec.backoffLimit to specify the number of retries beforeconsidering a Job as failed. The back-off limit is set by default to 6. FailedPods associated with the Job are recreated by the Job controller with anexponential back-off delay (10s, 20s, 40s ...) capped at six minutes.

When you run an indexed Job, you can choose to handle retriesfor pod failures independently for each index. To do so, set the.spec.backoffLimitPerIndex to specify the maximal number of pod failuresper index.

When the per-index backoff limit is exceeded for an index, Kubernetes considers the index as failed and adds it to the.status.failedIndexes field. The succeeded indexes, those with a successfullyexecuted pods, are recorded in the .status.completedIndexes field, regardless of whether you setthe backoffLimitPerIndex field.

Note that a failing index does not interrupt execution of other indexes.Once all indexes finish for a Job where you specified a backoff limit per index,if at least one of those indexes did fail, the Job controller marks the overallJob as failed, by setting the Failed condition in the status. The Job getsmarked as failed even if some, potentially nearly all, of the indexes wereprocessed successfully.

You can additionally limit the maximal number of indexes marked failed bysetting the .spec.maxFailedIndexes field.When the number of failed indexes exceeds the maxFailedIndexes field, theJob controller triggers termination of all remaining running Pods for that Job.Once all pods are terminated, the entire Job is marked failed by the Jobcontroller, by setting the Failed condition in the Job status.

Additionally, you may want to use the per-index backoff along with apod failure policy. When usingper-index backoff, there is a new FailIndex action available which allows you toavoid unnecessary retries within an index.

In some situations, you may want to have a better control when handling Podfailures than the control provided by the Pod backoff failure policy,which is based on the Job's .spec.backoffLimit. These are some examples of use cases:

In the example above, the first rule of the Pod failure policy specifies thatthe Job should be marked failed if the main container fails with the 42 exitcode. The following are the rules for the main container specifically:

The second rule of the Pod failure policy, specifying the Ignore action forfailed Pods with condition DisruptionTarget excludes Pod disruptions frombeing counted towards the .spec.backoffLimit limit of retries.

You can configure a success policy, in the .spec.successPolicy field,to meet the above use cases. This policy can handle Job success based on thesucceeded pods. After the Job meets the success policy, the job controller terminates the lingering Pods.A success policy is defined by rules. Each rule can take one of the following forms:

In the example above, both succeededIndexes and succeededCount have been specified.Therefore, the job controller will mark the Job as succeeded and terminate the lingering Podswhen either of the specified indexes, 0, 2, or 3, succeed.The Job that meets the success policy gets the SuccessCriteriaMet condition.After the removal of the lingering Pods is issued, the Job gets the Complete condition.

When a Job completes, no more Pods are created, but the Pods are usually not deleted either.Keeping them around allows you to still view the logs of completed pods to check for errors, warnings, or other diagnostic output.The job object also remains after it is completed so that you can view its status. It is up to the user to deleteold jobs after noting their status. Delete the job with kubectl (e.g. kubectl delete jobs/pi or kubectl delete -f ./job.yaml).When you delete the job using kubectl, all the pods it created are deleted too.

By default, a Job will run uninterrupted unless a Pod fails (restartPolicy=Never)or a Container exits in error (restartPolicy=OnFailure), at which point the Job defers to the.spec.backoffLimit described above. Once .spec.backoffLimit has been reached the Job willbe marked as failed and any running Pods will be terminated.

Another way to terminate a Job is by setting an active deadline.Do this by setting the .spec.activeDeadlineSeconds field of the Job to a number of seconds.The activeDeadlineSeconds applies to the duration of the job, no matter how many Pods are created.Once a Job reaches activeDeadlineSeconds, all of its running Pods are terminated and the Job statuswill become type: Failed with reason: DeadlineExceeded.

Note that a Job's .spec.activeDeadlineSeconds takes precedence over its .spec.backoffLimit.Therefore, a Job that is retrying one or more failed Pods will not deploy additional Pods onceit reaches the time limit specified by activeDeadlineSeconds, even if the backoffLimit is not yet reached.

Keep in mind that the restartPolicy applies to the Pod, and not to the Job itself:there is no automatic Job restart once the Job status is type: Failed.That is, the Job termination mechanisms activated with .spec.activeDeadlineSecondsand .spec.backoffLimit result in a permanent Job failure that requires manual intervention to resolve.

Finished Jobs are usually no longer needed in the system. Keeping them around inthe system will put pressure on the API server. If the Jobs are managed directlyby a higher level controller, such asCronJobs, the Jobs can becleaned up by CronJobs based on the specified capacity-based cleanup policy.

Another way to clean up finished Jobs (either Complete or Failed)automatically is to use a TTL mechanism provided by aTTL controller forfinished resources, by specifying the .spec.ttlSecondsAfterFinished field ofthe Job.

When the TTL controller cleans up the Job, it will delete the Job cascadingly,i.e. delete its dependent objects, such as Pods, together with the Job. Notethat when the Job is deleted, its lifecycle guarantees, such as finalizers, willbe honored.

If the field is set to 0, the Job will be eligible to be automatically deletedimmediately after it finishes. If the field is unset, this Job won't be cleanedup by the TTL controller after it finishes.

It is recommended to set ttlSecondsAfterFinished field because unmanaged jobs(Jobs that you created directly, and not indirectly through other workload APIssuch as CronJob) have a default deletionpolicy of orphanDependents causing Pods created by an unmanaged Job to be left aroundafter that Job is fully deleted.Even though the control plane eventuallygarbage collectsthe Pods from a deleted Job after they either fail or complete, sometimes thoselingering pods may cause cluster performance degradation or in worst case cause thecluster to go offline due to this degradation.