cc @kubernetes/sig-storage-pr-reviews
—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.![]()
@msau42 commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -326,6 +343,28 @@ func (dswp *desiredStateOfWorldPopulator) markPodProcessed( dswp.pods.processedPods[podName] = true } +func (dswp *desiredStateOfWorldPopulator) recordFailedProcessPodEvent(
Did you see issues with generating too many events? The events framework should already be aggregating the same events.
@xingzhou commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -326,6 +343,28 @@ func (dswp *desiredStateOfWorldPopulator) markPodProcessed( dswp.pods.processedPods[podName] = true } +func (dswp *desiredStateOfWorldPopulator) recordFailedProcessPodEvent(
@msau42 , yeap, the same events are being aggregated now, but based on current volume manager implementation, I can see that there will be more than 5 same events sent to pod per second. As a result, I'm thinking it's better to set up an interval to reduce the number of the same events.
@msau42 commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -109,18 +115,25 @@ type desiredStateOfWorldPopulator struct {
podStatusProvider status.PodStatusProvider
desiredStateOfWorld cache.DesiredStateOfWorld
pods processedPods
+ failedProcessPods failedProcessPods
Can you add a comment here describing what this is for
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -326,6 +343,28 @@ func (dswp *desiredStateOfWorldPopulator) markPodProcessed(
dswp.pods.processedPods[podName] = true
}
+func (dswp *desiredStateOfWorldPopulator) recordFailedProcessPodEvent(
+ pod *v1.Pod, msg string) {
+ dswp.failedProcessPods.Lock()
+ defer dswp.failedProcessPods.Unlock()
+
+ // Define a fixed event interval to avoid sending same events too often.
+ const failureEventInterval = 10 * time.Second
Can you make this configurable?
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -288,6 +302,8 @@ func (dswp *desiredStateOfWorldPopulator) processPodVolumes(pod *v1.Pod) {
volumeSpec.Name(),
uniquePodName,
err)
+
+ dswp.recordFailedProcessPodEvent(pod, fmt.Sprintf(errEventTemplate, podVolume.Name, format.Pod(pod), err))
Should this one use the same msg as its log message?
@xingzhou commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -326,6 +343,28 @@ func (dswp *desiredStateOfWorldPopulator) markPodProcessed(
dswp.pods.processedPods[podName] = true
}
+func (dswp *desiredStateOfWorldPopulator) recordFailedProcessPodEvent(
+ pod *v1.Pod, msg string) {
+ dswp.failedProcessPods.Lock()
+ defer dswp.failedProcessPods.Unlock()
+
+ // Define a fixed event interval to avoid sending same events too often.
+ const failureEventInterval = 10 * time.Second
I'm not sure whether this interval is worth for user to config, in my mind, this event is only to show some detail error information to end user when user uses "kubectl describe pod" to check the pod status. So I tend to think it's ok to put a fixed time interval here?
@xingzhou commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -288,6 +302,8 @@ func (dswp *desiredStateOfWorldPopulator) processPodVolumes(pod *v1.Pod) {
volumeSpec.Name(),
uniquePodName,
err)
+
+ dswp.recordFailedProcessPodEvent(pod, fmt.Sprintf(errEventTemplate, podVolume.Name, format.Pod(pod), err))
do we want to expose internal data structure like "desiredStateOfWorld" to end users in the pod event?
@xingzhou commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -109,18 +115,25 @@ type desiredStateOfWorldPopulator struct {
podStatusProvider status.PodStatusProvider
desiredStateOfWorld cache.DesiredStateOfWorld
pods processedPods
+ failedProcessPods failedProcessPods
Sure, let me update the code
@xingzhou: The following test failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|---|---|---|
| pull-kubernetes-e2e-kops-aws | 9814ea1 | link | /test pull-kubernetes-e2e-kops-aws |
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
@jsafrane commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -335,6 +377,15 @@ func (dswp *desiredStateOfWorldPopulator) deleteProcessedPod( delete(dswp.pods.processedPods, podName) } +// Removes the specified pod records from failedProcessedPods +func (dswp *desiredStateOfWorldPopulator) deleteFailedProcessPod(
Looking around, deleteFailedProcessPod is called in processPodVolumes when everything succeeds and in findAndRemoveDeletedPods when a pod is removed from DSW.
Is there a possibility that a pod is created, processPodVolumes fails (e.g. PVC is not bound yet), message gets logged (and thus stored in failedProcessPods), but the pod is not stored in DSW, because all its volumes failed in processPodVolumes? The stored message will be never deleted when the pod is deleted.
@msau42 commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -288,6 +302,8 @@ func (dswp *desiredStateOfWorldPopulator) processPodVolumes(pod *v1.Pod) {
volumeSpec.Name(),
uniquePodName,
err)
+
+ dswp.recordFailedProcessPodEvent(pod, fmt.Sprintf(errEventTemplate, podVolume.Name, format.Pod(pod), err))
Hm I guess not. This is fine then.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -326,6 +343,28 @@ func (dswp *desiredStateOfWorldPopulator) markPodProcessed(
dswp.pods.processedPods[podName] = true
}
+func (dswp *desiredStateOfWorldPopulator) recordFailedProcessPodEvent(
+ pod *v1.Pod, msg string) {
+ dswp.failedProcessPods.Lock()
+ defer dswp.failedProcessPods.Unlock()
+
+ // Define a fixed event interval to avoid sending same events too often.
+ const failureEventInterval = 10 * time.Second
I'm thinking more from a support standpoint. If it turns out that even this internal is not long enough, it won't be easy to turn this off in production.
@xingzhou commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -288,6 +302,8 @@ func (dswp *desiredStateOfWorldPopulator) processPodVolumes(pod *v1.Pod) {
volumeSpec.Name(),
uniquePodName,
err)
+
+ dswp.recordFailedProcessPodEvent(pod, fmt.Sprintf(errEventTemplate, podVolume.Name, format.Pod(pod), err))
ok, I'll change this in the later update
@xingzhou commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -326,6 +343,28 @@ func (dswp *desiredStateOfWorldPopulator) markPodProcessed(
dswp.pods.processedPods[podName] = true
}
+func (dswp *desiredStateOfWorldPopulator) recordFailedProcessPodEvent(
+ pod *v1.Pod, msg string) {
+ dswp.failedProcessPods.Lock()
+ defer dswp.failedProcessPods.Unlock()
+
+ // Define a fixed event interval to avoid sending same events too often.
+ const failureEventInterval = 10 * time.Second
fine, I can try change it as a config option then
@xingzhou commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -335,6 +377,15 @@ func (dswp *desiredStateOfWorldPopulator) deleteProcessedPod( delete(dswp.pods.processedPods, podName) } +// Removes the specified pod records from failedProcessedPods +func (dswp *desiredStateOfWorldPopulator) deleteFailedProcessPod(
Yeap, that could be, and the result is the map is growing bigger and bigger. So a solution for this I can imagine right now is we add another go-routine in DSW to check the records in the map, if the msg is old enough(longer than the failureEventInterval), let's delete it from the map. What do you think of this, @msau42 and @jsafrane ?
@xingzhou commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -326,6 +343,28 @@ func (dswp *desiredStateOfWorldPopulator) markPodProcessed(
dswp.pods.processedPods[podName] = true
}
+func (dswp *desiredStateOfWorldPopulator) recordFailedProcessPodEvent(
+ pod *v1.Pod, msg string) {
+ dswp.failedProcessPods.Lock()
+ defer dswp.failedProcessPods.Unlock()
+
+ // Define a fixed event interval to avoid sending same events too often.
+ const failureEventInterval = 10 * time.Second
@msau42, would you please confirm the following changes I'm planning to make:
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
| pull-kubernetes-e2e-gce | bd92c1e | link | /test pull-kubernetes-e2e-gce |
| pull-kubernetes-e2e-kops-aws | bd92c1e | link | /test pull-kubernetes-e2e-kops-aws |
| pull-kubernetes-e2e-gce-device-plugin-gpu | bd92c1e | link | /test pull-kubernetes-e2e-gce-device-plugin-gpu |
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
—
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
| pull-kubernetes-e2e-gce | bd92c1e | link | /test pull-kubernetes-e2e-gce |
| pull-kubernetes-e2e-kops-aws | bd92c1e | link | /test pull-kubernetes-e2e-kops-aws |
| pull-kubernetes-e2e-gce-device-plugin-gpu | bd92c1e | link | /test pull-kubernetes-e2e-gce-device-plugin-gpu |
| pull-kubernetes-kubemark-e2e-gce | bd92c1e | link | /test pull-kubernetes-kubemark-e2e-gce |
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
| pull-kubernetes-e2e-gce | bd92c1e | link | /test pull-kubernetes-e2e-gce |
| pull-kubernetes-e2e-kops-aws | bd92c1e | link | /test pull-kubernetes-e2e-kops-aws |
| pull-kubernetes-e2e-gce-device-plugin-gpu | bd92c1e | link | /test pull-kubernetes-e2e-gce-device-plugin-gpu |
| pull-kubernetes-kubemark-e2e-gce | bd92c1e | link | /test pull-kubernetes-kubemark-e2e-gce |
| pull-kubernetes-node-e2e | bd92c1e | link | /test pull-kubernetes-node-e2e |
current patch is using a go routine to delete the messages in the map, for your review first.
For config option issue, @msau42, please see my previous comments, thx
@pospispa requested changes on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
>
// Process volume spec for each volume defined in pod
for _, podVolume := range pod.Spec.Volumes {
volumeSpec, volumeGidValue, err :=
dswp.createVolumeSpec(podVolume, pod.Name, pod.Namespace, mountsMap, devicesMap)
if err != nil {
- glog.Errorf(
- "Error processing volume %q for pod %q: %v",
- podVolume.Name,
- format.Pod(pod),
- err)
+ msg := fmt.Sprintf(errEventTemplate, podVolume.Name, format.Pod(pod), err)
+ glog.Errorf(msg)
+ dswp.recordFailedProcessPodEvent(pod, msg)
Currently, the error message is logged (glog.Errorf(msg)) into a log file several times per second.
Is it possible to move the glog.Errorf(msg) into dswp.recordFailedProcessPodEvent() func so that the error message is logged into a log file only once in 10 seconds?
@xingzhou commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
>
// Process volume spec for each volume defined in pod
for _, podVolume := range pod.Spec.Volumes {
volumeSpec, volumeGidValue, err :=
dswp.createVolumeSpec(podVolume, pod.Name, pod.Namespace, mountsMap, devicesMap)
if err != nil {
- glog.Errorf(
- "Error processing volume %q for pod %q: %v",
- podVolume.Name,
- format.Pod(pod),
- err)
+ msg := fmt.Sprintf(errEventTemplate, podVolume.Name, format.Pod(pod), err)
+ glog.Errorf(msg)
+ dswp.recordFailedProcessPodEvent(pod, msg)
Yeap, that can be an option. But still, like I commented in the ticket, the logging message should align with the trace of program execution, so from the log I should see how the program is running.
@msau42 and @jsafrane, would you please share your thoughts on this?
@msau42 commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
>
// Process volume spec for each volume defined in pod
for _, podVolume := range pod.Spec.Volumes {
volumeSpec, volumeGidValue, err :=
dswp.createVolumeSpec(podVolume, pod.Name, pod.Namespace, mountsMap, devicesMap)
if err != nil {
- glog.Errorf(
- "Error processing volume %q for pod %q: %v",
- podVolume.Name,
- format.Pod(pod),
- err)
+ msg := fmt.Sprintf(errEventTemplate, podVolume.Name, format.Pod(pod), err)
+ glog.Errorf(msg)
+ dswp.recordFailedProcessPodEvent(pod, msg)
I think as long as it logs the first time it's fine. It's good to not spam the logs too.
@msau42 commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -326,6 +343,28 @@ func (dswp *desiredStateOfWorldPopulator) markPodProcessed(
dswp.pods.processedPods[podName] = true
}
+func (dswp *desiredStateOfWorldPopulator) recordFailedProcessPodEvent(
+ pod *v1.Pod, msg string) {
+ dswp.failedProcessPods.Lock()
+ defer dswp.failedProcessPods.Unlock()
+
+ // Define a fixed event interval to avoid sending same events too often.
+ const failureEventInterval = 10 * time.Second
I think having a default value if option is not specified is fine.
@msau42 commented on this pull request.
I wish we had the new events framework now, which will do client-side throttling.
Doing the cleanup here correctly is difficult, but I can't really think of anything better.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -335,6 +372,24 @@ func (dswp *desiredStateOfWorldPopulator) deleteProcessedPod(
delete(dswp.pods.processedPods, podName)
}
+// Removes the stale pod error records from failedProcessedPodMessages
+func (dswp *desiredStateOfWorldPopulator) cleanupFailedProcessPodMessage(
+ stopCh <-chan struct{}) {
+
+ doCleanup := func() {
+ dswp.failedProcessPodMessages.Lock()
+ defer dswp.failedProcessPodMessages.Unlock()
+
+ for msg, lastTime := range dswp.failedProcessPodMessages.messages {
+ if time.Since(lastTime) > failureEventInterval {
Thinking about when this interval becomes configurable, if it is set too high, then this could cause kubelet memory to not get cleaned up for a very long time.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -131,6 +152,9 @@ func (dswp *desiredStateOfWorldPopulator) Run(sourcesReady config.SourcesReady, dswp.hasAddedPodsLock.Lock() dswp.hasAddedPods = true dswp.hasAddedPodsLock.Unlock() + + go dswp.cleanupFailedProcessPodMessage(stopCh)
I haven't thought deeply about this, but I wonder if there is a more reliable way we can cleanup besides polling periodically.
Can we detect when the pod is removed from kubelet's list and clean up at that time?
@jsafrane commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
>
// Process volume spec for each volume defined in pod
for _, podVolume := range pod.Spec.Volumes {
volumeSpec, volumeGidValue, err :=
dswp.createVolumeSpec(podVolume, pod.Name, pod.Namespace, mountsMap, devicesMap)
if err != nil {
- glog.Errorf(
- "Error processing volume %q for pod %q: %v",
- podVolume.Name,
- format.Pod(pod),
- err)
+ msg := fmt.Sprintf(errEventTemplate, podVolume.Name, format.Pod(pod), err)
+ glog.Errorf(msg)
+ dswp.recordFailedProcessPodEvent(pod, msg)
Yes, please log at lower frequency, 100ms spams log quite a lot.
@jsafrane commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -131,6 +152,9 @@ func (dswp *desiredStateOfWorldPopulator) Run(sourcesReady config.SourcesReady, dswp.hasAddedPodsLock.Lock() dswp.hasAddedPods = true dswp.hasAddedPodsLock.Unlock() + + go dswp.cleanupFailedProcessPodMessage(stopCh)
Kubelet's PodManager does not provide any events, only Get methods. I was not able to quickly find the place where a new container is started when a pod is scheduled though.
@xingzhou commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
>
// Process volume spec for each volume defined in pod
for _, podVolume := range pod.Spec.Volumes {
volumeSpec, volumeGidValue, err :=
dswp.createVolumeSpec(podVolume, pod.Name, pod.Namespace, mountsMap, devicesMap)
if err != nil {
- glog.Errorf(
- "Error processing volume %q for pod %q: %v",
- podVolume.Name,
- format.Pod(pod),
- err)
+ msg := fmt.Sprintf(errEventTemplate, podVolume.Name, format.Pod(pod), err)
+ glog.Errorf(msg)
+ dswp.recordFailedProcessPodEvent(pod, msg)
Ok, let me change the log frequency as well, will follow the same interval as the pod events.
@xingzhou commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -335,6 +372,24 @@ func (dswp *desiredStateOfWorldPopulator) deleteProcessedPod(
delete(dswp.pods.processedPods, podName)
}
+// Removes the stale pod error records from failedProcessedPodMessages
+func (dswp *desiredStateOfWorldPopulator) cleanupFailedProcessPodMessage(
+ stopCh <-chan struct{}) {
+
+ doCleanup := func() {
+ dswp.failedProcessPodMessages.Lock()
+ defer dswp.failedProcessPodMessages.Unlock()
+
+ for msg, lastTime := range dswp.failedProcessPodMessages.messages {
+ if time.Since(lastTime) > failureEventInterval {
yes, but any better ideas on improve this? Maybe setting a maximum time threshold?
@msau42 commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -335,6 +372,24 @@ func (dswp *desiredStateOfWorldPopulator) deleteProcessedPod(
delete(dswp.pods.processedPods, podName)
}
+// Removes the stale pod error records from failedProcessedPodMessages
+func (dswp *desiredStateOfWorldPopulator) cleanupFailedProcessPodMessage(
+ stopCh <-chan struct{}) {
+
+ doCleanup := func() {
+ dswp.failedProcessPodMessages.Lock()
+ defer dswp.failedProcessPodMessages.Unlock()
+
+ for msg, lastTime := range dswp.failedProcessPodMessages.messages {
+ if time.Since(lastTime) > failureEventInterval {
Brainstorming an idea: findAndAddNewPods() calls podManager.GetPods(). When polling, can we compare the pod event to the podManager.GetPods() list? If the pod is not in that list or is terminated, then we can remove it form the event cache. I haven't thought about if this can have race conditions.
@xingzhou commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -326,6 +343,28 @@ func (dswp *desiredStateOfWorldPopulator) markPodProcessed(
dswp.pods.processedPods[podName] = true
}
+func (dswp *desiredStateOfWorldPopulator) recordFailedProcessPodEvent(
+ pod *v1.Pod, msg string) {
+ dswp.failedProcessPods.Lock()
+ defer dswp.failedProcessPods.Unlock()
+
+ // Define a fixed event interval to avoid sending same events too often.
+ const failureEventInterval = 10 * time.Second
@msau42, one more question here, I'm just trying to add one more kubelet flag for this option, and current the suggestion in the code shows that:
// In general, please try to avoid adding flags or configuration fields,
// we already have a confusingly large amount of them.
type KubeletConfiguration struct {
Do you know if we have other ways to make this configurable?
@xingzhou commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -335,6 +372,24 @@ func (dswp *desiredStateOfWorldPopulator) deleteProcessedPod(
delete(dswp.pods.processedPods, podName)
}
+// Removes the stale pod error records from failedProcessedPodMessages
+func (dswp *desiredStateOfWorldPopulator) cleanupFailedProcessPodMessage(
+ stopCh <-chan struct{}) {
+
+ doCleanup := func() {
+ dswp.failedProcessPodMessages.Lock()
+ defer dswp.failedProcessPodMessages.Unlock()
+
+ for msg, lastTime := range dswp.failedProcessPodMessages.messages {
+ if time.Since(lastTime) > failureEventInterval {
This method we shall consider how to remove the stale events, E.g:
dswp.podPreviouslyProcessed(uniquePodName), if so, we remove the staled event.@msau42 commented on this pull request.
In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:
> @@ -335,6 +372,24 @@ func (dswp *desiredStateOfWorldPopulator) deleteProcessedPod(
delete(dswp.pods.processedPods, podName)
}
+// Removes the stale pod error records from failedProcessedPodMessages
+func (dswp *desiredStateOfWorldPopulator) cleanupFailedProcessPodMessage(
+ stopCh <-chan struct{}) {
+
+ doCleanup := func() {
+ dswp.failedProcessPodMessages.Lock()
+ defer dswp.failedProcessPodMessages.Unlock()
+
+ for msg, lastTime := range dswp.failedProcessPodMessages.messages {
+ if time.Since(lastTime) > failureEventInterval {
My main thought is that there are two reasons to remove events from the cache:
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: xingzhou
We suggest the following additional approver: vishh
Assign the PR to them by writing /assign @vishh in a comment when ready.
The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment
updated the patch according to the comments, made the following changes, please take a look:
dswp.recordFailedProcessPodEvent as well/lgtm
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: msau42, xingzhou
We suggest the following additional approver: vishh
Assign the PR to them by writing /assign @vishh in a comment when ready.
The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment
—
/test all
Tests are more than 96 hours old. Re-running tests.
@xingzhou PR needs rebase
@xingzhou, please rebase so it can be merged
@xingzhou, can you please rebase the PR?
[APPROVALNOTIFIER] This PR is NOT APPROVED
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: vishh
Assign the PR to them by writing /assign @vishh in a comment when ready.
The full list of commands accepted by this bot can be found here.
The pull request process is described here
Needs approval from an approver in each of these files:Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
@xingzhou: The following test failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
| pull-kubernetes-unit | 2dba862 | link | /test pull-kubernetes-unit |
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
—
/test pull-kubernetes-unit
/lgtm
@kubernetes/sig-node-pr-reviews, PTAL and approve
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: jsafrane, msau42, xingzhou
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: vishh
Assign the PR to them by writing /assign @vishh in a comment when ready.
The full list of commands accepted by this bot can be found here.
The pull request process is described here
Needs approval from an approver in each of these files:Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
—
/test all
Tests are more than 96 hours old. Re-running tests.
—
@xingzhou: The following test failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
| pull-kubernetes-kubemark-e2e-gce | 2dba862 | link | /test pull-kubernetes-kubemark-e2e-gce |
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
—
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
| pull-kubernetes-kubemark-e2e-gce | 2dba862 | link | /test pull-kubernetes-kubemark-e2e-gce |
| pull-kubernetes-e2e-kops-aws | 2dba862 | link | /test pull-kubernetes-e2e-kops-aws |
| pull-kubernetes-e2e-gce-device-plugin-gpu | 2dba862 | link | /test pull-kubernetes-e2e-gce-device-plugin-gpu |
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
| pull-kubernetes-kubemark-e2e-gce | 2dba862 | link | /test pull-kubernetes-kubemark-e2e-gce |
| pull-kubernetes-e2e-kops-aws | 2dba862 | link | /test pull-kubernetes-e2e-kops-aws |
| pull-kubernetes-e2e-gce-device-plugin-gpu | 2dba862 | link | /test pull-kubernetes-e2e-gce-device-plugin-gpu |
| pull-kubernetes-node-e2e | 2dba862 | link | /test pull-kubernetes-node-e2e |
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
| pull-kubernetes-kubemark-e2e-gce | 2dba862 | link | /test pull-kubernetes-kubemark-e2e-gce |
| pull-kubernetes-e2e-kops-aws | 2dba862 | link | /test pull-kubernetes-e2e-kops-aws |
| pull-kubernetes-e2e-gce-device-plugin-gpu | 2dba862 | link | /test pull-kubernetes-e2e-gce-device-plugin-gpu |
| pull-kubernetes-node-e2e | 2dba862 | link | /test pull-kubernetes-node-e2e |
| pull-kubernetes-e2e-gce | 2dba862 | link | /test pull-kubernetes-e2e-gce |
| pull-kubernetes-typecheck | 2dba862 | link | /test pull-kubernetes-typecheck |
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
/retest
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
| pull-kubernetes-kubemark-e2e-gce | 2dba862 | link | /test pull-kubernetes-kubemark-e2e-gce |
| pull-kubernetes-e2e-gce-device-plugin-gpu | 2dba862 | link | /test pull-kubernetes-e2e-gce-device-plugin-gpu |
| pull-kubernetes-node-e2e | 2dba862 | link | /test pull-kubernetes-node-e2e |
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
| pull-kubernetes-kubemark-e2e-gce | 2dba862 | link | /test pull-kubernetes-kubemark-e2e-gce |
| pull-kubernetes-e2e-gce-device-plugin-gpu | 2dba862 | link | /test pull-kubernetes-e2e-gce-device-plugin-gpu |
| pull-kubernetes-typecheck | 2dba862 | link | /test pull-kubernetes-typecheck |
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
| pull-kubernetes-kubemark-e2e-gce | 2dba862 | link | /test pull-kubernetes-kubemark-e2e-gce |
| pull-kubernetes-e2e-gce-device-plugin-gpu | 2dba862 | link | /test pull-kubernetes-e2e-gce-device-plugin-gpu |
| pull-kubernetes-typecheck | 2dba862 | link | /test pull-kubernetes-typecheck |
| pull-kubernetes-verify | 2dba862 | link | /test pull-kubernetes-verify |
| pull-kubernetes-e2e-kops-aws | 2dba862 | link | /test pull-kubernetes-e2e-kops-aws |
| pull-kubernetes-e2e-gce | 2dba862 | link | /test pull-kubernetes-e2e-gce |
| pull-kubernetes-node-e2e | 2dba862 | link | /test pull-kubernetes-node-e2e |
| pull-kubernetes-bazel-test | 2dba862 | link | /test pull-kubernetes-bazel-test |
| pull-kubernetes-bazel-build | 2dba862 | link | /test pull-kubernetes-bazel-build |
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
| pull-kubernetes-kubemark-e2e-gce | 2dba862 | link | /test pull-kubernetes-kubemark-e2e-gce |
| pull-kubernetes-e2e-gce-device-plugin-gpu | 2dba862 | link | /test pull-kubernetes-e2e-gce-device-plugin-gpu |
| pull-kubernetes-typecheck | 2dba862 | link | /test pull-kubernetes-typecheck |
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
| pull-kubernetes-e2e-gce-device-plugin-gpu | 2dba862 | link | /test pull-kubernetes-e2e-gce-device-plugin-gpu |
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
/retest
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
| pull-kubernetes-node-e2e | 2dba862 | link | /test pull-kubernetes-node-e2e |
| pull-kubernetes-bazel-test | 2dba862 | link | /test pull-kubernetes-bazel-test |
| pull-kubernetes-integration | 2dba862 | link | /test pull-kubernetes-integration |
| pull-kubernetes-e2e-gce-device-plugin-gpu | 2dba862 | link | /test pull-kubernetes-e2e-gce-device-plugin-gpu |
| pull-kubernetes-typecheck | 2dba862 | link | /test pull-kubernetes-typecheck |
| pull-kubernetes-kubemark-e2e-gce | 2dba862 | link | /test pull-kubernetes-kubemark-e2e-gce |
| pull-kubernetes-verify | 2dba862 | link | /test pull-kubernetes-verify |
| pull-kubernetes-bazel-build | 2dba862 | link | /test pull-kubernetes-bazel-build |
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
—
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
| pull-kubernetes-node-e2e | 2dba862 | link | /test pull-kubernetes-node-e2e |
| pull-kubernetes-bazel-test | 2dba862 | link | /test pull-kubernetes-bazel-test |
| pull-kubernetes-integration | 2dba862 | link | /test pull-kubernetes-integration |
| pull-kubernetes-typecheck | 2dba862 | link | /test pull-kubernetes-typecheck |
| pull-kubernetes-kubemark-e2e-gce | 2dba862 | link | /test pull-kubernetes-kubemark-e2e-gce |
| pull-kubernetes-verify | 2dba862 | link | /test pull-kubernetes-verify |
| pull-kubernetes-bazel-build | 2dba862 | link | /test pull-kubernetes-bazel-build |
| pull-kubernetes-e2e-gce-device-plugin-gpu | 2dba862 | link | /test pull-kubernetes-e2e-gce-device-plugin-gpu |
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
| pull-kubernetes-e2e-kops-aws | 2dba862 | link | /test pull-kubernetes-e2e-kops-aws |
| pull-kubernetes-node-e2e | 2dba862 | link | /test pull-kubernetes-node-e2e |
| pull-kubernetes-bazel-test | 2dba862 | link | /test pull-kubernetes-bazel-test |
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
@xingzhou: The following tests failed, say /retest to rerun them all:
| Test name | Commit | Details | Rerun command |
|---|
@xingzhou, can you please rebase and fix compilation errors? We're still interested in the PR.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
Closed #58273.
@fejta-bot: Closed this PR.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue with/reopen.
Mark the issue as fresh with/remove-lifecycle rotten.Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
—