Re: [kubernetes/kubernetes] Log volume process failure event to pod. (#58273)

5 views
Skip to first unread message

Michelle Au

unread,
Jan 16, 2018, 2:17:34 PM1/16/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

cc @kubernetes/sig-storage-pr-reviews


You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

Michelle Au

unread,
Jan 16, 2018, 6:12:59 PM1/16/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@msau42 commented on this pull request.


In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:

> @@ -326,6 +343,28 @@ func (dswp *desiredStateOfWorldPopulator) markPodProcessed(
 	dswp.pods.processedPods[podName] = true
 }
 
+func (dswp *desiredStateOfWorldPopulator) recordFailedProcessPodEvent(

Did you see issues with generating too many events? The events framework should already be aggregating the same events.

Tom Xing

unread,
Jan 16, 2018, 8:32:49 PM1/16/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou commented on this pull request.


In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:

> @@ -326,6 +343,28 @@ func (dswp *desiredStateOfWorldPopulator) markPodProcessed(
 	dswp.pods.processedPods[podName] = true
 }
 
+func (dswp *desiredStateOfWorldPopulator) recordFailedProcessPodEvent(

@msau42 , yeap, the same events are being aggregated now, but based on current volume manager implementation, I can see that there will be more than 5 same events sent to pod per second. As a result, I'm thinking it's better to set up an interval to reduce the number of the same events.

Michelle Au

unread,
Jan 19, 2018, 8:14:19 PM1/19/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@msau42 commented on this pull request.


In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:

> @@ -109,18 +115,25 @@ type desiredStateOfWorldPopulator struct {
 	podStatusProvider         status.PodStatusProvider
 	desiredStateOfWorld       cache.DesiredStateOfWorld
 	pods                      processedPods
+	failedProcessPods         failedProcessPods

Can you add a comment here describing what this is for


In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:

> @@ -326,6 +343,28 @@ func (dswp *desiredStateOfWorldPopulator) markPodProcessed(
 	dswp.pods.processedPods[podName] = true
 }
 
+func (dswp *desiredStateOfWorldPopulator) recordFailedProcessPodEvent(
+	pod *v1.Pod, msg string) {
+	dswp.failedProcessPods.Lock()
+	defer dswp.failedProcessPods.Unlock()
+
+	// Define a fixed event interval to avoid sending same events too often.
+	const failureEventInterval = 10 * time.Second

Can you make this configurable?


In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:

> @@ -288,6 +302,8 @@ func (dswp *desiredStateOfWorldPopulator) processPodVolumes(pod *v1.Pod) {
 				volumeSpec.Name(),
 				uniquePodName,
 				err)
+
+			dswp.recordFailedProcessPodEvent(pod, fmt.Sprintf(errEventTemplate, podVolume.Name, format.Pod(pod), err))

Should this one use the same msg as its log message?

Tom Xing

unread,
Jan 23, 2018, 1:20:08 AM1/23/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou commented on this pull request.


In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:

> @@ -326,6 +343,28 @@ func (dswp *desiredStateOfWorldPopulator) markPodProcessed(
 	dswp.pods.processedPods[podName] = true
 }
 
+func (dswp *desiredStateOfWorldPopulator) recordFailedProcessPodEvent(
+	pod *v1.Pod, msg string) {
+	dswp.failedProcessPods.Lock()
+	defer dswp.failedProcessPods.Unlock()
+
+	// Define a fixed event interval to avoid sending same events too often.
+	const failureEventInterval = 10 * time.Second

I'm not sure whether this interval is worth for user to config, in my mind, this event is only to show some detail error information to end user when user uses "kubectl describe pod" to check the pod status. So I tend to think it's ok to put a fixed time interval here?

Tom Xing

unread,
Jan 23, 2018, 1:21:16 AM1/23/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention
> @@ -288,6 +302,8 @@ func (dswp *desiredStateOfWorldPopulator) processPodVolumes(pod *v1.Pod) {
 				volumeSpec.Name(),
 				uniquePodName,
 				err)
+
+			dswp.recordFailedProcessPodEvent(pod, fmt.Sprintf(errEventTemplate, podVolume.Name, format.Pod(pod), err))

do we want to expose internal data structure like "desiredStateOfWorld" to end users in the pod event?

Tom Xing

unread,
Jan 23, 2018, 1:22:05 AM1/23/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention
> @@ -109,18 +115,25 @@ type desiredStateOfWorldPopulator struct {
 	podStatusProvider         status.PodStatusProvider
 	desiredStateOfWorld       cache.DesiredStateOfWorld
 	pods                      processedPods
+	failedProcessPods         failedProcessPods

Sure, let me update the code

k8s-ci-robot

unread,
Jan 23, 2018, 2:14:53 AM1/23/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-kops-aws 9814ea1 link /test pull-kubernetes-e2e-kops-aws

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Jan Šafránek

unread,
Jan 23, 2018, 9:16:50 AM1/23/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@jsafrane commented on this pull request.


In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:

> @@ -335,6 +377,15 @@ func (dswp *desiredStateOfWorldPopulator) deleteProcessedPod(
 	delete(dswp.pods.processedPods, podName)
 }
 
+// Removes the specified pod records from failedProcessedPods
+func (dswp *desiredStateOfWorldPopulator) deleteFailedProcessPod(

Looking around, deleteFailedProcessPod is called in processPodVolumes when everything succeeds and in findAndRemoveDeletedPods when a pod is removed from DSW.

Is there a possibility that a pod is created, processPodVolumes fails (e.g. PVC is not bound yet), message gets logged (and thus stored in failedProcessPods), but the pod is not stored in DSW, because all its volumes failed in processPodVolumes? The stored message will be never deleted when the pod is deleted.

Michelle Au

unread,
Jan 23, 2018, 5:32:40 PM1/23/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@msau42 commented on this pull request.


In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:

> @@ -288,6 +302,8 @@ func (dswp *desiredStateOfWorldPopulator) processPodVolumes(pod *v1.Pod) {
 				volumeSpec.Name(),
 				uniquePodName,
 				err)
+
+			dswp.recordFailedProcessPodEvent(pod, fmt.Sprintf(errEventTemplate, podVolume.Name, format.Pod(pod), err))

Hm I guess not. This is fine then.


In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:

> @@ -326,6 +343,28 @@ func (dswp *desiredStateOfWorldPopulator) markPodProcessed(
 	dswp.pods.processedPods[podName] = true
 }
 
+func (dswp *desiredStateOfWorldPopulator) recordFailedProcessPodEvent(
+	pod *v1.Pod, msg string) {
+	dswp.failedProcessPods.Lock()
+	defer dswp.failedProcessPods.Unlock()
+
+	// Define a fixed event interval to avoid sending same events too often.
+	const failureEventInterval = 10 * time.Second

I'm thinking more from a support standpoint. If it turns out that even this internal is not long enough, it won't be easy to turn this off in production.

Tom Xing

unread,
Jan 24, 2018, 2:05:56 AM1/24/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou commented on this pull request.


In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:

> @@ -288,6 +302,8 @@ func (dswp *desiredStateOfWorldPopulator) processPodVolumes(pod *v1.Pod) {
 				volumeSpec.Name(),
 				uniquePodName,
 				err)
+
+			dswp.recordFailedProcessPodEvent(pod, fmt.Sprintf(errEventTemplate, podVolume.Name, format.Pod(pod), err))

ok, I'll change this in the later update

Tom Xing

unread,
Jan 24, 2018, 2:07:41 AM1/24/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention
> @@ -326,6 +343,28 @@ func (dswp *desiredStateOfWorldPopulator) markPodProcessed(
 	dswp.pods.processedPods[podName] = true
 }
 
+func (dswp *desiredStateOfWorldPopulator) recordFailedProcessPodEvent(
+	pod *v1.Pod, msg string) {
+	dswp.failedProcessPods.Lock()
+	defer dswp.failedProcessPods.Unlock()
+
+	// Define a fixed event interval to avoid sending same events too often.
+	const failureEventInterval = 10 * time.Second

fine, I can try change it as a config option then

Tom Xing

unread,
Jan 24, 2018, 2:10:03 AM1/24/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention
> @@ -335,6 +377,15 @@ func (dswp *desiredStateOfWorldPopulator) deleteProcessedPod(
 	delete(dswp.pods.processedPods, podName)
 }
 
+// Removes the specified pod records from failedProcessedPods
+func (dswp *desiredStateOfWorldPopulator) deleteFailedProcessPod(

Yeap, that could be, and the result is the map is growing bigger and bigger. So a solution for this I can imagine right now is we add another go-routine in DSW to check the records in the map, if the msg is old enough(longer than the failureEventInterval), let's delete it from the map. What do you think of this, @msau42 and @jsafrane ?

Tom Xing

unread,
Jan 25, 2018, 12:31:15 AM1/25/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention
> @@ -326,6 +343,28 @@ func (dswp *desiredStateOfWorldPopulator) markPodProcessed(
 	dswp.pods.processedPods[podName] = true
 }
 
+func (dswp *desiredStateOfWorldPopulator) recordFailedProcessPodEvent(
+	pod *v1.Pod, msg string) {
+	dswp.failedProcessPods.Lock()
+	defer dswp.failedProcessPods.Unlock()
+
+	// Define a fixed event interval to avoid sending same events too often.
+	const failureEventInterval = 10 * time.Second

@msau42, would you please confirm the following changes I'm planning to make:

  1. Add a new kubelet option named "process-volume-failure-event-interval" and pass it to volume manager (DSW).
  2. If user does not provide this option in command line, we disable reporting this event to pod.

k8s-ci-robot

unread,
Jan 25, 2018, 1:57:42 AM1/25/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-gce bd92c1e link /test pull-kubernetes-e2e-gce
pull-kubernetes-e2e-kops-aws bd92c1e link /test pull-kubernetes-e2e-kops-aws
pull-kubernetes-e2e-gce-device-plugin-gpu bd92c1e link /test pull-kubernetes-e2e-gce-device-plugin-gpu

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-ci-robot

unread,
Jan 25, 2018, 1:57:54 AM1/25/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-kops-aws 9814ea1 link /test pull-kubernetes-e2e-kops-aws
pull-kubernetes-e2e-gce bd92c1e link /test pull-kubernetes-e2e-gce

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

k8s-ci-robot

unread,
Jan 25, 2018, 1:57:58 AM1/25/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-gce bd92c1e link /test pull-kubernetes-e2e-gce
pull-kubernetes-e2e-kops-aws bd92c1e link /test pull-kubernetes-e2e-kops-aws

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

k8s-ci-robot

unread,
Jan 25, 2018, 1:59:24 AM1/25/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-gce bd92c1e link /test pull-kubernetes-e2e-gce
pull-kubernetes-e2e-kops-aws bd92c1e link /test pull-kubernetes-e2e-kops-aws
pull-kubernetes-e2e-gce-device-plugin-gpu bd92c1e link /test pull-kubernetes-e2e-gce-device-plugin-gpu
pull-kubernetes-kubemark-e2e-gce bd92c1e link /test pull-kubernetes-kubemark-e2e-gce

k8s-ci-robot

unread,
Jan 25, 2018, 1:59:52 AM1/25/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-gce bd92c1e link /test pull-kubernetes-e2e-gce
pull-kubernetes-e2e-kops-aws bd92c1e link /test pull-kubernetes-e2e-kops-aws
pull-kubernetes-e2e-gce-device-plugin-gpu bd92c1e link /test pull-kubernetes-e2e-gce-device-plugin-gpu
pull-kubernetes-kubemark-e2e-gce bd92c1e link /test pull-kubernetes-kubemark-e2e-gce
pull-kubernetes-node-e2e bd92c1e link /test pull-kubernetes-node-e2e

k8s-ci-robot

unread,
Jan 25, 2018, 2:00:43 AM1/25/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention
pull-kubernetes-unit bd92c1e link /test pull-kubernetes-unit

k8s-ci-robot

unread,
Jan 25, 2018, 2:01:43 AM1/25/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention
pull-kubernetes-bazel-build bd92c1e link /test pull-kubernetes-bazel-build

k8s-ci-robot

unread,
Jan 25, 2018, 2:17:31 AM1/25/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention
pull-kubernetes-bazel-test bd92c1e link /test pull-kubernetes-bazel-test

k8s-ci-robot

unread,
Jan 25, 2018, 2:32:39 AM1/25/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention
pull-kubernetes-verify bd92c1e link /test pull-kubernetes-verify

Tom Xing

unread,
Jan 25, 2018, 4:19:46 AM1/25/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

current patch is using a go routine to delete the messages in the map, for your review first.
For config option issue, @msau42, please see my previous comments, thx

Pavel Pospisil

unread,
Jan 25, 2018, 9:31:49 AM1/25/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@pospispa requested changes on this pull request.


In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:

>  
 	// Process volume spec for each volume defined in pod
 	for _, podVolume := range pod.Spec.Volumes {
 		volumeSpec, volumeGidValue, err :=
 			dswp.createVolumeSpec(podVolume, pod.Name, pod.Namespace, mountsMap, devicesMap)
 		if err != nil {
-			glog.Errorf(
-				"Error processing volume %q for pod %q: %v",
-				podVolume.Name,
-				format.Pod(pod),
-				err)
+			msg := fmt.Sprintf(errEventTemplate, podVolume.Name, format.Pod(pod), err)
+			glog.Errorf(msg)
+			dswp.recordFailedProcessPodEvent(pod, msg)

Currently, the error message is logged (glog.Errorf(msg)) into a log file several times per second.
Is it possible to move the glog.Errorf(msg) into dswp.recordFailedProcessPodEvent() func so that the error message is logged into a log file only once in 10 seconds?

Tom Xing

unread,
Jan 25, 2018, 9:13:25 PM1/25/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou commented on this pull request.


In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:

>  
 	// Process volume spec for each volume defined in pod
 	for _, podVolume := range pod.Spec.Volumes {
 		volumeSpec, volumeGidValue, err :=
 			dswp.createVolumeSpec(podVolume, pod.Name, pod.Namespace, mountsMap, devicesMap)
 		if err != nil {
-			glog.Errorf(
-				"Error processing volume %q for pod %q: %v",
-				podVolume.Name,
-				format.Pod(pod),
-				err)
+			msg := fmt.Sprintf(errEventTemplate, podVolume.Name, format.Pod(pod), err)
+			glog.Errorf(msg)
+			dswp.recordFailedProcessPodEvent(pod, msg)

Yeap, that can be an option. But still, like I commented in the ticket, the logging message should align with the trace of program execution, so from the log I should see how the program is running.
@msau42 and @jsafrane, would you please share your thoughts on this?

Michelle Au

unread,
Jan 25, 2018, 11:02:00 PM1/25/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@msau42 commented on this pull request.


In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:

>  
 	// Process volume spec for each volume defined in pod
 	for _, podVolume := range pod.Spec.Volumes {
 		volumeSpec, volumeGidValue, err :=
 			dswp.createVolumeSpec(podVolume, pod.Name, pod.Namespace, mountsMap, devicesMap)
 		if err != nil {
-			glog.Errorf(
-				"Error processing volume %q for pod %q: %v",
-				podVolume.Name,
-				format.Pod(pod),
-				err)
+			msg := fmt.Sprintf(errEventTemplate, podVolume.Name, format.Pod(pod), err)
+			glog.Errorf(msg)
+			dswp.recordFailedProcessPodEvent(pod, msg)

I think as long as it logs the first time it's fine. It's good to not spam the logs too.

Michelle Au

unread,
Jan 25, 2018, 11:04:47 PM1/25/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention
> @@ -326,6 +343,28 @@ func (dswp *desiredStateOfWorldPopulator) markPodProcessed(
 	dswp.pods.processedPods[podName] = true
 }
 
+func (dswp *desiredStateOfWorldPopulator) recordFailedProcessPodEvent(
+	pod *v1.Pod, msg string) {
+	dswp.failedProcessPods.Lock()
+	defer dswp.failedProcessPods.Unlock()
+
+	// Define a fixed event interval to avoid sending same events too often.
+	const failureEventInterval = 10 * time.Second

I think having a default value if option is not specified is fine.

Michelle Au

unread,
Jan 25, 2018, 11:15:41 PM1/25/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@msau42 commented on this pull request.

I wish we had the new events framework now, which will do client-side throttling.

Doing the cleanup here correctly is difficult, but I can't really think of anything better.


In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:

> @@ -335,6 +372,24 @@ func (dswp *desiredStateOfWorldPopulator) deleteProcessedPod(
 	delete(dswp.pods.processedPods, podName)
 }
 
+// Removes the stale pod error records from failedProcessedPodMessages
+func (dswp *desiredStateOfWorldPopulator) cleanupFailedProcessPodMessage(
+	stopCh <-chan struct{}) {
+
+	doCleanup := func() {
+		dswp.failedProcessPodMessages.Lock()
+		defer dswp.failedProcessPodMessages.Unlock()
+
+		for msg, lastTime := range dswp.failedProcessPodMessages.messages {
+			if time.Since(lastTime) > failureEventInterval {

Thinking about when this interval becomes configurable, if it is set too high, then this could cause kubelet memory to not get cleaned up for a very long time.


In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:

> @@ -131,6 +152,9 @@ func (dswp *desiredStateOfWorldPopulator) Run(sourcesReady config.SourcesReady,
 	dswp.hasAddedPodsLock.Lock()
 	dswp.hasAddedPods = true
 	dswp.hasAddedPodsLock.Unlock()
+
+	go dswp.cleanupFailedProcessPodMessage(stopCh)

I haven't thought deeply about this, but I wonder if there is a more reliable way we can cleanup besides polling periodically.

Can we detect when the pod is removed from kubelet's list and clean up at that time?

Jan Šafránek

unread,
Jan 26, 2018, 9:51:46 AM1/26/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@jsafrane commented on this pull request.


In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:

>  
 	// Process volume spec for each volume defined in pod
 	for _, podVolume := range pod.Spec.Volumes {
 		volumeSpec, volumeGidValue, err :=
 			dswp.createVolumeSpec(podVolume, pod.Name, pod.Namespace, mountsMap, devicesMap)
 		if err != nil {
-			glog.Errorf(
-				"Error processing volume %q for pod %q: %v",
-				podVolume.Name,
-				format.Pod(pod),
-				err)
+			msg := fmt.Sprintf(errEventTemplate, podVolume.Name, format.Pod(pod), err)
+			glog.Errorf(msg)
+			dswp.recordFailedProcessPodEvent(pod, msg)

Yes, please log at lower frequency, 100ms spams log quite a lot.

Jan Šafránek

unread,
Jan 26, 2018, 9:55:47 AM1/26/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention
> @@ -131,6 +152,9 @@ func (dswp *desiredStateOfWorldPopulator) Run(sourcesReady config.SourcesReady,
 	dswp.hasAddedPodsLock.Lock()
 	dswp.hasAddedPods = true
 	dswp.hasAddedPodsLock.Unlock()
+
+	go dswp.cleanupFailedProcessPodMessage(stopCh)

Kubelet's PodManager does not provide any events, only Get methods. I was not able to quickly find the place where a new container is started when a pod is scheduled though.

Tom Xing

unread,
Jan 28, 2018, 8:36:57 PM1/28/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou commented on this pull request.


In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:

>  
 	// Process volume spec for each volume defined in pod
 	for _, podVolume := range pod.Spec.Volumes {
 		volumeSpec, volumeGidValue, err :=
 			dswp.createVolumeSpec(podVolume, pod.Name, pod.Namespace, mountsMap, devicesMap)
 		if err != nil {
-			glog.Errorf(
-				"Error processing volume %q for pod %q: %v",
-				podVolume.Name,
-				format.Pod(pod),
-				err)
+			msg := fmt.Sprintf(errEventTemplate, podVolume.Name, format.Pod(pod), err)
+			glog.Errorf(msg)
+			dswp.recordFailedProcessPodEvent(pod, msg)

Ok, let me change the log frequency as well, will follow the same interval as the pod events.

Tom Xing

unread,
Jan 28, 2018, 8:38:33 PM1/28/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention
> @@ -335,6 +372,24 @@ func (dswp *desiredStateOfWorldPopulator) deleteProcessedPod(
 	delete(dswp.pods.processedPods, podName)
 }
 
+// Removes the stale pod error records from failedProcessedPodMessages
+func (dswp *desiredStateOfWorldPopulator) cleanupFailedProcessPodMessage(
+	stopCh <-chan struct{}) {
+
+	doCleanup := func() {
+		dswp.failedProcessPodMessages.Lock()
+		defer dswp.failedProcessPodMessages.Unlock()
+
+		for msg, lastTime := range dswp.failedProcessPodMessages.messages {
+			if time.Since(lastTime) > failureEventInterval {

yes, but any better ideas on improve this? Maybe setting a maximum time threshold?

Michelle Au

unread,
Jan 29, 2018, 2:34:51 PM1/29/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@msau42 commented on this pull request.


In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:

> @@ -335,6 +372,24 @@ func (dswp *desiredStateOfWorldPopulator) deleteProcessedPod(
 	delete(dswp.pods.processedPods, podName)
 }
 
+// Removes the stale pod error records from failedProcessedPodMessages
+func (dswp *desiredStateOfWorldPopulator) cleanupFailedProcessPodMessage(
+	stopCh <-chan struct{}) {
+
+	doCleanup := func() {
+		dswp.failedProcessPodMessages.Lock()
+		defer dswp.failedProcessPodMessages.Unlock()
+
+		for msg, lastTime := range dswp.failedProcessPodMessages.messages {
+			if time.Since(lastTime) > failureEventInterval {

Brainstorming an idea: findAndAddNewPods() calls podManager.GetPods(). When polling, can we compare the pod event to the podManager.GetPods() list? If the pod is not in that list or is terminated, then we can remove it form the event cache. I haven't thought about if this can have race conditions.

Tom Xing

unread,
Jan 29, 2018, 9:42:53 PM1/29/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou commented on this pull request.


In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:

> @@ -326,6 +343,28 @@ func (dswp *desiredStateOfWorldPopulator) markPodProcessed(
 	dswp.pods.processedPods[podName] = true
 }
 
+func (dswp *desiredStateOfWorldPopulator) recordFailedProcessPodEvent(
+	pod *v1.Pod, msg string) {
+	dswp.failedProcessPods.Lock()
+	defer dswp.failedProcessPods.Unlock()
+
+	// Define a fixed event interval to avoid sending same events too often.
+	const failureEventInterval = 10 * time.Second

@msau42, one more question here, I'm just trying to add one more kubelet flag for this option, and current the suggestion in the code shows that:

// In general, please try to avoid adding flags or configuration fields,
// we already have a confusingly large amount of them.
type KubeletConfiguration struct {

Do you know if we have other ways to make this configurable?

Tom Xing

unread,
Jan 29, 2018, 10:04:00 PM1/29/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention
> @@ -335,6 +372,24 @@ func (dswp *desiredStateOfWorldPopulator) deleteProcessedPod(
 	delete(dswp.pods.processedPods, podName)
 }
 
+// Removes the stale pod error records from failedProcessedPodMessages
+func (dswp *desiredStateOfWorldPopulator) cleanupFailedProcessPodMessage(
+	stopCh <-chan struct{}) {
+
+	doCleanup := func() {
+		dswp.failedProcessPodMessages.Lock()
+		defer dswp.failedProcessPodMessages.Unlock()
+
+		for msg, lastTime := range dswp.failedProcessPodMessages.messages {
+			if time.Since(lastTime) > failureEventInterval {

This method we shall consider how to remove the stale events, E.g:

  1. Pod with one single volume failed, but some time later, the failure volume is successfully get processed. In this case, we also need to check pod by dswp.podPreviouslyProcessed(uniquePodName), if so, we remove the staled event.
  2. Pod with several failed volumes, e.g. pod with VolumeA and VolumeB, both volumes are failed to process. If some time later, VolumeA get successfully processed, in this case, we should remove stale event for VolumeA and keep event msg for VolumeB. In this case, we still need a timemout mechanism to remove VolumeA's event msg?

Michelle Au

unread,
Jan 29, 2018, 11:39:41 PM1/29/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@msau42 commented on this pull request.


In pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:

> @@ -335,6 +372,24 @@ func (dswp *desiredStateOfWorldPopulator) deleteProcessedPod(
 	delete(dswp.pods.processedPods, podName)
 }
 
+// Removes the stale pod error records from failedProcessedPodMessages
+func (dswp *desiredStateOfWorldPopulator) cleanupFailedProcessPodMessage(
+	stopCh <-chan struct{}) {
+
+	doCleanup := func() {
+		dswp.failedProcessPodMessages.Lock()
+		defer dswp.failedProcessPodMessages.Unlock()
+
+		for msg, lastTime := range dswp.failedProcessPodMessages.messages {
+			if time.Since(lastTime) > failureEventInterval {

My main thought is that there are two reasons to remove events from the cache:

  1. All pod's volumes got successfully processed.
  2. Pod is terminated/deleted so kubelet doesn't handle it anymore

k8s-ci-robot

unread,
Feb 2, 2018, 1:03:21 AM2/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: xingzhou
We suggest the following additional approver: vishh

Assign the PR to them by writing /assign @vishh in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

Tom Xing

unread,
Feb 2, 2018, 1:59:08 AM2/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

updated the patch according to the comments, made the following changes, please take a look:

  1. Add a new kubelet flag to config the event report interval
  2. Move the error log to dswp.recordFailedProcessPodEvent as well
  3. Refined the error event map cleanup logic

Michelle Au

unread,
Feb 6, 2018, 6:28:58 PM2/6/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

/lgtm

k8s-ci-robot

unread,
Feb 6, 2018, 6:29:15 PM2/6/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: msau42, xingzhou


We suggest the following additional approver: vishh

Assign the PR to them by writing /assign @vishh in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

Kubernetes Submit Queue

unread,
Feb 6, 2018, 6:29:33 PM2/6/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

/test all

Tests are more than 96 hours old. Re-running tests.

Kubernetes Submit Queue

unread,
Feb 9, 2018, 3:22:43 AM2/9/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou PR needs rebase

Jan Šafránek

unread,
Feb 12, 2018, 5:22:18 AM2/12/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou, please rebase so it can be merged

Jan Šafránek

unread,
Feb 20, 2018, 10:33:45 AM2/20/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou, can you please rebase the PR?

k8s-ci-robot

unread,
Feb 21, 2018, 10:05:15 PM2/21/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: msau42, xingzhou

To fully approve this pull request, please assign additional approvers.


We suggest the following additional approver: vishh

Assign the PR to them by writing /assign @vishh in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Kubernetes Submit Queue

unread,
Feb 21, 2018, 10:05:15 PM2/21/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

/lgtm cancel //PR changed after LGTM, removing LGTM. @msau42 @verult @xingzhou

k8s-ci-robot

unread,
Feb 21, 2018, 10:35:58 PM2/21/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-unit 2dba862 link /test pull-kubernetes-unit

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Tom Xing

unread,
Feb 21, 2018, 10:37:52 PM2/21/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

/test pull-kubernetes-unit

Jan Šafránek

unread,
Feb 22, 2018, 4:02:24 AM2/22/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

/lgtm
@kubernetes/sig-node-pr-reviews, PTAL and approve

k8s-ci-robot

unread,
Feb 22, 2018, 4:02:54 AM2/22/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: jsafrane, msau42, xingzhou


To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: vishh

Assign the PR to them by writing /assign @vishh in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Kubernetes Submit Queue

unread,
Feb 26, 2018, 3:24:59 AM2/26/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

/test all

Tests are more than 96 hours old. Re-running tests.

Kubernetes Submit Queue

unread,
Mar 2, 2018, 3:26:06 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

k8s-ci-robot

unread,
Mar 2, 2018, 3:29:15 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-kubemark-e2e-gce 2dba862 link /test pull-kubernetes-kubemark-e2e-gce

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-ci-robot

unread,
Mar 2, 2018, 3:29:16 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-kubemark-e2e-gce 2dba862 link /test pull-kubernetes-kubemark-e2e-gce
pull-kubernetes-e2e-kops-aws 2dba862 link /test pull-kubernetes-e2e-kops-aws

k8s-ci-robot

unread,
Mar 2, 2018, 3:30:30 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-kubemark-e2e-gce 2dba862 link /test pull-kubernetes-kubemark-e2e-gce
pull-kubernetes-e2e-kops-aws 2dba862 link /test pull-kubernetes-e2e-kops-aws
pull-kubernetes-e2e-gce-device-plugin-gpu 2dba862 link /test pull-kubernetes-e2e-gce-device-plugin-gpu

k8s-ci-robot

unread,
Mar 2, 2018, 3:31:26 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-kubemark-e2e-gce 2dba862 link /test pull-kubernetes-kubemark-e2e-gce
pull-kubernetes-e2e-kops-aws 2dba862 link /test pull-kubernetes-e2e-kops-aws
pull-kubernetes-e2e-gce-device-plugin-gpu 2dba862 link /test pull-kubernetes-e2e-gce-device-plugin-gpu
pull-kubernetes-node-e2e 2dba862 link /test pull-kubernetes-node-e2e

k8s-ci-robot

unread,
Mar 2, 2018, 3:31:43 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-kubemark-e2e-gce 2dba862 link /test pull-kubernetes-kubemark-e2e-gce
pull-kubernetes-e2e-kops-aws 2dba862 link /test pull-kubernetes-e2e-kops-aws
pull-kubernetes-e2e-gce-device-plugin-gpu 2dba862 link /test pull-kubernetes-e2e-gce-device-plugin-gpu
pull-kubernetes-node-e2e 2dba862 link /test pull-kubernetes-node-e2e
pull-kubernetes-e2e-gce 2dba862 link /test pull-kubernetes-e2e-gce
pull-kubernetes-typecheck 2dba862 link /test pull-kubernetes-typecheck

k8s-ci-robot

unread,
Mar 2, 2018, 3:32:03 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

k8s-ci-robot

unread,
Mar 2, 2018, 3:32:21 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention
pull-kubernetes-typecheck 2dba862 link /test pull-kubernetes-typecheck
pull-kubernetes-bazel-test 2dba862 link /test pull-kubernetes-bazel-test

k8s-ci-robot

unread,
Mar 2, 2018, 3:32:52 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention
pull-kubernetes-integration 2dba862 link /test pull-kubernetes-integration

k8s-ci-robot

unread,
Mar 2, 2018, 3:33:58 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention
pull-kubernetes-bazel-build 2dba862 link /test pull-kubernetes-bazel-build

k8s-ci-robot

unread,
Mar 2, 2018, 3:58:27 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention
pull-kubernetes-verify 2dba862 link /test pull-kubernetes-verify

Pavel Pospisil

unread,
Mar 2, 2018, 11:31:56 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

/retest

k8s-ci-robot

unread,
Mar 2, 2018, 11:35:49 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-kubemark-e2e-gce 2dba862 link /test pull-kubernetes-kubemark-e2e-gce

k8s-ci-robot

unread,
Mar 2, 2018, 11:36:37 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-kubemark-e2e-gce 2dba862 link /test pull-kubernetes-kubemark-e2e-gce
pull-kubernetes-e2e-gce-device-plugin-gpu 2dba862 link /test pull-kubernetes-e2e-gce-device-plugin-gpu
pull-kubernetes-node-e2e 2dba862 link /test pull-kubernetes-node-e2e

k8s-ci-robot

unread,
Mar 2, 2018, 11:37:40 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-kubemark-e2e-gce 2dba862 link /test pull-kubernetes-kubemark-e2e-gce
pull-kubernetes-e2e-gce-device-plugin-gpu 2dba862 link /test pull-kubernetes-e2e-gce-device-plugin-gpu

k8s-ci-robot

unread,
Mar 2, 2018, 11:39:07 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-kubemark-e2e-gce 2dba862 link /test pull-kubernetes-kubemark-e2e-gce
pull-kubernetes-e2e-gce-device-plugin-gpu 2dba862 link /test pull-kubernetes-e2e-gce-device-plugin-gpu
pull-kubernetes-typecheck 2dba862 link /test pull-kubernetes-typecheck

k8s-ci-robot

unread,
Mar 2, 2018, 11:39:33 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-kubemark-e2e-gce 2dba862 link /test pull-kubernetes-kubemark-e2e-gce
pull-kubernetes-e2e-gce-device-plugin-gpu 2dba862 link /test pull-kubernetes-e2e-gce-device-plugin-gpu
pull-kubernetes-typecheck 2dba862 link /test pull-kubernetes-typecheck
pull-kubernetes-verify 2dba862 link /test pull-kubernetes-verify
pull-kubernetes-e2e-kops-aws 2dba862 link /test pull-kubernetes-e2e-kops-aws
pull-kubernetes-e2e-gce 2dba862 link /test pull-kubernetes-e2e-gce
pull-kubernetes-node-e2e 2dba862 link /test pull-kubernetes-node-e2e
pull-kubernetes-bazel-test 2dba862 link /test pull-kubernetes-bazel-test
pull-kubernetes-bazel-build 2dba862 link /test pull-kubernetes-bazel-build
pull-kubernetes-integration 2dba862 link /test pull-kubernetes-integration

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

k8s-ci-robot

unread,
Mar 2, 2018, 11:40:03 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-kubemark-e2e-gce 2dba862 link /test pull-kubernetes-kubemark-e2e-gce
pull-kubernetes-e2e-gce-device-plugin-gpu 2dba862 link /test pull-kubernetes-e2e-gce-device-plugin-gpu
pull-kubernetes-typecheck 2dba862 link /test pull-kubernetes-typecheck

k8s-ci-robot

unread,
Mar 2, 2018, 11:40:07 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-kubemark-e2e-gce 2dba862 link /test pull-kubernetes-kubemark-e2e-gce
pull-kubernetes-typecheck 2dba862 link /test pull-kubernetes-typecheck
pull-kubernetes-e2e-gce-device-plugin-gpu 2dba862 link /test pull-kubernetes-e2e-gce-device-plugin-gpu

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

k8s-ci-robot

unread,
Mar 2, 2018, 11:40:43 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-kubemark-e2e-gce 2dba862 link /test pull-kubernetes-kubemark-e2e-gce

k8s-ci-robot

unread,
Mar 2, 2018, 11:41:11 AM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command

k8s-ci-robot

unread,
Mar 2, 2018, 12:04:33 PM3/2/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command

Tomas Smetana

unread,
Apr 4, 2018, 3:24:45 AM4/4/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

/retest

k8s-ci-robot

unread,
Apr 4, 2018, 3:30:13 AM4/4/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-kops-aws 2dba862 link /test pull-kubernetes-e2e-kops-aws
pull-kubernetes-node-e2e 2dba862 link /test pull-kubernetes-node-e2e
pull-kubernetes-bazel-test 2dba862 link /test pull-kubernetes-bazel-test
pull-kubernetes-integration 2dba862 link /test pull-kubernetes-integration
pull-kubernetes-e2e-gce-device-plugin-gpu 2dba862 link /test pull-kubernetes-e2e-gce-device-plugin-gpu
pull-kubernetes-typecheck 2dba862 link /test pull-kubernetes-typecheck
pull-kubernetes-kubemark-e2e-gce 2dba862 link /test pull-kubernetes-kubemark-e2e-gce
pull-kubernetes-verify 2dba862 link /test pull-kubernetes-verify
pull-kubernetes-bazel-build 2dba862 link /test pull-kubernetes-bazel-build
pull-kubernetes-e2e-gce 2dba862 link /test pull-kubernetes-e2e-gce

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-ci-robot

unread,
Apr 4, 2018, 3:30:44 AM4/4/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-kops-aws 2dba862 link /test pull-kubernetes-e2e-kops-aws

k8s-ci-robot

unread,
Apr 4, 2018, 3:30:53 AM4/4/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-kops-aws 2dba862 link /test pull-kubernetes-e2e-kops-aws
pull-kubernetes-node-e2e 2dba862 link /test pull-kubernetes-node-e2e
pull-kubernetes-bazel-test 2dba862 link /test pull-kubernetes-bazel-test
pull-kubernetes-integration 2dba862 link /test pull-kubernetes-integration
pull-kubernetes-typecheck 2dba862 link /test pull-kubernetes-typecheck
pull-kubernetes-kubemark-e2e-gce 2dba862 link /test pull-kubernetes-kubemark-e2e-gce
pull-kubernetes-verify 2dba862 link /test pull-kubernetes-verify
pull-kubernetes-bazel-build 2dba862 link /test pull-kubernetes-bazel-build
pull-kubernetes-e2e-gce 2dba862 link /test pull-kubernetes-e2e-gce
pull-kubernetes-e2e-gce-device-plugin-gpu 2dba862 link /test pull-kubernetes-e2e-gce-device-plugin-gpu

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

k8s-ci-robot

unread,
Apr 4, 2018, 3:31:13 AM4/4/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-kops-aws 2dba862 link /test pull-kubernetes-e2e-kops-aws
pull-kubernetes-node-e2e 2dba862 link /test pull-kubernetes-node-e2e
pull-kubernetes-bazel-test 2dba862 link /test pull-kubernetes-bazel-test

k8s-ci-robot

unread,
Apr 4, 2018, 3:32:05 AM4/4/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-kops-aws 2dba862 link /test pull-kubernetes-e2e-kops-aws

k8s-ci-robot

unread,
Apr 4, 2018, 3:32:41 AM4/4/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-bazel-test 2dba862 link /test pull-kubernetes-bazel-test
pull-kubernetes-e2e-kops-aws 2dba862 link /test pull-kubernetes-e2e-kops-aws
pull-kubernetes-typecheck 2dba862 link /test pull-kubernetes-typecheck

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

k8s-ci-robot

unread,
Apr 4, 2018, 3:32:57 AM4/4/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-bazel-test 2dba862 link /test pull-kubernetes-bazel-test

k8s-ci-robot

unread,
Apr 4, 2018, 3:33:39 AM4/4/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-bazel-test 2dba862 link /test pull-kubernetes-bazel-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

k8s-ci-robot

unread,
Apr 4, 2018, 3:33:50 AM4/4/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command

k8s-ci-robot

unread,
Apr 4, 2018, 3:52:46 AM4/4/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command

Jan Šafránek

unread,
May 9, 2018, 7:28:01 AM5/9/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou, can you please rebase and fix compilation errors? We're still interested in the PR.

Tomas Smetana

unread,
Jul 31, 2018, 8:23:59 AM7/31/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@xingzhou if you don't have time to work on this I may try to create another PR (rebasing yours). As @jsafrane mentioned, we are still interested in this. I'm only a bit unsure about the new kubelet flag: it's an API change that will definitely make the PR difficult to push through.

fejta-bot

unread,
Oct 29, 2018, 9:09:30 AM10/29/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot

unread,
Nov 28, 2018, 8:56:42 AM11/28/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.

/lifecycle rotten

fejta-bot

unread,
Dec 28, 2018, 9:39:21 AM12/28/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.


Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Kubernetes Prow Robot

unread,
Dec 28, 2018, 9:39:42 AM12/28/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

Closed #58273.

Kubernetes Prow Robot

unread,
Dec 28, 2018, 9:39:43 AM12/28/18
to kubernetes/kubernetes, k8s-mirror-storage-pr-reviews, Team mention

@fejta-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Reply all
Reply to author
Forward
0 new messages