Re: [kubernetes/kubernetes] [e2e test failure] [sig-api-machinery] Aggregator Should be able to support the 1.7 Sample API Server using the current Aggregator (#50945)

9 views
Skip to first unread message

Eric Chiang

unread,
Sep 20, 2017, 12:12:53 PM9/20/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

Reopened #50945.


You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

Kubernetes Submit Queue

unread,
Sep 20, 2017, 12:13:19 PM9/20/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

[MILESTONENOTIFIER] Milestone Labels Complete

@k8s-merge-robot

Issue label settings:

  • sig/api-machinery: Issue will be escalated to these SIGs if needed.
  • priority/critical-urgent: Never automatically move out of a release milestone; continually escalate to contributor and SIG through all available channels.
  • kind/bug: Fixes a bug discovered during the current release.
Additional instructions available here The commands available for adding these labels are documented here

Eric Chiang

unread,
Sep 20, 2017, 12:14:35 PM9/20/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

This has started failing again on our GKE test suite https://k8s-testgrid.appspot.com/release-master-blocking#gke

https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gke/15715#sig-api-machinery-aggregator-should-be-able-to-support-the-17-sample-api-server-using-the-current-aggregator

/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/apimachinery/aggregator.go:69
attempting to delete a newly created flunders resource
Expected error:
    <*errors.StatusError | 0xc4211b4990>: {
        ErrStatus: {
            TypeMeta: {Kind: "", APIVersion: ""},
            ListMeta: {SelfLink: "", ResourceVersion: "", Continue: ""},
            Status: "Failure",
            Message: "the server could not find the requested resource",
            Reason: "NotFound",
            Details: {
                Name: "",
                Group: "",
                Kind: "",
                UID: "",
                Causes: [
                    {
                        Type: "UnexpectedServerResponse",
                        Message: "unknown",
                        Field: "",
                    },
                ],
                RetryAfterSeconds: 0,
            },
            Code: 404,
        },
    }
    the server could not find the requested resource
not to have occurred
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/apimachinery/aggregator.go:430

cc @kubernetes/sig-api-machinery-test-failures

Walter Fender

unread,
Sep 20, 2017, 4:55:26 PM9/20/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

/assign @cheftako

Eric Chiang

unread,
Sep 20, 2017, 5:42:51 PM9/20/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

Jaice Singer DuMars

unread,
Sep 21, 2017, 11:02:06 AM9/21/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

@cheftako @ericchiang - we need to determine (today if possible) if this is truly release blocking. If so, please add the release-blocker label. And, if not, how do we best continue work on this for 1.8.x/1.9.0?

Jordan Liggitt

unread,
Sep 21, 2017, 1:17:01 PM9/21/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

this test has two passes and four failures on the same commit

I'm seeing gke-specific authz grants in that test that are incorrect:

3b9485b#diff-c944d1288edcaf37beebab811603bfd8L164

That commit removed the wait for the authz grant to become effective (which can lead to flakes), and granted superuser permissions to all users, which is incorrect and invalidates any other authz-related tests run in parallel with this test

Jordan Liggitt

unread,
Sep 21, 2017, 1:25:26 PM9/21/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

Eric Chiang

unread,
Sep 21, 2017, 4:47:31 PM9/21/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

I can add the wait back.

Eric Chiang

unread,
Sep 21, 2017, 5:15:59 PM9/21/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

Actually after staring at this test for about half an our I can't figure out what different users exist or what permissions they're being granted. ClientSet, InternalClientset, and AggregatorClient are all initialized from the same config so I don't see how one would be able to create an RBAC binding but another would fail later.

https://github.com/kubernetes/kubernetes/blob/6808e800c9e556185d6c2a0f87ab1e777a8402d1/test/e2e/framework/framework.go#L156-L161

@cheftako any thoughts here?

Walter Fender

unread,
Sep 21, 2017, 6:36:43 PM9/21/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

I honestly think the gke specific BindClusterRole is a red-hearing. It is needed so the client has permission to perform one of the setup steps. (I think it was either to create the wardler cluster role or to bind that role to the anonymous user) Once that setup step is complete we no longer need the that cluster role bound and so I don't think its related.

Jordan Liggitt

unread,
Sep 21, 2017, 6:39:21 PM9/21/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

I don't see how one would be able to create an RBAC binding but another would fail later.

The gke authorizer allows the “bind” verb, so the client can create a binding to the cluster-admin. It cannot create a role directly unless it has permissions via RBAC. Since we don’t have a way to determine the username associated with iclient, binding to all authenticated users is what was done as a workaround.

Jordan Liggitt

unread,
Sep 21, 2017, 6:40:43 PM9/21/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

I agree that the point at which the tests are failing indicate that the previous authorization issues are not the cause.

Kubernetes Submit Queue

unread,
Sep 21, 2017, 7:06:22 PM9/21/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

Closed #50945 via #52816.

Walter Fender

unread,
Sep 21, 2017, 9:11:52 PM9/21/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

/open

k8s-ci-robot

unread,
Sep 21, 2017, 9:11:59 PM9/21/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

Reopened #50945.

Walter Fender

unread,
Sep 21, 2017, 9:12:05 PM9/21/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

/reopen

Walter Fender

unread,
Sep 21, 2017, 9:12:17 PM9/21/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

So a lot more information to work with now but the error is still occurring. I am still looking into this.

Jordan Liggitt

unread,
Sep 22, 2017, 9:05:48 PM9/22/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

@cheftako any update on the investigation?

Aaron Crickenberger

unread,
Sep 25, 2017, 12:53:12 PM9/25/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

https://storage.googleapis.com/k8s-gubernator/triage/index.html?test=aggregator

Friendly v1.8 release team ping. This failure still seems to be happening, is this actively being worked? Does this need to be in the v1.8 milestone?

Kubernetes Submit Queue

unread,
Sep 26, 2017, 3:34:21 PM9/26/17
to kubernetes/kubernetes, k8s-mirror-api-machinery-test-failures, Team mention

Closed #50945 via #53030.

Reply all
Reply to author
Forward
0 new messages