So, I don't even know where to begin or 100% the root cause yet but I will share what we just wrestled with because it literally drove us to drinking…
We are integrating Tekton into our Quarkus platform and first wrestled with getting the Tekton Operator (v0.67.0) installed and configured how we like into EKS. Finally, after weeks, we got it running the way we wanted.
Once we got Tekton running we built two pipelines - one using a PVC and another not. Applying these directly to Tekton, either in MiniKube or EKS, worked as expected.
Finally, we moved into Quarkus and pulled in the Quarkus Tekton extension (originally v1.0.0 then v1.0.1). We built out a service that created the equivalent pipeline run we applied manually and then used our service to kick off Tekton.
This is where it gets interesting...
We could apply the Tekton pipeline runs, etc. directly into the cluster or minikube and they worked.
We could invoke our Tekton service locally, as a JVM instance, pointing to either minikube or the EKS cluster and both pipelines worked.
When we deployed the Tekton service as a native instance the pipeline not using the PVC worked but the one using a PVC failed with this error below. We also validated it was specific to the PVC by changing the other pipeline to use a PVC and got the same error. In all cases, we never even got to Tekton because of this.
"message": "Failure executing: POST at: https://172.20.0.1:443/apis/tekton.dev/v1beta1/namespaces/pipeline/pipelineruns. Message: admission webhook \"webhook.pipeline.tekton.dev\" denied the request: mutation failed: cannot decode incoming new object: json: unknown field \"claims\". Received status: Status(apiVersion=v1, code=400, details=null, kind=Status, message=admission webhook \"webhook.pipeline.tekton.dev\" denied the request: mutation failed: cannot decode incoming new object: json: unknown field \"claims\", metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=BadRequest, status=Failure, additionalProperties={})."
What we went through:
We opened up all of our ClusterRoles (temporarily) thinking maybe we configured them wrong - did not work.
We downgraded the Tekton Operator to v0.66 - did not work.
We switched from the Tekton Operator to regular Pipelines - did not work
We thought since the error was referring to either a missing/added property that it might be the Kubernetes client or the Tekton extension in the client, and its models, so we overrode all the dependencies across the board (it did change from the ArrayOrString param value to ParamValue) - did not work.
We dumped out the spec being sent to Tekton and found the “claims” property which in the spec was being sent over was an empty array [] and we nulled it out - this worked! Again, this worked as a JVM instance only natively did it fail.
We hacked the code and essentially went from the commented section to the below section:
/*.item(() -> tektonClient.v1beta1().pipelineRuns()
.inNamespace(tektonConfig.namespace())
.resource(r).create())
.invoke(pr -> log.debug("Applied pipeline run")))
.runSubscriptionOn(Infrastructure.getDefaultWorkerPool())*/
.item(() ->
{
for (WorkspaceBinding b: r.getSpec().getWorkspaces())
{
if (b.getVolumeClaimTemplate() != null)
{
b.getVolumeClaimTemplate().getSpec().getResources().setClaims(null);
}
}
String pipelineRun = r.toString();
return tektonClient.v1beta1().pipelineRuns()
.inNamespace(tektonConfig.namespace())
.resource(r).create();
})
Hypothesis:
Since we could apply the pipelines directly we felt the pipelines and Tekton were not a problem, running locally or against the cluster in JVM mode worked so the code did not seem to be a problem, running natively using the same service only the PVC pipeline failed - so, it was either something cluster related or specific to running natively.
The fact we had to manually change that specific property to null instead of [] makes us think it is something specific to the native instance, similar to when we need to apply RegisterForReflection in specific cases.
Why this was difficult to debug:
This was extremely difficult to debug initially because there were a lot of moving parts. Even now we do not know for sure if that is the actual problem. We had to contend with reconciling the versions of the extension, the Kubernetes client and its extensions (i.e. Kubernetes Client + Tekton extension + models), Tekton itself which has been challenging to get working outside of Quarkus and so we thought maybe it was something there related to the CRDs or differences between the operator and proper install.
It would be really nice if some extensions listed what versions they reference (ie Kubernetes Client vX.y.z) because they would have saved us a lot of pain digging into all the pieces especially when it deals with CRDs or moving systems, like Tekton.
Hi KimJohn,
Does not sound fun.
I'm struggling to fully grok what/where you actually changed code and where you think things should change?
I grok its some diff in behaviour from java mode and native - we should track that down because that shouldn't happen.
Could you open issue with some more details; or even PR with the code change that I think is in one of the Quarkus or Quarkiverse extension(s)?
/max
https://xam.dk/about
--
You received this message because you are subscribed to the Google Groups "Quarkus Development mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to quarkus-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/quarkus-dev/fc11352f-ed0f-41c9-9dda-3ffc7a00d230n%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/quarkus-dev/CAF1TLv4KkEhPTHE8oy4Rs4hzTf3T6WSr6oLRw8oLqHuo%3DY4BWQ%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/quarkus-dev/21e8d3f1-b629-4cfa-9d8c-2b87ea061a2dn%40googlegroups.com.