Question on AllocateLoadBalancerNodePorts KEP

307 views
Skip to first unread message

Tim Hockin

unread,
Jul 4, 2021, 1:33:23 AM7/4/21
to kubernetes-sig-network
To Lars in particular (as author) but anyone feel free to weigh in.

This KEP adds a field to express "don't allocate node ports in LB
mode". If I set that flag to false, it will not allocate ports. What
do we think SHOULD happen if I set that field to false *and* specify a
value in `.spec.ports[0].nodePort` (but maybe not in `ports[1]`)

I expected it to allocate the NodePort I specified (or fail) but not
allocate anything for ports that do not have a NodePort value. It
actually allocates neither, but also does not clear the NodePort
value. The result is we list a port in the API which is not
allocated. That will confuse things like kube-proxy and especially
since someone else COULD allocate that port for real.

So, should it:

a) Allocate specifically requested ports
b) Clear the NodePort field (we usually do not mutate user input
c) Fail the API operation

?

Tim

Tim Hockin

unread,
Jul 4, 2021, 1:51:06 AM7/4/21
to kubernetes-sig-network
```
diff --git a/pkg/registry/core/service/storage/rest.go
b/pkg/registry/core/service/storage/rest.go
index f24c79b2d26..c696f7d91a6 100644
--- a/pkg/registry/core/service/storage/rest.go
+++ b/pkg/registry/core/service/storage/rest.go
@@ -279,10 +279,16 @@ func (al *RESTAllocStuff)
releaseAllocatedResources(svc *api.Service) {
}

func shouldAllocateNodePorts(service *api.Service) bool {
- if utilfeature.DefaultFeatureGate.Enabled(features.ServiceLBNodePortControl) {
- return *service.Spec.AllocateLoadBalancerNodePorts
+ if service.Spec.Type == api.ServiceTypeNodePort {
+ return true
}
- return true
+ if service.Spec.Type == api.ServiceTypeLoadBalancer {
+ if utilfeature.DefaultFeatureGate.Enabled(features.ServiceLBNodePortControl) {
+ return *service.Spec.AllocateLoadBalancerNodePorts
+ }
+ return true
+ }
+ return false
}

// externalTrafficPolicyUpdate adjusts ExternalTrafficPolicy during
service update if needed.
@@ -424,8 +430,7 @@ func (rs *REST) Update(ctx context.Context, name
string, objInfo rest.UpdatedObj
releaseNodePorts(oldService, nodePortOp)
}
// Update service from any type to NodePort or LoadBalancer, should
update NodePort.
- if service.Spec.Type == api.ServiceTypeNodePort ||
- (service.Spec.Type == api.ServiceTypeLoadBalancer &&
shouldAllocateNodePorts(service)) {
+ if service.Spec.Type == api.ServiceTypeNodePort || service.Spec.Type
== api.ServiceTypeLoadBalancer {
if err := updateNodePorts(oldService, service, nodePortOp); err != nil {
return nil, false, err
}
@@ -1163,8 +1168,7 @@ func (al *RESTAllocStuff)
allocServiceNodePortsNew(service *api.Service, dryRun

// TODO: This creates nodePorts if needed. In the future nodePorts
may be cleared if *never* used.
// But for now we stick to the KEP "don't allocate new node ports
but do not deallocate existing node ports if set"
- if service.Spec.Type == api.ServiceTypeNodePort ||
- (service.Spec.Type == api.ServiceTypeLoadBalancer &&
shouldAllocateNodePorts(service)) {
+ if service.Spec.Type == api.ServiceTypeNodePort || service.Spec.Type
== api.ServiceTypeLoadBalancer {
if err := initNodePorts(service, nodePortOp); err != nil {
txn.Revert()
return nil, err
@@ -1186,6 +1190,9 @@ func initNodePorts(service *api.Service,
nodePortOp *portallocator.PortAllocatio
svcPortToNodePort := map[int]int{}
for i := range service.Spec.Ports {
servicePort := &service.Spec.Ports[i]
+ if servicePort.NodePort == 0 && !shouldAllocateNodePorts(service) {
+ continue
+ }
allocatedNodePort := svcPortToNodePort[int(servicePort.Port)]
if allocatedNodePort == 0 {
// This will only scan forward in the service.Spec.Ports list
because any matches
@@ -1238,6 +1245,9 @@ func updateNodePorts(oldService, newService
*api.Service, nodePortOp *portalloca

for i := range newService.Spec.Ports {
servicePort := &newService.Spec.Ports[i]
+ if servicePort.NodePort == 0 && !shouldAllocateNodePorts(newService) {
+ continue
+ }
nodePort := ServiceNodePort{Protocol: servicePort.Protocol,
NodePort: servicePort.NodePort}
if nodePort.NodePort != 0 {
if !containsNumber(oldNodePortsNumbers, int(nodePort.NodePort)) &&
!portAllocated[int(nodePort.NodePort)] {
```

lars.g...@est.tech

unread,
Jul 4, 2021, 3:46:41 AM7/4/21
to kubernetes-sig-network
I should be alternative a) Allocate specifically requested ports

Explicitly defined nodePorts are allocated regardless of allocateLoadBalancerNodePorts.

I tested on v1.22.0-beta.0.420+5fe522c2376e9a and it seem to work as intended.

apiVersion: v1
kind: Service
metadata:
name: nodeport-disabled-np-ipv6
spec:
allocateLoadBalancerNodePorts: false
ipFamilies:
- IPv6
selector:
app: mconnect
ports:
- port: 5001
name: mconnect
nodePort: 30007
- port: 5003
name: ctraffic
type: LoadBalancer
# kubectl get svc nodeport-disabled-np-ipv6
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
nodeport-disabled-np-ipv6 LoadBalancer fd00:4000::cc2a 1000::55 5001:30007/TCP,5003/TCP 8m22s

I also tested external connectivity to node:30007 with both proxy-mode=ipvs/iptables and it works.

> The result is we list a port in the API which is not allocated.

That "should not be possibe" since the kube-proxy code is not altered in by this feature, it is purely and api-server change. So if you see the nodePort so does kube-proxy.


Regards,
Lars E

lars.g...@est.tech

unread,
Jul 4, 2021, 4:14:04 AM7/4/21
to kubernetes-sig-network
The reasoning to select a) Allocate specifically requested ports, I think was that in a cluster where default allocateLoadBalancerNodePorts:false is set by a mutating web-hook the behaivor should be consistent with clusters without.

However, I can't find this in the KEP and not even in PR discussions. So IMO it should be documented.

Antonio Ojea

unread,
Jul 4, 2021, 5:39:48 AM7/4/21
to lars.g...@est.tech, kubernetes-sig-network
If a) then it seems we have a bug,  as Tim explained, despite the nodePort shows up in the API, the  Service rest code doesn't allocate the Port, so the API allows multiple nodePorts with the same value.
Kube-proxy will work with one Service because it consumes the NodePort from the API and doesn't care about the allocators, they are only needed to validate and avoid duplicates resources (ClusterIP and NodePorts) in the API, however, kube-proxy will fail (somehow) if we create a second service with the same NodePort 



--
You received this message because you are subscribed to the Google Groups "kubernetes-sig-network" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-ne...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-sig-network/ed5e8736-8b15-4538-88ac-7e3447779e8en%40googlegroups.com.

Tim Hockin

unread,
Jul 4, 2021, 12:07:06 PM7/4/21
to Antonio Ojea, lars.g...@est.tech, kubernetes-sig-network
I think I agree on A.

The patch I posted here seems to fix it in my dev branch.  I can see about pulling that to the front of the queue and peeling it off to a PR.


Tim Hockin

unread,
Jul 4, 2021, 6:13:04 PM7/4/21
to Antonio Ojea, lars.g...@est.tech, kubernetes-sig-network
Reply all
Reply to author
Forward
0 new messages