ClusterIP immutability

288 views
Skip to first unread message

khn...@gmail.com

unread,
Oct 8, 2020, 8:01:13 PM10/8/20
to kubernetes-sig-network
sending this at the risk of driving more insanity to an already insane api type.

Why do we have Service.Spec.ClusterIP as immutable? is there a historical reason for that?

Things to note:
- In reality it is not that immutable. I can patch a Service.Spec.Type= ExternalName (from ClusterIP) then back to Service.Spec.Type = ClusterIP (while setting the desired IP at the second step).

- I can also do the same with headless => externalName=>ClusterIP (but i can not do headless => ClusterIP !*sigh*)

While my ultimate goal by starting up the discussion is to remove that restriction and align the api to a more predicable common (and removing *lots* of code and tests) I'd settle for "Ok. shouldn't be immutable and we should have that in some vNext api"

thoughts?

Kal

Antonio Ojea

unread,
Oct 9, 2020, 2:56:01 AM10/9/20
to khn...@gmail.com, kubernetes-sig-network
This is a really interesting topic, let me share my understanding and my view on it:


ClusterIP, NodePort and LoadBalancer are related and have a dependency between them, so it makes sense that you can not mutate the ClusterIP between them (quoting from the code), they solve the problem on how to expose a ClusterIP externally:
-  ServiceTypeLoadBalancer means a service will be exposed via an external load balancer (if the cloud provider supports it), in addition  to 'NodePort' type.
- ServiceTypeNodePort means a service will be exposed on one port of every node, in addition to 'ClusterIP' type.
- ServiceTypeClusterIP means a service will only be accessible inside the cluster, via the ClusterIP

However, ClusterIP has a special case, that is ClusterIP = None. It means that the service is Headless, it generates endpoints and DNS records, and does not have a ClusterIP.
The ServiceTypeExternalName means a service consists of only a reference to a CNAME.

So far so good, I think we can differentiate between Services that use IPs (ClusterIP, LoadBalancer and ExternalName) and Services that does not need them (ClusterIP=None and ExternalName), 

Considering this, and thinking in a possible "vNext" API:

I think that ClusterIP=None is something we should remove if we can, IMHO it is complicated to consume, ...

If the main problem is mutating from an "IP Type" Service to a "DNS type" Service, should we make this difference explicit and create new API types for each?

ClusterIPs can be reachable directly with IPv6, because you can assign public IPs to Services, so we move a Layer 4 problem to a Layer 3 problem, that has its nuances too, but I don't think that they need any change to the current API, just change the definition "service will only be accessible inside"  ...


--
You received this message because you are subscribed to the Google Groups "kubernetes-sig-network" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-ne...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-sig-network/22bb2e9d-b282-408e-990f-ff30ac07587bn%40googlegroups.com.

Sandor Szuecs

unread,
Oct 9, 2020, 10:56:55 AM10/9/20
to Antonio Ojea, khn...@gmail.com, kubernetes-sig-network
On Fri, 9 Oct 2020 at 08:55, Antonio Ojea <antonio.o...@gmail.com> wrote:
This is a really interesting topic, let me share my understanding and my view on it:


ClusterIP, NodePort and LoadBalancer are related and have a dependency between them, so it makes sense that you can not mutate the ClusterIP between them (quoting from the code), they solve the problem on how to expose a ClusterIP externally:
-  ServiceTypeLoadBalancer means a service will be exposed via an external load balancer (if the cloud provider supports it), in addition  to 'NodePort' type.
- ServiceTypeNodePort means a service will be exposed on one port of every node, in addition to 'ClusterIP' type.
- ServiceTypeClusterIP means a service will only be accessible inside the cluster, via the ClusterIP

However, ClusterIP has a special case, that is ClusterIP = None. It means that the service is Headless, it generates endpoints and DNS records, and does not have a ClusterIP.
The ServiceTypeExternalName means a service consists of only a reference to a CNAME.

If ClusterIP=None generates endpoints it's the perfect fit for most ingress controllers.

Best, sandor
--
 

So far so good, I think we can differentiate between Services that use IPs (ClusterIP, LoadBalancer and ExternalName) and Services that does not need them (ClusterIP=None and ExternalName), 

Considering this, and thinking in a possible "vNext" API:

I think that ClusterIP=None is something we should remove if we can, IMHO it is complicated to consume, ...

If the main problem is mutating from an "IP Type" Service to a "DNS type" Service, should we make this difference explicit and create new API types for each?

ClusterIPs can be reachable directly with IPv6, because you can assign public IPs to Services, so we move a Layer 4 problem to a Layer 3 problem, that has its nuances too, but I don't think that they need any change to the current API, just change the definition "service will only be accessible inside"  ...


El vie., 9 oct. 2020 2:01, khn...@gmail.com <khn...@gmail.com> escribió:
sending this at the risk of driving more insanity to an already insane api type.

Why do we have Service.Spec.ClusterIP as immutable? is there a historical reason for that?

Things to note:
- In reality it is not that immutable. I can patch a Service.Spec.Type= ExternalName (from ClusterIP) then back to Service.Spec.Type = ClusterIP (while setting the desired IP at the second step).

- I can also do the same with headless => externalName=>ClusterIP (but i can not do headless => ClusterIP !*sigh*)

While my ultimate goal by starting up the discussion is to remove that restriction and align the api to a more predicable common (and removing *lots* of code and tests) I'd settle for "Ok. shouldn't be immutable and we should have that in some vNext api"

thoughts?

Kal

--
You received this message because you are subscribed to the Google Groups "kubernetes-sig-network" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-ne...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-sig-network/22bb2e9d-b282-408e-990f-ff30ac07587bn%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "kubernetes-sig-network" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-ne...@googlegroups.com.

Tim Hockin

unread,
Oct 9, 2020, 11:59:48 AM10/9/20
to Sandor Szuecs, Antonio Ojea, khn...@gmail.com, kubernetes-sig-network
It makes more sense if you look at it evolutionarily.

First we had `type=(ClusterIP | NodePort | LoadBalancer)`. Making the
clusterIP field immutable made sense and made it easier to be
confident in the implementation. But it had to be possible to move
between those types, so `type` was a mutable field.

Then we added Headless (`type:ClusterIP && clusterIP: None`). That
was a design mistake and I honestly do not recall the justification.
But keeping the clusterIP field immutable still made sense.

Then we added `type: ExternalName`. Since `type` was mutable, we kept
with the pattern. This was the real failure, IMO. ExternalName
doesn't need a ClusterIP, and we validate that it is empty (so we
don't allocate finite IPs for no reason). So now we more or less HAD
to allow you to clear `clusterIP`. But only in that specific
circumstance. Blech.

It seems exceedingly unlikely that anyone uses the ability to change
to/from ExternalName for real reasons.

Now, what can we do about it? We can make ClusterIP mutable, maybe.
That seems like a vector for mistakes (users and controllers) for
relatively little value. We could make it mutable to/from "None".
That at least has some value, but it complicates every consumer of the
API (vs "just delete and recreate the service") and is kind of a
breaking change. We could move headless to a type, but that would be
a breaking change. We could disable the conversion to/from
ExternalName, but that would be a breaking change.

Sadly, I don't think it's possible to tighten the spec, and I think
loosening it would be a lot of complexity for little value. I am
willing to be proven wrong (eager, in fact), but my feeling is that we
probably just have to live with the mistakes and try not to make it
worse.

I am hoping Gateway will supersede most non-trivial use-cases in
Service, and we can deprecate (but not remove) the older functionality
in favor of it. In the limit, Service becomes a selector and ports
list, and all the LB stuff moves to Gateway.

Thoughts?

Tim
> To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-sig-network/CAHNrm7JPRAdhA_rYNG-d2GneDfHkxZHZS2rjGuLmhSxWYpPDWw%40mail.gmail.com.

Sandor Szuecs

unread,
Oct 9, 2020, 1:00:59 PM10/9/20
to Tim Hockin, Antonio Ojea, khn...@gmail.com, kubernetes-sig-network
Hi!

On Fri, 9 Oct 2020 at 17:59, Tim Hockin <tho...@google.com> wrote:
-----------8<------------

I am hoping Gateway will supersede most non-trivial use-cases in
Service, and we can deprecate (but not remove) the older functionality
in favor of it.  In the limit, Service becomes a selector and ports
list, and all the LB stuff moves to Gateway.

Thoughts?

I have not much to say about service as long as we can group with them an ingress to endpoints or endpointSlices, I am happy.

Gateway topic is different. I think there are a lot of weaknesses in the object group, that would be a blocker for us to use or even migrate our 20k ingress objects.
I don't want to write too much offtopic.

Best, sandor
--

Tim Hockin

unread,
Oct 9, 2020, 1:05:55 PM10/9/20
to Sandor Szuecs, Antonio Ojea, khn...@gmail.com, kubernetes-sig-network
On Fri, Oct 9, 2020 at 10:00 AM Sandor Szuecs <sandor...@zalando.de> wrote:
>
> Hi!
>
> On Fri, 9 Oct 2020 at 17:59, Tim Hockin <tho...@google.com> wrote:
>>
>> -----------8<------------
>> I am hoping Gateway will supersede most non-trivial use-cases in
>> Service, and we can deprecate (but not remove) the older functionality
>> in favor of it. In the limit, Service becomes a selector and ports
>> list, and all the LB stuff moves to Gateway.
>>
>> Thoughts?
>
>
> I have not much to say about service as long as we can group with them an ingress to endpoints or endpointSlices, I am happy.

Exactly. That's what Service is really good at. The rest should be
outsourced, whether to Ingress of Gateway or ...

> Gateway topic is different. I think there are a lot of weaknesses in the object group, that would be a blocker for us to use or even migrate our 20k ingress objects.
> I don't want to write too much offtopic.

Have you expressed these issues to the Gateway folks? It's really
important that Gateway considers those use-cases if it is to succeed.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-sig-network/CAHNrm7LX6wpYKDMfzL8wo1eT7%2BjybzvRsK4X3pWNq4Z0fOwZ6Q%40mail.gmail.com.

Casey Callendrello

unread,
Oct 9, 2020, 1:06:28 PM10/9/20
to Tim Hockin, Sandor Szuecs, Antonio Ojea, khn...@gmail.com, kubernetes-sig-network
On Fri, Oct 9, 2020 at 5:59 PM 'Tim Hockin' via kubernetes-sig-network <kubernetes-...@googlegroups.com> wrote:

I am hoping Gateway will supersede most non-trivial use-cases in
Service, and we can deprecate (but not remove) the older functionality
in favor of it.  In the limit, Service becomes a selector and ports
list, and all the LB stuff moves to Gateway.


Oh, interesting. I'd just sort of assumed that the Gateway API was going to be (approximately) L7. In other words, a better Ingress. It totally passed me by that it would be a replacement for the L3 infrastructure that is Service + kube-proxy (and friends). Of course, the TCPRoute makes that pretty clear.

Do you foresee a set of default GatewayClasses with every cluster that approxmiate what kube-proxy provides? It is "legal" to run a cluster with no StorageClasses (albeit of less use), but it's hard to conceive of a cluster without any load balancing.

--Casey "the other Casey" Callendrello

Tim Hockin

unread,
Oct 9, 2020, 1:11:50 PM10/9/20
to Casey Callendrello, Sandor Szuecs, Antonio Ojea, khn...@gmail.com, kubernetes-sig-network
On Fri, Oct 9, 2020 at 10:06 AM Casey Callendrello <c...@redhat.com> wrote:
>
> On Fri, Oct 9, 2020 at 5:59 PM 'Tim Hockin' via kubernetes-sig-network <kubernetes-...@googlegroups.com> wrote:
>>
>>
>> I am hoping Gateway will supersede most non-trivial use-cases in
>> Service, and we can deprecate (but not remove) the older functionality
>> in favor of it. In the limit, Service becomes a selector and ports
>> list, and all the LB stuff moves to Gateway.
>>
>
> Oh, interesting. I'd just sort of assumed that the Gateway API was going to be (approximately) L7. In other words, a better Ingress. It totally passed me by that it would be a replacement for the L3 infrastructure that is Service + kube-proxy (and friends). Of course, the TCPRoute makes that pretty clear.

Gateway is shaping up to be the "Grand Unified Theory" :)

> Do you foresee a set of default GatewayClasses with every cluster that approxmiate what kube-proxy provides? It is "legal" to run a cluster with no StorageClasses (albeit of less use), but it's hard to conceive of a cluster without any load balancing.

Strictly speaking, kube-proxy and Services are optional. You have to
do some interesting things to live without them, but it can be made to
work.

I do hope we end up with some well-known class names that represent
the various Service `type` values and then kube-proxy (and
work-alikes) can be just another controller implementing the API.
It's a bit early for that, but you can start to see it emerging, I
think.

Oleg Atamanenko

unread,
Oct 9, 2020, 1:59:10 PM10/9/20
to Tim Hockin, Antonio Ojea, Sandor Szuecs, khn...@gmail.com, kubernetes-sig-network
Hello! 

One of the use-cases for ExternalNames is Services migration from one namespace to another - wanting the old Service to CNAME to the new one, so ExternalName is useful here.
A bit of history: when we started using kubernetes (It was 1.4) we put most of the applications in the single namespace and now we are moving them to separate namespaces.

Regards,
Oleg.
 

--
Best regards,
Oleg.

Tim Hockin

unread,
Oct 9, 2020, 2:06:21 PM10/9/20
to Oleg Atamanenko, Antonio Ojea, Sandor Szuecs, khn...@gmail.com, kubernetes-sig-network
On Fri, Oct 9, 2020 at 10:59 AM Oleg Atamanenko
<oleg.at...@gmail.com> wrote:
>
> Hello!
>
> One of the use-cases for ExternalNames is Services migration from one namespace to another - wanting the old Service to CNAME to the new one, so ExternalName is useful here.
> A bit of history: when we started using kubernetes (It was 1.4) we put most of the applications in the single namespace and now we are moving them to separate namespaces.

That's a very real use-case, but doesn't really require in-place
mutation, does it? Ie. you could delete the service and re-create it.
That leaves a small period where DNS has no record at all - maybe that
IS a problem? Interesting.

Sandor Szuecs

unread,
Oct 10, 2020, 3:20:50 PM10/10/20
to Tim Hockin, Antonio Ojea, khn...@gmail.com, kubernetes-sig-network
On Fri 9. Oct 2020 at 19:05 Tim Hockin <tho...@google.com> wrote:
On Fri, Oct 9, 2020 at 10:00 AM Sandor Szuecs <sandor...@zalando.de> wrote:
>
> Hi!
>
> On Fri, 9 Oct 2020 at 17:59, Tim Hockin <tho...@google.com> wrote:
>>
>> -----------8<------------
>> I am hoping Gateway will supersede most non-trivial use-cases in
>> Service, and we can deprecate (but not remove) the older functionality
>> in favor of it.  In the limit, Service becomes a selector and ports
>> list, and all the LB stuff moves to Gateway.
>>
>> Thoughts?
>
>
> I have not much to say about service as long as we can group with them an ingress to endpoints or endpointSlices, I am happy.

Exactly.  That's what Service is really good at. The rest should be
outsourced, whether to Ingress of Gateway or ...

> Gateway topic is different. I think there are a lot of weaknesses in the object group, that would be a blocker for us to use or even migrate our 20k ingress objects.
> I don't want to write too much offtopic.

Have you expressed these issues to the Gateway folks?  It's really
important that Gateway considers those use-cases if it is to succeed.

I tried to be involved from the beginning specifying use cases to gather requirements and we talked but seem to disagree.
I remember that „you are too far ahead“ was an answer. 
Best, sandor 
Reply all
Reply to author
Forward
0 new messages