kube-proxy question

267 views
Skip to first unread message

Budai Laszlo

unread,
Jan 22, 2024, 11:00:56 AMJan 22
to kubernetes-...@googlegroups.com

Hello Everyone!


I am checking the networking (services) in Kubernets 1.29.0 and I see a significant difference compared with earlier versions (I checked against 1.27 and 1.28). 

If I list the nat/KUBE-SERVICES chain with iptables, then I cannot see the destination ports (the service port) being matched there:

root@worker1:~# iptables -t nat -L KUBE-SERVICES -n
Chain KUBE-SERVICES (2 references)
target     prot opt source               destination         
KUBE-SVC-Z6GDYMWE5TV2NNJN  tcp  --  0.0.0.0/0            10.108.188.27        /* kubernetes-dashboard/dashboard-metrics-scraper cluster IP */
KUBE-SVC-NPX46M4PTMTKRN6Y  tcp  --  0.0.0.0/0            10.96.0.1            /* default/kubernetes:https cluster IP */
KUBE-SVC-TCOU7JCQXEZGVUNU  udp  --  0.0.0.0/0            10.96.0.10           /* kube-system/kube-dns:dns cluster IP */
KUBE-SVC-ERIFXISQEP7F7OF4  tcp  --  0.0.0.0/0            10.96.0.10           /* kube-system/kube-dns:dns-tcp cluster IP */
KUBE-SVC-JD5MR3NA4I4DYORP  tcp  --  0.0.0.0/0            10.96.0.10           /* kube-system/kube-dns:metrics cluster IP */
KUBE-SVC-CEZPIJSAUFW5MYPQ  tcp  --  0.0.0.0/0            10.96.200.153        /* kubernetes-dashboard/kubernetes-dashboard cluster IP */
KUBE-NODEPORTS  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service nodeports; NOTE: this must be the last rule in this chain */ ADDRTYPE match dst-type LOCAL

but if I list the same (similar?) chain with nft then I see the service ports:

root@worker1:~# nft list chain nat KUBE-SERVICES
table ip nat {
    chain KUBE-SERVICES {
        meta l4proto tcp ip daddr 10.108.188.27  tcp dport 8000 counter packets 0 bytes 0 jump KUBE-SVC-Z6GDYMWE5TV2NNJN
        meta l4proto tcp ip daddr 10.96.0.1  tcp dport 443 counter packets 0 bytes 0 jump KUBE-SVC-NPX46M4PTMTKRN6Y
        meta l4proto udp ip daddr 10.96.0.10  udp dport 53 counter packets 0 bytes 0 jump KUBE-SVC-TCOU7JCQXEZGVUNU
        meta l4proto tcp ip daddr 10.96.0.10  tcp dport 53 counter packets 0 bytes 0 jump KUBE-SVC-ERIFXISQEP7F7OF4
        meta l4proto tcp ip daddr 10.96.0.10  tcp dport 9153 counter packets 0 bytes 0 jump KUBE-SVC-JD5MR3NA4I4DYORP
        meta l4proto tcp ip daddr 10.96.200.153  tcp dport 443 counter packets 0 bytes 0 jump KUBE-SVC-CEZPIJSAUFW5MYPQ
         fib daddr type local counter packets 233 bytes 14427 jump KUBE-NODEPORTS
    }
}

I know nftables is an alpha feature of kube-proxy, but in my case that is not enabled and I was expecting the kube-proxy to work in iptables mode. The kube-proxy image is 1.29.0 and the command line for kube-proxy is:

      - command:
        - /usr/local/bin/kube-proxy
        - --config=/var/lib/kube-proxy/config.conf
        - --hostname-override=$(NODE_NAME)

my kube-proxy configmap has: mode: "" , and according to the documentation this should mean iptables.

My node runs the kernel coming with Ubuntu 22.04.3:

root@worker1:~# uname -a
Linux worker1 5.15.0-91-generic #101-Ubuntu SMP Tue Nov 14 13:30:08 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Can you please give me some directions why the mentioned thing is happening, and where can I read more about these.


Thank you in advance for any help.

Kind regards,
Laszlo


Mikaël Cluseau

unread,
Jan 22, 2024, 11:36:27 AMJan 22
to Budai Laszlo, kubernetes-...@googlegroups.com
Hi Budai,

iptables now works as a translation layer to nftables in many distros, and what it produces really looks like the ruleset you show. Example:

$ docker run --rm -it --cap-add NET_ADMIN debian:bookworm
root@1bff23dfb827:/# apt update && apt install -y nftables iptables
[...]
update-alternatives: using /usr/sbin/iptables-legacy to provide /usr/sbin/iptables (iptables) in auto mode
update-alternatives: using /usr/sbin/ip6tables-legacy to provide /usr/sbin/ip6tables (ip6tables) in auto mode
update-alternatives: using /usr/sbin/iptables-nft to provide /usr/sbin/iptables (iptables) in auto mode
update-alternatives: using /usr/sbin/ip6tables-nft to provide /usr/sbin/ip6tables (ip6tables) in auto mode
update-alternatives: using /usr/sbin/arptables-nft to provide /usr/sbin/arptables (arptables) in auto mode
update-alternatives: using /usr/sbin/ebtables-nft to provide /usr/sbin/ebtables (ebtables) in auto mode
Processing triggers for libc-bin (2.36-9+deb12u1) ...
root@1bff23dfb827:/# iptables -A FORWARD -j REJECT -d 1.2.3.4
root@1bff23dfb827:/# nft list ruleset
# Warning: table ip filter is managed by iptables-nft, do not touch!
table ip filter {
chain FORWARD {
type filter hook forward priority filter; policy accept;
ip daddr 1.2.3.4 counter packets 0 bytes 0 reject
}
}




--
You received this message because you are subscribed to the Google Groups "kubernetes-sig-network" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-sig-ne...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-sig-network/e6e6d5fe-4d73-49e1-b9cc-3df49d911c9f%40gmail.com.

Laszlo Budai

unread,
Jan 23, 2024, 6:44:00 AMJan 23
to kubernetes-sig-network
Hi Mikaël,

Thank you for the answer.

I have just tested it now to add a rule using the iptables command:
root@worker1:~# iptables -t nat -N MY-CHAIN
root@worker1:~# iptables -t nat -A KUBE-SERVICES -p tcp -d 192.168.233.1/32 --dport 8008 -j MY-CHAIN
root@worker1:~# iptables -t nat -nL KUBE-SERVICES

Chain KUBE-SERVICES (2 references)
target     prot opt source               destination        
KUBE-SVC-TQPWH5NN6VPUUVCQ  tcp  --  0.0.0.0/0            10.103.93.175        /* dex/dex:dex cluster IP */
KUBE-SVC-OAJHUCXZJOSLNQJH  tcp  --  0.0.0.0/0            10.111.53.108        /* dex/example-app cluster IP */

KUBE-SVC-JD5MR3NA4I4DYORP  tcp  --  0.0.0.0/0            10.96.0.10           /* kube-system/kube-dns:metrics cluster IP */
KUBE-SVC-CEZPIJSAUFW5MYPQ  tcp  --  0.0.0.0/0            10.96.200.153        /* kubernetes-dashboard/kubernetes-dashboard cluster IP */
KUBE-SVC-AFRTN5RCZN6KOMKE  tcp  --  0.0.0.0/0            10.99.254.249        /* dex/ldap-backend cluster IP */

KUBE-SVC-NPX46M4PTMTKRN6Y  tcp  --  0.0.0.0/0            10.96.0.1            /* default/kubernetes:https cluster IP */
KUBE-SVC-TCOU7JCQXEZGVUNU  udp  --  0.0.0.0/0            10.96.0.10           /* kube-system/kube-dns:dns cluster IP */
KUBE-SVC-ERIFXISQEP7F7OF4  tcp  --  0.0.0.0/0            10.96.0.10           /* kube-system/kube-dns:dns-tcp cluster IP */
KUBE-SVC-Z6GDYMWE5TV2NNJN  tcp  --  0.0.0.0/0            10.108.188.27        /* kubernetes-dashboard/dashboard-metrics-scraper cluster IP */
KUBE-NODEPORTS  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service nodeports; NOTE: this must be the last rule in this chain */ ADDRTYPE match dst-type LOCAL
MY-CHAIN   tcp  --  0.0.0.0/0            192.168.233.1        tcp dpt:8008
root@worker1:~#

As you can see the iptables command is listing that rule correctly (it includes the info about the destination port). Is kube-proxy using nft command currently in 1.29?

Thank you,
Laszlo

Laszlo Budai

unread,
Jan 23, 2024, 6:44:04 AMJan 23
to kubernetes-sig-network
Hello Mikaël,

I'm not sure whether my previous message is still waiting for moderation, or I've missed to send it ... so here I am again:

I did a test by adding a rule to the iptables nat/KUBE-SERVICES using iptables, and when I check the rules, then I can see the destination port there:

root@worker1:~# iptables -t nat -N MY-CHAIN
root@worker1:~# iptables -t nat -A KUBE-SERVICES -p tcp -d 192.168.233.1/32 --dport 8008 -j MY-CHAIN
root@worker1:~# iptables -t nat -nL KUBE-SERVICES

Chain KUBE-SERVICES (2 references)
target     prot opt source               destination        
...

MY-CHAIN   tcp  --  0.0.0.0/0            192.168.233.1        tcp dpt:8008
root@worker1:~#

I can also see the same rule with nft as well:
root@worker1:~# nft list chain nat KUBE-SERVICES
table ip nat {
chain KUBE-SERVICES {
...
meta l4proto tcp ip daddr 192.168.233.1 tcp dport 8008 counter packets 0 bytes 0 jump MY-CHAIN
}
}

But if I add a rule using nft then iptables will not show it correctly:
root@worker1:~# nft add rule ip nat KUBE-SERVICES ip daddr 192.168.233.2 tcp dport 38888 jump MY-CHAIN
root@worker1:~# iptables -t nat -nL KUBE-SERVICES

Chain KUBE-SERVICES (2 references)
target     prot opt source               destination        
...
MY-CHAIN   tcp  --  0.0.0.0/0            192.168.233.1        tcp dpt:8008
MY-CHAIN   tcp  --  0.0.0.0/0            192.168.233.2      
root@worker1:~# 

so from these tests I conclude that kube-proxy is using by default nft. Is this correct or am I missing something?

Thanks,
Laszlo
On Monday, January 22, 2024 at 6:36:27 PM UTC+2 Mikaël Cluseau wrote:

Dan Winship

unread,
Jan 23, 2024, 7:59:06 AMJan 23
to Laszlo Budai, kubernetes-sig-network
On 1/23/24 05:49, Laszlo Budai wrote:
> Hello Mikaël,
>
> I'm not sure whether my previous message is still waiting for
> moderation, or I've missed to send it ... so here I am again:

(If you join the list your messages won't get moderated)

> I did a test by adding a rule to the iptables nat/KUBE-SERVICES using
> iptables, and when I check the rules, then I can see the destination
> port there:
>
> root@worker1:~# iptables -t nat -N MY-CHAIN
> root@worker1:~# iptables -t nat -A KUBE-SERVICES -p tcp -d
> 192.168.233.1/32 --dport 8008 -j MY-CHAIN
> root@worker1:~# iptables -t nat -nL KUBE-SERVICES

"iptables -L" is not a very good command... it tries to guess what
information you care about and present it in table form, while
preserving backward compatibility for people trying to parse the output
based on what old versions did, etc. If you do "iptables -S" instead
you'll get the full rule.

(I don't know why the output would be different with kube 1.29 than
before... you didn't change iptables versions at the same time?)

-- Dan

> Chain KUBE-SERVICES (2 references)
> target     prot opt source               destination        
> ...
> MY-CHAIN   tcp  --  0.0.0.0/0            192.168.233.1        tcp *dpt:8008*
> root@worker1:~#
>
> I can also see the same rule with nft as well:
> root@worker1:~# nft list chain nat KUBE-SERVICES
> table ip nat {
> chain KUBE-SERVICES {
> ...
> meta l4proto tcp ip daddr 192.168.233.1 tcp *dport 8008* counter packets
> 0 bytes 0 jump MY-CHAIN
> }
> }
>
> But if I add a rule using nft then iptables will not show it correctly:
> root@worker1:~# nft add rule ip nat KUBE-SERVICES ip daddr 192.168.233.2
> tcp dport 38888 jump MY-CHAIN
> root@worker1:~# iptables -t nat -nL KUBE-SERVICES
> Chain KUBE-SERVICES (2 references)
> target     prot opt source               destination        
> ...
> MY-CHAIN   tcp  --  0.0.0.0/0            192.168.233.1        tcp dpt:8008
> *MY-CHAIN   tcp  --  0.0.0.0/0            192.168.233.2*      
> __
>
> Hello Everyone!
>
>
> I am checking the networking (services) in Kubernets 1.29.0 and
> I see a significant difference compared with earlier versions (I
> checked against 1.27 and 1.28). 
>
> If I list the nat/KUBE-SERVICES chain with iptables, then I
> cannot see the destination ports (the service port) being
> matched there:
>
> root@worker1:~# iptables -t nat -L KUBE-SERVICES -n
> Chain KUBE-SERVICES (2 references)
> target     prot opt source               destination         
> KUBE-SVC-Z6GDYMWE5TV2NNJN  tcp  --  0.0.0.0/0
> <http://0.0.0.0/0>            10.108.188.27        /*
> kubernetes-dashboard/dashboard-metrics-scraper cluster IP */
> KUBE-SVC-NPX46M4PTMTKRN6Y  tcp  --  0.0.0.0/0
> <http://0.0.0.0/0>            10.96.0.1            /*
> default/kubernetes:https cluster IP */
> KUBE-SVC-TCOU7JCQXEZGVUNU  udp  --  0.0.0.0/0
> <http://0.0.0.0/0>            10.96.0.10           /*
> kube-system/kube-dns:dns cluster IP */
> KUBE-SVC-ERIFXISQEP7F7OF4  tcp  --  0.0.0.0/0
> <http://0.0.0.0/0>            10.96.0.10           /*
> kube-system/kube-dns:dns-tcp cluster IP */
> KUBE-SVC-JD5MR3NA4I4DYORP  tcp  --  0.0.0.0/0
> <http://0.0.0.0/0>            10.96.0.10           /*
> kube-system/kube-dns:metrics cluster IP */
> KUBE-SVC-CEZPIJSAUFW5MYPQ  tcp  --  0.0.0.0/0
> <http://0.0.0.0/0>            10.96.200.153        /*
> kubernetes-dashboard/kubernetes-dashboard cluster IP */
> KUBE-NODEPORTS  all  --  0.0.0.0/0 <http://0.0.0.0/0>           
> 0.0.0.0/0 <http://0.0.0.0/0>            /* kubernetes service
> nodeports; NOTE: this must be the last rule in this chain */
> ADDRTYPE match dst-type LOCAL
>
> but if I list the same (similar?) chain with nft then I see the
> service ports:
>
> root@worker1:~# nft list chain nat KUBE-SERVICES
> table ip nat {
>     chain KUBE-SERVICES {
>         meta l4proto tcp ip daddr 10.108.188.27  tcp *dport
> 8000* counter packets 0 bytes 0 jump KUBE-SVC-Z6GDYMWE5TV2NNJN
>         meta l4proto tcp ip daddr 10.96.0.1  tcp *dport 443*
> counter packets 0 bytes 0 jump KUBE-SVC-NPX46M4PTMTKRN6Y
>         meta l4proto udp ip daddr 10.96.0.10  udp *dport 53*
> counter packets 0 bytes 0 jump KUBE-SVC-TCOU7JCQXEZGVUNU
>         meta l4proto tcp ip daddr 10.96.0.10  tcp *dport 53*
> counter packets 0 bytes 0 jump KUBE-SVC-ERIFXISQEP7F7OF4
>         meta l4proto tcp ip daddr 10.96.0.10  tcp *dport 9153*
> counter packets 0 bytes 0 jump KUBE-SVC-JD5MR3NA4I4DYORP
>         meta l4proto tcp ip daddr 10.96.200.153  tcp *dport 443*
> https://groups.google.com/d/msgid/kubernetes-sig-network/e6e6d5fe-4d73-49e1-b9cc-3df49d911c9f%40gmail.com <https://groups.google.com/d/msgid/kubernetes-sig-network/e6e6d5fe-4d73-49e1-b9cc-3df49d911c9f%40gmail.com?utm_medium=email&utm_source=footer>.
>
> --
> You received this message because you are subscribed to the Google
> Groups "kubernetes-sig-network" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to kubernetes-sig-ne...@googlegroups.com
> <mailto:kubernetes-sig-ne...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/kubernetes-sig-network/aab957a3-0d2b-4e6f-9890-0ae4a8781108n%40googlegroups.com <https://groups.google.com/d/msgid/kubernetes-sig-network/aab957a3-0d2b-4e6f-9890-0ae4a8781108n%40googlegroups.com?utm_medium=email&utm_source=footer>.

Budai Laszlo

unread,
Jan 23, 2024, 11:31:04 AMJan 23
to Dan Winship, kubernetes-sig-network
Hello Dan!

On 23.01.2024 14:58, Dan Winship wrote:
(If you join the list your messages won't get moderated)

I did join the list yesterday, so I guess there is some time needed until my messages will be automatically accepted (I can see the list policy that says that new member messages are moderated. That is normal.).



I did a test by adding a rule to the iptables nat/KUBE-SERVICES using
iptables, and when I check the rules, then I can see the destination
port there:

root@worker1:~# iptables -t nat -N MY-CHAIN
root@worker1:~# iptables -t nat -A KUBE-SERVICES -p tcp -d
192.168.233.1/32 --dport 8008 -j MY-CHAIN
root@worker1:~# iptables -t nat -nL KUBE-SERVICES
"iptables -L" is not a very good command... it tries to guess what
information you care about and present it in table form, while
preserving backward compatibility for people trying to parse the output
based on what old versions did, etc. If you do "iptables -S" instead
you'll get the full rule.

(I don't know why the output would be different with kube 1.29 than
before... you didn't change iptables versions at the same time?)

I did test with "iptables -S" and also with "iptables-save". None of them are showing the destination port in the KUBE-SERVICES chain.

root@worker1:~# iptables -t nat -S KUBE-SERVICES
-N KUBE-SERVICES
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:metrics cluster IP" -j KUBE-SVC-JD5MR3NA4I4DYORP
-A KUBE-SERVICES -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -j KUBE-SVC-TCOU7JCQXEZGVUNU
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -j KUBE-SVC-ERIFXISQEP7F7OF4
-A KUBE-SERVICES -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SERVICES -d 10.100.241.240/32 -p tcp -m comment --comment "default/mydep cluster IP" -j KUBE-SVC-5K4KPF3R3ZZJPT44
-A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
root@worker1:~# iptables --version
iptables v1.8.7 (nf_tables)


I did another test: on the _same_ cluster I just downgraded the kube-proxy image to 1.28.0 and once the new kube-proxy pods have started the destination port was there in the iptables output. Then set it back to the original 1.29.0 and the ports are gone again. So definitely there is something different in kube-proxy 1.29. And that's what I'm trying to figure it out.


Kind regards,
Laszlo


Dan Winship

unread,
Jan 23, 2024, 12:15:49 PMJan 23
to Budai Laszlo, kubernetes-sig-network
On 1/23/24 10:26, Budai Laszlo wrote:
> I did another test: on the _same_ cluster I just downgraded the
> kube-proxy image to 1.28.0 and once the new kube-proxy pods have started
> the destination port was there in the iptables output. Then set it back
> to the original 1.29.0 and the ports are gone again. So definitely there
> is something different in kube-proxy 1.29. And that's what I'm trying to
> figure it out.

Oh, the image comes with its own iptables binary. So you probably are
changing iptables binary versions when you change kube-proxy versions.
And the version in the 1.29 kube-proxy image (presumably iptables-nft
1.8.8) is apparently translating the iptables rules to nftables in a
new, slightly-cleverer way that your old iptables binary doesn't
recognize and can't translate back. We've seen similar problems before
(and unfortunately, we're seeing the same thing in the nftables backend
now too when you have different versions of nft in the kube-proxy image
and on the host).

Anyway, the actual rules that end up in the kernel are fine, it's just
that the copy of the iptables binary you are using doesn't recognize
them. :-/

-- Dan

Budai Laszlo

unread,
Jan 23, 2024, 6:20:08 PMJan 23
to Dan Winship, kubernetes-sig-network
Oh boy! Indeed, listing the rules in the kube-proxy 1.29 using its iptables looks OK. This was a strange issue. Thank you so much for the clarification.
One more thing related to iptables: do I understand correctly that the iptables tool (iptables-nft) is still using nftables under the hood?

Thank you,
Laszlo

Dan Winship

unread,
Jan 24, 2024, 9:40:01 AMJan 24
to Budai Laszlo, kubernetes-sig-network
On 1/24/24 08:37, Budai Laszlo wrote:
> Hello again,
>
> I want to try out the new nftables mode of kube-proxy, I have added the
> following two options to the kube-proxy command in my daemonset:
>         - --feature-gates=NFTablesProxyMode=true
>         - --proxy-mode=nftables
>
> If I set the mode: "nftables" in the configMap, then the proxy fails and
> it's complaining that the mode can only be ipvs or iptables

That means it thinks the feature gate isn't set. I think you're running
into the problem where it's really non-obvious how config file options
and command-line options override each other. If you are using a config
file at all, you should generally set everything in the config file
rather than on the command line. (In particular in this case, you should
set the feature gate in the config file.)

-- Dan

> so I had to
> set it to mode: "", but now when checking the kube-proxy logs I can see
> the following as the first line:
>
> I0124 13:09:15.763432       1 server_others.go:72] "Using iptables proxy"
>
> How can I ensure that my proxy runs now in nftables mode? Are there any
> differences that I should see?
>
> Thank you,
> Laszlo
>
> On 23.01.2024 19:15, Dan Winship wrote:

Budai Laszlo

unread,
Jan 24, 2024, 10:19:10 AMJan 24
to Dan Winship, kubernetes-sig-network
Hello again,

I want to try out the new nftables mode of kube-proxy, I have added the following two options to the kube-proxy command in my daemonset:
        - --feature-gates=NFTablesProxyMode=true
        - --proxy-mode=nftables

If I set the mode: "nftables" in the configMap, then the proxy fails and it's complaining that the mode can only be ipvs or iptables, so I had to set it to mode: "", but now when checking the kube-proxy logs I can see the following as the first line:


I0124 13:09:15.763432       1 server_others.go:72] "Using iptables proxy"

How can I ensure that my proxy runs now in nftables mode? Are there any differences that I should see?

Thank you,
Laszlo

On 23.01.2024 19:15, Dan Winship wrote:

Budai Laszlo

unread,
Jan 25, 2024, 7:48:11 AMJan 25
to Dan Winship, kubernetes-sig-network
Hi again,

I managed to get my kube-proxy to work in nftables mode (using the command line options  because I couldn't figure out the right settings in the config file). I would like to know if there is some diagram/illustration about how the tables/chains/rules are set up in this nftables mode? For iptables we have something  like:
+------------------------+         +----------------------------+         +----------------------------+         +-------------------------+
|  nat/PREROUTING |  ----> | nat/KUBE-SERVICES |  ----> | nat/KUBE-SVC-#### | ----> | KUBE-SEP-####  |
                                         +----------------------------+         +----------------------------+         +-------------------------+
|  nat/OUTPUT         |
+------------------------+

Is there some document where I could read more about these?

Thank you,
Laszlo

Dan Winship

unread,
Jan 25, 2024, 10:57:19 AMJan 25
to Budai Laszlo, kubernetes-sig-network
On 1/25/24 07:48, Budai Laszlo wrote:
> Hi again,
>
> I managed to get my kube-proxy to work in nftables mode (using the
> command line options  because I couldn't figure out the right settings
> in the config file). I would like to know if there is some
> diagram/illustration about how the tables/chains/rules are set up in
> this nftables mode? For iptables we have something  like:
> +------------------------+        
> +----------------------------+        
> +----------------------------+         +-------------------------+
> |  nat/PREROUTING |  ----> | nat/KUBE-SERVICES |  ----> |
> nat/KUBE-SVC-#### | ----> | KUBE-SEP-####  |
>                                         
> +----------------------------+        
> +----------------------------+         +-------------------------+
> |  nat/OUTPUT         |
> +------------------------+
>
> Is there some document where I could read more about these?

There's a README in the source tree,
https://github.com/kubernetes/kubernetes/blob/master/pkg/proxy/nftables/README.md.
At the moment it's higher-level and more about "how nftables works, and
how kube-proxy interacts with that" than it is about the nitty-gritty of
what our rules look like. It would be good to document that better, but
at the moment everything is still changing... (eg,
https://github.com/kubernetes/kubernetes/pull/122296 changed the
implementation of the LoadBalancerSourceRanges feature and
https://github.com/kubernetes/kubernetes/pull/122692 changed where
various "drop" and "reject" rules happen from.)

The KEP,
https://github.com/kubernetes/enhancements/blob/master/keps/sig-network/3866-nftables-proxy/README.md,
has some discussion of how the rules will work (particularly in the
section "Low level design") though that is now slightly out of date, and
some of the things I talked about there turned out to not work.) But if
you read that, and then look at the output of `nft list table ip
kube-proxy` you should be able to understand what's going on...

-- Dan
Reply all
Reply to author
Forward
0 new messages