I have an issue that hasn't manifested itself as a known or visible problem, but I'd like to understand what I'm seeing in logs on our cluster's NAT gateway host. We have a very restrictive forwarding policy where we block all forward traffic from within the cluster unless the destination host is whitelisted. I have seen entries like this printed in the iptables logs we write when network requests are blocked from FORWARD.
Feb 5 11:42:27 gw01 kernel: IPTABLES: FORWARD DROPPED IN=eth0 OUT=ib0 SRC=192.168.200.67 DST=192.168.208.200 LEN=60 TOS=0x00 PREC=0x00 TTL=63 ID=62848 DF PROTO=TCP SPT=59735 DPT=8005 WINDOW=17920 RES=0x00 SYN URGP=0
The above the SRC entry is IP of a compute node with only GigE. The DST is IB address of our metadata host.
This gateway host is on both our TCP/GigE network and our IB network and acts as the gateway NAT host for all systems. Half our cluster has only GigE connections (
192.168.200.0/22) annd half has IB (
192.168.208.0/22) + GigE (
192.168.200.0/22). What I'm curious about is why the FhGFS clients on GigE are attempting to communicate to our Metadata host on its IB address. Is this clients trying to see if they can use the IB network for FhGFS and then falling back to TCP?
I identified this as related to FhGFS because the DPT value is same as our connMetaPortTCP and connMetaPortUDP on our metadata host.
If this is expected behavior of clients trying to use the IB interface then falling back to TCP, is there a way to tell clients "you are to only try TCP on network
192.168.200.0/22" rather than attempting to connect to metadata on a network they have no physical access to. The connNetFilterFile seems like it may achieve the goal of telling our GigE-only nodes that they are to only try accessing metadata on
192.168.200.0/22, but unsure if this would help.
Thanks,
- Trey