On 6/25/20 9:34 PM, Girish Moodalbail wrote:
> Hello Dumitru, Han,
>
> So, we applied this patchset and gave it a spin on our large scale
> cluster and saw a significant reduction in the number of logical flows
> in lr_in_ip_input table. Before this patch there were around 1.6M flows
> in lr_in_ip_input table. However, after the patch we see about 26K
> flows. So that is significant reduction in number of logical flows.
>
> In lr_in_ip_input, I see
>
> * priority 92 flows matching ARP requests for dnat_and_snat IPs on
> distributed gateway port with is_chassis_resident() and
> corresponding ARP reply
> * priority 91 flows matching ARP requests for dnat_and_snat IPs on
> distributed gateway port with !is_chassis_resident() and
> corresponding drop
> * priority 90 flow matching ARP request for dnat_and_snat IPs and
> corresponding ARP replies
>
> So far so good.
Hi Girish,
Great, thanks for testing out the series and confirming that it's
working ok.
>
> However, not directly related to this patch per-se but directly related
> to the behaviour of ARP and dnat_and_snat IP, on the OVN chassis we are
> seeing a significant number of OpenFlow flows in table 27 (around 2.3M
> OpenFlow flows). This table gets populated from logical flows in
> table=19 (ls_in_l2_lkup) of logical switch.
>
> The two logical flows in l2_in_l2_lkup that are contributing to huge
> number of OpenFlow flows are: (for the entire logical flow entry,
> please
> see:
https://gist.github.com/girishmg/57b3005030d421c59b30e6c36cfc9c18)
>
> Priority=75 flow
> =============
> This flow looks like below (where
169.254.0.0/29 <
http://169.254.0.0/29>
> is dnat_and_snat subnet and 192.168.0.1 is the logical_switch's gateway IP)
>
> table=19(ls_in_l2_lkup ), priority=75 , match=(flags[1] == 0 &&
> arp.op == 1 && arp.tpa == { 169.254.3.107, 169.254.1.85, 192.168.0.1,
> 169.254.10.155, 169.254.1.6}), action=(outport = "stor-sdn-test1"; output;)
>
> What this flow says is that any ARP request packet from the switch
> heading towards the default gateway or any of those 1-to-1 nat send it
> out through the port towards the ovn_cluster_router’s ingress pipeline.
> Question though is why any Pod on the logical switch would send an ARP
> for an IP that is not in its subnet. A packet from a Pod towards a
> non-subnet IP should ARP only for the default gateway IP.
>
This is a bug. I'll start working on a fix send a patch for it soon.
> Priority=80 Flow
> =============
> This flow looks like below
>
> table=19(ls_in_l2_lkup ), priority=80 , match=(eth.src == {
> 0a:58:c0:a8:00:01, 6a:93:f4:55:aa:a7, ae:92:2d:33:24:ea,
> ba:0a:d3:7d:bc:e8, b2:2f:40:4d:d9:2b} && (arp.op == 1 || nd_ns)),
> action=(outport = "_MC_flood"; output;)
>
> The question again for this flow is why will there be a self-originated
> arp requests for the dnat_and_snat IPs from inside of the node's logical
> switch. I can see how this is a possibility on the switch that has
> `localnet port` on it and to which the distributed router connects to
> through a gateway port.
>
This is also a bug, similar to the one above, we should only deal with
external_mac's that might be used on this port. I'll fix it too soon.
Thanks,
Dumitru
> > <mailto:
zho...@gmail.com <mailto:
zho...@gmail.com>>> wrote:
> >
> > Sorry Girish, I can't promise for now. I will see if I have
> time in
> > the next couple of weeks, but welcome anyone to volunteer on
> this if
> > it is urgent.
> >
> > On Mon, Jun 15, 2020 at 10:56 AM Girish Moodalbail
> > <
gmood...@gmail.com <mailto:
gmood...@gmail.com>
> <mailto:
ovn-kubernetes%2Bunsu...@googlegroups.com>
> > <mailto:
ovn-kubernete...@googlegroups.com
> <mailto:
ovn-kubernetes%2Bunsu...@googlegroups.com>>.