Hello EVPN working group team,
I'm working on a Barefoot switch using the build from
https://sonic-jenkins.westus2.cloudapp.azure.com/job/barefoot/job/buildimage-bf-all/105/ and trying to get a rudimentary L2 EVPN tunnel put together. However, things are misbehaving and I'm trying to figure out if I'm doing it wrong, or if there's a bug.
I've set up EVPN in FRR directly and configured a remote side tunnel to connect to it. FRR is happy and trades all the necessary routes, VNIs, and MAC information with the SONiC FRR instance.
The switch starts in the L3 default configuration at start that assigns
10.0.0.28/31 to Ethernet56,
10.0.0.30/31 to Ethernet60, and 10.1.0.1 to Loopback0. The only thing special is that the BGP configuration has been removed. I then configure the SONiC switch with the following command sequence:
# Make VLAN100 and add Ethernet60 as an untagged VLAN100 port
config interface ip rem Ethernet60 10.0.0.30/31
config vlan add 100
config vlan member add -u 100 Ethernet60
# Create a VXLAN VTEP called vtep, make an NVO for it called nvo, and map VNI 1001 to the existing VLAN100.
config vxlan add vtep 10.1.0.1
config vxlan add evpn_nvo nvo vetp
config vxlan map add vtep 100 1001
Everything seems to play nice right up until I get to adding the VNI/VLAN mapping. At that point I get a series of logs that imply some internal issue (since I don't get to pick the number of decap mappers with any of those commands):
Jan 27 22:10:51.375106 sonic NOTICE swss#orchagent: :- create_tunnel: create_tunnel:encapmaplist[0]=0x290000000003ac
Jan 27 22:10:51.375106 sonic NOTICE swss#orchagent: :- create_tunnel: create_tunnel:encapmaplist[1]=0x290000000003ae
Jan 27 22:10:51.376114 sonic ERR syncd[24]: 2021-01-27 22:10:51.375754 ERROR (BF_SAI:(file=null):(func=null):0) - sai_create_tunnel:294: Only one tunnel decap mapper is supported, while 2 passed
Jan 27 22:10:51.376114 sonic ERR syncd[24]: :- sendApiResponse: api SAI_COMMON_API_CREATE failed in syncd mode: SAI_STATUS_NOT_SUPPORTED
Jan 27 22:10:51.376155 sonic INFO /supervisord: syncd 2021-01-27 22:10:51.375754 BF_SAI ERROR - sai_create_tunnel:294: Only one tunnel decap mapper is supported, while 2 passed
Jan 27 22:10:51.376295 sonic ERR syncd[24]: :- processQuadEvent: attr: SAI_TUNNEL_ATTR_TYPE: SAI_TUNNEL_TYPE_VXLAN
Jan 27 22:10:51.376295 sonic ERR syncd[24]: :- processQuadEvent: attr: SAI_TUNNEL_ATTR_UNDERLAY_INTERFACE: oid:0x6000000000049
Jan 27 22:10:51.376330 sonic ERR syncd[24]: :- processQuadEvent: attr: SAI_TUNNEL_ATTR_DECAP_MAPPERS: 2:oid:0x290000000003ab,oid:0x290000000003ad
Jan 27 22:10:51.376360 sonic ERR syncd[24]: :- processQuadEvent: attr: SAI_TUNNEL_ATTR_ENCAP_MAPPERS: 2:oid:0x290000000003ac,oid:0x290000000003ae
Jan 27 22:10:51.376360 sonic ERR syncd[24]: :- processQuadEvent: attr: SAI_TUNNEL_ATTR_ENCAP_SRC_IP: 10.1.0.1
Jan 27 22:10:51.376402 sonic ERR syncd[24]: :- processQuadEvent: attr: SAI_TUNNEL_ATTR_PEER_MODE: SAI_TUNNEL_PEER_MODE_P2MP
Jan 27 22:10:51.376585 sonic ERR swss#orchagent: :- create: create status: SAI_STATUS_NOT_SUPPORTED
Jan 27 22:10:51.376667 sonic ERR swss#orchagent: :- createTunnelHw: Error creating tunnel vtep: Can't create a tunnel object
Jan 27 22:10:51.378746 sonic NOTICE swss#orchagent: :- addOperation: Vxlan tunnel map entry 'map_1001_Vlan100' for tunnel 'vtep' was created
Jan 27 22:10:51.389269 sonic INFO systemd-udevd[5710]: Using default interface naming scheme 'v240'.
Jan 27 22:10:51.389650 sonic INFO systemd-udevd[5710]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jan 27 22:10:51.393617 sonic INFO kernel: [87342.409754] Bridge: port 2(vtep-100) entered blocking state
Jan 27 22:10:51.393666 sonic INFO kernel: [87342.409760] Bridge: port 2(vtep-100) entered disabled state
Jan 27 22:10:51.393698 sonic INFO kernel: [87342.410347] device vtep-100 entered promiscuous mode
After that I get a frequently repeating stanza of logging:
Jan 28 01:22:33.109160 sonic WARNING swss#orchagent: :- addTunnelUser: VTEP not yet active.user=0 remote_vtep=10.1.0.2
Jan 28 01:22:33.109160 sonic WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 10.1.0.2
In the resulting state, I get results like the following:
root@sonic:/etc/sonic# show vxlan remotevni 10.1.0.2
+---------+--------------+-------+
| VLAN | RemoteVTEP | VNI |
+=========+==============+=======+
| Vlan100 | 10.1.0.2 | 1001 |
+---------+--------------+-------+
Total count : 1
root@sonic:/etc/sonic# show vxlan vlanvnimap
+---------+-------+
| VLAN | VNI |
+=========+=======+
| Vlan100 | 1001 |
+---------+-------+
Total count : 1
root@sonic:/etc/sonic# show vxlan remotevtep
+-------+-------+-------------------+--------------+
| SIP | DIP | Creation Source | OperStatus |
+=======+=======+===================+==============+
+-------+-------+-------------------+--------------+
Total count : 0
Which, having torn apart the code a bit makes sense; the tunnel is partially created, but not active. Similarly, the practical behavior of the system matches the shown state; sending ARPs through the tunnel cause the FDB in Linux on the switch to update, and similarly cause the MAC table in FRR to show remote MAC/tunnel endpoints on the switch (show evpn mac vni 1001). However, traffic doesn't pass on the switch in either direction.
Is there something in my configuration that's causing trouble here, or am I just bumping up against one of the rough edges of the current EVPN prototypes? Should I file a bug report? The resulting config_db.json, as well as the two FRR configurations, can be provided on request.
Best regards,
John-Michael O'Brien
Dev Ops Engineer