I'm doing this on a Celestica seastone-2 (Broadcom td3.x7) running latest 202111.
I'm trying to run a dnsmasq DHCP server on the SONiC switch itself, serving DHCP IP addresses to clients connected to the switch ports. I've been able to get this to work intermittently, but often times the DHCP packets seem to get dropped or eaten by the switch before they reach dnsmasq.
For example Ethernet0 is up as a routed port:
admin@myriad-brcm-1:~$ show interface status
Interface Lanes Speed MTU FEC Alias Vlan Oper Admin Type Asym PFC
------------ --------------- ------- ----- ----- ------- ------------ ------ ------- --------------- ----------
Ethernet0 1,2,3,4 100G 1500 rs QSFP1 routed up up QSFP28 or later N/A
...
admin@myriad-brcm-1:~$ show ip interfaces
Interface Master IPv4 address/mask Admin/Oper BGP Neighbor Neighbor IP
------------ -------- ------------------- ------------ -------------- -------------
Ethernet0
172.16.4.1/29 up/up N/A N/A
If I manually give the linux host connected to Ethernet0 the IP 172.16.4.2, I can then ping 172.16.4.1, ssh into the SONiC switch, so on. All good. I can see this traffic with tcpdump from the SONiC switch:
root@myriad-brcm-1:/home/admin# tcpdump -n -e -i Ethernet0
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on Ethernet0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
12:03:22.702341 08:c0:eb:a9:6e:bc > 0c:48:c6:97:24:e6, ethertype IPv4 (0x0800), length 98: 172.16.4.2 >
172.16.4.1: ICMP echo request, id 4, seq 1, length 64
12:03:22.702459 0c:48:c6:97:24:e6 > 08:c0:eb:a9:6e:bc, ethertype IPv4 (0x0800), length 98: 172.16.4.1 >
172.16.4.2: ICMP echo reply, id 4, seq 1, length 64
12:03:23.712905 08:c0:eb:a9:6e:bc > 0c:48:c6:97:24:e6, ethertype IPv4 (0x0800), length 98: 172.16.4.2 >
172.16.4.1: ICMP echo request, id 4, seq 2, length 64
12:03:23.713023 0c:48:c6:97:24:e6 > 08:c0:eb:a9:6e:bc, ethertype IPv4 (0x0800), length 98: 172.16.4.1 >
172.16.4.2: ICMP echo reply, id 4, seq 2, length 64
12:03:24.736889 08:c0:eb:a9:6e:bc > 0c:48:c6:97:24:e6, ethertype IPv4 (0x0800), length 98: 172.16.4.2 >
172.16.4.1: ICMP echo request, id 4, seq 3, length 64
12:03:24.737008 0c:48:c6:97:24:e6 > 08:c0:eb:a9:6e:bc, ethertype IPv4 (0x0800), length 98: 172.16.4.1 >
172.16.4.2: ICMP echo reply, id 4, seq 3, length 64
12:03:25.760845 08:c0:eb:a9:6e:bc > 0c:48:c6:97:24:e6, ethertype IPv4 (0x0800), length 98: 172.16.4.2 >
172.16.4.1: ICMP echo request, id 4, seq 4, length 64
12:03:25.760972 0c:48:c6:97:24:e6 > 08:c0:eb:a9:6e:bc, ethertype IPv4 (0x0800), length 98: 172.16.4.1 >
172.16.4.2: ICMP echo reply, id 4, seq 4, length 64
12:03:27.840681 08:c0:eb:a9:6e:bc > 0c:48:c6:97:24:e6, ethertype ARP (0x0806), length 60: Request who-has 172.16.4.1 tell 172.16.4.2, length 46
12:03:27.840747 0c:48:c6:97:24:e6 > 08:c0:eb:a9:6e:bc, ethertype ARP (0x0806), length 42: Reply 172.16.4.1 is-at 0c:48:c6:97:24:e6, length 28
However, if I run dhclient on the linux box, it sends DHCP IPv4 requests but they never appear in the tcpdump output nor does dnsmasq ever receive them:
root@node-1:/home/admin# dhclient -4 -v enp1s0f0
Internet Systems Consortium DHCP Client 4.4.1
Copyright 2004-2018 Internet Systems Consortium.
All rights reserved.
For info, please visit
https://www.isc.org/software/dhcp/Listening on LPF/enp1s0f0/08:c0:eb:a9:6e:bc
Sending on LPF/enp1s0f0/08:c0:eb:a9:6e:bc
Sending on Socket/fallback
DHCPDISCOVER on enp1s0f0 to 255.255.255.255 port 67 interval 3 (xid=0x4764f25c)
DHCPDISCOVER on enp1s0f0 to 255.255.255.255 port 67 interval 6 (xid=0x4764f25c)
...
Curiously, I *do* see IPV6 router discovery packets. So it's like whatever is filtering or consuming the packets is only doing so on DHCP IPV4.
I've configured the CTRLPLANE ACL to allow all traffic, if that is relevant:
"ACL_RULE": {
"CTRL|ACE_ACCEPT": {
"PACKET_ACTION": "ACCEPT",
"PRIORITY": "2",
"SRC_IP": "
0.0.0.0/0"
}
},
"ACL_TABLE": {
"CTRL": {
"policy_desc": "CTRLPLANE ACL",
"services": [
"ANY"
],
"type": "CTRLPLANE"
}
}
I do not have any other ACLs. The dhcp_relay feature is disabled.
Any clues? Somehow this was working previously and stopped working, but I haven't been able to figure out what changed.
Thanks,
Ben