[stratum-dev] BMv2 standalone instances unable to forward traffic

4 views
Skip to first unread message

A Sydney

unread,
Apr 26, 2021, 1:41:00 PM4/26/21
to strat...@lists.stratumproject.org
Hi Stratum folks,
                           It appears that traffic is not forwarded by the BMv2 standalone instances. Here are more details:

1. I used Max's instructions to create standalone BMv2 instances running on Debian 10.
2. I created an SdnTest app on ONOS with the identical pipeconf, bmv2.json, and p4info.txt files from the ngsdn-tutorial app (i.e. the same working versions of these files from the ngsdn-tutorial VM are used in a different setup with a different ONOS controller and 6 standalone bmv2 switches). (https://github.com/opennetworkinglab/ngsdn-tutorial).
3. I created a custom 6-node Clos topology and I've attached the corresponding netcfg.json in the tarball.
4. Onos starts up just fine and I ensure that all apps (similar to the ones running on ngsdn-tutorial) are fired-up (See onos_cli_log in attached).
5. I then push netcfg.json to onos, and the onos logs show a few "Failed to register Bandwidth" and "Failed to register Port" errors (See onos_log in attached tarball): Note: After digging a bit, it appears that these messages originate from ResourceDeviceListener.java on line 162 when "log.isTraceEnabled()" is not set in ConsistentResourceStore on line 122 (So this essentially may not be an issue?). In any case, onos_cli_log shows that all devices are connected and available. Interestingly, the "Tx" portstat remains at 0. Given the fact the the lldpprovider app is running, I would expect at least LLDP traffic to be flowing.

6. As an example, I logged onto leaf1 to take a look at the logs, and from the terminal, I see the following (See leaf1_stratum_terminal_log for more details):
"""
E20210426 16:01:18.918056  5458 p4_service.cc:311] Failed to write forwarding entries to node 1:
"""
So when the switch starts up, the initial "chassis_config_file" lists all 8 ports on all switches. However, netcfg.json specifies 6 of the 8. When I take a look through the onos web GUI, I still see all 8 ports which implies that perhaps there may be issues pushing the updated chassis_config_file to the switch?

7. When I look through the details debug log for leaf1 (see leaf1_log_written_to_file), I only see 1 LLDP pkt (i.e. ethtype 88cc), though tcpdump from the controller shows that the controller is indeed sending numerous discovery pkts.

PS. I've attached a tarball with the various log files and a ReadMe describing the contents of each file.

Can you kindly provide feedback on perhaps why traffic (at least LLDP) is not forwarded by the switches?

BTW. I also tried to connect to leaf1 via "util/p4rt-sh", install flow rules to foward traffic between ports 5 and 6, but a quick ping showed that traffic was not flowing: I also used tcpdump on leaf1 to verify that indeed no pkts were making it on ports 5 and 6 (except dhcp which is expected).

Thanks,
-Syd


debug.tar.xz

A Sydney

unread,
May 3, 2021, 3:58:04 PM5/3/21
to strat...@lists.stratumproject.org

I was able to resolve the issue. Below is a quick summary:
Problem: The ONOS GUI did not show any links and further investigation showed that LLDP traffic was not observed in the stratum_bmv2 log.
Solution: I specified the "cpu_port" argument when stratum_bmv2 starts-up (i.e. cpu_port was set to 255, which is the same port used in the p4 program).

PS. Feel free to read below for more details:

Cheers!
-Syd

On Mon, Apr 26, 2021 at 1:41 PM A Sydney <asydn...@gmail.com> wrote:
Hi Stratum folks,
                           It appears that traffic is not forwarded by the BMv2 standalone instances. Here are more details:

1. I used Max's instructions to create standalone BMv2 instances running on Debian 10.
2. I created an SdnTest app on ONOS with the identical pipeconf, bmv2.json, and p4info.txt files from the ngsdn-tutorial app (i.e. the same working versions of these files from the ngsdn-tutorial VM are used in a different setup with a different ONOS controller and 6 standalone bmv2 switches). (https://github.com/opennetworkinglab/ngsdn-tutorial).
3. I created a custom 6-node Clos topology and I've attached the corresponding netcfg.json in the tarball.
4. Onos starts up just fine and I ensure that all apps (similar to the ones running on ngsdn-tutorial) are fired-up (See onos_cli_log in attached).
5. I then push netcfg.json to onos, and the onos logs show a few "Failed to register Bandwidth" and "Failed to register Port" errors (See onos_log in attached tarball): Note: After digging a bit, it appears that these messages originate from ResourceDeviceListener.java on line 162 when "log.isTraceEnabled()" is not set in ConsistentResourceStore on line 122 (So this essentially may not be an issue?). In any case, onos_cli_log shows that all devices are connected and available. Interestingly, the "Tx" portstat remains at 0. Given the fact the the lldpprovider app is running, I would expect at least LLDP traffic to be flowing.

6. As an example, I logged onto leaf1 to take a look at the logs, and from the terminal, I see the following (See leaf1_stratum_terminal_log for more details):
"""
E20210426 16:01:18.918056  5458 p4_service.cc:311] Failed to write forwarding entries to node 1:

This occurs when an attempt is made to install a flow that already exists.
 
"""
So when the switch starts up, the initial "chassis_config_file" lists all 8 ports on all switches. However, netcfg.json specifies 6 of the 8. When I take a look through the onos web GUI, I still see all 8 ports which implies that perhaps there may be issues pushing the updated chassis_config_file to the switch?

7. When I look through the details debug log for leaf1 (see leaf1_log_written_to_file), I only see 1 LLDP pkt (i.e. ethtype 88cc), though tcpdump from the controller shows that the controller is indeed sending numerous discovery pkts.

This turns out to be the log when lldphostprovider adds the LLDP rule to the switch.
 
PS. I've attached a tarball with the various log files and a ReadMe describing the contents of each file.

Can you kindly provide feedback on perhaps why traffic (at least LLDP) is not forwarded by the switches?

Using tcpdump, I observed that LLDP traffic was sent from the controller and made it to the switch. However, the bmv2 logs did not register any LLDP traffic on port 255 (i.e. the CPU port specified in the p4 program). I took a look at how stratum_bmv2 starts from the ngsdn-tutorial vm and observed that "cpu_port" was one of the arguments and was set to 255. For some reason, I thought that 255 was the default CPU port but it turns out that the default port is 64.  So I specified the "cpu_port" argument in testrig and pkt-in/outs now work and hence LLDP works.
 

BTW. I also tried to connect to leaf1 via "util/p4rt-sh", install flow rules to foward traffic between ports 5 and 6, but a quick ping showed that traffic was not flowing: I also used tcpdump on leaf1 to verify that indeed no pkts were making it on ports 5 and 6 (except dhcp which is expected).
This was a result of the behavior of the protocols in my environment. I had to add new tables and their corresponding ONOS components, and now traffic flows just fine.
 

Maximilian Pudelko

unread,
May 11, 2021, 2:33:04 PM5/11/21
to A Sydney, strat...@lists.stratumproject.org
Glad you could make it work and thanks for writing up the solution.

Max

On Mon, May 3, 2021 at 12:58 PM A Sydney via stratum-dev <strat...@lists.stratumproject.org> wrote:

I was able to resolve the issue. Below is a quick summary:
Problem: The ONOS GUI did not show any links and further investigation showed that LLDP traffic was not observed in the stratum_bmv2 log.
Solution: I specified the "cpu_port" argument when stratum_bmv2 starts-up (i.e. cpu_port was set to 255, which is the same port used in the p4 program).

PS. Feel free to read below for more details:

Cheers!
-Syd

On Mon, Apr 26, 2021 at 1:41 PM A Sydney <asydn...@gmail.com> wrote:
Hi Stratum folks,
                           It appears that traffic is not forwarded by the BMv2 standalone instances. Here are more details:

1. I used Max's instructions to create standalone BMv2 instances running on Debian 10.
2. I created an SdnTest app on ONOS with the identical pipeconf, bmv2.json, and p4info.txt files from the ngsdn-tutorial app (i.e. the same working versions of these files from the ngsdn-tutorial VM are used in a different setup with a different ONOS controller and 6 standalone bmv2 switches). (https://github.com/opennetworkinglab/ngsdn-tutorial).
3. I created a custom 6-node Clos topology and I've attached the corresponding netcfg.json in the tarball.
4. Onos starts up just fine and I ensure that all apps (similar to the ones running on ngsdn-tutorial) are fired-up (See onos_cli_log in attached).
5. I then push netcfg.json to onos, and the onos logs show a few "Failed to register Bandwidth" and "Failed to register Port" errors (See onos_log in attached tarball): Note: After digging a bit, it appears that these messages originate from ResourceDeviceListener.java on line 162 when "log.isTraceEnabled()" is not set in ConsistentResourceStore on line 122 (So this essentially may not be an issue?). In any case, onos_cli_log shows that all devices are connected and available. Interestingly, the "Tx" portstat remains at 0. Given the fact the the lldpprovider app is running, I would expect at least LLDP traffic to be flowing.

6. As an example, I logged onto leaf1 to take a look at the logs, and from the terminal, I see the following (See leaf1_stratum_terminal_log for more details):
"""
E20210426 16:01:18.918056  5458 p4_service.cc:311] Failed to write forwarding entries to node 1:

This occurs when an attempt is made to install a flow that already exists.
 
"""
So when the switch starts up, the initial "chassis_config_file" lists all 8 ports on all switches. However, netcfg.json specifies 6 of the 8. When I take a look through the onos web GUI, I still see all 8 ports which implies that perhaps there may be issues pushing the updated chassis_config_file to the switch?

7. When I look through the details debug log for leaf1 (see leaf1_log_written_to_file), I only see 1 LLDP pkt (i.e. ethtype 88cc), though tcpdump from the controller shows that the controller is indeed sending numerous discovery pkts.

This turns out to be the log when lldphostprovider adds the LLDP rule to the switch.
 
PS. I've attached a tarball with the various log files and a ReadMe describing the contents of each file.

Can you kindly provide feedback on perhaps why traffic (at least LLDP) is not forwarded by the switches?
Using tcpdump, I observed that LLDP traffic was sent from the controller and made it to the switch. However, the bmv2 logs did not register any LLDP traffic on port 255 (i.e. the CPU port specified in the p4 program). I took a look at how stratum_bmv2 starts from the ngsdn-tutorial vm and observed that "cpu_port" was one of the arguments and was set to 255. For some reason, I thought that 255 was the default CPU port but it turns out that the default port is 64.  So I specified the "cpu_port" argument in testrig and pkt-in/outs now work and hence LLDP works.
 
BTW. I also tried to connect to leaf1 via "util/p4rt-sh", install flow rules to foward traffic between ports 5 and 6, but a quick ping showed that traffic was not flowing: I also used tcpdump on leaf1 to verify that indeed no pkts were making it on ports 5 and 6 (except dhcp which is expected).
This was a result of the behavior of the protocols in my environment. I had to add new tables and their corresponding ONOS components, and now traffic flows just fine.
 
_______________________________________________
stratum-dev mailing list
strat...@lists.stratumproject.org
https://lists.stratumproject.org/listinfo/stratum-dev
Reply all
Reply to author
Forward
0 new messages