P4+OpenFlow Topology leaf-spine

116 views
Skip to first unread message

Tiago Vieira

unread,
Oct 22, 2020, 2:27:03 PM10/22/20
to Trellis Developers, andre....@gmail.com

Hi!

 

We want to deploy the topology illustrated in the attached file topology_P4+OF.jpg in ONOS using 3 physical switches: two OpenFlow Edgecore AS7712 32x as leafs and one P4 Edgecore Wedge 100BF 32x as spine. There is also one host attached to each leaf switch. Our first goal is simply to have connectivity between these two hosts. It happens that with all the 3 switches setted up, the packets can’t go through the switch P4 and we can’t figure out why. Otherwise, if we only use the two OpenFlow switches directly linked without the switch P4, the connection between hosts is easily established.

 

Here is the result of the t3-troubleshoot-pingall command in onos-cli:

 

onos@root > t3-troubleshoot-pingall

*** NIB is invalid. Snapshots for the NIB have been auto-filled: ***

Load current network states from ONOS stores

FlowNib created 22-10-2020 02:30 from SNAPSHOT

GroupNib created 22-10-2020 02:30 from SNAPSHOT

LinkNib created 22-10-2020 02:30 from SNAPSHOT

HostNib created 22-10-2020 02:30 from SNAPSHOT

DeviceNib created 22-10-2020 02:30 from SNAPSHOT

DriverNib created 22-10-2020 02:30 from SNAPSHOT

MastershipNib created 22-10-2020 02:30 from SNAPSHOT

EdgePortNib created 22-10-2020 02:30 from SNAPSHOT

RouteNib created 22-10-2020 02:30 from SNAPSHOT

NetworkConfigNib created 22-10-2020 02:30 from SNAPSHOT

MulticastRouteNib created 22-10-2020 02:30 from SNAPSHOT

Tracing between all ipv4 hosts

Error executing command: class org.onosproject.net.pi.model.PiTableId cannot be cast to class org.onosproject.net.flow.IndexTableId (org.onosproject.net.pi.model.PiTableId and org.onosproject.net.flow.IndexTableId are in unnamed module of loader org.apache.felix.framework.BundleWiringImpl$BundleClassLoader @51901873)

 

Active apps:

onos@root > apps -a -s

*   5 org.onosproject.route-service        2.2.3    Route Service Server

*   7 org.onosproject.hostprovider         2.2.3    Host Location Provider

*   8 org.onosproject.lldpprovider         2.2.3    LLDP Link Provider

*   9 org.onosproject.optical-model        2.2.3    Optical Network Model

*  10 org.onosproject.openflow-base        2.2.3    OpenFlow Base Provider

*  11 org.onosproject.openflow             2.2.3    OpenFlow Provider Suite

*  40 org.onosproject.fpm                  2.2.3    FIB Push Manager (FPM) Route Receiver

*  41 org.onosproject.dhcprelay            2.2.3    DHCP Relay Agent

*  44 org.onosproject.drivers              2.2.3    Default Drivers

*  47 org.onosproject.protocols.grpc       2.2.3    gRPC Protocol Subsystem

*  48 org.onosproject.protocols.gnmi       2.2.3    gNMI Protocol Subsystem

*  49 org.onosproject.generaldeviceprovider 2.2.3    General Device Provider

*  50 org.onosproject.drivers.gnmi         2.2.3    gNMI Drivers

*  51 org.onosproject.protocols.gnoi       2.2.3    gNOI Protocol Subsystem

*  52 org.onosproject.drivers.gnoi         2.2.3    gNOI Drivers

*  53 org.onosproject.protocols.p4runtime  2.2.3    P4Runtime Protocol Subsystem

*  54 org.onosproject.p4runtime            2.2.3    P4Runtime Provider

*  55 org.onosproject.drivers.p4runtime    2.2.3    P4Runtime Drivers

*  56 org.onosproject.pipelines.basic      2.2.3    Basic Pipelines

*  57 org.onosproject.drivers.stratum      2.2.3    Stratum Drivers

*  58 org.onosproject.drivers.barefoot     2.2.3    Barefoot Drivers

*  59 org.onosproject.drivers.bmv2         2.2.3    BMv2 Drivers

*  98 org.onosproject.pipelines.fabric     2.2.3    Fabric Pipeline

* 117 org.onosproject.gui                  2.2.3    ONOS Legacy GUI

* 119 org.onosproject.hostprobingprovider  2.2.3    Host Probing Provider

* 132 org.onosproject.mcast                2.2.3    Multicast traffic control

* 140 org.onosproject.netcfghostprovider   2.2.3    Network Config Host Provider

* 164 org.onosproject.portloadbalancer     2.2.3    Port Load Balance Service

* 169 org.onosproject.proxyarp             2.2.3    Proxy ARP/NDP

* 173 org.onosproject.routeradvertisement  2.2.3    IPv6 RA Generator

* 176 org.onosproject.segmentrouting       2.2.3    Segment Routing

* 178 org.onosproject.t3                   2.2.3    Trellis Troubleshooting Toolkit

* 198 org.opencord.fabric-tofino           1.0.0    Tofino-enabled Fabric Pipeconf

 

 

The P4 switch is running the stratumproject/stratum-bf: 9.0.0-4.14.49-OpenNetworkLinux docker image modified to start with the following entrypoint:
ENTRYPOINT ["start-stratum.sh", "--bf_sim"

 

 

Steps performed:

1-     Installation and activation of fabric-tofino-1.0.0.oar (https://repo1.maven.org/maven2/org/opencord/fabric-tofino/)

We have tried both 1.0.0 and 1.1.0 versions

 

2-     In onos-cli: onos@root > cfg set org.onosproject.ra.RouterAdvertisementManager raOptionPrefixStatus true

 

3-     Configuration of network through the REST API (by this order):

 

 

  • Configuration for southbound ports of the leaf switches:

 

{

    "ports": {

        "of:0000b86a9737507b/35": {

            "interfaces": [

                {

                    "name": "vlan-vm",

                    "ips": [

                        "10.66.3.254/24"

                    ],

                    "vlan-tagged": [

                        9

                    ]

                }

            ]

        },

        "of:0000b86a9737547b/9": {

            "interfaces": [

                {

                    "name": "vlan-olt",

                    "ips": [

                        "10.66.4.254/24"

                    ],

                    "vlan-tagged": [

                        9

                    ]

                }

            ]

        }

    }

}

 

  •    Configuration for P4 switch:

 

{

    "devices": {

        "device:switch-p4": {

            "segmentrouting": {

                "ipv4NodeSid": 101,

                "ipv4Loopback": "192.1.1.248",

                "ipv6NodeSid": 111,

                "ipv6Loopback": "fe80::290:fbff:fe62:c581",

                "routerMac": "00:90:fb:62:c5:81",

                "isEdgeRouter": false,

                "adjacencySids": []

            },

            "ports": {

                "144": {

                    "name": "swp4-eth144",

                    "speed": 100000000000,

                    "number": 144,

                    "removed": false,

                    "type": "fiber"

                },

                "32": {

                    "name": "swp4-eth32",

                    "speed": 40000000000,

                    "number": 32,

                    "removed": false,

                    "type": "fiber"

                }

            },

            "basic": {

                "name": "swp4",

                "managementAddress": "grpc://10.112.106.248:28000?device_id=1",

                "driver": "stratum-tofino",

                "pipeconf": "org.opencord.fabric.tofino.montara_sde_9_0_0"

            }

        }

    }

}

 

 

  • Configuration for OpenFlow switches:

 

{

   "devices":{

      "of:0000b86a9737547b":{

         "segmentrouting":{

            "ipv4NodeSid":247,

            "ipv4Loopback":"192.1.1.247",

            "ipv6NodeSid":257,

            "ipv6Loopback":"abcd::ba6a:97ff:fe37:547b",

            "routerMac":"b8:6a:97:37:54:7b",

            "isEdgeRouter":true,

            "adjacencySids":[  ]

         },

         "ports":{

            "1":{

               "name":"leaf247-eth1",

               "speed":100000000000,

               "enabled":true,

               "number":1,

               "removed":false,

               "type":"fiber"

            },

            "9":{

               "name":"leaf247-eth9",

               "speed":100000000000,

               "enabled":true,

               "number":9,

               "removed":false,

               "type":"fiber"

            }

         },

         "basic":{

            "name":"Leaf-sw247"

         }

      },

      "of:0000b86a9737507b":{

         "segmentrouting":{

            "ipv4NodeSid":246,

            "ipv4Loopback":"192.1.1.246",

            "ipv6NodeSid":256,

            "ipv6Loopback":"abcd::ba6a:97ff:fe37:507b",

            "routerMac":"b8:6a:97:37:50:7b",

            "isEdgeRouter":true,

            "adjacencySids":[

              

            ]

         },

         "ports":{

            "21":{

               "name":"leaf246-eth21",

               "speed":40000000000,

               "enabled":true,

               "number":21,

               "removed":false,

               "type":"fiber"

            },

            "35":{

               "name":"leaf246-eth35",

               "speed":10000000000,

               "enabled":true,

               "number":35,

               "removed":false,

               "type":"fiber"

            }

         },

         "basic":{

            "name":"Leaf-sw246"

         }

      }

   }

}

 

  

In this experience was used version 2.2.3 of ONOS but we obtained the same results with v2.3.0 and v2.4.0.

 

Thanks for your help and appreciate your attention.

 

Best Regards,

Tiago

topology_P4+OF.jpg
onos_logs_v2.2.3.txt

Carmelo Cascone

unread,
Oct 22, 2020, 5:17:36 PM10/22/20
to Tiago Vieira, Trellis Developers, andre....@gmail.com
I’m not 100% sure but I suspect the issue comes from the fact that you provide port via configuration for the P4 switch via netcfg, which is likely inconsistent with the port configuration used by Stratum.

I suggest you do the following:

- Run Stratum with a custom chassis_config file to provide port speed and other parameters you need. You can set the CHASSIS_CONFIG env as explained here. You can find the default chassis configs here.
- Remove the entire “ports” block from the netcfg (i.e., remove everything under devices/device:switch-p4/ports) ONOS will discover ports using gNMI.

Hope it helps
Carmelo

On Oct 22, 2020, at 11:27 AM, Tiago Vieira <tiagoa...@gmail.com> wrote

--
You received this message because you are subscribed to the Google Groups "Trellis Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trellis-dev...@opennetworking.org.
To view this discussion on the web visit https://groups.google.com/a/opennetworking.org/d/msgid/trellis-dev/28d516ab-bd30-4b0b-8ebb-5e903307b4den%40opennetworking.org.
<topology_P4+OF.jpg><onos_logs_v2.2.3.txt>

Charles Chan

unread,
Oct 22, 2020, 7:15:54 PM10/22/20
to Tiago Vieira, Trellis Developers, andre....@gmail.com, Carmelo Cascone
Hi Tiago,

+1 to Carmelo's suggestion that you should let ONOS learn the port speed information from Stratum.

Please note that:
- P4 support for T3 is still a work in progress. Our hope is to make it support a OpenFlow / P4 hybrid topology like the one you have. But we are still on the way.
- The P4 pipeline (fabric-tofino) doesn't support IPv6 yet.

Also, it would be helpful if you can attach the onos-diagnostics output.

Thanks,
Charles Chan, Ph.D.
Member of Technical Staff, Open Networking Foundation


Tiago Vieira

unread,
Oct 23, 2020, 1:06:23 PM10/23/20
to Trellis Developers, cha...@opennetworking.org, Trellis Developers, andre....@gmail.com, car...@opennetworking.org, Tiago Vieira
Hi!

Thank you for your suggestions. Unfortunately nothing changes and the error is still happening...


We've also changed all the QSFP ports to 100G and left chassis_config file with its default configurations like:
...
singleton_ports {
id: 132
name: "31/0"
slot: 1
port: 31
speed_bps: 100000000000
config_params {
admin_state: ADMIN_STATE_ENABLED
}
node: 1
} .....

And the only change we made was media_type parameter from copper to optical in port_map.json:
...
{
      "connector": 32,
      "device_id": 0,
      "mac_block": 25,
      "media_type": "optical",
      "lane0": {
        "mac_ch": 0,
        "tx_lane": 2,
        "tx_pn_swap": 1,
        "rx_lane": 2,
        "rx_pn_swap": 0,
        "serdes_params": {
          "tx_eq_pre": 6,
          "tx_eq_post": 0,
          "tx_eq_attn": 0
        }
      },...


So, as far as I understood from Charles' reply, this is not working because Onos doesn't support this kind of topology, right?


Best Regards,
Tiago

Charles Chan

unread,
Oct 23, 2020, 1:54:21 PM10/23/20
to Tiago Vieira, Trellis Developers, andre....@gmail.com, car...@opennetworking.org, Yi Tseng
Hi Tiago,

No, I wasn't suggesting that.
Only T3 troubleshooting tools and IPv6 doesn't work. IPv4 should still work under current topology.

  • I am not sure why you need to modify the port_map.json. I never did that in the past. Maybe Carmelo knows more.
  • We have recently noticed an issue of wrong port mapping on switches with the same model name but different hardware revisions. The default port mapping file in Stratum was written based on a pre-production Wedge 100BF-32x and I am not sure if we have updated it (Yi may know about this). I would suggest you first make sure you are configuring the correct port. If you attach to the Stratum container, it should show you the bf-sde prompt. From there, run
    bf-sde > pm
    bf-sde.pm > show -a
    This should tell the the correct mapping between front panel port (something like 1/0) and the logical port number (something like 128)
  • I would also suggest trying disabling auto negotiation and try again. This can be done by adding the following line to the config_param section of the chassis_config, right after the admin_state line.
    autoneg: TRI_STATE_FALSE
Do you have all the ports coming up after providing the chassis_config to Stratum? Let's try to get to that point first before investigating the control plane config.


Thanks,
Charles Chan, Ph.D.
Member of Technical Staff, Open Networking Foundation

Tiago Vieira

unread,
Oct 26, 2020, 2:41:23 PM10/26/20
to Trellis Developers, cha...@opennetworking.org, Trellis Developers, andre....@gmail.com, car...@opennetworking.org, y...@opennetworking.org, Tiago Vieira
Hi Charles


I verified that both port_map.json and chassis_config files are consistent with the ports info in stratum container, except that in port_map.json there is an 'extra' connector 33. Also, as I said before I changed all 'media_type'  parameter to optical, otherwise the switch links wouldn't appear in Onos. Those links (to the OpenFlow switches) were automatically discovered by Onos so I assume the physical connection is ok:




I also followed your suggestion to add the autoneg line in chassis_config file.


Regards,
Tiago

Charles Chan

unread,
Oct 26, 2020, 8:28:57 PM10/26/20
to Tiago Vieira, Trellis Developers, andre....@gmail.com, car...@opennetworking.org, y...@opennetworking.org
Hi Tiago,

It looks like you were trying to attach a picture but I am unable to view it.

Anyway, having all links showing up means the chassis_config are likely correct. We should now move forward and check what's wrong on the control plane side.
Are you able to provide the full onos-diagnostics as I suggested in the previous email?
Also, I noticed that proxyarp app is activated. We shouldn't need that in the latest Trellis and it may interfere with other apps. Please try to deactivate it.


Thanks,
Charles Chan, Ph.D.
Member of Technical Staff, Open Networking Foundation


Tiago Vieira

unread,
Oct 27, 2020, 6:40:45 AM10/27/20
to Trellis Developers, cha...@opennetworking.org, Trellis Developers, andre....@gmail.com, car...@opennetworking.org, y...@opennetworking.org, Tiago Vieira
Hi Charles

Here is the full onos-diagnostics in the attached file.

Thank you,
Tiago
onos-diags.tar.gz
Reply all
Reply to author
Forward
0 new messages