MoM of today's OCP SONiC call 8/27/2019

138 views
Skip to first unread message

MSREDDY P

unread,
Aug 28, 2019, 10:19:16 AM8/28/19
to sonicproject, OCP-Net...@ocp-all.groups.io, Michael Schill, Xin Liu (CLOUD)
  • MoM of today's OCP SONiC call  8/27/2019.

Topics discussed
  • Dynamic Port BreakOut HLD - LNKD 
Review (Q & A):
  • Can't SONiC query SAI API to fetch the break out capabilities?
  • Generic question : Why breakout support only per interface? why can' t it per device ? Platform's don't allow certain ports due to silicon issues or the feature is not ready to use the breakout port on this relesase
  • Can the breakout feature supports range of ports together? 
  • Can ASIC vendors support breakout on range/group of ports?
  • What does platform vendors do to support this feature?  It seems vendors should provide platforms.ini file.
  • Can this feature support the list of breakout supported to the user? 
  • Can breakout feature enforce lanes and aliases to the sonic application?
  • Why can't we define platform files per HWSKU?  
  • Can this HLD covers Platform LED feature?
  • How about the configuration validations during port-breakout ? Can this integrate with MGMT framework CVL lib? yes.

Thanks,
Madhu


  • MoM of today's OCP SONiC call  8/20/2019.

Topics discussed
  • MC-LAG HLD - Nephos 
Review (Q & A):
  • Can MC-LAG support on sub-port interfaces?
  • Update scope of L2/L3 MC-LAG in HLD. 
  • Can MCLAG supports multicast? 
  • Do you have scale numbers w.r.t FDB/ARP/Route sync between MC-LAG failures? 
  • How can we isolate the packet flooding between MCLAG vs NON-MCLAG in same broadcast domain? 
  • Update HLD with test cases for MC-LAG failover (link/node level) scenarios?

Thanks,
Madhu


MoM of today's OCP SONiC call  8/13/2019.

Topics discussed
  • Sonic management framework - BRCM & DELL
Review (Q & A)
  • Can the click cli co-exists with mgmt-framework ? Yes.
  • Does mgmt framework support existing click cli commands ? yes, click based cli commnads will be migrated to klish based cli.
  • Can the click based cli deprecated ? No
  • Can the mgmt-framework supports the external AAA servers for authentication? pl add details to the HLD.
  • Add AAA auth failure work flow the REST SET work flow?
  • Does the mgmt framework handles the end to end error handling or feedback loop ? No, out of the scope.
  • Why are pulling telemetry container into mgmt container? We don't run multiple gNMI servers in SONIC, and requesting community to rename the sonic-telemetry server and make part of mgmt-framework.
  • Does output of click based CLI will be changed? 
  • Does the mgmt-framework supports the notion of start up config ? 
  • Does the mgmt framework supports the CLI show to reflect the configDB?
  • Can the mgmt-framework supports show running config ?
  • Feature timelines - the scope is proposing the mgmt framework and there will be seperate feature HLDs coming. 


MoM of today's OCP SONiC call  8/6/2019.

Topics discussed
  • Sub port interface design - Winda
Review (Q & A)
  • How sub-port interface different from vlan interface in sonic? Ans: Vlan interface is a bridge port in sonic.
  • Rename dot1Q table ? - Since there is vlan interface table, dot1Q interface table is little confusing, community suggested go with sub-port/interface table.
  • How about separate sub-interface/port manager for sub-port interfaces?
  • Does sub-port feature use sonic-cli/direct native calls ? It uses linux iproute2 calls 
  • Do you expect iproute2 upgrades to support sub-port feature? No 
  • What is the use case of mtu with sub interface? 
  • Can sub-port interface support on port-breakout interfaces? 
  • Do you see any issues with naming convention w.r.t port breakouts & sub-ports?
  • Is there any limit on sub port interfaces? yes, refer scalability section [750 per switch]
  • Few question on sub-port functionality? If the packet entered untagged how does it route to sub-port interface?
  • what is the miss-policy support with sub-port interfaces ? could be dropped - debatable 
  • define behavior untagged and miss policy arrived to physical port? How Sonic process these packets?
  • Can physical & sub port interfaces shared same neighbor table or different ? 
  • Add section to the HLD for cross functional / port properties when port is layer 3/ layer 2 port? 


Announcements:
  • 201908 release - will be delayed 10/2019
  • please send out PR's to sonic mailing lists 
  • OCP Amsterdam [Europe]- End of Sept.

MoM of today's OCP SONiC call  07/23/2019.

Topics discussed
  • Debug framework design spec - BRCM
Review (Q & A)
  • What is the impact on current show tech dump ? 
  • Can the framework support get the tech dump specified time slice/range ? 
  • Does framework support any schema for debug event triggers ? 
  • Where does this framework run, can user turn off? 
  • Will the framework exports debug data in Json format? 

MoM of today's OCP SONiC call  07/16/2019.

Topics discussed
  • Egress Mirror support and ACL action capability check 
Review (Q & A)
  • Does this feature backward compatible? Yes [sonic - to -sonic ]
  • Is there any requirement for egress mirroring to have all packet modifications done in the mirrored copy? No such support.
  • What is the behavior if max egress sessions programmed? - Not a requirement 
  • If both ingress/egress enabled on same packet, do we see two mirror copies? Yes, might need a fix around it.
  • Does SONiC has any limit on supporting egress mirror sessions? - depends on ASIC limit
  • Does this design supports truncate the mirrored copy ? Does it a SONiC/SAI spec? Need to check 

  • SONiC Image Build Time Improvements (MLNX)
Review (Q & A)

  • Is the design use parallel builds? yes, make use of all the cpu threads (12) 
  • How much build time improvements we can see if we discount kernel? - ~1 h (we build linux built in separate thread)
  • How is different Docker build kit from docker natived?- DBK is completely written for docker images and supports isolated users instead multiple users.

    Announcements
    • 201908 release tracking
    • Repurposing the sub-group meetings to design meetings.

    MoM of today's OCP SONiC call  07/09/2019.

    Topics discussed

    • PDE (Platform Development Environment) /PDDF (Platform Driver Development Framework)- BRCM
    Review (Q&A)
    • Is PDE specific to BRCM chipset? Not necessarily, who ever supoport SAI can use it.
    • What are the interfaces PDE provides for ASIC and platform? PDDF data driven framework (JSON APIs)& existing driver API's
    • Can framework allow vendor extensions ? PDDF supports vendor extensions
    • How to package PDE ?  PDE can be built along with full sonic image & dockers or individual docker
    • Will custom plugins (ex:BMC) could integrate with PDE? yes
    • Can we load PDE into multiple targets? possible 

    Announcements
    • PR reviews ownership - checkout the 201908 release tracking page

    MoM of today's OCP SONiC call  06/25/2019.

    Topics discussed

    • VRF design discussion  - Nephos (Jeffrey) 
    Review (Q&A)
    • How does VRF configures in Linux kernel? As of now, though there is a CLI wrapper, SONiC ultimately uses the linux NetLink calls. [Community has some suggestions - Liat may help here with our examples]
    • Questions on config_db migration script on VRF config migration? offline discussions would continue/PR feedback.
    • Design decision behind creating an empty interface INTERFACE|Ethernet0:{} in config_db ? Multiple things, 1) SAI 2) Code complexity behind the resource migration. etc. There is a section in the PR,  feedback can be provided.
    • There is a request on VRF ID adding besides interface name in the next hop? The decision seems we are going with minimal configuration to support the SONiC system design.
    • Can we safely assume VRF design supports later versions of Linux Kernel 4.9? Yes. 
    What next? 
    • PR discussion could be extended to next meeting based on the PR feedback. [Jeffery/Prince]

    MoM of today's OCP SONiC call  06/18/2019.

    Topics discussed

    • Error Handling  - BRCM (Santhosh)
    Review (Q&A)
    We had a great discussion, there are lot of inputs from community and here is some. Feel free to add missing comments here.
    • How does framework supports multiple CRUD failures?  
    [Ben]: See below 
    • Do you provide a knob to switch off Error handling feature? Is knob necessary? 
    [Ben]: No knob is necessary. The error handling proposal is a framework that is available for a) implementation of error reporting in SWSS on a feature-by-feature basis and b) application processing of such errors. Both a) and b) are implementation choices that can be made on an feature-by-feature basis. And if an application does not want to process a supported error, then it can just ignore it. 
    • Does the applications get out of order notifications from feedback loop? How to handle in the case of it? Ex: User does create/delete/create and do you expect the error feedback come in order? 
    [Ben]: The specific comment was that the key/values used to refer to APP_DB (or other) in an ERROR_DB report may not be specific enough to distinguish between different error events. The example given (by Nikos) was a route add-withdraw-add case - since the APP_DB table entry may be the same between the 2 adds, then, if there's an error report, how does the application (FRR in this case) know which of the adds failed? We will come back on this point. 
    • What is the design decision behind a new Error DB? Why can't we merge error attributes into APP DB? 
    [Ben]: We thought about both options, and decided that the ERROR_DB gave a bit more flexibility and avoided changing existing application tables. It was not a clear decision, but we see no reason to move away from it. 
    • What is the mechanism to synchronize route CRUD between APP DB vs new Error DB? 
    [Ben]: See above 
    • Is new Error DB is a mirror of APP DB? 
    [Ben]: Not really - but each error table entry points to a corresponding entry in another table (usually APP_DB) 
    • The current design mentioned an approach to stop propagate the failed/error routes to the neighbors? This may not right as per RFC, the routes should propagate though the it failed due to some policy. (Nikos)
    [Ben]: This topic went beyond scope of the framework (#1 above) and into the BGP doc (#2). We will setup a separate offline discussion for this.
     
    Overall feedback - The feedback loop is necessary to address SAI fatal errors. However the community requested the design should dis associate/de couple the feedback loop  as much as possible so that applications have freedom to react/handle it own way.
    [Ben]: That's exactly how it's setup today. 
    one option suggested - Framework should more generic and should accommodate opaque error context for the applications. 
    [Ben]: This is a different topic - see above ("The specific comment was that the key/values ....")

    Xin will extend an offline discussion on this topic, stay tuned.


    Announcements 
    • SONiC Release 201908 tracking page - Xin can you post the link
    • Action Item for community - Signup for PR reviews

    MoM of today's OCP SONiC call  06/04/2019.

    Topics discussed
    • STP/PVST - Sandeep (BRCM)
    Q & A 
    • Can this STP feature compile time disabled? BRCM will explore this (compile time/run time options to disable/enable STP/PVST feature)
    • Warm reboot not supported for PVST? Community requested more details need to be added to design 
    • Multiple questions what is the design decision on why  STP states are not programming to Kernel?   Few questions: 1) With the current STP design - the STP states are not populating in kernel, ASIC and Kernel will be out of sync, what is the downside ?  2) Let's say Port/Vlan is not blocking in the kernel, but is blocked in ASIC, then what is the behavior with arp/ping/ospf in this scenarios ?  BRCM should document the scenarios.
    • Community requested to document the ASIC and Kernel out of sync scenarios - AI BRCM
    • There should be no drop if HW says forwarding? yes
    • Is there mechanism to program the states in to Kernel ? BRCM to explore on it
    • If the trap is configured on port which is blocked does the packet comes to CPU? yes, based on the trap configurations.
    • When port is blocked in HW, what are the packets should send? - HW shouldn't block L2 packets/LACP exchanges but drop L3 packets.
    • Can COPP program to trap to cpu ? Yes

    • HLD on NAT  - Kiran Kella (BRCM)

    Q & A 
    • Does it support payload/embedded headers (ALGs- application level gateways) support ? Not right now.
    • Continue discussion next sub group meeting. 
    Announcements 
    • Next sub group meetings HLD on NAT, SFLow 

    MoM of today's OCP SONiC SUB GROUP call  05/28/2019.

    Topics discussed:
    • Status on MLAG Design discussions - Nephos Team

    Q & A 
    • Does this solution addressed L3 MLAG alone? Both L3 and L2. It seems L2 MLAG HLD need some updates.
    • Does MCLAG supports MulitCast? Nephos team will update the HLD with all the use-cases and missing pieces.
    • When is the next meeting to discuss on MCLAG ? June 11th
    • Community requested Nephos team for Updated MCLAG HLD before Jun 11th. 

    Action Items/Announcements
    • Will it be possible to discuss other than MCLAG in SUB Group calls ? Yes. Xin we will work and adjust to the cadence
    • Community requested to include/Update User Scenarios in HLDs for review
    • Ben Gale (BRCM) will propose on MCLAG next few weeks. 
    • Request community to review below MCLAG PR before next sub group meeting (06/11/2019)
    • Here is the PR and design presentation
      1.  MCLAG video - https://www.youtube.com/watch?v=shFEKjBp66Q&feature=youtu.be
      2.  MCLAG PR - https://github.com/Azure/SONiC/pull/325

    MoM of today's OCP SONiC call  05/21/2019.

    Topics discussed
    • L2 - FDB/MAC enhancements - Anil (Broadcom)

    Q & A 
    • FDB aging per device ? yes 
    • Does FDB aging support per sec ? yes 
    • Can MAC aging support per port and VLAN ? Anil will add support to the proposal 
    • Design on restrict the warning logs on VLAN range feature support? Broadcom will consider this in the proposal [Aggregated log etc.]
    • Does this feature need  SAI support from vendors ? (no new SAI attributes), Broadcom will list SAI APIs using it currently for this feature.
    • How does Vlan range updates implemented? vlan range being consolidated at config_db and apply down to the hardware in single shot, no deletes and adds.
    • Do we have FDB type in the fdb entry ? yes [static vs dynamic] and will be displayed in show commands
    • How does FDB optimizations on topo/STP event flush ? out side of ASIC, in the case of broadcom flushes are quick.  
    • How does system wide fdb flush ? It should handled by SAI, by go over all the ports and Vlans, vendor specific. 
    • Community ask on MAC aging & MAC move scale numbers? Broadcom will add into the proposal 

    • BFD - Sumit Agarwal (Broadcom)
    Q & A 

    • Discussed on BFD implementations phase 1  & Phase 2. 
    • In BFD Phase-1 : BFD is part of BGP docker
    • In BFD Phase 2 : BFD will implement in Hardware. 
    • Can SONiC Users turn off if they don't want? yes through compile time, but community suggested don't run default, provide CLI to enable it.
    • How BFD works with warm reboots ? 1) planned warm reboot, users can update the BFD timers upfront 2) unplanned warm reboot BFD session will timeout before BGP timeouts. 
    • Can configure/control BFD timeouts on remote Bgp peers? Question from Nikos. Need discussion more.
    Announcements 

    • More design reviews lineup for Aug 2019.
    • Provide feedbacks on PRs 
    • Watch out for bi weekly meeting on design proposals and reviews.
    MoM of today's OCP SONiC call  05/07/2019.

    Topics discussed
    • SONiC 201908 release Planning - 05/07/2019

    Q & A 
    • Need code review support for multi-db performance improvements - MSFT & AVIZ Networks
    • What is the scope of Error handling mechanism work by BRCM  - It covers SAI error surfacing and handling
    • What is the scope of Configuration validations - Open for design, current scope is use syslog mechanism to propagate the config errors.
    • What is the VRF feature planned in SONiC? it is VRF lite support not the MPLS. 
    • Do we have plan for multi-tenancy VPN with VRF feature? No, that would be handles separately.
    • When is the VRF lite design review - Expected 5/21
    • What is the ETA for dynamic breakout - Xin will work with LNKD
    • For dynamic breakout, is it possible to get ASIC vendor ETA ? Xin will talk to ASIC vendors [an ETA early July would help to test it]
    • Do we have a list of platform APIs ? refer PMON APIs
    • How to earn OCP credits for companies - Checkout the OCP website for how to get credits to such as software contributions etc.
    • Is sub-port feature is same as sub-interface ? yes 
    • What kind of features run on sub-port? No HLD yet, Jipan will come back with HLD on this
    • Can we have small description on sub-port ? Xin will work with Alibaba
    • When is the SAI proposal on sFlow? Dell working on the SAI proposal for sFlow and will send for design review.
    • What does SONiC side use for slow ? HSflowD, its a opensource package and need to check the licensing [Need to explore the licensing part, work with Xin]
    • Build improvements - experimental BRCM ? design review needed on the changes. Ben will provide a design review
    • What is Mgmt framework - Goal is to easily manage the sonic switch? [models, serialization, unified cli, gnmi]
    • What is the BFD for FRR used for - for BGP failures
    • Does BFD-FRR required SAI support ? No, for the current work, not using any SAI BFD APIs, will be using on next iteration.
    • Does SONiC official release support on ONL ? No, SONiC has tight roadmap next 8 months.

    Announcements 
    • OCP events - www.opencompute.org/events/upcoming events - road show  Taiwan, Beijing, India
    • SONiC next meeting 05/21/2019 
    • SONiC team will use Workgroup meetings other alternative Tuesday [Test workgroups & MLAG/L2 workgroups etc. ]
    APR release 
    • Redis performance - out of the apr release
    • CLI improvement - moved to next release
    • Any ETA for APR release stabilizations - need to estimate 

     

    MSREDDY P

    unread,
    Sep 3, 2019, 12:22:39 PM9/3/19
    to sonicproject, OCP-Net...@ocp-all.groups.io, Michael Schill, Xin Liu (CLOUD)
    • MoM of today's OCP SONiC call  9/3/2019.

    Topics discussed
    • BGP Error handling  - BRCM  
    Review (Q & A):
    • Is there any perf impact on disable this feature: No
    • Data shows the RIB-in convergence performance degradation is 44%, it should be linear, but why is 44%? can it be improved? 
    • What is the scope of the QuickTests? Is it covered only happy paths alone? do you have numbers with non-happy path scenarios? 
    • does the QuickTest covers both Ipv4 or Ipv6 ? QuickTest supports mix scenario of ipv4 & ipv6 ? not yet done for pure Ipv6 routes, will be explored.
    • do you have any special handling for default route ? No
    • Does it supports any debug commands check the failed route ? yes
    • What is the reconciliation on daemon crashes (Ex: BGP)- how to reconcile the routes? Please list out the scenarios in HLD.
    • Can this feature turn-off on demand ? is yes, can this affect the system stability? 
    • PR - https://github.com/Azure/SONiC/pull/424#pullrequestreview-283110975

    Error Handling - BRCM

    Review (Q & A):
    • Overall framework is thinking about two approaches - 1) Introduce Opaque ID to track the add-delete-add kind of error handling scenarios 2) Introduce an Sync SAI API in addition to current Async SAI API.
    • HLD is out for the community review. https://github.com/Azure/SONiC/pull/391

    Thanks,
    Madhu

      MSREDDY P

      unread,
      Sep 10, 2019, 12:11:06 PM9/10/19
      to sonicproject, OCP-Net...@ocp-all.groups.io, Michael Schill, Xin Liu (CLOUD)
      • MoM of today's OCP SONiC call  9/10/2019.

      Topics discussed
      • Drop Counters HLD  - MSFT
      Review (Q & A):
      • Does the design preserve the counters on warm reboots? No
      • Can the design reports the user if the drop counter is not supported with platforms? yes
      • List out the caveats with warm reboot cases. Ex: if the device went wrong after warm reboots, does the drop counters distinguish the failure reasons?
      • Do we have default settings for the debug counters with the device? No
      • Can the design provide any templates for the debug counters to configure it? 
      • Can the lifecycle (ex: clear)of these counters will not effect the existing counters? No
      • Can the design support logical/aggregate debug counters? 
      • Does these counters are ASIC independent? what platforms do you guys cover it?
      • Can this integrate with mgmt framework?

      Thanks,
      -Madhu

      MSREDDY P

      unread,
      Sep 17, 2019, 12:49:30 PM9/17/19
      to sonicproject, OCP-Net...@ocp-all.groups.io, Michael Schill, Xin Liu (CLOUD)
      • MoM of today's OCP SONiC call  9/17/2019.

      Topics discussed
      • Firmware Utils  - MLNX
      Review (Q & A):
      • Why don't leverage ONIE updater, what is the design rational behind the fwUtils? 
      • What is the significance of chassis? Does Sonic supports multiple chassis?
      • Can the design supports module level installations? 
      • Is the design support remote image path? yes
      • What are the supported methods to download images? remote url http/https
      • What about the image validations ? ex: compatibility between CPLD/BIOS etc..
      • Can user skip/install specific image version using the fwUtils? - you should use it manually [skip the fwUpgrade]
      • Can fwUtils supports scheduling of reloads after component updates? 

      2019 Oct Release 
      Checkout below for release tracking 

      Thanks,
      -Madhu



      MSREDDY P

      unread,
      Sep 24, 2019, 12:23:08 PM9/24/19
      to sonicproject, OCP-Net...@ocp-all.groups.io, Michael Schill, Xin Liu (CLOUD)
      • MoM of today's OCP SONiC call  9/24/2019.

      Topics discussed
      • Dynamic Port BreakOut  - LKND 
      • This talk is extension of previous discussion.
      Review (Q & A):
      • Can the design incorporate port groups ? offline discussion with Dell, LKND.
      • Can the design support to add port persona ex: FC/FCoE or Ethernet? 
      • What is the default admin status of fanned out ports ? admin staus is DOWN by default.
      • How does the design guarantee the sequencing of delete/add configurations? 

      Thanks,
      -Madhu





      MSREDDY P

      unread,
      Oct 8, 2019, 1:27:46 PM10/8/19
      to sonicproject, OCP-Net...@ocp-all.groups.io, Michael Schill, Xin Liu (CLOUD)
      • MoM of today's OCP SONiC call  10/08/2019.

      Topics discussed
      • Checkout for OCP summit https://www.opencompute.org/events/past-summits
      • Test sub group will be back next week [mid of OCT]
      • SONiC Document work group - news-letter bi-weekly [end of October]
      • 201908 Code PR reviews - target next 2 weeks.
      • 201908 Code complete - by Oct 31st
      • 201908 QA start - Nov 1st 
      Thanks,
      -Madhu




      • MoM of today's OCP SONiC call  9/24/2019.

      Topics discussed
      • MGMT Framework - BRCM & DELL

      Review (Q & A):
      • List out examples where does the developers/users need transLib hints?
      • W.r.t CVL library, do you have any performance numbers ex: add-del-add config objects work flow ex: vlan ? Do you see any performance hit? what are the improveements?

      eantck rara

      unread,
      Oct 8, 2019, 6:16:14 PM10/8/19
      to MSREDDY P, sonicproject, OCP-Net...@ocp-all.groups.io, Michael Schill, Xin Liu (CLOUD)
      Hello

      Can someone please share the bridge details and today's meeting time?

      ~Thank you
      /K

      --
      You received this message because you are subscribed to the Google Groups "sonicproject" group.
      To unsubscribe from this group and stop receiving emails from it, send an email to sonicproject...@googlegroups.com.
      To view this discussion on the web visit https://groups.google.com/d/msgid/sonicproject/CAGdF68ERYRsbi4nV6BFit7BgNK29de69o9AnR9yrzROjikfaSA%40mail.gmail.com.

      MSREDDY P

      unread,
      Oct 12, 2019, 8:35:40 PM10/12/19
      to eantck rara, sonicproject, OCP-Net...@ocp-all.groups.io, Michael Schill, Xin Liu (CLOUD)
      Hello, 

      You can join every Tuesday 8-9 AM PST. 

      In addition to that you could subscribe to sonicp...@googlegroups.com, you would receive the invite from Xin from Microsoft. 

      Thanks,
      -Madhu

      Reply all
      Reply to author
      Forward
      0 new messages