MoM of today's OCP SONiC call 10/15/2019

Skip to first unread message


Oct 15, 2019, 9:50:24 PM10/15/19
to sonicproject,, Michael Schill, Xin Liu (CLOUD)
  • MoM of today's OCP SONiC call  10/15/2019.

Topics discussed
  • Tech Support export Service 
Review (Q & A):
  • Can custom scripts will be added to tech support service ? This is minimal service, can be added. 
  • Can the Journal data should be part of tech support ? will be included
  • What kind of granular level support provided by tech support service? Minimal service for now, can be added filters, custom plugins to ex: to export to cloud etc. 
  • It seems tech support keep adding repeated data? How to avoid this storage back pressure on remote server? will look it
  • do you have per process core support ? Can the user cherry pick on few process instead all ? With current schema of things with containers seems not possible right now. BRCM will look into it.
  •   Core File Manager 
Review (Q & A):
  • Can File Manager do automatic analysis on core dumps ? yes 
  • Will it be possible to export only analysis reports to tech support ? yes 
  • Is the Core file uploaded with back traces ? yes
  • Does the systemD increase footprint - little bit 

On Tue, Oct 8, 2019 at 10:27 AM MSREDDY P <> wrote:
  • MoM of today's OCP SONiC call  10/08/2019.

Topics discussed
  • Checkout for OCP summit
  • Test sub group will be back next week [mid of OCT]
  • SONiC Document work group - news-letter bi-weekly [end of October]
  • 201908 Code PR reviews - target next 2 weeks.
  • 201908 Code complete - by Oct 31st
  • 201908 QA start - Nov 1st 

  • MoM of today's OCP SONiC call  9/24/2019.

Topics discussed
  • MGMT Framework - BRCM & DELL

Review (Q & A):
  • List out examples where does the developers/users need transLib hints?
  • W.r.t CVL library, do you have any performance numbers ex: add-del-add config objects work flow ex: vlan ? Do you see any performance hit? what are the improveements?

On Tue, Sep 24, 2019 at 9:22 AM MSREDDY P <> wrote:
  • MoM of today's OCP SONiC call  9/24/2019.

Topics discussed
  • Dynamic Port BreakOut  - LKND 
  • This talk is extension of previous discussion.
Review (Q & A):
  • Can the design incorporate port groups ? offline discussion with Dell, LKND.
  • Can the design support to add port persona ex: FC/FCoE or Ethernet? 
  • What is the default admin status of fanned out ports ? admin staus is DOWN by default.
  • How does the design guarantee the sequencing of delete/add configurations? 


On Tue, Sep 17, 2019 at 9:49 AM MSREDDY P <> wrote:
  • MoM of today's OCP SONiC call  9/17/2019.

Topics discussed
  • Firmware Utils  - MLNX
Review (Q & A):
  • Why don't leverage ONIE updater, what is the design rational behind the fwUtils? 
  • What is the significance of chassis? Does Sonic supports multiple chassis?
  • Can the design supports module level installations? 
  • Is the design support remote image path? yes
  • What are the supported methods to download images? remote url http/https
  • What about the image validations ? ex: compatibility between CPLD/BIOS etc..
  • Can user skip/install specific image version using the fwUtils? - you should use it manually [skip the fwUpgrade]
  • Can fwUtils supports scheduling of reloads after component updates? 

2019 Oct Release 
Checkout below for release tracking 


On Tue, Sep 10, 2019 at 9:10 AM MSREDDY P <> wrote:
  • MoM of today's OCP SONiC call  9/10/2019.

Topics discussed
  • Drop Counters HLD  - MSFT
Review (Q & A):
  • Does the design preserve the counters on warm reboots? No
  • Can the design reports the user if the drop counter is not supported with platforms? yes
  • List out the caveats with warm reboot cases. Ex: if the device went wrong after warm reboots, does the drop counters distinguish the failure reasons?
  • Do we have default settings for the debug counters with the device? No
  • Can the design provide any templates for the debug counters to configure it? 
  • Can the lifecycle (ex: clear)of these counters will not effect the existing counters? No
  • Can the design support logical/aggregate debug counters? 
  • Does these counters are ASIC independent? what platforms do you guys cover it?
  • Can this integrate with mgmt framework?


On Tue, Sep 3, 2019 at 9:22 AM MSREDDY P <> wrote:
  • MoM of today's OCP SONiC call  9/3/2019.

Topics discussed
  • BGP Error handling  - BRCM  
Review (Q & A):
  • Is there any perf impact on disable this feature: No
  • Data shows the RIB-in convergence performance degradation is 44%, it should be linear, but why is 44%? can it be improved? 
  • What is the scope of the QuickTests? Is it covered only happy paths alone? do you have numbers with non-happy path scenarios? 
  • does the QuickTest covers both Ipv4 or Ipv6 ? QuickTest supports mix scenario of ipv4 & ipv6 ? not yet done for pure Ipv6 routes, will be explored.
  • do you have any special handling for default route ? No
  • Does it supports any debug commands check the failed route ? yes
  • What is the reconciliation on daemon crashes (Ex: BGP)- how to reconcile the routes? Please list out the scenarios in HLD.
  • Can this feature turn-off on demand ? is yes, can this affect the system stability? 
  • PR -

Error Handling - BRCM

Review (Q & A):
  • Overall framework is thinking about two approaches - 1) Introduce Opaque ID to track the add-delete-add kind of error handling scenarios 2) Introduce an Sync SAI API in addition to current Async SAI API.
  • HLD is out for the community review.


    On Wed, Aug 28, 2019 at 7:19 AM MSREDDY P <> wrote:
    • MoM of today's OCP SONiC call  8/27/2019.

    Topics discussed
    • Dynamic Port BreakOut HLD - LNKD 
    Review (Q & A):
    • Can't SONiC query SAI API to fetch the break out capabilities?
    • Generic question : Why breakout support only per interface? why can' t it per device ? Platform's don't allow certain ports due to silicon issues or the feature is not ready to use the breakout port on this relesase
    • Can the breakout feature supports range of ports together? 
    • Can ASIC vendors support breakout on range/group of ports?
    • What does platform vendors do to support this feature?  It seems vendors should provide platforms.ini file.
    • Can this feature support the list of breakout supported to the user? 
    • Can breakout feature enforce lanes and aliases to the sonic application?
    • Why can't we define platform files per HWSKU?  
    • Can this HLD covers Platform LED feature?
    • How about the configuration validations during port-breakout ? Can this integrate with MGMT framework CVL lib? yes.


    • MoM of today's OCP SONiC call  8/20/2019.

    Topics discussed
    • MC-LAG HLD - Nephos 
    Review (Q & A):
    • Can MC-LAG support on sub-port interfaces?
    • Update scope of L2/L3 MC-LAG in HLD. 
    • Can MCLAG supports multicast? 
    • Do you have scale numbers w.r.t FDB/ARP/Route sync between MC-LAG failures? 
    • How can we isolate the packet flooding between MCLAG vs NON-MCLAG in same broadcast domain? 
    • Update HLD with test cases for MC-LAG failover (link/node level) scenarios?


    MoM of today's OCP SONiC call  8/13/2019.

    Topics discussed
    • Sonic management framework - BRCM & DELL
    Review (Q & A)
    • Can the click cli co-exists with mgmt-framework ? Yes.
    • Does mgmt framework support existing click cli commands ? yes, click based cli commnads will be migrated to klish based cli.
    • Can the click based cli deprecated ? No
    • Can the mgmt-framework supports the external AAA servers for authentication? pl add details to the HLD.
    • Add AAA auth failure work flow the REST SET work flow?
    • Does the mgmt framework handles the end to end error handling or feedback loop ? No, out of the scope.
    • Why are pulling telemetry container into mgmt container? We don't run multiple gNMI servers in SONIC, and requesting community to rename the sonic-telemetry server and make part of mgmt-framework.
    • Does output of click based CLI will be changed? 
    • Does the mgmt-framework supports the notion of start up config ? 
    • Does the mgmt framework supports the CLI show to reflect the configDB?
    • Can the mgmt-framework supports show running config ?
    • Feature timelines - the scope is proposing the mgmt framework and there will be seperate feature HLDs coming. 

    MoM of today's OCP SONiC call  8/6/2019.

    Topics discussed
    • Sub port interface design - Winda
    Review (Q & A)
    • How sub-port interface different from vlan interface in sonic? Ans: Vlan interface is a bridge port in sonic.
    • Rename dot1Q table ? - Since there is vlan interface table, dot1Q interface table is little confusing, community suggested go with sub-port/interface table.
    • How about separate sub-interface/port manager for sub-port interfaces?
    • Does sub-port feature use sonic-cli/direct native calls ? It uses linux iproute2 calls 
    • Do you expect iproute2 upgrades to support sub-port feature? No 
    • What is the use case of mtu with sub interface? 
    • Can sub-port interface support on port-breakout interfaces? 
    • Do you see any issues with naming convention w.r.t port breakouts & sub-ports?
    • Is there any limit on sub port interfaces? yes, refer scalability section [750 per switch]
    • Few question on sub-port functionality? If the packet entered untagged how does it route to sub-port interface?
    • what is the miss-policy support with sub-port interfaces ? could be dropped - debatable 
    • define behavior untagged and miss policy arrived to physical port? How Sonic process these packets?
    • Can physical & sub port interfaces shared same neighbor table or different ? 
    • Add section to the HLD for cross functional / port properties when port is layer 3/ layer 2 port? 

    • 201908 release - will be delayed 10/2019
    • please send out PR's to sonic mailing lists 
    • OCP Amsterdam [Europe]- End of Sept.

    MoM of today's OCP SONiC call  07/23/2019.

    Topics discussed
    • Debug framework design spec - BRCM
    Review (Q & A)
    • What is the impact on current show tech dump ? 
    • Can the framework support get the tech dump specified time slice/range ? 
    • Does framework support any schema for debug event triggers ? 
    • Where does this framework run, can user turn off? 
    • Will the framework exports debug data in Json format? 

    MoM of today's OCP SONiC call  07/16/2019.

    Topics discussed
    • Egress Mirror support and ACL action capability check 
    Review (Q & A)
    • Does this feature backward compatible? Yes [sonic - to -sonic ]
    • Is there any requirement for egress mirroring to have all packet modifications done in the mirrored copy? No such support.
    • What is the behavior if max egress sessions programmed? - Not a requirement 
    • If both ingress/egress enabled on same packet, do we see two mirror copies? Yes, might need a fix around it.
    • Does SONiC has any limit on supporting egress mirror sessions? - depends on ASIC limit
    • Does this design supports truncate the mirrored copy ? Does it a SONiC/SAI spec? Need to check 

    • SONiC Image Build Time Improvements (MLNX)
    Review (Q & A)

    • Is the design use parallel builds? yes, make use of all the cpu threads (12) 
    • How much build time improvements we can see if we discount kernel? - ~1 h (we build linux built in separate thread)
    • How is different Docker build kit from docker natived?- DBK is completely written for docker images and supports isolated users instead multiple users.

      • 201908 release tracking
      • Repurposing the sub-group meetings to design meetings.

      MoM of today's OCP SONiC call  07/09/2019.

      Topics discussed

      • PDE (Platform Development Environment) /PDDF (Platform Driver Development Framework)- BRCM
      Review (Q&A)
      • Is PDE specific to BRCM chipset? Not necessarily, who ever supoport SAI can use it.
      • What are the interfaces PDE provides for ASIC and platform? PDDF data driven framework (JSON APIs)& existing driver API's
      • Can framework allow vendor extensions ? PDDF supports vendor extensions
      • How to package PDE ?  PDE can be built along with full sonic image & dockers or individual docker
      • Will custom plugins (ex:BMC) could integrate with PDE? yes
      • Can we load PDE into multiple targets? possible 

      • PR reviews ownership - checkout the 201908 release tracking page

      MoM of today's OCP SONiC call  06/25/2019.

      Topics discussed

      • VRF design discussion  - Nephos (Jeffrey) 
      Review (Q&A)
      • How does VRF configures in Linux kernel? As of now, though there is a CLI wrapper, SONiC ultimately uses the linux NetLink calls. [Community has some suggestions - Liat may help here with our examples]
      • Questions on config_db migration script on VRF config migration? offline discussions would continue/PR feedback.
      • Design decision behind creating an empty interface INTERFACE|Ethernet0:{} in config_db ? Multiple things, 1) SAI 2) Code complexity behind the resource migration. etc. There is a section in the PR,  feedback can be provided.
      • There is a request on VRF ID adding besides interface name in the next hop? The decision seems we are going with minimal configuration to support the SONiC system design.
      • Can we safely assume VRF design supports later versions of Linux Kernel 4.9? Yes. 
      What next? 
      • PR discussion could be extended to next meeting based on the PR feedback. [Jeffery/Prince]

      MoM of today's OCP SONiC call  06/18/2019.

      Topics discussed

      • Error Handling  - BRCM (Santhosh)
      Review (Q&A)
      We had a great discussion, there are lot of inputs from community and here is some. Feel free to add missing comments here.
      • How does framework supports multiple CRUD failures?  
      [Ben]: See below 
      • Do you provide a knob to switch off Error handling feature? Is knob necessary? 
      [Ben]: No knob is necessary. The error handling proposal is a framework that is available for a) implementation of error reporting in SWSS on a feature-by-feature basis and b) application processing of such errors. Both a) and b) are implementation choices that can be made on an feature-by-feature basis. And if an application does not want to process a supported error, then it can just ignore it. 
      • Does the applications get out of order notifications from feedback loop? How to handle in the case of it? Ex: User does create/delete/create and do you expect the error feedback come in order? 
      [Ben]: The specific comment was that the key/values used to refer to APP_DB (or other) in an ERROR_DB report may not be specific enough to distinguish between different error events. The example given (by Nikos) was a route add-withdraw-add case - since the APP_DB table entry may be the same between the 2 adds, then, if there's an error report, how does the application (FRR in this case) know which of the adds failed? We will come back on this point. 
      • What is the design decision behind a new Error DB? Why can't we merge error attributes into APP DB? 
      [Ben]: We thought about both options, and decided that the ERROR_DB gave a bit more flexibility and avoided changing existing application tables. It was not a clear decision, but we see no reason to move away from it. 
      • What is the mechanism to synchronize route CRUD between APP DB vs new Error DB? 
      [Ben]: See above 
      • Is new Error DB is a mirror of APP DB? 
      [Ben]: Not really - but each error table entry points to a corresponding entry in another table (usually APP_DB) 
      • The current design mentioned an approach to stop propagate the failed/error routes to the neighbors? This may not right as per RFC, the routes should propagate though the it failed due to some policy. (Nikos)
      [Ben]: This topic went beyond scope of the framework (#1 above) and into the BGP doc (#2). We will setup a separate offline discussion for this.
      Overall feedback - The feedback loop is necessary to address SAI fatal errors. However the community requested the design should dis associate/de couple the feedback loop  as much as possible so that applications have freedom to react/handle it own way.
      [Ben]: That's exactly how it's setup today. 
      one option suggested - Framework should more generic and should accommodate opaque error context for the applications. 
      [Ben]: This is a different topic - see above ("The specific comment was that the key/values ....")

      Xin will extend an offline discussion on this topic, stay tuned.

      • SONiC Release 201908 tracking page - Xin can you post the link
      • Action Item for community - Signup for PR reviews

      MoM of today's OCP SONiC call  06/04/2019.

      Topics discussed
      • STP/PVST - Sandeep (BRCM)
      Q & A 
      • Can this STP feature compile time disabled? BRCM will explore this (compile time/run time options to disable/enable STP/PVST feature)
      • Warm reboot not supported for PVST? Community requested more details need to be added to design 
      • Multiple questions what is the design decision on why  STP states are not programming to Kernel?   Few questions: 1) With the current STP design - the STP states are not populating in kernel, ASIC and Kernel will be out of sync, what is the downside ?  2) Let's say Port/Vlan is not blocking in the kernel, but is blocked in ASIC, then what is the behavior with arp/ping/ospf in this scenarios ?  BRCM should document the scenarios.
      • Community requested to document the ASIC and Kernel out of sync scenarios - AI BRCM
      • There should be no drop if HW says forwarding? yes
      • Is there mechanism to program the states in to Kernel ? BRCM to explore on it
      • If the trap is configured on port which is blocked does the packet comes to CPU? yes, based on the trap configurations.
      • When port is blocked in HW, what are the packets should send? - HW shouldn't block L2 packets/LACP exchanges but drop L3 packets.
      • Can COPP program to trap to cpu ? Yes

      • HLD on NAT  - Kiran Kella (BRCM)

      Q & A 
      • Does it support payload/embedded headers (ALGs- application level gateways) support ? Not right now.
      • Continue discussion next sub group meeting. 
      • Next sub group meetings HLD on NAT, SFLow 

      MoM of today's OCP SONiC SUB GROUP call  05/28/2019.

      Topics discussed:
      • Status on MLAG Design discussions - Nephos Team

      Q & A 
      • Does this solution addressed L3 MLAG alone? Both L3 and L2. It seems L2 MLAG HLD need some updates.
      • Does MCLAG supports MulitCast? Nephos team will update the HLD with all the use-cases and missing pieces.
      • When is the next meeting to discuss on MCLAG ? June 11th
      • Community requested Nephos team for Updated MCLAG HLD before Jun 11th. 

      Action Items/Announcements
      • Will it be possible to discuss other than MCLAG in SUB Group calls ? Yes. Xin we will work and adjust to the cadence
      • Community requested to include/Update User Scenarios in HLDs for review
      • Ben Gale (BRCM) will propose on MCLAG next few weeks. 
      • Request community to review below MCLAG PR before next sub group meeting (06/11/2019)
      • Here is the PR and design presentation
        1.  MCLAG video -
        2.  MCLAG PR -

      MoM of today's OCP SONiC call  05/21/2019.

      Topics discussed
      • L2 - FDB/MAC enhancements - Anil (Broadcom)

      Q & A 
      • FDB aging per device ? yes 
      • Does FDB aging support per sec ? yes 
      • Can MAC aging support per port and VLAN ? Anil will add support to the proposal 
      • Design on restrict the warning logs on VLAN range feature support? Broadcom will consider this in the proposal [Aggregated log etc.]
      • Does this feature need  SAI support from vendors ? (no new SAI attributes), Broadcom will list SAI APIs using it currently for this feature.
      • How does Vlan range updates implemented? vlan range being consolidated at config_db and apply down to the hardware in single shot, no deletes and adds.
      • Do we have FDB type in the fdb entry ? yes [static vs dynamic] and will be displayed in show commands
      • How does FDB optimizations on topo/STP event flush ? out side of ASIC, in the case of broadcom flushes are quick.  
      • How does system wide fdb flush ? It should handled by SAI, by go over all the ports and Vlans, vendor specific. 
      • Community ask on MAC aging & MAC move scale numbers? Broadcom will add into the proposal 

      • BFD - Sumit Agarwal (Broadcom)
      Q & A 

      • Discussed on BFD implementations phase 1  & Phase 2. 
      • In BFD Phase-1 : BFD is part of BGP docker
      • In BFD Phase 2 : BFD will implement in Hardware. 
      • Can SONiC Users turn off if they don't want? yes through compile time, but community suggested don't run default, provide CLI to enable it.
      • How BFD works with warm reboots ? 1) planned warm reboot, users can update the BFD timers upfront 2) unplanned warm reboot BFD session will timeout before BGP timeouts. 
      • Can configure/control BFD timeouts on remote Bgp peers? Question from Nikos. Need discussion more.

      • More design reviews lineup for Aug 2019.
      • Provide feedbacks on PRs 
      • Watch out for bi weekly meeting on design proposals and reviews.
      MoM of today's OCP SONiC call  05/07/2019.

      Topics discussed
      • SONiC 201908 release Planning - 05/07/2019

      Q & A 
      • Need code review support for multi-db performance improvements - MSFT & AVIZ Networks
      • What is the scope of Error handling mechanism work by BRCM  - It covers SAI error surfacing and handling
      • What is the scope of Configuration validations - Open for design, current scope is use syslog mechanism to propagate the config errors.
      • What is the VRF feature planned in SONiC? it is VRF lite support not the MPLS. 
      • Do we have plan for multi-tenancy VPN with VRF feature? No, that would be handles separately.
      • When is the VRF lite design review - Expected 5/21
      • What is the ETA for dynamic breakout - Xin will work with LNKD
      • For dynamic breakout, is it possible to get ASIC vendor ETA ? Xin will talk to ASIC vendors [an ETA early July would help to test it]
      • Do we have a list of platform APIs ? refer PMON APIs
      • How to earn OCP credits for companies - Checkout the OCP website for how to get credits to such as software contributions etc.
      • Is sub-port feature is same as sub-interface ? yes 
      • What kind of features run on sub-port? No HLD yet, Jipan will come back with HLD on this
      • Can we have small description on sub-port ? Xin will work with Alibaba
      • When is the SAI proposal on sFlow? Dell working on the SAI proposal for sFlow and will send for design review.
      • What does SONiC side use for slow ? HSflowD, its a opensource package and need to check the licensing [Need to explore the licensing part, work with Xin]
      • Build improvements - experimental BRCM ? design review needed on the changes. Ben will provide a design review
      • What is Mgmt framework - Goal is to easily manage the sonic switch? [models, serialization, unified cli, gnmi]
      • What is the BFD for FRR used for - for BGP failures
      • Does BFD-FRR required SAI support ? No, for the current work, not using any SAI BFD APIs, will be using on next iteration.
      • Does SONiC official release support on ONL ? No, SONiC has tight roadmap next 8 months.

      • OCP events - events - road show  Taiwan, Beijing, India
      • SONiC next meeting 05/21/2019 
      • SONiC team will use Workgroup meetings other alternative Tuesday [Test workgroups & MLAG/L2 workgroups etc. ]
      APR release 
      • Redis performance - out of the apr release
      • CLI improvement - moved to next release
      • Any ETA for APR release stabilizations - need to estimate 



      Oct 22, 2019, 1:00:47 PM10/22/19
      to sonicproject,, Michael Schill, Xin Liu (CLOUD)
      • MoM of today's OCP SONiC call  10/22/2019.

      Topics discussed
      • VRRP (Virtual Router Redundancy Protocol)- BRCM
      Review (Q & A):
      • What is preventing not to support vrrp3?
      • How is it different from FRR VRRP support? do you get a chance to evaluate FRR VRRP stack? 
      • It would be good to list the out the possible use cases/deployments for the sonic user to enable to this feature? Can this feature work with data center MLAG kind of deployments ?
      • How does the uplink tracking works? for instance let's say there are more than 8 uplink interfaces how do we does it effects on mastership?
      • How to handle split-brain scenarios? 
      • What are the supported VRID ranges ? 
      Sub group on test framework proposal - starting tomorrow 8-9 AM PST 


      Oct 29, 2019, 12:40:01 PM10/29/19
      to sonicproject,, Michael Schill, Xin Liu (CLOUD)
      • MoM of today's OCP SONiC call  10/29/2019.

      Topics discussed
      • RADIUS - BRCM
      Review (Q & A):
      • Where does the Cached MPL (management-privilege-level)stored ?  It stored at protected file /var/run/radius
      • Can the framework support user change from tacacs+ to radius? 
      • What is the radius agent planning to use ? pam-radius 
      • How about user login's on device reboots? does it expect login failed/success? No, as soon as MPL cache preserved, users can logged in.
      • Can the MPL cache associate with TTL? No, right now we refresh the session on every user logging in.
      • There are 3 radius options [many-to-one = Y/N/A] discussed, what is appropriate for SONiC usage?  

      DPKG Caching Framework - BRCM
      • How do track/calculate the GIT hash for new files and dependencies from the internet?
      • Where does the deebug cache stored? 
      • Can this framework takes more time?  
      • How much memory the debug cache takes ? ~600MB
      PR is available, discussion will continue next week.



      Nov 5, 2019, 12:53:40 PM11/5/19
      to sonicproject,, Michael Schill, Xin Liu (CLOUD)
      MoM of today's OCP SONiC call  11/05/2019.

      Topics discussed

      • Ingress Discards Verification - MLNX
      Review (Q & A):
      • Can this discards provide drop reasons? 
      • What does it mean L2 drop counter? is it L2 ANY ? - Right now show what is available thru sonic cli.
      • Can this tests augment with debug drop counters? 

      • DPKG Caching Framework - BRCM (Will be continued next sub-group meeting)
      Review (Q & A):
      • How do track/calculate the GIT hash for new files and dependencies from the internet?
      • Where does the deebug cache stored? 
      • Can this framework takes more time?  
      • How much memory the debug cache takes ? ~600MB

      2019/10 release - PR reviews deadline - Mid of November 

      Reply all
      Reply to author
      0 new messages