Hi Team,
MoM of today's OCP SONiC call 06/18/2019.
Topics discussed
- Error Handling - BRCM (Santhosh)
Review (Q&A)
We had a great discussion, there are lot of inputs from community and here is some. Feel free to add missing comments here.
- How does framework supports multiple CRUD failures?
[Ben]: See below
- Do you provide a knob to switch off Error handling feature? Is knob necessary?
[Ben]: No knob is necessary. The error handling proposal is a framework that is available for a) implementation of error reporting in SWSS on a feature-by-feature basis and b) application processing of such errors. Both a) and b) are implementation choices that can be made on an feature-by-feature basis. And if an application does not want to process a supported error, then it can just ignore it.
- Does the applications get out of order notifications from feedback loop? How to handle in the case of it? Ex: User does create/delete/create and do you expect the error feedback come in order?
[Ben]: The specific comment was that the key/values used to refer to APP_DB (or other) in an ERROR_DB report may not be specific enough to distinguish between different error events. The example given (by Nikos) was a route add-withdraw-add case - since the APP_DB table entry may be the same between the 2 adds, then, if there's an error report, how does the application (FRR in this case) know which of the adds failed? We will come back on this point.
- What is the design decision behind a new Error DB? Why can't we merge error attributes into APP DB?
[Ben]: We thought about both options, and decided that the ERROR_DB gave a bit more flexibility and avoided changing existing application tables. It was not a clear decision, but we see no reason to move away from it.
- What is the mechanism to synchronize route CRUD between APP DB vs new Error DB?
[Ben]: See above
- Is new Error DB is a mirror of APP DB?
[Ben]: Not really - but each error table entry points to a corresponding entry in another table (usually APP_DB)
- The current design mentioned an approach to stop propagate the failed/error routes to the neighbors? This may not right as per RFC, the routes should propagate though the it failed due to some policy. (Nikos)
[Ben]: This topic went beyond scope of the framework (#1 above) and into the BGP doc (#2). We will setup a separate offline discussion for this.
Overall feedback - The feedback loop is necessary to address SAI fatal errors. However the community requested the design should dis associate/de couple the feedback loop as much as possible so that applications have freedom to react/handle it own way.
[Ben]: That's exactly how it's setup today.
one option suggested - Framework should more generic and should accommodate opaque error context for the applications.
[Ben]: This is a different topic - see above ("The specific comment was that the key/values ....")
Xin will extend an offline discussion on this topic, stay tuned.
Announcements
- SONiC Release 201908 tracking page - Xin can you post the link
- Action Item for community - Signup for PR reviews
Thanks & Regards,
-Madhu
Aviz Networks
Some additions below .... thanks
Hi Team,
MoM of today's OCP SONiC call 06/11/2019.
Topics discussed
- sFLOW - Padma Narayana (Dell)
Review (Q&A)
- How to support sFlow ratelimit? One option via CoPP
- is there any hsflowd/InMon license implications? Padmanabhan follow up
- Do we maintain separate repos for sFlow custom configurations? maintain SONiC Repo -
- [Ben]: Not quite - the proposal from Padman was to pull the code in from the 3rd-party repo at build time - same way as teamd is handled today. The question was whether we should instead bring a fork into SONiC so we can more easily make local changes (similar to how FRR is handled today). Padman's view was that the level of changes likely required would not justify that - accepted. Note that we can always apply SONiC patches at build time (again, as per teamd) if we have trouble getting any fixes applied upstream.
- What is the configurations need to expose hsflowd/InMon ? sampling rate, interfaces etc. Please add it into the HLD if not.
- Where do you use state db entries?AI Dell
- How to handle warm boot scenario ? please have a section in the HLD AI Padmanabhan
- Any recommendations on sFlow sampling performance?
[Ben]: sFlow Counter support is moving into Phase 1 (sFlow spec compliance issue)
[Ben]: Asked whether it was possible to change sFlow configuration without restarting hsflowd.
[Ben]: Agreed that per interface enable/disable is needed. However, per-interface sampling rate could come later
[Ben]: gennetlink details don't need to be configurable
[Ben]: sFlow will be a SONiC build option. However, the OrchAgent/syncd code will always be there
[Ben]: Discussion on whether SAI should use the generic psample driver (available in Debian 10) vs. using it's own netlink configuration. Padman said that this is a SAI decision. Cautioned that the generic psample driver may have a performance issue around non-zero copies.
[Ben]: All to focus on PR for further review
- HLD on NAT - Kiran Kella (BRCM)
Review (Q&A)
- Is this design supports ALGs ? Not supported
- How to support rate-limits? [Ben]: Answer was that NAT miss traffic to the CPU is rate limited using the existing CoPP feature. There will be no per-session "NAT hit" rate limit - not a requirement
- Requirement to gracefully handle flow TableFull scenarios
- What is the design to make kernel vs ASIC conntrack NAT entries are in sync?
- Do you have NAT scaling numbers? [Ben]: These are in the HLD (Broadcom HW). However the application is not limited. See also the Table full topic (and application awareness of scale)
- BulkAPI/Flex counters support for NAT flows? Both ways can be supported
[Ben]: The concern was over the performance impact of maintaining potentially thousands of counters. Rejected the idea of making this configurable (i.e. define which sessions are counted), but analyze the options to ensure that system performance is not adversely affected.
[Ben]: Concern over only tracking 2-ways of the 3-way TCP handshake - Guohan to take this offline
[Ben]: All to focus on PR for further review
- Next PR discussion - Error handling BRCM
[Ben]: Also 201908 feature status update
Corrected subject line with correct date.
Hi Team,
MoM of today's OCP SONiC call 06/04/2019.
Topics discussed
- STP/PVST - Sandeep (BRCM)
Q & A
- Can this STP feature compile time disabled? BRCM will explore this (compile time/run time options to disable/enable STP/PVST feature)
- Warm reboot not supported for PVST? Community requested more details need to be added to design
- Multiple questions what is the design decision on why STP states are not programming to Kernel? Few questions: 1) With the current STP design - the STP states are not populating in kernel, ASIC and Kernel will be out of sync, what is the downside ? 2) Let's say Port/Vlan is not blocking in the kernel, but is blocked in ASIC, then what is the behavior with arp/ping/ospf in this scenarios ? BRCM should document the scenarios.
- Community requested to document the ASIC and Kernel out of sync scenarios - AI BRCM
- There should be no drop if HW says forwarding? yes
- Is there mechanism to program the states in to Kernel ? BRCM to explore on it
- If the trap is configured on port which is blocked does the packet comes to CPU? yes, based on the trap configurations.
- When port is blocked in HW, what are the packets should send? - HW shouldn't block L2 packets/LACP exchanges but drop L3 packets.
- Can COPP program to trap to cpu ? Yes
- HLD on NAT - Kiran Kella (BRCM)
Q & A
- Does it support payload/embedded headers (ALGs- application level gateways) support ? Not right now.
- Continue discussion next sub group meeting.
Announcements
- Next sub group meetings HLD on NAT, SFLow
Thanks,
Madhu
AvizNetworks
Hi Team,
MoM of today's OCP SONiC SUB GROUP call 05/28/2019.
Topics discussed:
- Status on MLAG Design discussions - Nephos Team
Q & A
- Does this solution addressed L3 MLAG alone? Both L3 and L2. It seems L2 MLAG HLD need some updates.
- Does MCLAG supports MulitCast? Nephos team will update the HLD with all the use-cases and missing pieces.
- When is the next meeting to discuss on MCLAG ? June 11th
- Community requested Nephos team for Updated MCLAG HLD before Jun 11th.
Action Items/Announcements
Thanks,
-Madhu
AvizNetworks
Minor correction:
- Can FDB clear support per port and VLAN ? Anil will add support to the proposal
Thanks,
Anil
Hi Team,
MoM of today's OCP SONiC call 05/21/2019.
Topics discussed
- L2 - FDB/MAC enhancements - Anil (Broadcom)
Q & A
- FDB aging per device ? yes
- Does FDB aging support per sec ? yes
- Can MAC aging support per port and VLAN ? Anil will add support to the proposal
- Design on restrict the warning logs on VLAN range feature support? Broadcom will consider this in the proposal [Aggregated log etc.]
- Does this feature need SAI support from vendors ? (no new SAI attributes), Broadcom will list SAI APIs using it currently for this feature.
- How does Vlan range updates implemented? vlan range being consolidated at config_db and apply down to the hardware in single shot, no deletes and adds.
- Do we have FDB type in the fdb entry ? yes [static vs dynamic] and will be displayed in show commands
- How does FDB optimizations on topo/STP event flush ? out side of ASIC, in the case of broadcom flushes are quick.
- How does system wide fdb flush ? It should handled by SAI, by go over all the ports and Vlans, vendor specific.
- Community ask on MAC aging & MAC move scale numbers? Broadcom will add into the proposal
- BFD - Sumit Agarwal (Broadcom)
Q & A
- Discussed on BFD implementations phase 1 & Phase 2.
- In BFD Phase-1 : BFD is part of BGP docker
- In BFD Phase 2 : BFD will implement in Hardware.
- Can SONiC Users turn off if they don't want? yes through compile time, but community suggested don't run default, provide CLI to enable it.
- How BFD works with warm reboots ? 1) planned warm reboot, users can update the BFD timers upfront 2) unplanned warm reboot BFD session will timeout before BGP timeouts.
- Can configure/control BFD timeouts on remote Bgp peers? Question from Nikos. Need discussion more.
Announcements
- More design reviews lineup for Aug 2019.
- Provide feedbacks on PRs
- Watch out for bi weekly meeting on design proposals and reviews.
Best Regards,
-Madhu
Aviz Networks
Hi Team,
MoM of today's OCP SONiC call 05/07/2019.
Topics discussed
- SONiC 201908 release Planning - 05/07/2019
Q & A
- Need code review support for multi-db performance improvements - MSFT & AVIZ Networks
- What is the scope of Error handling mechanism work by BRCM - It covers SAI error surfacing and handling
- What is the scope of Configuration validations - Open for design, current scope is use syslog mechanism to propagate the config errors.
- What is the VRF feature planned in SONiC? it is VRF lite support not the MPLS.
- Do we have plan for multi-tenancy VPN with VRF feature? No, that would be handles separately.
- When is the VRF lite design review - Expected 5/21
- What is the ETA for dynamic breakout - Xin will work with LNKD
- For dynamic breakout, is it possible to get ASIC vendor ETA ? Xin will talk to ASIC vendors [an ETA early July would help to test it]
- Do we have a list of platform APIs ? refer PMON APIs
- How to earn OCP credits for companies - Checkout the OCP website for how to get credits to such as software contributions etc.
- Is sub-port feature is same as sub-interface ? yes
- What kind of features run on sub-port? No HLD yet, Jipan will come back with HLD on this
- Can we have small description on sub-port ? Xin will work with Alibaba
- When is the SAI proposal on sFlow? Dell working on the SAI proposal for sFlow and will send for design review.
- What does SONiC side use for slow ? HSflowD, its a opensource package and need to check the licensing [Need to explore the licensing part, work with Xin]
- Build improvements - experimental BRCM ? design review needed on the changes. Ben will provide a design review
- What is Mgmt framework - Goal is to easily manage the sonic switch? [models, serialization, unified cli, gnmi]
- What is the BFD for FRR used for - for BGP failures
- Does BFD-FRR required SAI support ? No, for the current work, not using any SAI BFD APIs, will be using on next iteration.
- Does SONiC official release support on ONL ? No, SONiC has tight roadmap next 8 months.
Announcements
- OCP events - www.opencompute.org/events/upcoming events - road show Taiwan, Beijing, India
- SONiC next meeting 05/21/2019
- SONiC team will use Workgroup meetings other alternative Tuesday [Test workgroups & MLAG/L2 workgroups etc. ]
APR release
- Redis performance - out of the apr release
- CLI improvement - moved to next release
- Any ETA for APR release stabilizations - need to estimate
Best Regards,
-Madhu
Aviz Networks
--
You received this message because you are subscribed to the Google Groups "sonicproject" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sonicp...@googlegroups.com.
To post to this group, send email to sonicp...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sonicproject/CADavDeaRoh2ApMWWpnOXRK0qxRUfGd_ioAk%2Bdpv%3D5afhNEV--A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "sonicproject" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sonicp...@googlegroups.com.
To post to this group, send email to sonicp...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sonicproject/CA%2BCxG%3D9_SB3QvemQyizHMbZWQSUa-EOLog_wqLucYY2SVug86w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "sonicproject" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sonicp...@googlegroups.com.
To post to this group, send email to sonicp...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sonicproject/CADavDebLwE8HAOuVEG4XVW_8%3D%2Bc%2B9oD-XexubN8eLwtrakXRgw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "sonicproject" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sonicp...@googlegroups.com.
To post to this group, send email to sonicp...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sonicproject/CANur7QTvXAjGnARUi2gF7tq3yPusD1%2B4ZqR0%2BPK1CDMFTc9FFA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.