Thanks, Madhu |
- MoM of today's OCP SONiC call 02/07/2023
Topics discussed.SwitchPort mode CLI and VLAN CLI management - xFlow
- How does HLD address the state transition graph between access/trunk mode, please add a section describing it?
- Describe how db migrator will address backward compatibility
- What will be the default vlan mode should be trunk
- How does design handle if someone modified config_db.json without sonic cli?
- How does the design handle failed vlan configuration? >> It seems if anything vlan config fails it stops and returns error without reverting the applied config
- If it is a routed port and ip address configured what is the transition process?
- How do you handle the backward compatibility issue with the yang model?
- If the port is marked as an access port, it is an access port forever?
- Update HLD with possible scenarios to support access, trunk and routed port?
- sonic-mgmt tests need to be modified accordinglyHLD should mention moving port from access to trunk mode?
- PR is out for review - https://github.com/sonic-net/SONiC/pull/912/files
Thanks,-Madhu
Thanks,
- MoM of today's OCP SONiC call 01/31/2023
Topics discussed.Resource Monitoring for Generic SAI Extensions- Intel
- How SAI is integrated into P4 path using extension table manager.?
- Are there any high & low watermarks available with extension tables ?
- What is a SAI extension table.? Is it used for regular paths or only for P4 program path.?
- How is the default watermark handled in SONiC currently.?
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1243
Misc
- Weekly meetings will be using Teams from Next week - https://sonic-net.github.io/SONiC/Calendar.html
-MadhuOn Tue, Jan 24, 2023 at 8:56 AM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 01/24/2023
Topics discussed.Reset Factory HLD - Nvidia
- How does this design handle RMA scenarios? ex: if RMA device or send device back to vendor wanted to delete all the data for security reasons?
- Why not use the ONIE option for SONiC devices to do an RMA? >> There is no unified experience for customers like ARM platforms don't use ONIE
- Does design handle bgp-frr-config which is stored in BGP docker mounted to /etc/sonic?
- How does the design handle the case where the box has two images ? How does the factory reset handle it?
- How is it different from brand new switch vs reset factory defaults? >> Can you add scope and use cases of this design in the HLD?
- Can this HLD handle a warm reboot scenario where synd has files mounted to sda3 ?
- How do use choose reset factory default using CLI? >> if don't enter any value that is default
- How does design handle user mgmt while factory reset?
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1231
Thanks,-MadhuOn Tue, Jan 17, 2023 at 2:50 PM MS Reddy <msreddy...@gmail.com> wrote:Thanks,
- MoM of today's OCP SONiC call 01/17/2023
Topics discussed.Clock management HLD - Nvidia
- Can the design set the time zones of all the docker services by setting this date and timezone?
- Can this design generate a syslog on date and time changes? Community advised to add syslog support for changing clocks.
- How does the design sync with the BIOS clock?
- Does the HLD take care of image upgrades & config migrations? Ex: During upgrade, the default value "ETC/UTC" cannot be assumed always because the system might have been configured with another timezone using timedatectl before upgrade.
- PR is out for review - https://github.com/sonic-net/SONiC/blob/54dd6e6c2b1db14460dbee44f635a5a5daebcf59/doc/Clock%20commands/clock_managment_hld.md
Update rsyslog HLD - Nvidia
- How does the design handles syslog config for dockers and base sonic image? is the HLD address it? >> No
- How does the design handle the inconsistencies of rsyslog.conf ex: on boot - config_db must be consistent with rsyslog config - how do update the rsyslog.conf (diff/overwrites)or you building from scratch.
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1218/files
General
- 202305 feature submission is closed checkout here - https://github.com/orgs/sonic-net/projects/8/views/2
- 202211 PRs any pending reviews - reach out @Yanzhao Zhang
MadhuOn Mon, Jan 16, 2023 at 5:04 PM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 01/10/2023
Topics discussed.Config Reload Enhancement HLD - Nvidia
- What criteria determine whether a service is critical or not ? Provide some guidance to the users.
- Describe how the new config reload design works with Zero Touch Provisioning cases?
- Shouldn't the existing waits for postInitDone be sufficient? How does the Feature table help?
- Do you have performance numbers with this design enhancement?
- Does design enhancement cover the low end platforms like 1G ARM?
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1203
Thanks,-MadhuOn Wed, Dec 14, 2022 at 8:33 AM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 12/13/2022
Topics discussed.User Management HLD - Nvidia
- Why do you need a build flag to enable this feature, why don't use the config db to control this feature? >> Suggestion to have run time config no build time flag.
- Suggestion to have CLI to manage new roles for user mgmt.
- Suggestion on Yang model - use leafref to /ROLE_TABLE/ROLE_TABLE_LIST/name, Instead of fixing the user roles to admin/monitor
- Using CONFIG_DB seems security risk for user mgmt, creds can be leaked while doing techsupport, please validate it.
- Is that user mgmt config saved to config db or directly written to /etc/passwd? >> written to config db
- Can the monitor role name be changed to "operator" role? >> what is the rationale having role name as "monitor"
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1048
Thanks,-MadhuOn Mon, Dec 12, 2022 at 10:05 AM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 12/06/2022
Topics discussed.Generic Hash HLD - Nvidia
- How to express ASIC HASH capabilities, does the SAI support it today ? >> yes, SAI query attribute capabilities will be used.
- The HLD is limited to SAI capabilities, if the vendor is not supported what will be the vendor implementation? >> the vendor must return API response as not_implemented in case certain parameters or pkt types
- HLD has an assumption configure Global, how does the user know whether all the pkt types being consumed the hash config or not by ASIC? Do you have an ASIC mechanism for that ?
- There are several items to hashed out with this HLD, Let's have a phased approach to adapt this HLD
- support current SAI and list out the HLD with the limitations
- enhance SAI proposal for feedback mechanism
- enhance SONiC with SAI proposal.
- Decision on generic hash global? - Everyone is ok with at least hash fields that can be configured globally
- How do users know how the hash is being used for different packet types (MPLS, IPinIP or VxLan)?
- What is the plan for qualifying these features ? >> Yes, part of the sonic-mgmt repo.
- How does it handle if ASIC supporters (more or less have SAI HASH fields )configured?
- It seems all the ASIC may not have capability to switch off/on hash fields configured not available in the ASIC, Is is there any default behaviour to pick up the hash fields calculations?
- PR is out for review - https://github.com/sonic-net/SONiC/blob/1235e84d925308eaf80b926fc802c832e7fb688b/doc/hash/hash-design.md
Thanks,-MadhuOn Mon, Nov 28, 2022 at 6:20 PM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 11/22/2022
Topics discussed.FIB Suppression Announcements of routes not installed Hardware - Nvidia
- What is the performance impact suppress-fib-pending feature enabled default in zebra?
- By having suppress-fib-pending is always enabled with zebra - What are the performance numbers with suppress-fib-pending enabled/disabled in SONiC
- Please make sure "suppression-fib-pending" command across the HLD to be consistent with the zebra frr stack.
- Measuring performance: - 10K routes test is not sufficient, keep in mind that typical leaf use case - at least 100K routes while measuring the performance tests.
- Is there any build flag for suppress-fb-pending with FRR stack? >> No, it;s just a configuration.
- Route check script is currently not supported for Vrf routes. This needs a change in route_check script to handle VRF, Application routes are only collected from default vrf.
- PR is out for review - https://github.com/sonic-net/SONiC/blob/bb09d8b6d3ae491b3bf81a8bd178e4093fe3c551/doc/BGP/BGP-supress-fib-pending.md
Thanks,-MadhuOn Wed, Nov 9, 2022 at 12:19 PM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 11/08/2022
Topics discussed.Portable Console Switch HLD - MSFT
- How does these Console device discovery work? What is an auto-detect feature for console devices?
- What are the guidelines for Vendors to conform with console devices support and discovery?
- How does the design support daisy chain the console devices? How does design get all the device information when daisy chained?
- If you daisy-chain 2 same console devices, and this model has 8 ports. How is the port name of 2nd console devices aligned with the physical port name?
- Define yang model for configurations and CLI's.
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1012
General:
- ONE Summit Nov 15 – 17 @Seattle
- No community meeting 11/15
Thanks,-MadhuOn Sun, Nov 6, 2022 at 8:42 PM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 11/01/2022
Topics discussed.Time based ACL HLD - MSFT
- What is the use case trying to solve by this HLD? >> The HLD will address the following security concern - the ACL is always active until someone removed and sometimes mgmt plane down can't be removed from data plane
- What are the timestamp formats supported by this HLD? >> is it epoch? What other formats does it support?
- Community suggestion - it would support multiple time formats as well as user defined time formats
- Is the HLD supports periodic timestamp based ACL ? >> It's in the roadmap
- Why can't the design reuse the existing ACL rule table to implement time based ACLs?
- Community suggestion - All the new config tables must have Yang models defined.
- Are there any SAI dependencies with this feature? >> No new SAI attributes introduced as part of this feature.
- How does the design manage if the device time is not synchronized with the management system?
- Can this design decouple the ACL implementation without depending on the time synchronization (NTP) ?
- Is there a mechanism to optimize time stamp based ACL rules instead of walking through all the time based ACL rules?
- What if the user has changed the switch time stamp, what will happen to the stale ACL rules, which make the system inconsistent? how to handle this?
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1078
Gentle reminder
- ONE Summit Nov 15 – 17 @Seattle
Thanks,-MadhuOn Fri, Oct 28, 2022 at 5:22 PM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 10/25/2022
Topics discussed.Generic Hash HLD - Nvidia
- Can the Design allow separate configuration for ECMP and LAG features ? >> yes, It is supported.
- Does the design support ASIC SAI capability check for pkt hash fields? >> Not in the scope of the design.
- Please add a section the scope of the Hashing techniques based on Vendor implementations.
- How does design handle encap/decap use cases (ex: some fields are not available for hashing or not)? Please provide a section about what fields are relevant for what use case?
- How does the design handle transit and terminating packets ?
- What is the behaviour with migration use cases ? ex: If the user doesn't provide any config what is the behaviour ?
- Is the design providing certain pkt fields can be masked out? ex: load balancing use cases.
- Add a section in the HLD - Provide guidance to use what is supported from Vendor ex: SAI capabilities.
- Add a section in the HLD - use cases and examples sections for each hashing technique.
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1101
General Updates
- ONE-SONiC Workshop | Linux Foundation Events is still open for registration. Please register and submit your topics. It will happen from Nov 15 – 17 @Seattle
- sonicproject google group is deprecated, soni...@lists.sonicfoundation.dev | Home is the new home of SONiC community
- SONiC202211 branch will be forked by end of this month per schedule, please accelerate the PR review process for the contributed features. Please let me know if you need help
- SONiC202305 is open to call for participation, please check [SONiC] Call for participation for SONiC 202305 release (sonicfoundation.dev)
Thanks,-MadhuGeneral Updates
- MoM of today's OCP SONiC call 10/11/2022
Topics discussed.Teamd Warm Restart - Tencent
- How does the design handle teamD dependency with various other sonic containers (ex: swss)? >> No, during warm restart SWSS will not restart.
- There is a patch for SIGUSR2 already being used by SONiC, shouldn't the HLD use a different signal to handle it?
- For instance - If there are any lag changes during this 3 sec, which module is doing the reconciliation logic ? >> existing teamD must be doing it
- The HLD talks about container management such as bkp_contianer which part of the code is managed? >> Can this be considered kubernetes mgmt?
- The sonic_installer docker-Upgrade used by current HLD is too generic; it must not be used for container management? any thoughts
- New docker must be brought up first before the old docker gets killed to make sure to make a before break approach. What are your thoughts on this? add into the HLD section..
- Please add a section about the use cases we are trying to solve warm reboot with teamD fast mode.
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1095
- Migrating to google groups
- 202305 release planning kick off start next week
- OCP summit 2022 next week OCT 18-20 (In person)
- ONES Nov15-17th https://events.linuxfoundation.org/one-summit-north-america/features/one-sonic-workshop/
On Tue, Oct 4, 2022 at 9:53 AM MS Reddy <msreddy...@gmail.com> wrote:General Updates
- MoM of today's OCP SONiC call 10/04/2022
Topics discussed.PSU_daemon design/Support PSU power threshold checking - Nvidia
- How is this PSU design different from chassis mgmt? Can chassis mgmt design be leveraged?
- Community suggest to please compare chassis PSU designs w.r.t PSU budgets and capabilities.
- Does the design introduce any new platform API's ? If yes please add a seperate section to describe it
- what is the difference between max power threshold vs critical power threshold? How have these values been determined and set up ?
- Community suggested to use Hysteresis graph to represent the critical and warning PSU thresholds
- Provide CLI command for users to overwrite hysteresis PSU thresholds to PSU daemon?
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1060
- 202211 release - 30th November
- 202305 release planning middle of this month.
- OCP summit discount code - SONIC20G2X
- REPO maintainer list elected.
- Meeting migration - > goto meeting to LF zoom meetings (LF account is free precondition to join the zoom meeting)
Thanks,-MadhuOn Mon, Oct 3, 2022 at 12:13 PM MS Reddy <msreddy...@gmail.com> wrote:Thanks,
- MoM of today's OCP SONiC call 9/27/2022
Topics discussed.PINS SONIC HLD for SAI Generic Extensions - Intel
- Is the new P4RT SAI extension objects (new json objects) being tracked by SAI REDIS?
- What is the PINS SAI path, how is it different from SAI? Please describe a section for the community audience.
- How does the design handle the NON PINS SAI path? Please describe the section in HLD
- The architecture diagram needs few modifications especially the orch managers(vlan, route) . How much is the design leverage from existing orchagent?
- Design describes that P4RT builds a dependency graph, why do need this? shouldn't the existing dependency resolution can't suffice?
- Can the dependency graph be used for regular SAI API?
- What is the need of table dependency mgmt for p4RT extensions?
- Please add sections describing examples for P4RT Generic extension and PINS work flow
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1088
-MadhuOn Tue, Sep 20, 2022 at 3:44 PM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 9/20/2022
Topics discussed.Port Profile Init HLD - Nvidia
- Since HLD makes SAI changes, can HLD handle backward compatibility? >> yes
- How much performance gain we get with bulk api? Do you have performance numbers with create_switch() vs new bulk ports API from orchagent?
- What is the requirement not to create switch_ports() by SAI ? shouldn't we support both legacy as well as bulk? ex: Some platforms 10/25G ports don't have dynamic port breakout, for that case static or legacy mode seems more efficient.
- Can design support switch level bulk api support capability instead of implicit assumption by checking API to null?
- What is the error propagation with bulk API ?
- Why don't define or use bulk SET API's instead individually to set properties?
- PR out for review - https://github.com/sonic-net/SONiC/pull/1084/files
Thanks,-MadhuLow agenda, cancelled weekly meeting.Thanks,MadhuOn Sep 6, 2022, at 8:23 AM, MS Reddy <msreddy...@gmail.com> wrote:Thanks,
- MoM of today's OCP SONiC call 9/06/2022
Topics discussed.California Law/ Change the password on first boot - Nvidia
- Does this HLD mandate password change on every reboot? >> on first boot
- When password expires, is the new password will be old? >> Debian checks old and new hash, if there is no change ask for password change (hardening rules will not effect)
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1077
-MadhuOn Tue, Sep 6, 2022 at 7:36 AM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 8/30/2022
Topics discussed.HLD change teamD timer expiry- MSFT/Krishna
- How the design handles (changing etherType with LACP packets) when SONiC device connected to non SONiC? >> The expectation is to drop/discard the packets at non sonic devices
- Do you evaluate any alternative mechanisms to support Non SONiC devices? >> Will check
- What is the warm reboot timeout >> 90 sec
- How SONiC admin know the devices capable of new teamD patch?
- How does design work if some of the devices do not support the capable teamD patch? >> what is the behaviour with LACP new retries packets sent?
- Can design support user configurable LACP retries count can be added ? >> Current plan is hard coded
- Does the design send these new lacp packets to all the member ports or per LAG? >>it must be sent to each port.
- Is the design considers ICCPd VLT case, how does it work? >> it must run the same version of SONiC, in general if it is a non sonic supported version, there is NO-OP
- Why should we send actor/partner info in these special packets? >> Will see to optimise it
- What is the trigger for restarting the timer on receiving the first special packet on every lag member or one lag member? >>
- How do I optimize the peer ACk mechanism?
- How does ASIC vendors treat these special pkt with dest mac + ether type >> depends on the vendor
- what is the fall back mechanism in case of special pkt/retry count reset back to default drop? >> There is no notice, add this case.
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1073/files
Thanks,-MadhuOn Tue, Aug 23, 2022 at 8:58 AM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 8/23/2022
Topics discussed.S3IP Sysfs- Tencent
- How is this design different from Linux sensors' implementation? >> This design will re-use the sysfs path exposed by the linux.
- This design looks similar to the PDDF framework. Do you get a chance to study? Community suggestion - Add section to discuss about the differences S3IP sysfs, PDDF frameworks explains about the additions.
- What is the plan for vendors to adapt or migrate to this design? >> refer to the porting guide
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1068/files
Thanks,-MadhuOn Thu, Aug 18, 2022 at 2:29 PM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 8/16/2022
Topics discussed.Syslog rate limit design- Nvidia
- why not this design reusing hostcfgd for the host side configuration? >> can be used, hostcfgd is a new feature, not available when we started it. Nvidia will pursue this
- Community is looking for a daemon based approach to listen to config_db for all the changes dynamically ex: dhcp_relay, hostname
- Does this design support ratelimit runtime configuration(without service restart)?
- Shouldn't we also make it configurable for rsyslog to suppress duplicate messages?
- Can this design handle - lose the syslog message when the rsyslog daemon restarts?
- should there be an option to specify immediate restart ? or next boot ?
- Is this design dependent on app extension to be used? >> Nvidia confirmed app extension is not dependent here, this is how to expose functionality with app extension
- Nvidia action items
- Examine - should add a daemon or not at the container side ? hostcfgD w.r.t the app extension design
- Where to handle syslog config host -> all the containers vs docker level?
- Evaluate - what is right ? adding a daemon / critical resource mgr to containers to listen to config_db changes ?
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1049
Thanks,MadhuOn Thu, Aug 11, 2022 at 1:37 PM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 8/9/2022
Topics discussed.Persistent log level design- Nvidia
- What is the performance impact with Log level DB moving to ConfigDB? Do you consider low mem/cpu platforms? >> yes
- Design mentioned not supported downgrade, what will be the log level ? >> default
- How does it handle warm reboot scenarios? >> boot with default log level
- PR is out - https://github.com/sonic-net/SONiC/pull/1041/files#diff-fb61ccb3665c94363d1eb8cbc8cc1ba5b711528c6e323f4eb63cea05e65c7fb6
Static Lag support - Celestica
- Why static lag design default with RoundRobin, it must be load balancer- teamD has CPU 100% >> Madhukar has fix for 100% CPU set static/dynamic to same metric - load balance
- There shouldn't be any lag design change between static vs dynamic? >> static lag must work with default alg, only runner must be different.
- TeamD shouldn't be aware of LACP runners on static lag configuration. ?? Yes, take care of it
- Does the design expose static ALG API? >> no
- What is the default mechanism for the port channel? It must be LACP. >> yes
- Static lag design supports incremental config? >> No, lag must be deleted and readded.
- yang model should support for change mode to static -> dynamic
- Does this design platform agnostic? >> Yes
- Is there any SAI support expected ? >> No
- Yang model supported? >> not yet, please review and add it.
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1039
Thanks,-MadhuOn Tue, Aug 2, 2022 at 9:02 AM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 8/2/2022
Topics discussed.OpenPoE presentation by - Shasta Cloud (Steve Martin / Doron)
- Are these OpenPoE platforms OCP certified? >> Yes, more platforms in the pipeline.
- Does OpenPoE support larger or ToR form factors such as 48 1G PoE switches? >> yes, OpenPoE supports small/large form factor PoE switches such as RealTek, Alicat etc..
- Is OpenPoE cloud controller multi-tenancy built in natively? >> CloudSDK is available
- Can this OpenPoE SDK extend to private 5G/private LTE or Wifi? >> Can be extended.
- What are the challenges to run SONiC onto OpenPoE switches? >> memory is the constraint.
- What is the plan to validate SONiC onto OpenPoE switches? >> Community suggested hardening a single platform, setting up the processes so that could be easily extended to other platforms as a reference.
Thanks,-MadhuOn Tue, Jul 26, 2022 at 2:49 PM MS Reddy <msreddy...@gmail.com> wrote:General:
- MoM of today's OCP SONiC call 7/26/2022
Topics discussed.Secure Boot & Secure Upgrade -Nvidia
- What stage of secure boot image verification occurred? >> Before installation
- Does secure boot/upgrade covers warm reboot ? >> Yes
- Does SONiC support TPM 2.0 ? >> It is a hardware feature not supported with SONiC - secure boot
- Is measure boot supported? >> it is altogether a TPM related feature and not supported with secure boot
- Does the image certification management in scope of this HLD? >> Not scope in the scope
- How to disable the feature ? >> can be done from BIOS/hardware must support
- Is there a way to check the integrity of the device installed with the image. >> secure upgrade HLD
- Secure upgrade shouldn't and break warm/fast reboot. >> yes
- Secure Boot PR is out for review - https://github.com/sonic-net/SONiC/pull/1028
- Secure Upgrade PR is out for review - https://github.com/sonic-net/SONiC/pull/1024
- Repo migration is in progress target 8/9.
Thanks,-MadhuOn Tue, Jul 19, 2022 at 8:39 AM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 7/17/2022
Topics discussed.SRv6_uSID - Sitanshu Shah/Intel
- What is the plan for uSID support with SONiC FRR? >>FRR 8.1 has basic SRv6 support and uSID will be available with later releases.
- What will be the roadmap for the uSID feature in SONiC ? >> This feature will be available only after Nov release
- What is the limit for uSID block? >> there is no limit, if the network requires more than 16 bits, it is allowed.
- PR is out for review - https://github.com/sonic-net/SONiC/blob/216ee5d0b97da1a88502afcbc28bf3a4a0f15f01/doc/srv6/SRv6_uSID.md
General.
- Repo migration to LF - Pilot program (90%) - MSFT
- 202205 releasing - there are few issues - in progress.
- There is a slot Aug 9th available for HLD review, please plan for it if pending reviews are to be done.
Thanks,-MadhuOn Tue, Jul 19, 2022 at 8:02 AM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 7/12/2022
Topics discussed.
- No Meeting
Thanks,-MadhuOn Fri, Jul 8, 2022 at 1:56 PM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 6/28/2022
Topics discussed.Bulk SONiC Counter Support - Junchao (Nvidia)
- How does it handle it if all the counters do not support the bulk API? >> Ex: queue counters support bulk where ads PG stats not, it falls back to non bulk API
- Is the counter removed from the counter group on error? >> yes it is, no change in existing logic. Is the same behaviour supported with bulk API ? >> yes
- Please add a review and feedback in case ASIC vendors have questions in implementing this HLD.
- PR is out for review - https://github.com/Junchao-Mellanox/SONiC/blob/e55d24126d046eb003fedae5f439cf82eea9f239/doc/bulk_counter/bulk_counter.md
General
- Repo migration to sonic-net (LF) (Pilot) - in progress
Thanks,-MadhuOn Tue, Jun 28, 2022 at 10:40 AM MS Reddy <msreddy...@gmail.com> wrote:Thanks,
- MoM of today's OCP SONiC call 6/28/2022
Topics discussed.SONiC gNMI server interface design - GangLy/MSFT
Why introduce a new container instead of reusing telemetry? Telemetry container has been used for config purposes by some applications today, can telemetry container be renamed and continue to be used? If a new container is required, what is the backward compatible/migration plan?
Looks like admins can directly write to AppDB with gNMI API w/o using Application, this may result in data consistency issues? for security concerns, someone bypasses some checks and directly operates the DB. How to mitigate?
Will gNMI expose the capability info like which table can be changed, which one can not?
- Why don't use OC models instead sonic yang models for set configuration? >> There will be some SDN applications like DASH/PINS that don't have OC models support. Goal is support various models & interfaces
- Community suggested can have multiple containers as long as there is clear separation of ownership, if not bring in more containers endup up operating on the same data.
- How does the design handle config application errors?
- PR is out for review - https://github.com/sonic-net/SONiC/pull/996/files#diff-32984e8e053ae4121b728f8d0c5b034daf4c4bf82a49d0c4a83d513d4dec2618
-MadhuOn Tue, Jun 21, 2022 at 8:59 AM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 6/21/2022
Topics discussed.Packet_I/O Update - Don/Brian
- Why can't use the sflow framework to program ASIC for pkt_in/Out, what are the challenges with sflow?
- What use cases are we trying to solve this HLD? Please list down in the HLD.
- How is this design supporting different ASIC's? must be discussed in the SAI forum for SAI related extensions.
- Maintain getNetLink library a separate repo out of swss-common ? the community agreed to it.
- Please list down any requirements such as rate_liit/copp or latency issues implementing this design?
- PR is out for review - https://github.com/sonic-net/SONiC/pull/978
Note: Found that community recordings are not being posted since Feb, kindly post and make it available for the team.Thanks,-MadhuOn Tue, Jun 14, 2022 at 9:49 AM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 6/13/2022
Topics discussed.
- 202211 features (~40) discussed, some of them are backlogs from previous releases, watch out for other emails from Yanzhao.
Yanzhao Zhang / Ying Xie - the recent recordings from March are not available here - https://www.opencompute.org/wiki/Networking/SONiC. Is there a new location? Please help with this?
Thanks,-MadhuOn Mon, Jun 13, 2022 at 12:57 PM MS Reddy <msreddy...@gmail.com> wrote:Yang schema based event stream via gNMI - MSFT
- MoM of today's OCP SONiC call 6/07/2022
Topics discussed.Platform Integration Testing (PIT) - Alibaba
- PDDF system helps ODM vendors to implement the platform API's in the right way whereas PIT system will help SONiC Users to validate/test the platform API's.
- How does the PIT system handle different vendor specific details such as Led colors etc? Ans>> PIT systems handles these use cases by Per platform configuration
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1005
- How does the design handle BGP operations and events ? Yang models will be discussed in the Yang sub workgroup .
- PR is out for review - https://github.com/sonic-net/SONiC/blob/a1a9f2c4bca0ec4c9d0fe8f8b51222278ecdb745/doc/event-alarm-framework/events-producer.md
Release Updates:
- 202205 is out and Yanzhao shared detailed email, watch out for release notes.
- 202211 is in planning - 202205 deferred features will be picked up and open for new feature submissions.
Thanks,-MadhuOn Mon, Jun 6, 2022 at 1:03 PM MS Reddy <msreddy...@gmail.com> wrote:Thanks,
- MoM of today's OCP SONiC call 5/31/2022
Topics discussed.Active-Active dual ToR HLD - MSFT (Jing)
- Continued last week's PR review.
- Discussed grpc component architecture overview used to determine link health and active-standby transitions and its decision table.
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1005
Platform_Integration_Testing - Alibaba
- Why do I need PIT (Platform Integration Testing)? How is it different from PDE (Platform Development Environment)?
- How can communities collaborate? describe clear ownership who developed what in the PIT architecture puzzle, let us community know how to contribute it?
- PR is out for review - https://github.com/clarklee-guizhao/SONiC/blob/pit/doc/pit/Platform_Integration_Test_high_level_design.md
- PR will be discussed next week as well.
-MadhuOn Tue, May 24, 2022 at 9:35 AM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 5/24/2022
Topics discussed.Active-Active dual ToR HLD - MSFT (Jing)
- What is the convergence/down time expected to set up tunnels on server link failure? 10-20 mSec >> tunnels are ip-in-ip and being pre-setup and convergence calculated 10-20 msec measures the time taken to detect the link and re-route traffic. This depends on various scenarios, and will be listed down in the test section.
- How is it different from the Y-cable solution? Can you add a section to list down use cases for active-standby vs active-active
- Can you add a section to describe the gRPC high level design? Is it a gRPC open implementation or does it need to understand any nuances?
- How are these ip-in-ip tunnels being established ? Is it a full mesh or selective ? >> it is selective between two TOR's, a kind of alternative to ICL
- Why choose a Link Prober? why can't a BFD >> BFD seems to have challenges with server side.
- Please have section in HLD describes server requirements for Dual TOR design
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1005
- Next week will be continued for 10-20 mins discussion on this
- For Active-Standby design please check OCP Summit 2021 session Dual TOR use case for Single NIC servers - YouTube (Slideshow)
Release Update:
- 202205 release not cut yet due to few issues.
Thanks,-MadhuOn Wed, May 18, 2022 at 9:11 PM MS Reddy <msreddy...@gmail.com> wrote:Syslog Source IP configuration HLD - Nvidia (Nazaril)
- MoM of today's OCP SONiC call 5/17/2022
Topics discussed.SystemD bootchart Integration - Nvidia (Stepan)
- What is the major use case/motivation for Nvidia to bring this tool ? >> Nvidia identified performance degradation during sonic boot time, boot carts help to find out the root cause for it.
- What are the use cases of this tool? Can this tool be used during production?
- What are the minimum requirements (Mem, CPU, Disk space)to support SystemD boot charts? >> disk space required - 128 KB
- How big are the generated SVG files? >> ~10 MB
- Can this feature support only boot time? Can we analyse features runtime? >> Binary must be installed during compile time and CLI can be used to use it for run time.
- How about the use case image upgrade w.r.t boot charts configurations ?
- Does this feature measure dockers/micro services ? >> yes
- Does this design publish performance numbers ? >> Not yet
- Is this feature part of built time flags? >> No, can be added.
- Is there any evaluation report for tool selection?
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1001
- What is the use case of having source IP configured?
- why can't the existing design where it selects an interface for syslog to connect is not sufficient?
- PR is out for review - https://github.com/sonic-net/SONiC/pull/1002
General
- 202205 release branch not yet cut - watch out for the communication
- 202211 release plan is on;
Thanks,-MadhuOn Tue, Apr 26, 2022 at 9:48 AM MS Reddy <msreddy...@gmail.com> wrote:Thanks,
- MoM of today's OCP SONiC call 4/26/2022
Topics discussed.DASH SAI PTF - INTEL
- How is this DASH proposal different from existing PTF frameworks Ex: Spy Test framework? >> this proposal for data plane testing based out of DASH (Disaggregated API for SONiC HOST)
- Where do you get the DASH proposal ? Is there any workgroup discussing it?
- What is the scope of the DASH SAI PTF proposal? Is this proposal for functionality testing or does it cover scale as well?
- How about underlay testing with DASH ? Is this proposal considered the underlay - >> Guohan suggested working with Prince on this item.
- Pdf is out for feedback - https://github.com/reshmaintel/DASH/blob/main/doc_SAI-Proposal-SAI-PTF.md
- Do you have a plan to raise PR to review this proposal?
-Madhu
- MoM of today's OCP SONiC call 4/19/2022
Topics discussed.Fast-reboot Flow enhancements - NVIDIA
- How is this feature different from existing warm/fast reboot sequences ? What is the performance/downtime improvement here?
- How does this design measure the control plane's downtime ? Ex: is it considered ports, lags, routes, VRF's etc?
- Does this feature work with existing control plane assist with warm reboot? Reference:https://github.com/Azure/sonic-utilities/blob/3ff68c4e5287ab2f5d23c23176ebd75a4f629bf0/scripts/neighbor_advertiser
- Do you have benchmarks to control/data planes or downtime w.r.t configurations?
- How does the design handles image upgrade use case w.r.t schema updates? >> Existing db_migrator does take care of config_db & app_db for image upgrade schema changes
- What is the restoration logic to be handled in orchagent? >> The logic makes sure all the do_tasks are completed without any items in the queue.
- Does the new fast fast-reboot design support bulk-api ASIC ? >> yes
- Is the bulk-api support from ASIC mandatory or choice to use fast fast reboot feature ? >> not mandatory that ASIC must support bulk API to get fast fast reboot feature
- PR is out review - https://github.com/Azure/SONiC/pull/980
Project announcements:
- SONiC moved to LF - Software for Open Networking in the Cloud (SONiC) Moves to the Linux Foundation - Linux Foundation
- Note: There is no change on our 202205 release plan.
Thanks,-MadhuThanks,MoM of today's OCP SONiC call 3/29/2022Topics discussed.DSCP/TC Remapping for Tunnel traffic HLD - MSFT
- Yang models - the table names not used anymore, the yang model definitions need to update the HLD and commit.
- Is the design DSCP/TC remapping is vendor specific, as some vendors don't need to remap as the inner header will be used for mapping? Yes.
- HLD should be updated with various vendor support for remapping DSCP/TC remapping.
- PR is out for review - https://github.com/Azure/SONiC/pull/950
-MadhuOn Wed, Mar 23, 2022 at 7:59 PM MS Reddy <msreddy...@gmail.com> wrote:MoM of today's OCP SONiC call 3/22/2022Topics discussed.SONiC OpenSSL FIPS 140-3 HLD - MSFT
- Is the SymCrypt library FIPS compliant or certified? >>> it is certified, MSFT & security team submitted the validation, certification number is not received yet.
- Do you have performance numbers with the SymCrypt SSL engine?
- Does this feature support the build time option? yes
- PR is out for review - https://github.com/Azure/SONiC/pull/955
CONFIG RELOAD ENHANCEMENT- EDGECORE
- What are the real use cases that will address this HLD? Please list it down in HLD.
- How does the existing "cfg reload" design differ from it? How does it benefit the admin?
- SONIC Currently doesn't allow two "cfg reload" commands parallely? How does this design benefit the admins? What is the problem? Does this design address it ?
- PR is out for review - https://github.com/Azure/SONiC/pull/964
Thanks,-MadhuOn Tue, Mar 15, 2022 at 8:36 AM MS Reddy <msreddy...@gmail.com> wrote:MoM of today's OCP SONiC call 3/15/2022Topics discussed.
- How much performance gain/ saving time using batched requests API - 8%
- How does the batch write API handles in case of a process crashed ? In case a consumer crashes, Redis queues up the requests and the process should be able to consume it.
- How does the batch process API handles the scenarios priority inversion Ex: Orchagent is single threaded, can be blocked due to a high priority task, (Link up/down/ PFC storms need to react quickly, popping routes) - Wu will look into it.
- Is there a limitation to the size of the batch?
- How does it handle the writes failures in a batch? Explain redis transaction vs how producers and consumers behave in case of failure scenarios?
- What use case does this batch API can be used for ? Please list out the use cases in HLD.
- How did you test this API? Do you have any performance numbers?
- PR is out for review - https://github.com/Azure/SONiC/pull/959/files
- General: Community must focus on 202205 SONiC May release for feature delivery.
Thanks,-MadhuOn Tue, Mar 8, 2022 at 8:29 AM MS Reddy <msreddy...@gmail.com> wrote:MoM of today's OCP SONiC call 3/08/2022Topics discussed.
- Discussed on 202205 release feature list and did house cleaning activities.
- We have Identified the owners to migrate docker images to BullsEye.
- Watch out for updated xls from Ying/Zhang and contribute for missing docker packages and update the ownership and reviewers list.
Thanks,-MadhuOn Tue, Mar 1, 2022 at 9:38 PM MS Reddy <msreddy...@gmail.com> wrote:Thanks,MoM of today's OCP SONiC call 3/01/2022Topics discussed.
- How much build time reduction is seen with new improvements? Currently it is measured for one file change, it just takes 5 min to build, however more tests need to be done, more details refer to the HLD below.
- Does the sonic-build-system improvements support ARM builds both 32 & 64 ? yes it is applicable, however tested only for AMD
- Can single user support to issue more than one builds simultaneously? yes it is.
- PR is out for review - https://github.com/Azure/SONiC/blob/9bc83902da0ae1a5db01713a5d0a0611fe876897/doc/sonic-build-system/build-enhancements.md
-MadhuMoM of today's OCP SONiC call 2/22/2022Topics discussed.
- Is the CLI to get current memory usage of the container ? yes docker stats provides that
- If memory condition persists in monit , do u generate tech dump again again? - No generate once by special instruction from Monit
- Are these memory threshold users configurable ? yes and the 200 MB is default available memory
- what kind of report is this tech support? Is there a way to provide a summary report?
- could we support multiple thresholds ? start at 60%, jump at 80% collect one more tech support?
- Is this memory leak or memory thresholds for a container being reported by. syslog today? yes by the Monit process.
- PR is out for review - https://github.com/Azure/SONiC/blob/669409c18d32db90adb92486a1d877c176fb356a/doc/auto_techsupport_and_coredump_mgmt.md
Thanks,-MadhuThanks,MoM of today's OCP SONiC call 1/11/2022Topics discussed.
- Are these passwd rules/policies mandated for REST/HTTP users? sure, will be added to the design.
- Is the passw hardening supported for remote users? No, only local users.
- Is pam_cracklib FIPS compliant? Is it using the open ssl for encrypt/decrypt?
- Is this code part of sonic_mgmt repo ? yes
- Is the feature enabled default? No will be included by compile time, and choose by run time
- Is the passwords can be rotated ? not part of this design, can be thought through the use cases
- Is it possible to provide an informational log for the users/applications about the passwd expiry? will be included..
- How does it handle switch image upgrades w.r.t passwd hardening?
- PR out for review- https://github.com/Azure/SONiC/blob/8edc92e2139d1fd2b7a088396877281116717830/doc/passw_hardening/hld_password_hardening.md
-MadhuOn Thu, Dec 2, 2021 at 5:39 PM MS Reddy <msreddy...@gmail.com> wrote:MoM of today's OCP SONiC call 11/29/2021.Topics discussed.
- Discussed on 202111 community release fork date (11/30/2021 PST)
- Release planning what is in and what will be moved to the next release
- There is a Xls from Zhang for more details marked Yellow & Red.
- Features marked with Yellow need be handled by today with priority
- Features marked with Red will be moved to next release
Thanks,-MadhuMoM of today's OCP SONiC call 11/02/2021.Topics discussed.
- SONiC SAI Challenger - SAI Testing by PLVision.
- Slides will be published soon by PLVision.
Thanks,-MadhuMoM of today's OCP SONiC call 10/19/2021.Topics discussed.Had 202111 release features review, stay tuned for the update.Thanks,-MadhuOn Tue, Oct 12, 2021 at 8:30 AM MS Reddy <msreddy...@gmail.com> wrote:Dynamic policy based Hashing- by NvidiaMoM of today's OCP SONiC call 10/12/2021.Topics discussed.NVGRE - by Vadym/Nvidia
- Does the design provide capability checks for tunnel resources from vendors (no.of NVGRE supported)? Ans>> No.
- How about vNet routing support on NVGRE tunnels? Not supported, this feature does encap/decap tunnelled packets.
- PR is out for review - https://github.com/Azure/SONiC/pull/869
Note: Today was the last HLD discussion for the 2106 release.
- Not ready, will be postponed to the next release.
Thanks,-MadhuSystem Ready Enhancements - by Senthil Kumar Guruswamy
- MoM of today's OCP SONiC call 10/05/2021.
Topics discussed.CMIS Diagnostics - by Dante Su
- There is SFP Refactoring, how does this design different from that effort? >> Debate: This solution will coexist with sfp refactoring efforts, however later will merge into SFP refactoring.
- Are there any impacts with current SFPUtil show commands with new additions ? No impact, there will be new application advertisement, pl refer the CLI section for review
- PR is out for review - https://github.com/Azure/SONiC/pull/876
- How is this different from the current Monit feature ? Ans>> The Monit summary provides platform status such as LED etc.. it shows the running status of the container not the application readiness
- Does this design consider Application readiness vs liveness ? Readiness means all the dependent modules up and application ready to serve the traffic, how about the application hogs on memory cycle or runs out of threads not able to service requests? will it be possible to include liveness capability into the design?
- https://github.com/Azure/SONiC/pull/875/files
Miscellaneous - by XIn
- OCP date - Nov 9th- 10th
- OCP Schedule will be published on OCP website
- What is the mode of OCP workshop - virtual
- SONiC/ SAI Workshop - Tech Talk / Contribution / Proposal / Innovations - Let's plan for immediate after /next day OCP Nov 9th-10th
Thanks,-MadhuOn Tue, Sep 28, 2021 at 9:01 AM MS Reddy <msreddy...@gmail.com> wrote:Topics discussed.Host Interface counters - MLNX/Chen
- Can this design support packet drop counters due to DDOS attacks? a few options: 1. You can attach a policer to the drop counter. You can get flow counters from policer stats>> Chen will look into it.
- PR is out for review - https://github.com/Azure/SONiC/pull/858
Guidelines for reference proprietary code - John/Metaswitch
- Is this design proposal applicable only to Metaswitch? >> No
- How about every company wanting to add their own routing stack, what is the recommendation? >> application extension model would be a great fit here.
- Reference to check - https://github.com/vadymhlushko-mlnx/SONiC/blob/fdb2cae32421affba8a3cec3fda0fee40c091708/doc/cli_auto_generation/cli_auto_generation.md
- PR is out for review - https://github.com/Azure/SONiC/pull/860
Thanks,-MadhuOn Tue, Sep 21, 2021 at 8:58 AM MS Reddy <msreddy...@gmail.com> wrote:MPLS TC_to_TC_map HLD - Alexander (Metaswitch)
- MoM of today's OCP SONiC call 9/21/2021.
Topics discussed.SONiC TACACS+ HLD - Hua Liu (IPAM)
- Is the design supported to show the list of authorised commands ? >> No, the list of commands managed by tacacs server.
- Is the audit support
- How does the design work when a remote TACACS+ failover? >>> as the local server don't know the list of commands? >> this is an issue, in SONiC, users can login and run using local permissions.
- How do I block the commands using bash(/bin/sh)/python etc?
- PR is out for review - https://github.com/liuh-80/SONiC/blob/master/doc/aaa/TACACS%2B%20Design.md
- DB schema should be aligned with Yang model.
- PR is out for review - https://github.com/Azure/SONiC/blob/96a65f0a4d67dc3b0949d5798be51ab10da99c07/doc/qos/mpls_tc_to_tc_map.md
Thanks,
-MadhuOn Tue, Sep 14, 2021 at 6:36 PM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 9/14/2021.
Topics discussed.ECMP Overlay BFD support - Prince / MSFT
- Does this feature support BUM traffic ? No - Since Vnet routes are based on unicast routes, it supports Unicast only.
- Does the feature support control plan BFD / FRR BFD ? No, this BFD offload to ASIC
- What is the motivation/use case endpoint monitor ip is different from the actual endpoint running BFD ? The use case is - There are devices support data vs control plane on different ports for monitor purpose
- Can the design support query ASIC BFD capabilities before writing the BFD session ? could be possible, will add into the HLD
- Can the control plan & hardware offload BFD coexists on the same device ? yes/No - Need to brainstorm complexity - Will split the HLD into two, one dedicated to BFD to describe all the scenarios.
- What are the default BFD timers used for offload ? WIll be included into HLD
- Is there any global session bfd table for default values? No
- Community suggested to have BFD into a separate HLD ? Ex: coexistence may cause issues, need to discuss more in details
- How about end user CLI to control BFD session either hardware offload or FRR BFD ? Next phase
- Do you have a BFD state db schema mapped to the Rfc BFD schema ? Will be included in HLD
- How do handle or notify BFD sessions from HW?
- Is it possible to remove BFD routes from the ECMP group ? yes, vNetOrch
- PR is out, please leave comments here - https://github.com/Azure/SONiC/pull/861/files
General Comments
- SUggest that use HLD PR as tracking PRs for related Code PRs. REference - - https://github.com/Azure/SONiC/pull/806
- 202111 feature release Deadline - Oct 1st
- Some features will be delayed to next release - No list yet identified
- Feature owners with HLD ready - if you want to schedule the review - reach out
Yanzhao Zhang
Thanks,-MadhuOn Thu, Sep 2, 2021 at 7:25 AM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 8/31/2021.
Topics discussed.SHow Running Command Enhancement - - EdgeCore/MaxChen
- What are the use cases of this feature ? Customers familiar with CISCO like CLI, goal is to make it more convenient or trivial to understand the current running command .
- How does the design handle the maintainability of these improvements?
- How does Yang show the running command ?
- It looks like manual work, shouldn't it be duplicated to the sonic-mgmt-framework which is yang driven/auto generated? >> Not really!
- PR is out, please leave comments here - https://github.com/Azure/SONiC/pull/838
Routed Subinterfaces Enhancement - Preetham/BRCM
- What are the use cases of shorter sub interface naming ?
- Why does it required to bring in the short names? >> Kernel naming limited to 15.
- Where do you store the mapping short to long name child to parent ?
- Suggestion - for consistency reason keep vlan interface mandate.
- Is this config change only for routed subinterfaces ? How do they differentiate ?
- There must be a short name convention in SONiC already, is this design considered that?
- Why can't this short vs long name conversion hide in intrfaceMgrD? >> it looks like the changes are widespread
- PR is out for review - please leave comments here - https://github.com/Azure/SONiC/pull/833
Thanks,-MadhuThanks,
- MoM of today's OCP SONiC call 8/24/2021.
Topics discussed.SAG - Static Anycast Gateway - EdgeCore/MaxChen
- Is the SAG feature enabled by default? >> No, the feature is default disabled.
- Why do I need a knob for SAG? >> Please list down in HLD sections, is there any implications for taking care in data path handling of SAG vs macvlan interfaces.
- Unless it's absolutely necessary, it's not required to have a global knob. A global knob will introduce a lot of complexity and cases to handle. >> So is it necessary to have a knob for SAG?? can the gateway & ip address list can't be sufficient ? >> It seems to be true.
- SagMgrD is not required when we plan to use SVI instead of maxvlan interfaces.
- Is there my hardware resource limit on SAG interfaces ? can it be referred to in CRM?
- Please list down what are the complexities when SAG is enabled along with SVI/macvlan interfaces in data path routing?
- CLI SAG command can be part of interface command.
- PR is out - https://github.com/Azure/SONiC/pull/837
Show running enhancement - EdgeCoreGeneral comments1. Feature owners speed up as the deadline approaches2. Test quality is highest important for Community Features-Madhu
- MoM of today's OCP SONiC call 8/17/2021.
Topics discussed.PINS - P4 Integrated Network Stack - Google/Intel/ONF
- Why don't the design leverage the existing Error DB framework for feedback loop? >> It seems the PINS team is closely working on the Error framework team to address the gaps.
- Is this design different from FlexSAI ? Using PINS, you can model the entire SAI pipeline, not the case with FlexSAI
- What are the advantages of exposing the entire SAI pipeline using PINS? >>> we can do Fuzzing, Automation testing the entire pipeline being exposed.
- What kind of intelligence does the design provide to the applications in terms of network/application/resource errors? >> there is new HLD work in progress.
- What is the plan to support vendor SAI extensions? Can vendors SAI extension be added without recompiling libSAI? >> yes, HLD described it
- How does the design handle the missing redis pub/sub response path / notifications ?
- How about the PINS migration plan in terms of software upgrades vs ASIC upgrades? >>>Please add a section in HLD.
- Can this design work on packet I/O performance improvement? >> So far, the numbers are promising, and will be looked into.
- Can admins run SONiC without P4RT? yes.
- PR is out for review, please provide comments offline - https://github.com/pins/SONiC/blob/pins-hld/doc/pins/pins_hld.md
Thanks,-MadhuThanks,
- MoM of today's OCP SONiC call 8/10/2021.
Topics discussed.SONiC_SFP_refactoring HLD - Arista/MSFT
- How does the design support backward compatibility of existing sfp modules ?
- What are the guidelines for Vendors to implement the common sfp refactor packages? Please list out few examples for vendors to embrace it
- Can the design provide sfp data as Dict including all the sfp fields?
- PR is out for review please leave comments here - https://github.com/Azure/SONiC/blob/bf657839e521fb71e407df18e566a3e09c7e6958/doc/sfp-refactor/sfp-refactor.md
What next:
- Routed Subinterface Enhancement HLD review by Preetham Singh from BRCM – 30 mins
- PINS Main HLD review by Bhagat Janarthannan and team from Google, Intel, ONF – 30 mins
-MadhuOn Sat, Aug 7, 2021 at 1:19 PM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 8/03/2021.
Topics discussed.
- Class based Forwarding HLD - MSFT /Tom
- Here is the PR to provide review comments - https://github.com/Azure/SONiC/pull/796
- Q & A
- How does any SONiC user consume this feature? What is the plan? How to enable this feature
- Is the regularSONiC release not enabled by default? yes Tom please confirm
- It doesn't provide any CLI and asks users directly to play around with app_db tables. What is the guidance? Please list down the instructions in the HLD section
- Shouldn't the design be limited to the number of FC ? platform specific
- Can it be the DSCP values more than FC values right? Yes, it is
- Can the design expose the DSCP/EXP values to applications? yes
- Is there any plan to introduce click commands / CLI ? >>> Tom
- A separate table for DSCP_to_FC, and refer to those table names in CLASS_BASED_NEXT_HOP_GROUP_TABLE ?
Thanks,-MadhuOn Sun, Aug 1, 2021 at 5:02 PM MS Reddy <msreddy...@gmail.com> wrote:Thanks,
- MoM of today's OCP SONiC call 7/27/2021.
Topics discussed.
- Tech support dump improvements - Nvidia (Vivek) - Please share the HLD here
- Q & A - How does it handle if the device end up with continuous coredumps? Is there way to ship the core files external?
- 202111 release plan (HLD & Code PR's reviewers)- will be posted shortly by Yanzhao Zhang
- July 23 - call for Paper OCP / OCP website to submit abstract - Select Networking Track - https://www.opencompute.org/summit/global-summit/call-for-papers
- OCP Updates for call for papers.
-MadhuOn Tue, Jul 13, 2021 at 8:58 AM MS Reddy <msreddy...@gmail.com> wrote:Thanks,
- MoM of today's OCP SONiC call 7/13/2021.
Topics discussed.
- 202111 release plan (HLD & Code PR's reviewers)- will be posted shortly by Yanzhao Zhang
- July 23 - call for Paper OCP / OCP website to submit abstract - Select Networking Track - https://www.opencompute.org/summit/global-summit/call-for-papers
- Paper Selection - Aug
-Madhu
- MoM of today's OCP SONiC call 6/22/2021.
Topics discussed.
- 202106 release status & discussed code & PR status
- Wiki will be posted shortly.
- 202106 release cut - 06/30
Thanks,-MadhuOn Tue, Jun 15, 2021 at 8:55 AM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 6/15/2021.
Topics discussed.CMIS-C-CMIS [Coherent - Common Management Interface] - Chuan Qin/MSFT
- What is the goal of CMIS? >>>It is to configure and monitor optics/ transceivers.
- Why do need a special daemon, shouldn't be used xrvd or transceiver? .>> It is extending the xrvd
- How does SONiC consume these interfaces? Is it REst interface or CLI? will be discussed
- What are the plans to integrate with SONIC, need more detailed steps w.r.t CMIS interfaces as well as CMD firmware upgrades?
It would be very helpful if there would be a list of APIs which need to be implemented by vendors and how these are used. We are looking to understand which daemons are using it, CLI, etc.
General updates
202106 release updates
202111 release planning
Few timelines
- 202111 feature contribution submission end by 6/25/2021
- 202111 feature roadmap review in community on 7/6/2021
- 202111 release roadmap finalization 7/15/2021
Thanks,-MadhuOn Tue, Jun 1, 2021 at 8:42 AM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 6/01/2021.
Topics discussed.SAI Failure Handling - Shi-su
- Can this design extend the error feedback loop to CLI? not in the scope - Shi will come back on it.
- Can this design work only with app_db or can be extended to other db's as well? It could, not in the scope.
- How does it work in conjunction with multi redis db?
- There is Error handling HLD out in the community, please make sure this design should be augmented to it.
- What is the life cycle of error db entries? How do the entries be consumed and cleaned up?
- PR is out for review - https://github.com/Azure/SONiC/blob/312e885c3c19f3e9506cfd10fcc86dbb8eac0309/doc/SAI_failure_handling/SAI_failure_handling.md
Thanks,-MadhuThanks,
- MoM of today's OCP SONiC call 5/18/2021.
Topics discussed.Sonic Dump Utility - Vivek from NVDA
Can this HLD support multi ASIC DB? yes. Need VS image to test the utility, please share a multi ASIC VS image to NVDA team.
Can this utility be used for counters db? yes.
Is there a code PR raised? Not yet.
How is this utility different from redis-dump tools ?
HLD PR is out for review - https://github.com/Azure/SONiC/blob/791a6a22d989ec7d3daa8efd3a45a56fdc3fa156/doc/Dump-Utility.md#overview
-MadhuOn Tue, May 11, 2021 at 5:41 PM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 5/11/2021.
Topics discussed.SRv6 HLD - Houdi from Alibaba + Intel
How did you arrive at this requirement ex: SRv6 sidList per Policy = 4 ? Is this specific to Alibaba deployment?
How deep can transit nodes be ? Why is this limited to 3 in HLD? - Again, deployment specific - more headers results in SRH header compression.
How does the design support tying up the policy with routes? right now it supports policy tie up with prefixes. can be enhanced.
The workflow diagram is a little confusing, needs an update and should discuss it once again - pl focus on w.r.t ownership of the route data and who does what modifications + in terms of consumers/producers.
Alibaba/Intel will share the PR for review...
Thanks,-MadhuOn Fri, Apr 30, 2021 at 6:52 PM Madhu Pal <mad...@aviznetworks.com> wrote:Hi Srinadh,Please find answers line:Thanks,-MadhuI would like to understand more about "Don't overload stateDB for events & alarms? It should be advised to use separate redis DB for events & alarms?" comment. Appreciate some responses from the community. I hope I am using the right channel.Eventd is planning to use stateDB to house event history table, alarm table and stats tables. They get updated every time an event/alarm is raised.These tables are of fixed size: event history table size is customizable with maximum size being 40k or time limit of 30 days - at which time, eventd deletes older records. Stats table is of fixed size with a handful of records. Alarm table only contains a record when an alarm is raised and record is removed when the alarm is cleared.How does stateDB gets overloaded?Madhu>> As the operational data stored in state db today, by adding events, alarms and stats (frequent data) into it makes state db easily out of limits. In addition to that a software (eventD open source)/design flaw makes it worse. Unlike APP_DB, State DB is more read friendly, more writes/updates due to events/alarms/stats become performane issues? Btw, do you get a chance to estiamte on state DB with new data / what is plan to mesure performance of state db with ne data writes?Is it because of db writes? OR 40k for history table is too much in a DB?Madhu>> Today each redis instansce setup with multile redis db's ex: APP_DB, STATE_DB etc.. Not 100% sure REDIS memory limits and how does it set for each redis db or any redis profile?? MSFT team can help on this.Using serparate redis DB means, I need to create a new redis instance and create a DB with that insance? OR create a DB off existing redis, redis2, redis3 instances?Madhu>>I'd suggest to use new redis db similar to state db ex: event db or altogether new redis instance (redis1, redis2 - ) ex: https://github.com/Azure/SONiC/blob/master/doc/database/multi_namespace_db_instances.mdOn Tuesday, April 27, 2021, 09:15:18 PM PDT, MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 4/27/2021.
Topics discussed.Event & Alarm Framework HLD- Srinath - Dell
- How does the design handle an alarm storm ? event cache
- Which component does the event Cache? events
- What is the plan to upstream fixes into eventD?
- What exactly does eventD do? Why can't this be done with redis DB + lua scripts ?
- PR is out for review - https://github.com/Azure/SONiC/blob/ce60b64ee1560d0e6f9f4f19b4e860292a235bad/doc/event-alarm-framework/event-alarm-framework.md
Generalizing config.bcm support all brcm platforms - BRCM
- Who will maintain the common config file? BRCM
- How does the design handle ODM config files vs common config file?
- PR is out for review - https://github.com/Azure/SONiC/pull/699
Thanks,-MadhuOn Tue, Apr 20, 2021 at 6:42 PM MS Reddy <msreddy...@gmail.com> wrote:
- MoM of today's OCP SONiC call 4/20/2021.
Topics discussed.Event & Alarm Framework HLD- Srinath - Dell
- Is this HLD supports SONiC CLi? No, It is supported by mgmt-framework.
- Why can't current syslog be utilized or enhanced?
- What is the motivation choosing eventD?
- How does this HLD integrate with Thermal design HLD which has similar eventing.
- How does this design handle event re-ordering? What suggestions if the events are out of order?
- Don't overload stateDB for events & alarms? It should be advised to use separate redis DB for events & alarms?
- How does the design handle the life cycle of an event/alarm? What are the eviction policies enforced on to the DB?
- Advised to use dynamic json event profile instead of using a static map?
- PR is out for review - https://github.com/Azure/SONiC/blob/ce60b64ee1560d0e6f9f4f19b4e860292a235bad/doc/event-alarm-framework/event-alarm-framework.md
- Review will be continued..
Thanks,-Madhu
- MoM of today's OCP SONiC call 4/13/2021.
Topics discussed.Policy based Hashing - Nvidia
- How does the design calculate hash resources ? There is no SAI API to calculate this. Please add a comment in the HLD.
- How does CRM resources handled with PbH?
- When port is part of the LAG, how does PbH rules PhB table - User/Orch agent should pass LAG to ASIC. - Validation should be taken care of in application.
- Add data flow sequence diagram explains the precedence or out of order handling of hash vs rule ?
- What is the behaviour if PBH table/resources is full ? there is no API - currently raise syslog/error to user and fallback to no PBH - hash will not be created
- Does SONiC track any thresholds ? Currently the ACL thresholds are being tracked but not the scarce resource like ALU / mirror sessions / no thresholds being tracked
- Hashing will be calculated only on inner frames . based on type ipv4 or ipv6 or vxlan user defined
- what are the fields expected to be configured for NVGRE? There is a reference example in HLD
- The PbH data model should be yang complaint? yes there is a section below.
- Is there any way to track ASIC hash resources today? No SOniC Infra support yet.
- PR is out for review - https://github.com/Azure/SONiC/blob/a3f2bde7f938c3db0b49d8acfe947a1320337bb8/doc/pbh/pbh-design.md
Thanks,Madhu
- MoM of today's OCP SONiC call 4/06/2021.
Topics discussed.SONiC BUM control support - Mohan S
- How does the design calculate CBR, as there is no support for user to configure? >> It is internally calculated by the application - BRCM
- How do you handle unknown multicast & unknown unicast storm control as there are no separate SAI policers? - Mohan will look into it
- Are there any statistics for drop counters due to storm control for each category - Not supported
- How does user stop storm control - By delete storm control config.
- Do you query strom control capability on asic ? How does the application know ASIC has this capability ? mgmt framework would support this feature ?? Mohan will check it
- Is there a sonic_yang support for storm control ? yes
- Share code PRs and Sonic Yang to the community for review - Mohan
- PR out for Review -https://github.com/Azure/SONiC/pull/441
SONiC Mgmt framework - ''show techsupport dump" - Kerry
- Can the design support flexibility of what content needs to be added to the tech support dump ? - yes
- can the design support download the tech support tarBall to clients? - Not yet, will be supported
- PR is out for review -