DASH Workgroup Community Update 9/24/2025

2 views
Skip to first unread message

Kristina Moore

unread,
Sep 30, 2025, 3:36:35 PM (9 days ago) Sep 30
to sonic...@googlegroups.com, sonic-o...@lists.sonicfoundation.dev, Prabhat Aravind, Saikrishna Arcot, Lawrence Lee, Michal Zygmunt, eddie.ruan, guizhao.lh, Yanfeng, Yuezhou, Murthy Vakkalagadda, Arun, Doddapaneni, Krishna, Moopath velayudhan, Mukesh, Narayanan, Swaminathan, Selvarajan, Arunachalam, Srinivasan, Vijay, Sundara Murthy Gurunathan, Thyamagundalu, Sanjay, Veerappan, Senthilnathan, Venkatesh Srinivasan, Marc Meunier, Chid, Harrish SJ, Madhu, Israel Meilik, Jai Kumar, Lisa Nguyen, Mohammad Hanif, Sandeep Balani, Suresh Satapati, Kannan Selvaraj, Joseph White, Phaniraj Vattem, Shawn Dube, Venkatesan Mahalinga, Faisal Khan, Farhan Tariq, Mohammad Qasim Farooqi, Saad Mazhar GMail, Zafir, Zarif Hafeez GMail, Ahmed Guetari, Chris McDonald, Heath Parrott, Joel Moses, John Gruber, Tony Torzillo, Ziv Saar, Ravindran Suresh, jame...@geico.com, Amith, Andy Fingerhut, Erum Frahim, Ghani, Ixim, Kwangsuk, Lin Songnan, Mahendar Byra, Meyappan K Gmail, Nitesh, Piotr P, Ravi, RS4681, Venkat External, Yoyo, Chatterjee, Deb, Cristian Dumitrescu, Dan Peng, Limaye, Namrata, Naren Mididaddi, Paul Kappler, Rao, Radhika, Shan Greer, Shweta Shrivastava, Singhai, Anjali, Stephen Doyle, Subramanian, Maheswari, Dean Lee, Alberto Villarreal, Alex Bortok, Chris Sommers, Manodipto Ghose, Mircea Dan Gheorghe, Nitesh Jha, Swaminathan Balasubramanian, Vinod Kumar, Alexander Cheskis, Mike Woster, Kishore Atreya, Sonny Mei, Brad House, Balachandar Rajarathinam, Christian Kuhtz, John Evans, Rawal, Amol (Nokia - US/Westford), Abdul Rouff, Alan Lo, E Blatt, Eilon Greenstein, Gagan Punathil Ellath, Idan Hac, Liat Grozovik, Marian Pritsak, Matty Kadosh, Nikhil Sandugula, Oleksandr Ivantsiv, Paul Cummins, Shay Schlafman, Venice Hawa, Wei Bai, Yohad Tor, Yuval Degani, Madhu, Jamal Hadi Salim, Andriy Kokhan, Leonid Khedyk, Mykola Zhuravel, Tetyana Zubova, Michael Offel, Philipp Keydel, VolodymyrX Mytnyk, Aditya Sahni, Mahaboob Gani, Pranay Sahay, Sairam Rangaswamy, Satya Valli Rama, Sohan Prabhu (TATA CONSULTANCY SERVICES LTD), Syed Mehemood, Richard Wu, arham...@xflowresearch.com, Kanza Latif, Muhammad Ali, rimsh...@xflowresearch.com, Bud Grise, Ezra Y, John C Carney, Ted Weatherford, Vincent L

Hello DASH Open Source Community –thank you for your time on Wednesday 9/24 🙂     

Just another reminder, I’m looking to leverage the Linux Foundation lists more to manage communications.  If you could please take the time to enter your info into the list here, I can initiate deletion of the sonic-dash@googlegroups list we used when we began the project 😊 

In Summary this week:

I look forward to seeing everyone at OCP in October 2025; please find the schedules below:

    OCP schedule https://2025ocpglobal.fnvirtual.app/a/schedule/

    Sonic schedule: https://sonicfoundation.dev/event/sonic-workshop-and-sonic-booth-at-ocp-global-summit/

 

We continue to work on the initial sonic-mgmt tests PR19700 for PL NSG, Trusted VNI, Floating NIC, and Return Path ECMP (with @Lawrence Lee from the SONiC team), along with the SDN team for the northbound controller programming interface.   

 

 We've reviewed comments and differences in Issue686 regarding example configurations and behavioral differences between vendors (focusing on fields like Region ID and Trusted VNI), and clarified the alignment of SONiC mgmt tests and SWSS implementation across vendors, with @Prabhat Aravind confirming the current test as the source of truth.  

We also discussed the technical challenges of supporting the delete operation for the route rule table with non-zero priority, focusing on controller expectations, GNMI server limitations, and possible alternative approaches using the GET API.  Thinking to use the GET API to retrieve the underlay IP from SAI, which would remove the dependency on the controller and GNMI server changes. Prabhat agreed to consider this approach if the controller update was not feasible.  

There are blockers related to the Trixie upgrade, specifically systemd-SONiC-generator incompatibility with systemd 257; we will need Community help and possible involvement from Cisco and @Saikrishna Arcot.  

And lastly, we continue to have a contribution back up for grabs.  It would be great to have a volunteer to suggest a PR in the dash-sonic-hld (in the SONiC repo here) for commands to show ENI counters and DPU global metrics – please submit a PR if you are interested!

For Complete Details, please see the “Full DASH Community Notes” near the end of this communication. 

Follow-ups: 

  • Example Configuration for Floating NIC: Follow up and provide an update on the example configuration for Floating NIC discussed last week. (@Prabhat Aravind or @Michal Zygmunt)
  • Review of Cisco HA Testing Contribution: Review the merged Cisco HA testing contribution for heartbeats and provide feedback if necessary. (@Ramesh)
  • Clarification of Inbound/Outbound Lookup Implementation: Send an email listing questions and request all vendors to clarify how inbound/outbound lookup is implemented in their firmware for both normal and Floating modes. (@Michal, @Prabhat)
  • VTAP Feature Documentation Presentation: Coordinate with @Pranjal Shrivastava to present the VTAP feature documentation in the next month. (Kristina)
  • Config Diff Review and Alignment: Review the Config Diff in Issue 686, respond to Prabhat's comments, and ensure alignment between SONiC mgmt and SWSS implementations, re-running tests as needed. (Oleksandr, Mircea, Prabhat)
  • Multiple Inbound Direction Lookup Enhancement: Track and plan the enhancement for programming multiple inbound direction lookup in DASH, with follow-up in Issue 23875 after POC. (Judy (SONiC team), Prabhat)
  • DHCP Unique Identifier Issue Resolution: Prioritize and implement a fix for the DHCP unique identifier Issue affecting NVidia during upgrade from 2025_05 to 2025_06, by reverting the change binding to MAC and keeping the vendor-specific change, and update Prabhat on the outcome. (Senthil, Mukesh)
  • Route Rule Table Delete Operation Support: Follow up with @Lawrence Lee and @Michal Zygmunt to determine if the controller can send the underlay IP in the delete operation, and update Vivek by end of the week; if not possible, explore using the GET API as an alternative. (@Prabhat)
  • SystemD/Trixie Upgrade Issue Follow-Up: Follow up with Sai Krishna to determine if the systemD/Trixie upgrade Issue will be addressed generically for smart switch and multi-ASIC, and coordinate with Murali if additional help is needed. (Prabhat)

 

In Summary (full list below), since the last Community call we have:

13 PRs Completed (+5)

13 in To Do (+/- 0)

8 in Draft (+/- 0)

42 in Progress (+1)

11 Awaiting Review (-6)

 

Just a reminder that we would encourage/invite Community members to present to the Community (test runs or progress, new scenarios, etc…), just ‘r’ to let me know, or generate a PR in the repo.

The DASH channel link is here to subscribe / access WG content (and click the bell to receive notifications). 

 Thank you for your time/contributions – tomorrow is our 1st Wednesday 'week off', see you on 10/8/2025

 

Meeting Title:  SONiC-DASH-Workgroup Community Meeting #161

Attendees (13):

DASH Group to join: https://groups.google.com/g/sonic-dash

Linux Foundation list: https://lists.sonicfoundation.dev/g/SONiC-Dash

 

Abdul Rouff - NVidia

Mircea Dan Gheorghe - Keysight

Swami Balasubramanian - Keysight

Bud Grise - XSightLabs

murali Venkateshaiah - Cisco

Veerappan, Senthil - AMD

Farhat Ullah - DreamBigSemi

Oleksandr Ivantsiv - Nvidia

Vivek Reddy Karri - Nvidia

Gagan Punathil Ellath - Nvidia

Prabhat Aravind - MSFT

Kristina Moore - MSFT

Ramesh Raghupathy - Cisco

 

  

 

Full DASH Community Notes 😊

·         DASH Config Diff Review and SONiC Mgmt Test Alignment: discussed the recent DASH config diffs (Issue #686), reviewed comments and differences, and clarified the alignment of SONiC mgmt tests and SWSS implementation across vendors, with Prabhat confirming the current test as the source of truth and planning further consolidation with Lawrence and Michael's teams.

·        Config Diff Review: review of the DASH config diffs, referencing Issue #686 and highlighting 5 key differences. Prabhat provided detailed comments on each diff, clarifying which differences were relevant to the PL MVP 2.0 scenario and which were not, and pointed to sample configurations for further reference.

·        SONiC Mgmt Test Alignment: NVidia asked if the SONiC mgmt test was aligned with expected configuration and behavior. Prabhat confirmed alignment based on Lawrence's input. It was agreed that passing the SONiC mgmt test meant no further immediate action was required, but any changes in SONiC mgmt would need to be reflected in SWSS as well.

·        Cross-Vendor Consistency: discussed the need for a common config template across AMD and NVIDIA devices, with Prabhat noting that Michael Aronovic had additional comments to Lawrence's changes. Prabhat planned to sync with Lawrence to consolidate and upload the latest patch.

·        Test Execution and Image Versioning: discussed the dynamic nature of image versions (2025_05, 2025_06), with Prabhat emphasizing the need to use the latest images due to ongoing bug fixes.

·        Mappings: Mircea noted that certain mappings were required on some platforms but not others, leading to a technical discussion with Senthil about ENI mapping requirements.

·        Action Items and Next Steps: It was agreed that Mircea, Senthil, Michael, and Murali would further validate the mapping differences offline, ensuring that the common SONiC mgmt test is used as the baseline for validation. Prabhat provided links to relevant configs and suggested setting up a separate meeting if further questions arose.

 

·        DHCP Client Identifier Upgrade Issue and Factory Image Compatibility: addressed a critical bug where changing the DHCP client identifier from DUID to MAC address caused IP assignment failures on DPU upgrades from 2025_05 to 2025_06, especially impacting NVidia factory-shipped devices.  Sent to Mukesh for urgent inspection.

·        Bug Description and Impact: Prabhat described the Issue where the DHCP unique identifier for the E0 midplane IP was changed from DUID to the midplane interface MAC address by AMD, causing IP assignment failures for NVIDIA DPUs during upgrades. This was identified as a blocker for the 2025_06 release, with factory-shipped devices unable to receive IPs after upgrade.

·        Workarounds and Limitations: Senthil suggested a workaround involving reducing the DHCP lease time before migration, but Oleksandr and Prabhat noted this was not a long-term fix and would not address devices already shipped with the old image. The group discussed the infeasibility of updating factory images already in production.

·        Platform-Specific Fixes: Oleksandr and Senthil debated whether the fix should be implemented in a platform-specific manner, with the consensus being to revert the MAC address change in the vendor-independent code and keep platform-specific changes as needed. Vivek and Prabhat identified the exact code lines to revert and agreed to test the solution.

·        Assignment and Next Steps: Senthil agreed to have Mukesh address the fix, and Prabhat emphasized the urgency for NVidia.

·        Status and Prioritization of Open Issues and PRs: reviewed the status of open Issues and PRs and discussed the prioritization of fixes for the 2025_06 and 2025_11 releases.

·        Review of Recent PRs and Merges: summarized the PRs merged in the past week, including the addition of local next hop IP and updates to reboot status variables. The group noted which PRs were relevant to their work and which were not.

·        Blockers and Target Versions: discussed two main blockers for NVidia, one being the DHCP client identifier Issue and another pending verification. They agreed to clarify which Issues must be fixed in 2025_06 and which could be deferred to 2025_11 with Prabhat committing to update the team and GitHub Issues accordingly.

·        Enhancements and Pending Issues: reviewed enhancements such as the ability to program multiple inbound direction lookup entries and discussed the status of the route rule table delete operation for non-zero priority. Vivek and Prabhat plann to follow up with Lawrence and Michal for a solution.

·        Issue Tracking and Communication: discussed the use of GitHub labels and tables to track Issues, the group agreed to use the 'smartswitch' label for easier tracking and to communicate updates via GitHub and email.

·        Technical Discussion on Route Rule Table Delete Operation: discussed the technical challenges of supporting the delete operation for the route rule table with non-zero priority, focusing on controller expectations, GNMI server limitations, and possible alternative approaches using the GET API.

·        Controller and GNMI Server Limitations: Vivek explained that the current controller and GNMI server do not provide the underlay IP in the delete operation, which is required by the proposed fix. Prabhat agreed to follow up with Lawrence and Michal to determine if the controller can be updated or if an alternative approach is needed.

·        Alternative Approaches: Vivek suggested using the GET API to retrieve the underlay IP from SAI, which would remove the dependency on the controller and GNMI server changes. Prabhat agreed to consider this approach if the controller update was not feasible.

·        Systemd and Trixie Upgrade Issues: discussed blockers related to the Trixie upgrade, specifically systemd-SONiC-generator incompatibility with systemd 257, and considered the need for community help and possible involvement from Cisco and Sai Krishna.

·        Trixie Upgrade Blockers: Vivek described Issues encountered during the Trixie upgrade, where systemd-SONiC-generator fails with systemd 257, blocking both NPU and DPU functionality. The group noted that the fix would be generic and not specific to SmartSwitch.

·        Community and Team Involvement: Prabhat asked if Cisco could help, and Murali agreed to check if their team had the necessary context. Sai Krishna was identified as working on the Trixie upgrade, and Prabhat planned to follow up to see if the Issue would be addressed as part of the broader upgrade.

 

Sticky for Links/Reference:

 

 

DASH Groups to join to receive Invites, Meeting Notes, and Comms

DASH: https://groups.google.com/g/sonic-dash    

DASH-Test-Workgroup Group: https://groups.google.com/g/sonic-dash-test-workgroup  

Linux Foundation list: https://lists.sonicfoundation.dev/g/SONiC-Dash

If anyone knows potentially interested people who would like info re: our community, please have them joins these groups for receive Comms, etc…

Links to Recording 

Teams:

SONiC-DASH Workgroup Community Meeting-20250924_090323-Meeting Recording.mp4 

DASH Community: https://youtu.be/x1qDrQIWCfQ

YouTube Behavioral Model:
No agenda this week

9/24/2025 DASH Community Call; please request access via the link if you are not able to view/listen

Azure DASH GitHub Repo:                     

https://github.com/sonic-net/DASH

 


Test/Docs folder:

https://github.com/sonic-net/DASH/blob/main/test/docs/dash-test-workflow-saithrift.md

Ideal test workflow is here, converted to .md

SAI Thrift     

SAI Thrift PR

Client server needed for testing

P4

https://opennetworking.org/p4/ and https://p4.org/working-groups/

Open source, domain-specific programming language for network devices, specifying packet processing for data plane devices (switches, routers, NICs, filters, etc.)

PINS

https://opennetworking.org/pins/

 

PNA consortium spec

https://p4.org/p4-spec/docs/PNA-v0.5.0.html

An architecture describing the structure and common capabilities of network interface controller (NIC) devices which process packets transiting one or more interfaces and a host system.

Describes the structure and capabilities of the pipeline, and a user program, which specifies the functionality of the programmable blocks within that pipeline. For more information, see the P4 Language Consortium specifications

IPDK

Infrastructure Programmer Development Kit (ipdk.io) and

https://github.com/ipdk-io/ipdk-io.github.io

IPDK is an open source, vendor agnostic framework of drivers and APIs for infrastructure offload and management which runs on a CPU, IPU, DPU or switch. IPDK runs in Linux and uses a set of well-established tools such as DPDK and P4 to enable network virtualization.

bmv2

https://github.com/p4lang/behavioral-model

The second version of the reference P4 software switch, nicknamed bmv2 (for behavioral model version 2). The software switch is written in C++11. It takes as input a JSON file generated from your P4 program by a P4 compiler and interprets it to implement the packet-processing behavior specified by that P4 program

DPDK

https://www.dpdk.org/

DPDK is the Data Plane Development Kit which consists of libraries to accelerate packet processing workloads running on a wide variety of CPU architectures.

Linux Foundation SmartSwitch

https://lists.sonicfoundation.dev/g/sonic-smartswitch/calendar

 

  

Thank you again for your participation…

Kristina Moore MBA, M.S., CISSP - Azure Core Principal PM / DASH & SmartSwitch
Office: 425-722-7720     Mobile: 425-876-2040     Email:
kri...@microsoft.com
DASH Group to join: https://groups.google.com/g/sonic-dash    
Linux Foundation:  
https://lists.sonicfoundation.dev/g/SONiC-Dash
ImageTitle: LinkedIn - Description: image of LinkedIn icon

 

Reply all
Reply to author
Forward
0 new messages