Ecmp Full Software Download

0 views
Skip to first unread message

Jennifer Leos

unread,
Aug 3, 2024, 3:24:52 PM8/3/24
to akereqhe

Most of the information is available from multiple links, however sometimes Juniper does not complete dots with a summary statement. So here goes some base information for someone who is just reading the question and not researched any docs:
An equal-cost multipath (ECMP) set is formed when the routing table contains multiple next-hop addresses for the same destination with equal cost. (Routes of equal cost have the same preference and metric values.) If there is an ECMP set for the active route, Junos OS uses a hash algorithm to choose one of the next-hop addresses in the ECMP set to install in the forwarding table.
You can configure Junos OS so that multiple next-hop entries in an ECMP set are installed in the forwarding table. On Juniper Networks devices, per-flow load balancing can be performed to spread traffic across multiple paths between routing devices.
Per-packet load balancing allows you to spread traffic across multiple equal-cost paths. By default, when a failure occurs in one or more paths, the hashing algorithm recalculates the next hop for all paths, typically resulting in the reordering of all flows. The reordering of all flows when a link fails potentially results in significant traffic loss or a loss of service to servers whose links remain active.
Answer/Solution
"If a link goes down, ECMP uses fast reroute protection to shift packet forwarding to use operational links, thereby decreasing packet loss. Fast reroute protection updates ECMP sets for the interface without having to wait for the route table update process. When the next route table update occurs, a new ECMP set can be added with fewer links or the route might point to a single next hop. " set forwarding-table ecmp-fast-reroute

Without fast, the ECMP sets will have to wait until the route is removed from the routing table, which then produces a new ECMP set which allows the hashing algorythm to recalculate the paths again. With the set forwarding-table ecmp-fast-reroute option enabled, this latency waiting for the routing table to update the ECMP set is eliminated.
FYI
Consistent load balancing maintains all active links and instead remaps only those flows affected by one or more link failures. This feature ensures that flows connected to links that remain active continue uninterrupted.
Starting in Junos OS 13.3R3, for MX Series 3D Universal Edge routers with modular port concentrators (MPCs) only, you can configure consistent load balancing, which prevents the reordering of all flows to active paths in an equal-cost multipath (ECMP) group when one or more next-hop paths fail. Only flows for paths that are inactive are redirected to another active next-hop path. Flows mapped to servers that remain active are maintained. This feature applies only to external BGP peers.

We however face issues with connection to our VPN servers in the DMZ. They are used by remote users to create a RA-VPN tunnel with the VPN servers from internet. The users have to try atleast 4-5 times before they get a successful connection with the VPN servers. We suspect it is because the VPN server have a public IP published on internet, which is a ISP2 public range. The return packet is getting load balanced too , towards ISP1 and cause assymmetric routing and ISP2 doesnt like it.

Is there a way to ensure the return packet goes through ISP2 only? We ahve tried PBF but doesnt seem to work. We ahve also enabled symmetric return option in ECMP, and confused why it doesn't seem to work.

Do you actually have logs showing return traffic is attempting to route via ISP1 instead of ISP2 with symmetric return enabled? If so, then that's all TAC should need to actually start digging into the issue and making sure you have it configured correctly, that it's being identified as server to client return traffic, ect. Usually issues like this is because it's not being identified as server to client traffic properly like it should, or that it's simply been misconfigured.

Also just to throw it out there, have you checked the release notes for 9.1 and verified that you aren't hitting any of the ECMP issues addressed in later releases? I know that there's been a few addressed issues in later builds related to ECMP, and 9.1.3 is pretty early in the 9.1 release.

The configuration for ECMP was all fine and TAC did take captures, where we did see issues caused by ecmp, it tried to sdn reply packets through the load-balancing. TAC although doesn't know why it is happening. Still under investigation.

The configuration for ECMP was all fine and TAC did take captures, where we did see issues caused by ecmp, it tried to sdn reply packets through the load-balancing. TAC although doesn't know why it is happening. Still under investigation

Hi there. Was there ever a resolution to this? We are seeing this behavior with many applications, especially ones that are setting cookies. Some vendors we had to inform of the 2nd ISP subnet range to make sure that traffic is being allowed. Please let me know.

I have a functional ospf adjacency between a server with 2 10Gb NICs and the L3 switch it is connected to over two /30 networks. A /32 IP address is assigned to the lo interface which is pingable and passes traffic reliably.

However in testing the ecmp via tcpdump, I either see traffic pin to one interface or duplicated across both, depending on the destination. More importantly, the sibling server which is configured similarly, but with another loopback ip address, traffic is pinned to one NIC.

I'm trying to solve the issue of drbd being bound to 1 IP/NIC with only a single 10Gbps link to sync between systems, while having >10Gbps of potential iSCSI initiator requests. The limiting factor here being the throughput of the peer link.

If you are using the current version of Cumulus Linux, the content on this page may not be up to date. The current version of the documentation is availablehere. If you are redirected to the main page of the user guide, then this page may have been renamed; please search for it there.

Cumulus Linux supports hardware-based equal cost multipath (ECMP) load sharing. ECMP is enabled by default in Cumulus Linux. Load sharing occurs automatically for all routes with multiple next hops installed. ECMP load sharing supports both IPv4 and IPv6 routes.

To prevent out of order packets, ECMP hashing is done on a per-flow basis; all packets with the same source and destination IP addresses and the same source and destination ports always hash to the same next hop. ECMP hashing does not keep a record of flow states.

Because the hash is deterministic and always provides the same result for the same input, you can query the hardware and determine the hash result of a given input. This is useful when determining exactly which path a flow takes through a network.

To use cl-ecmpcalc, all fields that are used in the hash must be provided. This includes ingress interface, layer 3 source IP, layer 3 destination IP, layer 4 source port, and layer 4 destination port.

cl-ecmpcalc can only take input interfaces that can be converted to a single physical port in the port tab file, such as the physical switch ports (swp). Virtual interfaces like bridges, bonds, and subinterfaces are not supported.

In most cases, the modification of hash buckets has no impact on traffic flows as traffic is being forwarded to a single end host. In deployments where multiple end hosts are using the same IP address (anycast), resilient hashing must be used.

It is useful to have a unique hash seed for each switch. This helps avoid hash polarization, a type of network congestion that occurs when multiple data flows try to reach a switch using the same switch ports.

The hash seed is set by the ecmp_hash_seed parameter in the /etc/cumulus/datapath/traffic.conf file. It is an integer with a value from 0 to 4294967295. If you do not specify a value, switchd creates a randomly generated seed instead.

You can configure the set of fields used to hash upon during ECMP load balancing. For example, if you do not want to use source or destination port numbers in the hash calculation, you can disable the source port and destination port fields.

Symmetric hashing is enabled by default on Mellanox switches. Make sure that the settings for the source IP (hash_config.sip) and destination IP (hash_config.dip) fields match, and that the settings for the source port (hash_config.sport) and destination port (hash_config.dport) fields match; otherwise symmetric hashing is disabled automatically. You can disable symmetric hashing manually in the /etc/cumulus/datapath/traffic.conf file by setting symmetric_hash_enable = FALSE.

In Cumulus Linux, when a next hop fails or is removed from an ECMP pool, the hashing or hash bucket assignment can change. For deployments where there is a need for flows to always use the same next hop, like TCP anycast deployments, this can create session failures.

The Mellanox Spectrum ASIC assigns packets to hash buckets and assigns hash buckets to next hops as follows. It also runs a background thread that monitors and may migrate buckets between next hops to rebalance the load.

When resilient hashing is configured, a fixed number of buckets are defined. Next hops are then assigned in round robin fashion to each of those buckets. In this example, 12 buckets are created and four next hops are assigned.

Resilient hashing does not prevent possible impact to existing flows when new next hops are added. Due to the fact there are a fixed number of buckets, a new next hop requires reassigning next hops to buckets.

Resilient hashing is not enabled by default. When resilient hashing is enabled, 65,536 buckets are created to be shared among all ECMP groups. An ECMP group is a list of unique next hops that are referenced by multiple ECMP routes.

A larger number of ECMP buckets reduces the impact on adding new next hops to an ECMP route. However, the system supports fewer ECMP routes. If the maximum number of ECMP routes have been installed, new ECMP routes log an error and are not installed.

c80f0f1006
Reply all
Reply to author
Forward
0 new messages