TRex Guidance for Dimensioning/Performance Hardware Requirements

776 views
Skip to first unread message

Matt Callaghan

unread,
Jul 10, 2018, 4:52:47 PM7/10/18
to TRex Traffic Generator
Hi TRex experts!

This is a bit of a "community poll / knowledge query".

I'm currently trying to ascertain how to dimension the hardware necessary to run TRex use cases.
 1) Inventorying current capital to know what can be re-purposed and what cannot
 2) Investigating what hardware systems would need to be purchased to fill the gaps

The documentation in https://trex-tgn.cisco.com/trex/doc/trex_manual.html#_hardware_recommendations has a lot of information, though it feels a bit insufficient (and/or stale) to accurately fulfill my planning goals as I'm left with numerous doubts and questions.

With respect to performance of TRex, we are supplied there with TWO hardware chassis examples "low end" and "high end" based on Cisco UCS specifications. There it states "supports up to XXGbps" (but with caveats/exceptions, is this unidirectional or bidirectional?, which TRex mode was it spec'd/tested against? what was the load of the system across the varying hardware vectors?)

Certainly whether we're pushing for 10G, 40G, 100G, or 400G generator capability will vary the requirements (linearly?).

What are the real hardware requirements?
   * CPU (num sockets, num cores/socket, frequency, vendor/model/generation/arch)
   * Memory (capacity, speed, # DIMM banks + single/dual/quad channel )
   * PCI-e Bandwidth (needs to be able to shove the bytes through the PCI-e bus of course, between CPU/RAM/NIC)
   * NIC Bandwidth (simple/obvious, though complex in vendor/model selection +DPDK ; that's a different can of worms for another day)
   * Disk (other than raw storage for OS + TRex pkg install, I presume no real disk I/O requirements exist)

I am currently considering "stateless" mode (https://trex-tgn.cisco.com/trex/doc/trex_manual.html), and I presume the answer various for stateless vs. stateful vs. advanced-stateful
 (though of course tomorrow we might want to know these same answers for the other modes)

Can things be simplified down to XMpps (X million packets / second) as the primary "output metric" for the TRex generator? I often see this metric referenced in the docs.

Do performance characteristics vary depending on the variance and complexity of the traffic profile configuration (5-tuple combinations, raw pcaps used, etc etc)

----
Ultimately, I wish we had some sort of guidance on how to size and dimension hardware based on the TRex use case and target traffic generator size.

Either we accomplish this by:
 1. Guestimates based on information available in documentation (risk of under or over dimensioning) - whomever pays the bill would be displeased if our hardware was 80% idle for a given TRex generator ;/
 2. Community contributions of "I do FOO with TRex, and have BAR hardware specs, and at max load cpu/mem/etc measurements are BLAH) - then others can buy similar hardware accordingly based on community samplesets
 3. Scientific modeling of TRex performance characteristics and the factors that impact it - then anyone could purchase with reasonable confidence the most optimal minimum hardware required for their use case

---
This was a long post! But I hope my intent is clear, and looking forward to input from Cisco staff + community.

~Matt

Maik Pfeil

unread,
Jul 11, 2018, 2:06:06 AM7/11/18
to TRex Traffic Generator
Hi Matt,

we are using several C240 with latest CPU configuration (to get most lcores) and both riser card options. We put upto 6x X710 with 4x10G (sum 24x10G) into the gen3 x8 slots or Mellanox NIC into the gen3 x16 slots, depending on riser setup.

We build static setups for automated testing and running multiple trex server with different configurations on same chassis.

Depending on the use case we need static core pinnings for the CPU/NIC bindings(2-4 lcores per port pair) . You should take this into account for your CPU choice.

We only using stateless mode and running tests for qos(latency flows), linerate with different frame sizes (aim to be a rfc2544), throughput or just convergence tests.

I saw upto 6x10G linerate 64b working like a charm. Didn‘t tested more yet, because there was no need.

We also had good result with older x86 generations and 82599 NICs, but due to limitation in flow statistics we fully moved to X710.


HTH
Maik
Reply all
Reply to author
Forward
0 new messages