Does TRex ONLY work on NUMA 0?

592 views
Skip to first unread message

charl...@gmail.com

unread,
Jul 3, 2018, 11:27:57 AM7/3/18
to TRex Traffic Generator
I have two Mellanox CX556A dual-port 100G NICs plugged into an AMD EPYC based two-socket server.

Each EPYC socket has 4 NUMAs, so the server has total 8 NUMAs.

The first NIC is connected to NUMA 0 and the second to NUMA 3 (both NUMA 0 and 3 are from the same socket).

TRex works with the NIC on NUMA 0 but fails with the NIC on NUMA 3. Both NICs are in loopback.

I am using CentOS 7.5
$ uname -a
Linux amd-010236107136.amd.com 3.10.0-693.el7.x86_64 #1 SMP Tue Aug 22 21:09:27 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Here is the trex_cfg.yaml for the first NIC (it works)

- port_limit : 2
version : 2
#List of interfaces. Change to suit your setup. Use ./dpdk_setup_ports.py -s to see available options
interfaces : ["01:00.0","01:00.1"]
port_info : # Port IPs. Change to suit your needs. In case of loopback, you can leave as is.
- ip : 198.18.1.1
default_gw : 198.18.2.1
- ip : 198.18.2.1
default_gw : 198.18.1.1

Here is the trex_cfg.yaml for the second NIC (it fails)

- port_limit : 2
version : 2
#List of interfaces. Change to suit your setup. Use ./dpdk_setup_ports.py -s to see available options
interfaces : ["31:00.0","31:00.1"]
port_info : # Port IPs. Change to suit your needs. In case of loopback, you can leave as is.
- ip : 198.18.3.1
default_gw : 198.18.4.1
- ip : 198.18.4.1
default_gw : 198.18.3.1

Here are the logs when it fails

$ sudo ./t-rex-64 -i -c 8
Killing Scapy server... Scapy server is killed
Starting Scapy server.... Scapy server is started
The ports are bound/configured.
Starting TRex v2.35 please wait ...
set driver name net_mlx5
driver capability : TCP_UDP_OFFLOAD
Number of ports found: 2
zmq publisher at: tcp://*:4500
PMD: net_mlx5: 0x56403c9adc00: Drop queue allocation failed: Unknown error -1
PMD: net_mlx5: 0x56403c9adc00: Drop queue allocation failed: Unknown error -1
PMD: net_mlx5: 0x56403c9adc00: Drop queue allocation failed: Unknown error -1
PMD: net_mlx5: 0x56403c9adc00: Drop queue allocation failed: Unknown error -1
PMD: net_mlx5: 0x56403c9adc00: Drop queue allocation failed: Unknown error -1
PMD: net_mlx5: 0x56403c9adc00: Drop queue allocation failed: Unknown error -1
PMD: net_mlx5: 0x56403c9adc00: Drop queue allocation failed: Unknown error -1
PMD: net_mlx5: 0x56403c9adc00: Drop queue allocation failed: Unknown error -1
PMD: net_mlx5: 0x56403c9adc00: Drop queue allocation failed: Unknown error -1
PMD: net_mlx5: 0x56403c9adc00: Drop queue allocation failed: Unknown error -1
PMD: net_mlx5: 0x56403c9b1c80: Drop queue allocation failed: Unknown error -1
PMD: net_mlx5: 0x56403c9b1c80: Drop queue allocation failed: Unknown error -1
PMD: net_mlx5: 0x56403c9b1c80: Drop queue allocation failed: Unknown error -1
PMD: net_mlx5: 0x56403c9b1c80: Drop queue allocation failed: Unknown error -1
PMD: net_mlx5: 0x56403c9b1c80: Drop queue allocation failed: Unknown error -1
PMD: net_mlx5: 0x56403c9b1c80: Drop queue allocation failed: Unknown error -1
PMD: net_mlx5: 0x56403c9b1c80: Drop queue allocation failed: Unknown error -1
PMD: net_mlx5: 0x56403c9b1c80: Drop queue allocation failed: Unknown error -1
PMD: net_mlx5: 0x56403c9b1c80: Drop queue allocation failed: Unknown error -1
PMD: net_mlx5: 0x56403c9b1c80: Drop queue allocation failed: Unknown error -1
wait 1 sec .
port : 0
------------
link : link : Link Up - speed 100000 Mbps - full-duplex
promiscuous : 0
port : 1
------------
link : link : Link Up - speed 100000 Mbps - full-duplex
promiscuous : 0
./t-rex-64: line 72: 4222 Segmentation fault (core dumped) ./_$(basename $0) $INPUT_ARGS $EXTRA_INPUT_ARGS

hanoh haim

unread,
Jul 3, 2018, 4:57:59 PM7/3/18
to charl...@gmail.com, TRex Traffic Generator
Are you sure you are not mixing NUMA with DRAM bank. For 2 socket server you should have 2 NUMA.

Could you send trex output with -v 7 added to CLI?

Hanoh

--
You received this message because you are subscribed to the Google Groups "TRex Traffic Generator" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trex-tgn+u...@googlegroups.com.
To post to this group, send email to trex...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/trex-tgn/e54aa131-e776-4fc1-a5cf-7cad38c809c2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Hanoh
Sent from my iPhone

Karl Rister

unread,
Jul 3, 2018, 5:03:54 PM7/3/18
to hhaim...@gmail.com, charl...@gmail.com, trex...@googlegroups.com
AMD EPYC architecture is not the same as Intel where each socket is
it's own NUMA node. See
https://www.servethehome.com/amd-epyc-7281-dual-socket-linux-benchmarks-and-review/
for a quick synopsis.

--
Karl Rister <kri...@redhat.com>
> To view this discussion on the web visit https://groups.google.com/d/msgid/trex-tgn/CA%2BYxBoLY3rr0XHo5pzk-EEQ5wu---b2ZJgprZcqWK9pcDWMqRQ%40mail.gmail.com.

hanoh haim

unread,
Jul 3, 2018, 5:19:31 PM7/3/18
to Karl Rister, charl...@gmail.com, trex...@googlegroups.com
Thanks Karl,this explain things.
 I need to look into the code but for some reason we didn’t expect to have more than two NUMA, this could explain the issue Charlie  is facing with NUMA #3

For best performance we tune the cores/NIC to use local memory (NUMA) so the configuration script might need to adapt/test  for this new configuration.

Would you be able to keep the NIC attached to NUMA #1,2 and utilize less cores?
Just as a workaround.

Thanks,
Hanoh

charl...@gmail.com

unread,
Jul 3, 2018, 7:25:19 PM7/3/18
to TRex Traffic Generator
Thanks Hanoch for your response.

I will try the second NUMA when I come back from 4th of July holiday.

Charlie

hanoh haim

unread,
Jul 4, 2018, 4:33:30 AM7/4/18
to charl...@gmail.com, TRex Traffic Generator
Hi Charlie, 

Reading more about EPYC

https://www.anandtech.com/show/11551/amds-future-in-servers-new-7000-series-cpus-launched-and-epyc-analysis/2

It is pretty impressive. In case of Mellanox CX-5 there would be a limit of 8 cores per 100Gb (for NUMA locality) which might not be enough to drive full line rate, but there are plenty of dies (Zeppelin) to drive more total bandwidth.

Cisco C125 M5 for example will expose only 2xPCI16 while theoretically it could expose 16x PCIx16 for each die, total of 16 slots of 100g for dual socket, a total of 1.6Tb/sec of traffic.

Once UCS C125 will be unreadable we will tune the scripts for 16 NUMA instead of the maximum 2


I would try to fix this.


1)

#define RTE_MAX_NUMA_NODES 8

==>16


2) should be 16 and auto-identify the maximum NUMA 


        for socket_id in range(2):

            filename = '/sys/devices/system/node/node%d/hugepages/hugepages-2048kB/nr_hugepages' % socket_id


3) example 


In configuration file ( this is for CX-5 example)


$more /etc/trex_cfg.yaml 

### Config file generated by dpdk_setup_ports.py ###


- port_limit: 2

  version: 2

  interfaces: ['04:00.0', '04:00.1']

  port_info:

      - ip: 1.1.1.1

        default_gw: 2.2.2.2

      - ip: 2.2.2.2

        default_gw: 1.1.1.1


  platform:

      master_thread_id: 0

      latency_thread_id: 14

      dual_if:

        - socket: 0

          threads: [1,2,3,4,5,6,7,8,9,10,11,12,13]



need to convert it to more local NUMA manually (never tested ..) 

this is an example for 8 CX-5 setup maximum of 7 cores per NUMA (socket in our terms)



 interfaces: ['04:00.0', '04:00.1', '05:00.0', '05:00.1', '06:00.0', '06:00.1', '07:00.0', '07:00.1'  ]

  platform:

      master_thread_id: 0

      latency_thread_id: 15

      dual_if:

        - socket: 0   #<< this is NUMA 1 in socket 0. this associated with interface 0,1  ['04:00.0', '04:00.1']

          threads: [1,2,3,4,5,6,7]

        - socket: 1  #<< this is NUMA 1 in socket 0. this associated with interface 2,3  ['05:00.0', '05:00.1']

          threads: [8,9,10,11,12,13,14]

        - socket: 2 #<<  this is NUMA 2 in socket 0. this associated with  [ '06:00.0', '06:00.1']

          threads: [16,17,18,19,20,21,22]

        - socket: 3  #<<  this is NUMA 2 in socket 0. this associated with  ['07:00.0', '07:00.1']

          threads: [24,25,26,27,28,29,30]


hopes, it helps.



thanks

Hanoh



--
You received this message because you are subscribed to the Google Groups "TRex Traffic Generator" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trex-tgn+unsubscribe@googlegroups.com.

To post to this group, send email to trex...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

charl...@gmail.com

unread,
Jul 5, 2018, 1:18:01 PM7/5/18
to TRex Traffic Generator
Thanks Hanoh,

Just tested it on NUMA 1 and it worked. As you said, NUMA 0 and 1 work.

Charlie

charl...@gmail.com

unread,
Jul 5, 2018, 3:15:51 PM7/5/18
to TRex Traffic Generator
Hi Hanoh,

My system has 8 NUMAs. Is it possible for me to make some changes so TRex will work with 8 NUMAs?

1) #define RTE_MAX_NUMA_NODES 8 ==>16

I do not need change it because my system has 8 NUMAs

2) should be 16 and auto-identify the maximum NUMA

for socket_id in range(2):

filename = '/sys/devices/system/node/node%d/hugepages/hugepages-2048kB/nr_hugepages' % socket_id

Can you point me to the file where I can make this change?

Thanks,
Charlie

hanoh haim

unread,
Jul 6, 2018, 12:49:10 AM7/6/18
to charl...@gmail.com, TRex Traffic Generator
for The second change, it is located in dpdk_setup_ports.py. Try to enlarge it to 4 first and try NUMA 0,3 back.
The issue with simple change that now you will consume 4GB *4 NUMA. So you will need at least 16GB + memory for TRex 
There should be a better solution 

Thanks,
Hanoh

--
You received this message because you are subscribed to the Google Groups "TRex Traffic Generator" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trex-tgn+u...@googlegroups.com.

To post to this group, send email to trex...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

charl...@gmail.com

unread,
Jul 6, 2018, 11:21:30 AM7/6/18
to TRex Traffic Generator
Hi Hanoh,

In dpdk_setup_ports.py, I made the following three changes

Line #108 for i in range(0, len(self.interfaces), 4):
Line #191 for i in range(0, len(self.interfaces), 4):
Line #212 for i in range(0, len(self.interfaces), 4):

But I still got the same errors. I guess I must have missed something.

Thanks,
Charlie

Matt Callaghan

unread,
Jul 6, 2018, 12:07:29 PM7/6/18
to TRex Traffic Generator
has anyone submitted an issue/ticket to track support for >2 NUMAs?

hanoh haim

unread,
Jul 8, 2018, 5:01:10 AM7/8/18
to charl...@gmail.com, TRex Traffic Generator
if you have enough memory, change this line 
https://github.com/cisco-system-traffic-generator/trex-core/blob/master/scripts/dpdk_setup_ports.py#L660

to be 

for socket_id in range(4): 

or   

for socket_id in (0,3): 

in your first example. 
again this is just a workaround for the issue

thanks,
Hanoh





Thanks,
Charlie

--
You received this message because you are subscribed to the Google Groups "TRex Traffic Generator" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trex-tgn+unsubscribe@googlegroups.com.

To post to this group, send email to trex...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

charl...@gmail.com

unread,
Jul 9, 2018, 1:22:37 PM7/9/18
to TRex Traffic Generator
Hi Hanoh,

After downloading the be_latest version and changing it to range(4), NUMA 3 started to work.

-Per port stats table
ports | 0 | 1 | 2 | 3
-----------------------------------------------------------------------------------------
opackets | 0 | 0 | 0 | 0
obytes | 0 | 0 | 0 | 0
ipackets | 0 | 0 | 0 | 0
ibytes | 0 | 0 | 0 | 0
ierrors | 0 | 0 | 0 | 0
oerrors | 1263291820153344 | 1263291819281280 | 1263291857244864 | 1263289731283584
Tx Bw | 0.00 bps | 0.00 bps | 0.00 bps | 0.00 bps

The only strange thing is "oerrors", but it seems that "oerrors" does not affect anything.

-Per port stats table
ports | 0 | 1 | 2 | 3
-----------------------------------------------------------------------------------------
opackets | 136227845 | 136246338 | 136208024 | 136224018
obytes | 139497324500 | 139516259292 | 139477023716 | 139493404632
ipackets | 136230253 | 136243862 | 136207886 | 136224112
ibytes | 139499787232 | 139513721828 | 139476882404 | 139493498848
ierrors | 0 | 0 | 0 | 0
oerrors | 1261023989340672 | 1261023988468608 | 1261024026432192 | 1261021900470912
Tx Bw | 54.26 Gbps | 54.00 Gbps | 54.05 Gbps | 54.15 Gbps

-Global stats enabled
Cpu Utilization : 25.1 % 123.4 Gb/core
Platform_factor : 1.0
Total-Tx : 216.46 Gbps
Total-Rx : 216.46 Gbps
Total-PPS : 26.42 Mpps
Total-CPS : 0.00 cps

Expected-PPS : 0.00 pps
Expected-CPS : 0.00 cps
Expected-BPS : 0.00 bps

Active-flows : 0 Clients : 0 Socket-util : 0.0000 %
Open-flows : 0 Servers : 0 Socket : 0 Socket/Clients : -nan
Total_queue_full : 40081021
drop-rate : 0.00 bps
current time : 43.0 sec
test duration : 0.0 sec

Thanks,
Charlie

Reply all
Reply to author
Forward
0 new messages