Sparse CPU List

46 views
Skip to first unread message

Alexis DAMBRICOURT

unread,
Jan 2, 2023, 11:27:04 AM1/2/23
to likwid-users
Hi,

It looks like likwid-topology does not behave well in case of offline CPUs.

let's do 'echo 0 > /sys/devices/system/cpu/cpu2/online'

we'll get the following result:
Hardware Thread Topology
********************************************************************************
Sockets:                1
Cores per socket:       4
Threads per core:       2
--------------------------------------------------------------------------------
HWThread        Thread          Core            Socket          Available
0               0               0               0               *
1               0               1               0               *
2               4294967295              4294967295              1
3               0               2               0               *
4               1               0               0               *
5               1               1               0               *
6               0               3               0               *
--------------------------------------------------------------------------------
Socket 0:               ( 0 4 1 5 3 6 )
Socket 1:               ( -1 )
--------------------------------------------------------------------------------
********************************************************************************
Cache Topology
********************************************************************************
Level:                  1
Size:                   32 kB
Cache groups:           ( 0 4 ) ( 1 5 ) ( 3 6 ) ( -1 )
...
free(): invalid pointer
Aborted


this patch is a suggestion in order to fix the issue

it will give us the proper result:
Hardware Thread Topology
********************************************************************************
Sockets:                1
Cores per socket:       4
Threads per core:       2
--------------------------------------------------------------------------------
HWThread        Thread        Core        Die        Socket        Available
0               0             0           0          0             *                
1               1             0           0          0             *                
2               0             1           0          0             *                
3               1             1           0          0             *                
4               0             2           0          0             *                
5               1             2           0          0             *                
6               0             3           0          0             *                
--------------------------------------------------------------------------------
Socket 0:               ( 0 4 1 5 3 7 6 )
********************************************************************************
Cache Topology
********************************************************************************
Level:                  1
Size:                   32 kB
Cache groups:           ( 0 4 ) ( 1 5 ) ( 3 7 ) ( 6 )



Best,
-- Alexis

Thomas Gruber

unread,
Jan 2, 2023, 12:29:09 PM1/2/23
to likwid-users
Hi Alexis,

Happy New Year. Thanks for the report and the suggested fix.

I'm not sure whether this is really a fix. Yes, it avoids (unsigned int)-1 for uninitialized HWThreads and the error but it presents invalid output. According to your command, you took CPU2 offline but it is shown as available after your fix. Moreover, using the index i sorts the list differently. While before, the HWThreads are listed using their OS ID, it uses i after the fix. This could lead to problems when hardware threads selection strings (e.g., -c N:0-3) are used because the sorting is now different.

There are multiple issues that come into play for you:
- The internally often used cpuid_topology.numHWThreads is determined in multiple ways. One of it is to read the /sys/devices/system/cpu/online file. This does not work for offline HW threads. This should be changed to /sys/devices/system/cpu/present. Then we get the full amount.
- The HWLOC backend is initialized with the HWLOC_TOPOLOGY_FLAG_INCLUDE_DISALLOWED flag and according to the docs, this detects the whole system "However offline PUs and NUMA nodes are still ignored.". So even we we have the right number for numHWThreads, HWLOC will not return the data for CPU2. So we need a sanitation step afterwards that reads the missing data from the system. Sysfs would be a good candidate since the code is already there but /sys/devices/system/cpu/cpu2/topology is not present if the CPU is offline. It is also not shown by /proc/cpuinfo. I have currently no idea where to get the data for offline "CPUs".
- The initialization of the hwthread pool with (unsigned int)-1 causes that multiple sockets are recognized (0 and -1). This causes the creation of two socket affinity domains. When freeing the empty cpulist for socket -1, the error happens. We could guard socket == (unsigned int)-1 to avoid this problem but it is not really a fix, more a workaround.

I'll think about it. Do you have an idea how to get the data for offline CPUs?

Best,
Thomas

Alexis DAMBRICOURT

unread,
Jan 3, 2023, 11:53:57 AM1/3/23
to likwid-users
Hi Thomas,
Thanks for your comprehensive answer.

I agree with you.

This patch fix the issue of the core id in the HW threads enumeration:

--------------------------------------------------------------------------------
HWThread        Thread        Core        Die        Socket        Available    
0               0             0           0          0             *            
1               1             0           0          0             *            
2               0             1           0          0             *            
3               1             1           0          0             *            
4               0             3           0          0             *            
5               1             3           0          0             *            
6               0             2           0          0             *            
--------------------------------------------------------------------------------
Socket 0:               ( 0 4 1 5 6 3 7 )                                        
-------------------------------------------------------------------------------- 


But then it makes 'Cache groups' wrong:

Cache Topology                                                                  
********************************************************************************
Level:                  1                                                      
Size:                   32 kB                                                  
Cache groups:           ( 0 4 ) ( 1 5 ) ( 6 3 ) ( 7 )                           


with the current implementation, where we list the next N cpu thread according to "local threads = cputopo["cacheLevels"][level]["threads"]" ,
I don't see anyway to advertise a 'hole'. It look like a dead end here...

To answer your question: 
hwloc/topology-x86.c:look_proc()  would be able to get the data for an offline CPUs:, but it's not called since 'set_cpubind()' is first checking if this cpu Id is online in the kernel. quite a dead end here again... 

Best,
-- Alexis

Thomas Gruber

unread,
Jan 4, 2023, 9:32:26 AM1/4/23
to likwid-users
Hi Alexis,

In your patch, you are using the loop counter as array index again. This causes that the threads are sorted differently. Some functions use the array index instead of the apicId (the OS given numbering), like likwid-topology. likwid-topology is the obvious and simple case (Replace cntr in L153). We would need to check the whole library code. Therefore, I would refrain from changing the array index.

The HWLOC look_proc() function does not work in our case even if we mimic it using CPUID ourselves. The application has to run on the CPU to get its data and if it is offline, you cannot move the application there.

I attached a patch (to the current master branch) that should do the job. I added some guards to the topology code to skip offline CPUs. The likwid-topology code prints '-' for offline CPUs but uses the loop index for printing the ApicID. I'm not sure whether the later might not break something for strangely numbered systems. I didn't find a way to get the data for offline CPUs, yet. It's quite strange that the core numbering is changed by the Linux kernel when taking one SMT Thread offline. The core ID is changed. In the online case, CPU2 has core ID 2 SMT ID 0 and CPU6 has core ID 2 SMTID 1. After taking CPU2 offline, the core ID of CPU6 changes to 3 and SMTID to 0. IMHO a bug.

Please try the patch out, then I commit it to the repo.

Best,
Thomas

--------------------------------------------------------------------------------
CPU name:    Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz
CPU type:    Intel Kabylake processor
CPU stepping:    10
********************************************************************************

Hardware Thread Topology
********************************************************************************
Sockets:        1
Cores per socket:    4
Threads per core:    2
--------------------------------------------------------------------------------
HWThread        Thread        Core        Die        Socket        Available
0               0             0           0          0             *                
1               0             1           0          0             *                
2               -             -           -          -                              
3               0             2           0          0             *                
4               1             0           0          0             *                
5               1             1           0          0             *                

6               0             3           0          0             *                
7               1             2           0          0             *                

--------------------------------------------------------------------------------
Socket 0:        ( 0 4 1 5 3 7 6 )
--------------------------------------------------------------------------------
********************************************************************************
Cache Topology
********************************************************************************
Level:            1
Size:            32 kB
Cache groups:        ( 0 4 ) ( 1 5 ) ( 3 7 ) ( 6 )
--------------------------------------------------------------------------------
Level:            2
Size:            256 kB

Cache groups:        ( 0 4 ) ( 1 5 ) ( 3 7 ) ( 6 )
--------------------------------------------------------------------------------
Level:            3
Size:            6 MB
Cache groups:        ( 0 4 1 5 3 7 6 )
--------------------------------------------------------------------------------
********************************************************************************
NUMA Topology
********************************************************************************
NUMA domains:        1
--------------------------------------------------------------------------------
Domain:            0
Processors:        ( 0 1 3 4 5 6 7 )
Distances:        10
Free memory:        154.609 MB
Total memory:        7817.57 MB
--------------------------------------------------------------------------------

likwid-offline-cpu.patch

Alexis DAMBRICOURT

unread,
Jan 6, 2023, 5:52:54 AM1/6/23
to likwid-users
Hi Thomas,

Thank you for the patch. It's indeed much more clear to print '-' for offline CPUs.
That does well the job for HWThread enumeration, but there is still a bug for Cache Groups.

if I desactivate cpu 6 (HT of core 2):

Sockets:                1
Cores per socket:       4
Threads per core:       2
--------------------------------------------------------------------------------
HWThread        Thread        Core        Die        Socket        Available
0               0             0           0          0             *            
1               0             1           0          0             *            
2               0             2           0          0             *            
3               0             3           0          0             *            

4               1             0           0          0             *            
5               1             1           0          0             *            
6               -             -           -          -                          
7               1             3           0          0             *            
--------------------------------------------------------------------------------
Socket 0:               ( 0 4 1 5 2 3 7 )

--------------------------------------------------------------------------------
********************************************************************************
Cache Topology
********************************************************************************
Level:                  1
Size:                   32 kB
Cache groups:           ( 0 4 ) ( 1 5 ) ( 2 3 ) ( 7 )

Output prints ( 2 3 ) as sharing the same L1 cache, which is obviously wrong.

The bug is in the Lua code on one side, where ""local threads = cputopo["cacheLevels"][level]["threads"]""  will make the loop to always look for the same number of threads, where ( 2 ) has only one.

On the other side, 
 in  hwloc_init_cacheTopology() :
   cachePool[id].threads = likwid_hwloc_record_objs_of_type_below_obj( hwloc_topology, obj, HWLOC_OBJ_PU, NULL, NULL);

will guess from the first Core that all cores have the same thread count.

So, if we deactivate CPU1 (HT of Core 0), if will produce this:
HWThread        Thread        Core        Die        Socket        Available
0               0             0           0          0             *            
1               0             1           0          0             *            
2               0             3           0          0             *            
3               0             5           0          0             *            
4               -             -           -          -                          
5               1             2           0          0             *            
6               1             4           0          0             *            
7               1             6           0          0             *            
--------------------------------------------------------------------------------
Socket 0:               ( 0 1 5 2 6 3 7 )
--------------------------------------------------------------------------------

********************************************************************************
Cache Topology
********************************************************************************
Level:                  1
Size:                   32 kB
Cache groups:           ( 0 ) ( 1 ) ( 5 ) ( 2 ) ( 6 ) ( 3 ) ( 7 )


I do not see any easy way to fix this but to build a dedicated cacheTree as we build a  cpuid_topology.topologyTree.
This tree will be then straighforward to handle on the Lua side.

Best,
-- Alexis

Thomas Gruber

unread,
Jan 6, 2023, 9:00:29 AM1/6/23
to likwid-users
Hi Alexis,

thanks for checking the outcome more thoroughly. I did the patch on the side while working on new architecture support (Zen4 + SapphireRapids).

I'm not really happy with the topology tree code. I tried to remove it once because it so errorprone but there are parts relying on it that cannot be rewritten without the tree. At least I havn't found a way at that time.

So you are right, we either need a separate tree for the cache topology or we try to get the hardware threads IDs directly when reading in the caches (through /sys/devices/system/cpu/cpu*/cache/index*/shared_cpu_list or hwloc if it provides this info). This way, we don't have to collect the list(s) from the tree in the Lua part but can simply print out the list(s). We get the count already through hwloc by iterating over all hardware threads attached to the cache.

I will look into the cacheTopology code but it will take time since my holidays are starting next week and I won't work on LIKWID for around a month.

Best,
Thomas

Alexis DAMBRICOURT

unread,
Jan 6, 2023, 12:12:37 PM1/6/23
to likwid-users
Thanks Thomas,
no emergency at all, enjoy your holidays.

Best,
Alexis

Thomas Gruber

unread,
Jan 6, 2023, 10:09:10 PM1/6/23
to likwid-users

Hi Alexis,

I could not go to holiday knowing the issue is not fixed yet.

I added new fields in the CacheLevel struct to store a list of CacheGroups. Each CacheGroup contains the online hardware threads sharing a cache segment. The lookup code is only in the hwloc backend and not in the fallback procfs/sysfs backend. I also added code to publish the data via the Lua interface and likwid-toplogy uses the new fields to print the thread groups.

Best,
Thomas
likwid-offline-cpu-and-cache-threads.patch

Alexis DAMBRICOURT

unread,
Jan 17, 2023, 4:40:07 AM1/17/23
to likwid-users
Hi Thomas,

I took some time to test it on different configurations (especially AMD EPYC 7413 and Intel i7-8665U): it works like a charm.

Thanks a lot for your time.

Best,
Alexis

Thomas Gruber

unread,
Mar 8, 2023, 11:55:37 AM3/8/23
to likwid-users
Hi Alexis,

I'm back from my long vacation. I'm glad it works for you. I add the patch to the repository for future releases.

Best,
Thomas
Reply all
Reply to author
Forward
0 new messages