Get NUMA node/socket from PMem directory

91 views
Skip to first unread message

Lawrence Benson

unread,
Nov 24, 2020, 1:07:11 PM11/24/20
to pmem
Hey,

I'm currently working on a small project where I need to get the socket to which a PMem file is "closest". The basic idea is that the user provides a file or directory to a PMem-mounted filesystem, e.g., `/mnt/my-pmem/foo` (I have no control over this) and from that I want to programatically find out if the data is located in PMem closer to socket 0 or socket 1, i.e., /dev/pmem0 or /dev/pmem1. I want to use this information to explicitly pin threads to the NUMA node(s) of that socket later on in the application to avoid threads being spawned all over the place and accessing "remote" PMem.

Is there a way to do this in C/C++? I've played around with `get_mempolicy` and `move_pages` so far, but `get_mempolicy` always returns node 3 regardless of the actual location and `move_pages` is giving me "File exists" statuses for the moves, which is not documented in their man page.

I'd appreciate any advice/pointers.

Kind regards,
Lawrence

ppbb...@gmail.com

unread,
Nov 24, 2020, 1:34:15 PM11/24/20
to pmem
Hi Lawrence,

I'd suggest using ndctl_namespace_get_numa_node (or region, whichever is easier) from libndctl (https://github.com/pmem/ndctl). You will need to first find a ns for the file though - you can take a look at how this is implemented in PMDK: https://github.com/pmem/pmdk/blob/master/src/libpmem2/region_namespace_ndctl.c

If you want, you can also create a feature request for libpmem2 on PMDK's issue tracker (https://github.com/pmem/pmdk/issues). Something like this would be fairly trivial to implement, at least on Linux.

Piotr

Lawrence Benson

unread,
Nov 26, 2020, 8:29:07 AM11/26/20
to pmem
Hey Piotr,

Thanks for the quick reply and pointer. I played around with ndctl_namespace_get_numa_node for a bit and actually came to the same problem as before with get_mempolicy. It turns out that this simply does not work on one of the servers I tested it on (running CentOS 7, with an unfortunately very old 3.10 Kernel). Another server (Ubuntu 18.04, 4.15 Kernel) returns the correct value for both options. 

However, using ndctl_namespace_get_numa_node is quite verbose, as I needed to duplicate a significant part of the internal libpmem2 code. I'll open a feature request and might submit a PR if this is feasible for someone external to implement. Currently I defaulted back to get_mempolicy for simplicity but I think this would be a valuable addition to libpmem2.

- Lawrence

ppbb...@gmail.com

unread,
Nov 26, 2020, 9:26:54 AM11/26/20
to pmem
If this reproduces on CentOS/RHEL 7.9, then this is likely a kernel bug that should be reported to Red Hat.

Thanks for working on this. We make no distinction between external and internal contributors in PMDK - you are welcome to create a PR with the changes.

Piotr

Steve Scargall

unread,
Nov 26, 2020, 11:31:08 AM11/26/20
to pmem
Hi Lawrence, 

There is a Kernel bug in CentOS 7 where it does not report the NUMA node. NDCTL is a victim of this. There are two related issues filed https://github.com/pmem/ndctl/issues/146 and https://github.com/pmem/ndctl/issues/130. I know it's fixed in CentOS 8.x but I've not checked 7.8 or 7.9 recently to see if they backported the fix. 

/Steve

Reply all
Reply to author
Forward
0 new messages