Issue with DAX Namespace Configuration on "c8220" Machine in Clemson Cluster

21 views
Skip to first unread message

Taher Travadi

unread,
Feb 18, 2025, 2:52:22 PM2/18/25
to cloudlab-users
Hello CloudLab Support Team,

I am currently working on the "c8220" machine in the Clemson Cluster, which is running Ubuntu 18 with a custom kernel version 5.1.0-rc. My project requires me to utilize DAX (Direct Access), and I have successfully configured the necessary device drivers during kernel compilation. These drivers include:

NVDIMM (Non-Volatile Memory Device) Support -------------->
PMEM: Persistent memory block device support
BLK: Block data window (aperture) device support
PFN: Map persistent (device) memory
Processor type and features -------->
Support for non-standard NVDIMMs and ADR protected memory
Device memory (pmem, etc...) hotplug support
File systems---------->
Direct Access (DAX) support

As a result, I can see the pmem memory regions. However, when attempting to configure devdax namespaces, I encounter the following error:

"Failed to create namespace: Resource temporarily unavailable"

I am unsure of the root cause of this issue, and I would appreciate any insights or suggestions you may have regarding how to resolve it. Could this be related to kernel settings, system resource constraints, or another factor? I am attaching the log file about the exact dev/dex error.

Any assistance or guidance would be greatly appreciated.

Thank you for your support.

Best regards,
Taher
kernel-log.txt

David M Johnson

unread,
Feb 19, 2025, 12:27:16 PM2/19/25
to cloudla...@googlegroups.com
Hi Taher. We don't have direct experiencing with what you're trying to
do, creating a devdax namespace (we have only done pmem, e.g.
https://groups.google.com/d/msgid/cloudlab-users/6dce5fe4-5606-4a8d-a9a7-0127b6d5f880%40flux.utah.edu),
so I can't really say if you're doing something wrong, or if
kernel/userspace is too old (seems unlikely though). Oftentimes when
low-level commands fail like this, you can check the kernel dmesg log.
Relevant lines

Feb 18 12:44:16 localhost kernel: nd_pmem namespace0.0: unable to
guarantee persistence of writes
Feb 18 12:44:16 localhost kernel: namespace0.0 initialised, 18874368
pages in 346ms
Feb 18 12:44:16 localhost kernel: pmem0: detected capacity change from 0
to 77309411328
Feb 18 12:44:16 localhost kernel: nd_pmem namespace1.0: unable to
guarantee persistence of writes
Feb 18 12:44:16 localhost kernel: Adding 3145724k swap on /dev/sda3.
Priority:-2 extents:1 across:3145724k
Feb 18 12:44:16 localhost kernel: namespace1.0 initialised, 27262976
pages in 436ms
Feb 18 12:44:16 localhost kernel: pmem1: detected capacity change from 0
to 111669149696
Feb 18 13:28:18 node0.ttravadi-241983.boosthemem-pg0.clemson.cloudlab.us
kernel: EXT4-fs (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your
own risk
Feb 18 13:28:18 node0.ttravadi-241983.boosthemem-pg0.clemson.cloudlab.us
kernel: EXT4-fs (pmem0): mounted filesystem with ordered data mode.
Opts: dax
Feb 18 13:34:49 node0.ttravadi-241983.boosthemem-pg0.clemson.cloudlab.us
kernel: EXT4-fs (pmem1): DAX enabled. Warning: EXPERIMENTAL, use at your
own risk
Feb 18 13:34:49 node0.ttravadi-241983.boosthemem-pg0.clemson.cloudlab.us
kernel: EXT4-fs (pmem1): mounted filesystem with ordered data mode.
Opts: dax

which makes me wonder if you have created fsdax devices, not devdax
devices. Seems odd given the commands you issued, but I just lack
familiarity here.

> Any assistance or guidance would be greatly appreciated.
>
> Thank you for your support.
>
> Best regards,
> Taher

David

Gary Wong

unread,
Feb 21, 2025, 2:53:47 PM2/21/25
to 'Taher Travadi' via cloudlab-users
On Tue, Feb 18, 2025 at 11:52:22AM -0800, 'Taher Travadi' via cloudlab-users wrote:
> As a result, I can see the pmem memory regions. However, when attempting to
> configure devdax namespaces, I encounter the following error:
>
> "Failed to create namespace: Resource temporarily unavailable"
>
> I am unsure of the root cause of this issue, and I would appreciate any
> insights or suggestions you may have regarding how to resolve it.

I don't believe there are any testbed-related issues here, but possibly
relevant could be kernel command line parameters as described in:

https://groups.google.com/d/msgid/cloudlab-users/6dce5fe4-5606-4a8d-a9a7-0127b6d5f880%40flux.utah.edu

Assuming kernel paramters are correct, I would suspect higher level
(software) configuration problems, but don't have enough knowledge to
be specific. I can only suggest you try the Quick Setup Guide in
https://nvdimm.docs.kernel.org/ and similar.

Thanks for your interest, and good luck with your experiment!

Gary.
--
Gary Wong g...@flux.utah.edu http://www.cs.utah.edu/~gtw/
Reply all
Reply to author
Forward
0 new messages