Race condition when WiredTiger is combined with PMEM and DAX

Skip to first unread message

chloi alverti

Sep 10, 2021, 4:22:33 AM9/10/21
to wiredtiger-users
Hello to all!

I was wondering if anyone else
has tried to run the develop branch of wiredtiger
that enables mmio (https://engineering.mongodb.com/post/getting-storage-engines-ready-for-fast-storage-devices) over Intel's Optane PMEM storage 
combined with the DAX interface (no page cache buffering). 
It seems that with a read/update workload
there is some race condition (possibly with a linux kernel mutex)
and the process goes to non uninterruptible sleep.

Has anyone else experience this?  Could it be caused by my configuration?

Thank you all in advance for all your help!
Best Regards,

Alexandra Fedorova

Oct 20, 2021, 12:43:38 AM10/20/21
to wiredtiger-users
Hi Chloe, 

I am the author of the blog post you are citing. I am happy to try and help. 

I’ve used Optane PMEM with DAX extensively (though not in the precise configuration that you have tried) and I certainly noticed quite a few non-interruptible sleeps. For example, there is a kernel bug in the DAX code at fs/inode.c:530 that caused me much trouble. To check if you are experiencing the same issue, please try running “dmesg” on your machine: are there any log messages about a kernel bug there? If so, your only resort is a hard reboot (soft reboot won’t help, if you are hitting the same bug I did).

Reply all
Reply to author
0 new messages