Hi,
I find several issues about the persistence syscalls in BeeGFS.
Environment settings:
1 management server, 2 metadata server, 2 storage server and 1 client
client is configured with "tuneRemoteFSync = true"
For example, if I issue the following command from bash:
echo "testing beegfs fsync on file0" > beegfs-client/file0;
strace sync beegfs-client;
strace sync beegfs-client/file0;
After the last one command finishes (here all commands executed correctly and successfully), I crash all 5 servers using "echo b > /proc/sysrq-trigger". And then restart all servers to wait whole file systems is back to normal. However, I can't list file0 from the client. And I can't find the data on the server storage disk.
Issue 2. fdatasync doens't flush directory entries, which is the data for directories. Here is the strace of our testing scripts: (beegfs-client is our client mount point)
openat(AT_FDCWD, "beegfs-client/file0", O_RDWR|O_CREAT, 0162100) = 3
close(3) = 0
openat(AT_FDCWD, "./beegfs-client", O_RDONLY) = 3
fdatasync(3) = 0
close(3) = 0
Similiarly with previous testing, we crash all server after the last call. After recovering from the crash, there is no file0 under the directory beegfs-client. And I tested the same command and failures on local Ext4, it can has the file0 after recovery.
I'm wonering whether any BeeGFS developers can take a look at these issues? If this is caused by some wrong configurations, please correct me.
Best,
Tao