Beegfs v8 Issues with file deletion

99 views
Skip to first unread message

Tyler

unread,
Sep 18, 2025, 2:19:52 PMSep 18
to beegfs-user
Running latest release of beegfs[meta,storage,mgmtd,client-dkms], version 8.1, on RHEL 8.10.
Users are experiencing issues removing/deleting files/directories under the beegfs mount points.
Issues seem to typically occur on git clone directories and build directories ( directories where code is compiled).

After the git clone or build completes successfully or fails, users are unable to delete all the contents inside the folder.
Error on the terminal is "XYZ device resource busy"

I enabled high verbosity logging on all services and the client and all I was able to gather is the following entries from the beegfs-client-dkms:

FghsOps_revalidateIntent: called. Path: <PATH_TO_DIRECTORY> ; EntryID 232131
FghsOps_opendir_incremental: called. Path:  <PATH_TO_DIRECTORY> ; EntryID 232131
FghsOps_getAttr: called. Path:  <PATH_TO_DIRECTORY> ; EntryID 232131
FghsOps_readdir_incremental: called. Path:  <PATH_TO_DIRECTORY> ; EntryID 232131
FghsOps_revalidateIntent: called. Path: <PATH_TO_DIRECTORY> ; EntryID 232131
FghsOps_getAttr: called. Path:  <PATH_TO_DIRECTORY> ; EntryID 232131
FghsOps_revalidateIntent: called. Path: <PATH_TO_DIRECTORY> ; EntryID 232131
FghsOps_opendir_incremental: called. Path:  <PATH_TO_DIRECTORY> ; EntryID 232131
FghsOps_getAttr: called. Path:  <PATH_TO_DIRECTORY> ; EntryID 232131
FghsOps_readdir_incremental: called. Path:  <PATH_TO_DIRECTORY> ; EntryID 232131
FghsOps_release: called. Path:  <PATH_TO_DIRECTORY> ; EntryID 232131
FghsOps_revalidateIntent: called. Path: <PATH_TO_DIRECTORY> ; EntryID 232131
# This block happens multiple times, then the following:

FghsOps_rmDir: called. Path: <PATH_TO_DIRECTORY> ; EntryID 232131
Remoting (rmdir): RmDirResp error code: entry is in use

Any and all assistance is appreciated, I have had no success finding similar issues.


Dan Healy

unread,
Sep 23, 2025, 4:12:30 PMSep 23
to fhgfs...@googlegroups.com
Is anyone else experiencing this?  I am and would like to resolve it. 

Thanks,

Daniel Healy


--
You received this message because you are subscribed to the Google Groups "beegfs-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fhgfs-user+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/fhgfs-user/e48fcb37-57f7-44db-8a59-a9c09cac2c27n%40googlegroups.com.

Joe McCormick

unread,
Sep 24, 2025, 2:29:56 PMSep 24
to beegfs-user
Hello,

There was an issue fixed in 8.0.1 (https://github.com/ThinkParQ/beegfs/releases/tag/8.0.1) where deleting a file that had been moved or renamed within the same directory could fail. But there are no known issues around file deletions in 8.1.

I tried to reproduce this on my system by cloning a git repo (specifically our beegfs-go repo) to a directory in BeeGFS, switching between branches, then deleting it. So far I haven't been able to reproduce the issue.

(1) Can you sanity check all servers+clients (particularly metadata servers) were upgraded to 8.1 and restarted?

(2) Are there any specific steps and/or repos you can provide to help recreate the issue?

(3) If I'm following you are running a command like `rm -rf <PATH_TO_DIRECTORY>` which eventually returns `XYZ device resource busy`. Is it just the top level `<PATH_TO_DIRECTORY>` directory remaining at this point? Or are there other files/directories left under `<PATH_TO_DIRECTORY>`?

(4) Are you able to later run `rm` to clean up `<PATH_TO_DIRECTORY>` and any remaining files/directories? Or do you still have files/directories you are unable to cleanup?

(5) If there are entries remaining can you run `beegfs entry info --retro --verbose --recurse --retro-print-paths <PATH_TO_DIRECTORY>` and provide that output here? If there are lots of entries I'm mainly interested in whichever entry ID(s) are called out in the client logs.

(6) Is `EntryID 232131` the actual entry ID that is printed? Or is that the result of sanitizing/redacting the output? Usually BeeGFS entry IDs are in the form `0-68D424CB-1`, so if there was an ID like `232131` that would be very unusual.

Thanks,
~Joe

Tyler

unread,
Sep 25, 2025, 1:58:42 PMSep 25
to beegfs-user

Sanity check on metadata servers shows a large amount of "Attributes of file node are wrong"
There are no specific steps I can provide to re-create this issue. I can provide some info on the Beegfs deployment though
I am running a server with mgmtd, meta and storage installed, and another server with storage installed. All of these are running in a multi-mode fashion. with two instances of mgmtd, meta, and one for storage on each host.

when doing the `rm -rf <PATH_TO_DIRECTORY`, files remain left under the ` <PATH_TO_DIRECTORY>`, these tend to be hidden directories, such as .git

The files are only able to be removed if the meta service for that instance is restarted.

EntryID 232131 is result of sanitizing/redacting the output. 

I had recently updated and rebooted these servers, so at the time am unable to get output of `beegfs entry info --retro --verbose --recurse --retro-print-paths <PATH_TO_DIRECTORY>`

Joe McCormick

unread,
Sep 30, 2025, 10:10:14 AMSep 30
to beegfs-user
Thank you for the additional detail. While I was not able to reproduce the git-repo case locally, most likely I don't have enough concurrency on my system to hit the bug.

However our regular stress tests recently flagged an issue where directory deletions could fail for "device or resource busy" errors with concurrent directory access. We already hardened those code paths in the upcoming 8.2 release, and I checked with the developer who made those changes who confirmed this entire class of deletion/EBUSY issues should now be resolved.

Once the 8.2 release is available please give it a try and let us know if the issue with deletions persists.

~Joe

Dan Healy

unread,
Oct 2, 2025, 12:45:58 PMOct 2
to fhgfs...@googlegroups.com
Hi Joe,

Thanks for that info. When is v8.2 planned for release? Or is there a way we can test a develop branch?

One thing we forgot to mention: The instances where we see this issue are instances where we have v7 deployed and upgraded to v8. When we install v8 from scratch without any migrated data, we don't have this issue.

Thanks,

Dan



--
Thanks,

Daniel Healy
Reply all
Reply to author
Forward
0 new messages