Suddent crash on beegfs-meta for v7.4.5

62 views
Skip to first unread message

Timothy Yim

unread,
Mar 25, 2026, 3:00:06 AM (7 days ago) Mar 25
to beegfs-user
Hello,

  Beegfs META for v7.4.5 suddenly crash yesterday.

  Is it a bug or issue on v7.4.5?

  Error in beegfs-meta.log: 
(0) Mar24 17:57:34 Worker55 [PThread.cpp:99] >> Received a SIGSEGV. Trying to shut down...
(1) Mar24 17:57:34 Worker55 [PThread::signalHandler] >> Backtrace:
1: /opt/beegfs/sbin/beegfs-meta(_ZN7PThread13signalHandlerEi+0x47) [0x7103d7]
2: /lib64/libc.so.6(+0x3e730) [0x7f2b25a3e730]
(0) Mar24 17:57:35 Worker55 [PThread.cpp:135] >> Received a SIGABRT. Trying to shut down...
(1) Mar24 17:57:35 Worker55 [PThread::signalHandler] >> Backtrace:
1: /opt/beegfs/sbin/beegfs-meta(_ZN7PThread13signalHandlerEi+0x47) [0x7103d7]
2: /lib64/libc.so.6(+0x3e730) [0x7f2b25a3e730]
3: /lib64/libc.so.6(+0x8ba6c) [0x7f2b25a8ba6c]
4: /lib64/libc.so.6(raise+0x16) [0x7f2b25a3e686]
5: /lib64/libc.so.6(abort+0xd3) [0x7f2b25a28833]
6: /lib64/libstdc++.so.6(+0xa1b21) [0x7f2b25ea1b21]
7: /lib64/libstdc++.so.6(+0xad52c) [0x7f2b25ead52c]
8: /lib64/libstdc++.so.6(+0xad597) [0x7f2b25ead597]
9: /lib64/libstdc++.so.6(+0xad7f9) [0x7f2b25ead7f9]
10: /opt/beegfs/sbin/beegfs-meta() [0x4c7474]
11: /lib64/libc.so.6(+0x3e730) [0x7f2b25a3e730]
(2) Mar24 17:57:37 Main [App (wait for component termination)] >> Still waiting for this component to stop: Worker55

  beegfs-meta service log:
  Mar 24 17:57:35 h08mds1 beegfs-meta[6231]: terminate called after throwing an instance of 'SignalException'
Mar 24 17:57:35 h08mds1 beegfs-meta[6231]:   what():  Segmentation fault
Mar 24 17:57:41 h08mds1 beegfs-meta[6231]: terminate called recursively
Mar 24 17:57:46 h08mds1 systemd-coredump[1326857]: [🡕] Process 6231 (beegfs-meta/Mai) of user 0 dumped core.

                                                   Stack trace of thread 6296:
                                                   #0  0x00007f2b25a8ba6c n/a (n/a + 0x0)
                                                   ELF object binary architecture: AMD x86-64
Mar 24 17:57:47 h08mds1 systemd[1]: beegfs-meta.service: Main process exited, code=dumped, status=6/ABRT
Mar 24 17:57:47 h08mds1 systemd[1]: beegfs-meta.service: Failed with result 'core-dump'.
Mar 24 17:57:47 h08mds1 systemd[1]: beegfs-meta.service: Consumed 2month 2w 5d 20min 15.909s CPU time.

/var/log/messages:
Mar 24 17:57:46 h08mds1 systemd-coredump[1326857]: Process 6231 (beegfs-meta/Mai) of user 0 dumped core.#012#012Stack trace of thread 6296:#012#0  0x00007f2b25a8ba6c n/a (n/a + 0x0)#012ELF object binary architecture: AMD x86-64

Thank you.

Timothy Yim

unread,
Mar 26, 2026, 9:46:01 PM (5 days ago) Mar 26
to beegfs-user
Crash again after 2 days.
(0) Mar26 18:47:19 Worker64 [RWLock::readLock] >> Failed to get lock: Resource temporarily unavailable
(1) Mar26 18:47:19 Worker64 [RWLock::readLock] >> Backtrace:
1: /opt/beegfs/sbin/beegfs-meta(_ZN6RWLock8readLockEv+0x17f) [0x4ffebf]
2: /opt/beegfs/sbin/beegfs-meta(_ZN14MsgHelperClose22closeChunkFileParallelE9NumericIDIj12NumNodeIDTagERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEiR9FileInodeP9EntryInfojPSt6vectorI18DynamicFileAttribsSaISG_EE+0x4b) [0x59804b]
3: /opt/beegfs/sbin/beegfs-meta(_ZN14MsgHelperClose9closeFileE9NumericIDIj12NumNodeIDTagERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEP9EntryInfoijPbSD_PSt6vectorI18DynamicFileAttribsSaISF_EEP18MirroredTimestamps+0x98) [0x59c0a8]
4: /opt/beegfs/sbin/beegfs-meta(_ZN14CloseFileMsgEx16closeFilePrimaryERN10NetMessage15ResponseContextE+0x2d4) [0x6cb964]
5: /opt/beegfs/sbin/beegfs-meta(_ZN15MirroredMessageI12CloseFileMsg10FileIDLockE15processIncomingERN10NetMessage15ResponseContextE+0x4ff) [0x6cdb6f]
6: /opt/beegfs/sbin/beegfs-meta(_ZN27IncomingPreprocessedMsgWork7processEPcjS0_j+0x180) [0x77e910]
7: /opt/beegfs/sbin/beegfs-meta(_ZN6Worker8workLoopE13QueueWorkType+0x146) [0x7885f6]
8: /opt/beegfs/sbin/beegfs-meta(_ZN6Worker3runEv+0x58) [0x788cb8]
9: /opt/beegfs/sbin/beegfs-meta(_ZN7PThread9runStaticEPv+0x11c) [0x4fbb5c]
10: /lib64/libc.so.6(+0x89d22) [0x7fc304c89d22]
11: /lib64/libc.so.6(+0x10ed40) [0x7fc304d0ed40]
(0) Mar26 18:47:19 Worker64 [App (component exception handler)] >> The component [Worker64] encountered an unrecoverable error. [SysErr: No such file or directory] Exception message: Resource temporarily unavailable
(2) Mar26 18:47:19 Worker64 [App (component exception handler)] >> Shutting down...
.....
(0) Mar26 18:47:22 Main [InodeDirStore.cpp:147] >> Bug: releaseDir requested, but dir not referenced! dirID: 14-69C50667-B
(1) Mar26 18:47:22 Main [releaseDirUnlocked] >> Backtrace:
1: /opt/beegfs/sbin/beegfs-meta(_ZN13InodeDirStore18releaseDirUnlockedERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x365) [0x54c885]
2: /opt/beegfs/sbin/beegfs-meta(_ZN13InodeDirStore18clearStoreUnlockedEv+0x2f) [0x54e21f]
3: /opt/beegfs/sbin/beegfs-meta(_ZN3AppD2Ev+0x781) [0x4f60b1]
4: /opt/beegfs/sbin/beegfs-meta(_ZN3AppD0Ev+0x9) [0x4f6829]
5: /opt/beegfs/sbin/beegfs-meta(_ZN7Program4mainEiPPc+0x123) [0x4e9963]
6: /lib64/libc.so.6(+0x295d0) [0x7fc304c295d0]
7: /lib64/libc.so.6(__libc_start_main+0x80) [0x7fc304c29680]
8: /opt/beegfs/sbin/beegfs-meta(_start+0x25) [0x4ec1e5]


Anyone can help? Thank you.

Waltar

unread,
Mar 27, 2026, 11:53:25 AM (5 days ago) Mar 27
to beegfs-user
Did you look in into system log file (eg rh messages) already which might give the reason why beegfs(meta) gets out of work ?!

Timothy Yim

unread,
Mar 29, 2026, 10:07:12 PM (2 days ago) Mar 29
to beegfs-user
We have performed beegfs-fsck twice on 28 Mar.
1st fsck
1. found 35355 errors for Chunk without an inode pointing to it (orphaned chunk) ... -> Do Nothing
2. found 60 errors: Attributes of file inode are wrong ... -> attributes updated

2nd fsck:
1. found 35355 errors for Chunk without an inode pointing to it (orphaned chunk) ... -> Do Nothing

Error again on 29 Mar:
(0) Mar29 22:12:50 Worker8 [PThread.cpp:99] >> Received a SIGSEGV. Trying to shut down...
(1) Mar29 22:12:50 Worker8 [PThread::signalHandler] >> Backtrace:
1: /opt/beegfs/sbin/beegfs-meta(_ZN7PThread13signalHandlerEi+0x47) [0x7103d7]
2: /lib64/libc.so.6(+0x3e730) [0x7fca1143e730]
3: /opt/beegfs/sbin/beegfs-meta(_ZN14MsgHelperClose22closeChunkFileParallelE9NumericIDIj12NumNodeIDTagERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEiR9FileInodeP9EntryInfojPSt6vectorI18DynamicFileAttribsSaISG_EE+0x6a) [0x59806a]
4: /opt/beegfs/sbin/beegfs-meta(_ZN14MsgHelperClose9closeFileE9NumericIDIj12NumNodeIDTagERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEP9EntryInfoijPbSD_PSt6vectorI18DynamicFileAttribsSaISF_EEP18MirroredTimestamps+0x98) [0x59c0a8]
5: /opt/beegfs/sbin/beegfs-meta(_ZN14CloseFileMsgEx16closeFilePrimaryERN10NetMessage15ResponseContextE+0x2d4) [0x6cb964]
6: /opt/beegfs/sbin/beegfs-meta(_ZN15MirroredMessageI12CloseFileMsg10FileIDLockE15processIncomingERN10NetMessage15ResponseContextE+0x4ff) [0x6cdb6f]
7: /opt/beegfs/sbin/beegfs-meta(_ZN27IncomingPreprocessedMsgWork7processEPcjS0_j+0x180) [0x77e910]
8: /opt/beegfs/sbin/beegfs-meta(_ZN6Worker8workLoopE13QueueWorkType+0x146) [0x7885f6]
9: /opt/beegfs/sbin/beegfs-meta(_ZN6Worker3runEv+0x58) [0x788cb8]
10: /opt/beegfs/sbin/beegfs-meta(_ZN7PThread9runStaticEPv+0x11c) [0x4fbb5c]
11: /lib64/libc.so.6(+0x89d22) [0x7fca11489d22]
12: /lib64/libc.so.6(+0x10ed40) [0x7fca1150ed40]
(0) Mar29 22:12:50 Worker8 [App (component exception handler)] >> The component [Worker8] encountered an unrecoverable error. [SysErr: Success] Exception message: Segmentation fault

Chceked that many Deserialization failed found:
(0) Mar28 13:16:47 Worker8 [DiskMetadata (DirInode Deserialization)] >> Deserialization failed: expected DirInode, but got (numeric type): 3
(0) Mar28 13:16:47 Worker8 [Directory (load from xattr file)] >> Unable to deserialize metadata in file: inodes/40/59/6A7-68DDFF4B-B

Any recommendation for our issue? Thank you.

Waltar

unread,
Mar 30, 2026, 2:55:00 PM (2 days ago) Mar 30
to beegfs-user
Never ever do a "on-top beegfs-fsck" before checking/scrub the underlying filesystems first !!
It looks like you have corrupted meta-data filesystem (ext4 ?) which lets your beegfs crash as a follow-up.

Timothy Yim

unread,
Mar 30, 2026, 10:13:58 PM (2 days ago) Mar 30
to beegfs-user
We wil try fixing meta mount point. Thank you.
Reply all
Reply to author
Forward
0 new messages