meta server can't boot Beegfs 7.1.5

172 views
Skip to first unread message

Chen Bill

unread,
Sep 22, 2020, 6:25:51 AM9/22/20
to beegfs-user
Hi All,

The Meta Server reboot with unknown kernel panic reason, then the meta service can't boot with below error messages.


/opt/beegfs/sbin/beegfs-meta cfgFile=/etc/beegfs/beegfs-meta.conf runDaemonized=false
terminate called after throwing an instance of 'SignalException'
  what():  Segmentation fault
terminate called recursively
Aborted (core dumped)

also error messages from /var/log/beegfs-meta.log

(0) Sep22 17:40:23 Main [PThread.cpp:99] >> Received a SIGSEGV. Trying to shut down...
(1) Sep22 17:40:23 Main [PThread::signalHandler] >> Backtrace:
1: /opt/beegfs/sbin/beegfs-meta(_ZN7PThread13signalHandlerEi+0x47) [0x75a3e7]
2: /lib64/libc.so.6(+0x36280) [0x7fc12b814280]
3: /opt/beegfs/sbin/beegfs-meta(_ZN18ExceededQuotaStore19updateExceededQuotaEPSt4listIjSaIjEE13QuotaDataType14QuotaLimitType+0x1e) [0x752e6e]
4: /opt/beegfs/sbin/beegfs-meta(_ZN15InternodeSyncer29downloadAllExceededQuotaListsESt10shared_ptrI11StoragePoolE+0x169) [0x4aab89]
5: /opt/beegfs/sbin/beegfs-meta(_ZN15InternodeSyncer29downloadAllExceededQuotaListsERKSt6vectorISt10shared_ptrI11StoragePoolESaIS3_EE+0xb2) [0x4ab562]
6: /opt/beegfs/sbin/beegfs-meta(_ZN3App16downloadMgmtInfoER22TargetConsistencyState+0x1fa) [0x4880ca]
7: /opt/beegfs/sbin/beegfs-meta(_ZN3App9runNormalEv+0x12f) [0x48d34f]
8: /opt/beegfs/sbin/beegfs-meta(_ZN3App3runEv+0x52) [0x48d8d2]
9: /opt/beegfs/sbin/beegfs-meta(_ZN7PThread9runStaticEPv+0xfe) [0x4828fe]
10: /opt/beegfs/sbin/beegfs-meta(_ZN7Program4mainEiPPc+0x49) [0x47fa19]
11: /lib64/libc.so.6(__libc_start_main+0xf5) [0x7fc12b8003d5]
12: /opt/beegfs/sbin/beegfs-meta() [0x4821f5]
(0) Sep22 17:40:23 Main [PThread.cpp:135] >> Received a SIGABRT. Trying to shut down...
(1) Sep22 17:40:23 Main [PThread::signalHandler] >> Backtrace:
1: /opt/beegfs/sbin/beegfs-meta(_ZN7PThread13signalHandlerEi+0x47) [0x75a3e7]
2: /lib64/libc.so.6(+0x36280) [0x7fc12b814280]
3: /lib64/libc.so.6(gsignal+0x37) [0x7fc12b814207]
4: /lib64/libc.so.6(abort+0x148) [0x7fc12b8158f8]
5: /lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x165) [0x7fc12c33f7d5]
6: /lib64/libstdc++.so.6(+0x5e746) [0x7fc12c33d746]
7: /lib64/libstdc++.so.6(+0x5d6f9) [0x7fc12c33c6f9]
8: /lib64/libstdc++.so.6(__gxx_personality_v0+0x564) [0x7fc12c33d364]
9: /lib64/libgcc_s.so.1(+0xf8a3) [0x7fc12bdd68a3]
10: /lib64/libgcc_s.so.1(_Unwind_RaiseException+0xfb) [0x7fc12bdd6c3b]
11: /lib64/libstdc++.so.6(__cxa_throw+0x66) [0x7fc12c33d986]
12: /opt/beegfs/sbin/beegfs-meta(_ZN7PThread13signalHandlerEi+0x296) [0x75a636]
13: /lib64/libc.so.6(+0x36280) [0x7fc12b814280]
14: /opt/beegfs/sbin/beegfs-meta(_ZN18ExceededQuotaStore19updateExceededQuotaEPSt4listIjSaIjEE13QuotaDataType14QuotaLimitType+0x1e) [0x752e6e]
15: /opt/beegfs/sbin/beegfs-meta(_ZN15InternodeSyncer29downloadAllExceededQuotaListsESt10shared_ptrI11StoragePoolE+0x169) [0x4aab89]
16: /opt/beegfs/sbin/beegfs-meta(_ZN15InternodeSyncer29downloadAllExceededQuotaListsERKSt6vectorISt10shared_ptrI11StoragePoolESaIS3_EE+0xb2) [0x4ab562]
17: /opt/beegfs/sbin/beegfs-meta(_ZN3App16downloadMgmtInfoER22TargetConsistencyState+0x1fa) [0x4880ca]
18: /opt/beegfs/sbin/beegfs-meta(_ZN3App9runNormalEv+0x12f) [0x48d34f]
19: /opt/beegfs/sbin/beegfs-meta(_ZN3App3runEv+0x52) [0x48d8d2]
20: /opt/beegfs/sbin/beegfs-meta(_ZN7PThread9runStaticEPv+0xfe) [0x4828fe]
21: /opt/beegfs/sbin/beegfs-meta(_ZN7Program4mainEiPPc+0x49) [0x47fa19]
22: /lib64/libc.so.6(__libc_start_main+0xf5) [0x7fc12b8003d5]
23: /opt/beegfs/sbin/beegfs-meta() [0x4821f5]


Any suggestions will be appreciate!

Regards,
Bill 
 

Sven Breuner

unread,
Sep 22, 2020, 7:31:14 AM9/22/20
to fhgfs...@googlegroups.com, Chen Bill

Hi,

from the error messages, there seems to be a problem during quota initialization. If it is an option for you and if you have quota enabled, you might want to try disabling quota temporarily to see if it helps you get the service back up and running. As this is related to communication with the beegfs-mgmtd, you might also want to check if that service is still running fine or reporting any errors.

All the best
Sven
--
Sven Breuner
Field CTO
Excelero

--
You received this message because you are subscribed to the Google Groups "beegfs-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fhgfs-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fhgfs-user/b9331791-6e2f-420d-9d63-7a194346baa2n%40googlegroups.com.

Chen Bill

unread,
Sep 22, 2020, 11:31:49 AM9/22/20
to beegfs-user
Thank you Sven,
The quota is not enabled and there is no useful error for mgmtd, but I solved this problem at last as below steps.

1.  removed the data of mgmtd and reinit mgmtd service
2. add old meta and storage to new mgmtd (be careful  set toreAllowFirstRunInit       = false)
3. start beegfs-client

Cheers,
Bill

Toby Darling

unread,
Jul 15, 2021, 3:01:04 AM7/15/21
to fhgfs...@googlegroups.com
Hi Bill

Nod of appreciation to you - we ran into this problem a week ago and
your solution helped us. Quite scary rerunning the beegfs-setup-*
scripts, but it got us out of the hole.

We're still not sure of the cause, although we were having some network
disruption around the time of the failure.

Many thanks.

Cheers
Toby
>> <https://groups.google.com/d/msgid/fhgfs-user/b9331791-6e2f-420d-9d63-7a194346baa2n%40googlegroups.com?utm_medium=email&utm_source=footer>.
>
> --
> You received this message because you are subscribed to the Google
> Groups "beegfs-user" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to fhgfs-user+...@googlegroups.com
> <mailto:fhgfs-user+...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/fhgfs-user/038e5565-df06-473b-a458-0b81401018d3n%40googlegroups.com
> <https://groups.google.com/d/msgid/fhgfs-user/038e5565-df06-473b-a458-0b81401018d3n%40googlegroups.com?utm_medium=email&utm_source=footer>.

--
Toby Darling, Scientific Computing (2N249)
MRC Laboratory of Molecular Biology
https://www.mrc-lmb.cam.ac.uk/scicomp/

Xavi

unread,
Aug 2, 2021, 4:37:45 AM8/2/21
to beegfs-user
Hi,

are you using buddygroups? We have a similar problem and we want to re-set the mgmt, but We have buddygroups configured and we have the douct if i will work.

Thanks,
Xavi.

Toby Darling

unread,
Aug 2, 2021, 5:22:14 AM8/2/21
to fhgfs...@googlegroups.com
Hi Xavi

No, we don't use buddygroups.

Cheers
Toby
Reply all
Reply to author
Forward
0 new messages