How to deal with node failures and mirror groups

609 views
Skip to first unread message

Stefan Heitmüller

unread,
Oct 25, 2017, 1:21:29 PM10/25/17
to beegfs-user

Hi!

Yesterday i finally had some time to get my head into BeeGFS and i'm impressed so far (searching for reliable, fast and scalable FS for quite some time, done Ceph, Gluster, ...)!

Before going live with some real hardware, i'd created a small test setup:

- 2 KVMs for metadata and storage (on Proxmox ZFS SSDs using virtio SCSI)
- 1 LXC as manager (Proxmox ZFS SSD also)
- created one storage target at each KVM
- created a buddy group
- enabled metadata and storage mirroring

So far i've been testing failover (which is the most important to us) and basic performance (impressive).

Now i imagine one of my nodes inside this buddy group get's lost forever due to whatever (fire, crash, ...). How do i regain health on the group again? Can't find any command to modifiy existing groups. Is it possible to remove a mirror group and set up a new one using ta fresh nodes storage target or can i add a new storage target as replacement to the the existing group?

Jan Behrend

unread,
Oct 26, 2017, 3:15:43 AM10/26/17
to fhgfs...@googlegroups.com
Hello Stefan,

On Wed, 2017-10-25 at 10:21 -0700, Stefan Heitmüller wrote:
> Is it possible to remove a mirror group and set up a new one using ta
> fresh nodes storage target or can i add a new storage target as
> replacement to the the existing group?

Sure is! Once you lost a buddy mirror, you can resync using the same
target ID for the replacement server:

/opt/beegfs/sbin/beegfs-setup-storage -p /storage -i <same ID as lost
server here> -m mgmtServer

After the target goes online it'll resync the buddy mirror.

Cheers, Jan

--
MAX-PLANCK-INSTITUT fuer Radioastronomie
Jan Behrend - Rechenzentrum
----------------------------------------
Auf dem Huegel 69, D-53121 Bonn                                  
Tel: +49 (228) 525 359
http://www.mpifr-bonn.mpg.de

Stefan Heitmüller

unread,
Oct 27, 2017, 2:59:12 AM10/27/17
to beegfs-user
Great and thank for your answer!

What happens to a mirror group if i remove all targets? Stays empty and waits for new targets or gets destroyed?

kva...@gmail.com

unread,
Dec 9, 2017, 5:22:22 AM12/9/17
to beegfs-user
Hi Jan,

What if beegfs-mgmt not alows registration with same ID?

Log on beegfs-storage:

(3) Dec09 10:05:10 Main [RegDGramLis] >> Listening for UDP datagrams: Port 8003
(1) Dec09 10:05:10 Main [App] >> Waiting for beegfs-mgmtd@beegfs-mgmtd-1:8008...
(2) Dec09 10:05:10 RegDGramLis [Heartbeat incoming] >> New node: beegfs-mgmtd beegfs-mgmtd-1 [ID: 1];
(3) Dec09 10:05:10 Main [NodeConn (acquire stream)] >> Connected: beegfs...@10.29.36.173:8008 (protocol: TCP)
(0) Dec09 10:05:10 Main [App] >> Target ID reservation request was rejected by this mgmt node: beegfs-mgmtd-1 [ID: 1]
(0) Dec09 10:05:10 Main [App] >> Target pre-registration at management node failed

Log on beegfs-mgmtd:

(1) Dec09 10:05:10 Worker4 [RegisterTargetMsg incoming] >> Registration failed for target: 0-5A2BB556-3ED; numID: 1005
Reply all
Reply to author
Forward
Message has been deleted
0 new messages