BeeGFS Meta unexpectedly full

156 views
Skip to first unread message

Marcos Filho

unread,
Mar 5, 2024, 5:12:20 PMMar 5
to beegfs-user
Hello everyone, I'm facing an unusual issue in my environment. The BeeGFS Meta has filled up rapidly without any apparent reason.

Recently, the metadata partition of our BeeGFS volume became full. This volume is 28 TB in size and the metadata partition had 196 GB of size. This size was approximately 0.7% of the total storage capacity (the recommended size is 0.5% of the total). Then, the metadata partition became full, no apparent reason, when the BeeGFS volume had only 19 TB of usage. As a workaround, we increased the partition size to 390 GB and the BeeGFS volume started working again.
We understand that the metadata partition becoming full is not normal behavior for the BeeGFS file system. What could have caused this? Below some informations about the volume and the system.

 

# beegfs-df -p /mnt/beegfs/
METADATA SERVERS:
TargetID   Cap. Pool        Total         Free    %      ITotal       IFree    %
========   =========        =====         ====    =      ======       =====    =
       1      normal     391.0GiB     176.8GiB  45%      385.6M      370.7M  96%

 

STORAGE TARGETS:
TargetID   Cap. Pool        Total         Free    %      ITotal       IFree    %
========   =========        =====         ====    =      ======       =====    =
     101      normal   28245.4GiB    9232.7GiB  33%    19388.8M    19362.5M 100%
  

 The quantity of inodes in metadata partition (ZFS filesystem)

# du -a /data/beegfs/meta1/
223671088       /data/beegfs/meta1/

 

# zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
pool1  43.7T  30.4T  13.3T        -         -     8%    69%  1.00x    ONLINE  -

 

# zpool status
  pool: pool1
state: ONLINE
  scan: none requested
config:

 

        NAME                                     STATE     READ WRITE CKSUM
        pool1                                    ONLINE       0     0     0
          raidz2-0                               ONLINE       0     0     0


Have any of you experienced something similar? We are actively searching for any issues in the logs, but so far, we haven't found anything unusual.

If any additional information is needed, I'd be happy to share.

Thank you very much, everyone.

Quentin Le Burel

unread,
Mar 6, 2024, 1:15:59 PMMar 6
to fhgfs...@googlegroups.com
Hi Marcos,

Metadata would typically tend to get full when you have lots of small files and/or directories created by users. Have you checked how many files have been created by the users ?

Kind regards
Quentin



--
You received this message because you are subscribed to the Google Groups "beegfs-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fhgfs-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fhgfs-user/d45d2aa8-30bb-496b-815c-aa848281c253n%40googlegroups.com.

Marcos Filho

unread,
Mar 13, 2024, 5:19:30 PMMar 13
to fhgfs...@googlegroups.com
Hello Quenti, 
Thank you for your help, I managed to gather with the 'du' command, take a look. 

[root@root]# du -a /home/ | sort -n -r | head -n 20
18847208085     /home/
5257071517      /home/john
5225000303      /home/john/work
4116153087      /home/john/work/espresso
3391169463      /home/emma
3391168491      /home/emma/inputs
3386042949      /home/john/work/espresso/Louis-UFC
3158024892      /home/emma/inputs/jacob
2942537274      /home/emma/inputs/jacob/ultimos
2940567256      /home/emma/inputs/jacob/ultimos/qe
2427886669      /home/alexander
2405578180      /home/alexander/work_vasp
2327099594      /home/john/work/espresso/Louis-UFC/Hubbard
2324453883      /home/emma/inputs/jacob/ultimos/qe/variant1
2283142201      /home/emma/inputs/jacob/ultimos/qe/variant1/0_Mo
2283102998      /home/emma/inputs/jacob/ultimos/qe/variant1/0_Mo/output
1485241656      /home/john/work/espresso/Louis-UFC/Hubbard/ZnO-1Fe
1485238697      /home/john/work/espresso/Louis-UFC/Hubbard/ZnO-1Fe/out
1437140652      /home/ryan
1160021776      /home/kelly

Personally, I think there are too many files in /home, what do you think?"

Quentin Le Burel

unread,
Mar 15, 2024, 9:59:02 AMMar 15
to fhgfs...@googlegroups.com
Hi Marcos,
I don't know how much space is taken by one metadata object, but if you have 18 billions files taking ~180GB of metadata disk space means you have ~10 bytes of metadata per file, that probably doesn't sound unreasonable.
Now whether it's too many files in /home or not enough space in /meta is only a matter of perspective - clearly if your users need to have that many millions files, then the 0.5% rule of thumb provided by BeeGFS for metadata partition size is not sufficient.

Regards

Quentin





Marcos Filho

unread,
Mar 15, 2024, 10:00:34 AMMar 15
to beegfs-user

Additionally, we collected other inode data and the numbers are different. We will continue investigating.

[root@root]# du --inodes /home/ | sort -n 
10367962        /home/
2023962 /home/brandon
1316452 /home/ralf
866824  /home/jhon
860063  /home/gabe
828109  /home/brandon/anaconda3
776053  /home/jhon/Packages
766603  /home/ronny
747037  /home/mcosta
684069  /home/ryan/work_vasp
672098  /home/emma
652001  /home/brandon/TI
624306  /home/ronny/opt
610344  /home/lois
577379  /home/lois/.conda
472624  /home/
brandon/TI/na3bi
465415  /home/brandon/TI/na3bi/siesta
437874  /home/ronny/Packages
432441  /home/jhon/Packages/anaconda3
425418  /home/frank


We will continue investigating.

Thank you so much!

Quentin Le Burel

unread,
Mar 15, 2024, 10:09:33 AMMar 15
to fhgfs...@googlegroups.com
oh sorry i misread your first du -a, it was showing 18TB used in /home, not 18 billion files...
The second one below says you have 10 million files/directories under /home, maybe that's just already too much for your metadata partition...

Regards

Quentin

John Hearns

unread,
Mar 15, 2024, 11:27:24 AMMar 15
to fhgfs...@googlegroups.com
If your users must keep many small files for future use or for archival purposes tell them to make zip or tar files for each directory.
Then delete the small files.

Waltar

unread,
Mar 15, 2024, 12:30:52 PMMar 15
to beegfs-user
Hello Marcos,
10 million files = 10 million inodes in /home (= beegfs) is less than 1 GB in metadata xfs
while you need 223 GB with zfs which is caused by recordsize used.
Why use zfs as it's slower for metadata, data and as you see even much less efficient ?

Marcos Filho

unread,
Mar 15, 2024, 3:06:51 PMMar 15
to beegfs-user
Thank you very much, Quentin and John Hearns, for the help.

Waltar, It was necessary to use ZFS because the pool already existed and there were no available partitions to be used as metadata for BeeGFS. However, I understand that this does not justify the fact that 10 million files occupy 215GB in the metadata, even though it is a ZFS file system.

Waltar

unread,
Mar 16, 2024, 1:25:10 PMMar 16
to beegfs-user
Hello Marcos,
you could even build an hybrid raid xfs with no partions at all and put beegfs metadata and data into 1 filesystem,
eg. have 3 sata ssds or nvme and build a raid1 set (by hw controller or mdadm) to stay with up to 2 disks failed here
and a bunch of eg. 24 hdds for a 21+2 +1spare in a hw controller raid6.
Build a mdadm linear raid with ssd raidset as first device and the raid6 set as second one.
Make a xfs filesystem onto that hybrid linear raid (no gpt label or any partition needed) and mount xfs with inode32 option.
After each Server boot be shure to set by fs cache with sysctl or "echo 30 > /proc/sys/vm/vfs_cache_pressure".
When you test your filesystem with eg mdtest 0byte files you will see with iostat -xm 1 that all files go just into ssd-raid1,
when you push data onto you see it's going into raid6 device (after that read the read the data to /dev/null) with some peaks for the file inode allocations in ssd-raid1.
Fill lots of data in the filesystem, reboot the server or flush manual the cache by "echo 3 > /proc/sys/vm/drop_caches".
Run find, du or recursiv ls onto that xfs and you will see with iostat all coming from ssd raid1, read data (cat files, tar dir's, use ior or elbencho) and you see it's coming from raid6.
Be aware to use big ssd's/nvme (2, 4 or 8 TB), >=1 TB allows up to 1.000.000.000 inodes, with 3 TB you get space for 3.000.000.000 inodes.
After that just make the 3 beegfs directories for metadata, data and mgmt into the 1 xfs.
It is always good to have on every beegfs server 1 storage and 1 metadata service nevertheless which filesystem(-)s under beegfs is(/are) used.
Reply all
Reply to author
Forward
0 new messages