Metadata Directory index full

Emir Imamagic

unread,

Aug 11, 2016, 3:32:11 AM8/11/16

to fhgfs...@googlegroups.com

Hi,

we just deployed new BeeGFS instance using 2015.03.r17, with ext4 used
for metadata. On this FS we will have few directories containing over
10M files (without using subdirectories).

During the data migration on the BeeGFS we hit the following issue:
syslog:
EXT4-fs warning (device sdb): ext4_dx_add_entry:2018: Directory index full!
beegfs-meta.log:
(0) Aug11 08:38:59 Worker16 [DirEntry (store initial dirEntry)] >>
Creating the dentry-by-name file failed: Path:
dentries/40/7C/658-57A46B36-1/2016-07-11_05-22_6091.rar SysErr: No space
left on device
(0) Aug11 08:38:59 Worker16 [make meta dir-entry] >> Failed to create:
name: 2016-07-11_05-22_6091.rar entryID: 22F6-57AC1D56-2 in path:
dentries/40/7C/658-57A46B36-1

The error occurred in directory with slightly over 5M files.

Based on the information online:
https://access.redhat.com/solutions/29894
https://patchwork.ozlabs.org/patch/436179/
I disabled dir_index. Block size on the FS is 4 KB so nothing to be done
with that. Here is more info on metadata fs:
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode filetype
needs_recovery extent 64bit flex_bg sparse_super large_file huge_file
uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 1000084800
Block count: 221182000
Reserved block count: 0
Free blocks: 95941792
Free inodes: 1000051752
First block: 0
Block size: 4096
Fragment size: 4096
Group descriptor size: 64
Reserved GDT blocks: 1024
Blocks per group: 7240
Fragments per group: 7240
Inodes per group: 32736
Inode blocks per group: 4092
Flex block group size: 16
Filesystem created: Wed Aug 3 18:44:54 2016
Last mount time: Wed Aug 3 20:47:18 2016
Last write time: Thu Aug 11 08:39:02 2016
Mount count: 2
Maximum mount count: -1
Last checked: Wed Aug 3 18:44:54 2016
Check interval: 0 (<none>)
Lifetime writes: 481 MB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 512
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: 41c3ee8f-d20f-4af6-90fe-db07f82d0c86
Journal backup: inode blocks

After the disabling dir_index metadata performance hit was pretty bad.
Data migration speed dropped from 150 MB/s to 30 MB/s and we see
increase in number of queued requests.

Seems to me that the only way to increase the size of hash table is to
further increase block size or change the fs to XFS. Changing the FS
might not help with such a large directories.

Is there a general recommendation for this situation?

Thanks in advance
emir

Emir Imamagic

unread,

Aug 11, 2016, 4:44:20 AM8/11/16

to fhgfs...@googlegroups.com

ehm, this was rather incorrect given the 4k is the max I can get on x86. I guess the only way is to go XFS or live with 5M limit per directory?

I did not find any XFS tuning for metadata, is default as good as it gets?

--
em...@iphone.hr

Sven Breuner

unread,

Sep 7, 2016, 1:11:45 PM9/7/16

to fhgfs...@googlegroups.com, Emir Imamagic

Hi Emir,

Emir Imamagic wrote on 11.08.2016 10:44:
> ehm, this was rather incorrect given the 4k is the max I can get on x86. I guess the only way is to go XFS or live with 5M limit per directory?
>
> I did not find any XFS tuning for metadata, is default as good as it gets?

indeed, what you encountered is a really unfortunate thing with ext4, because
there actually is no clearly defined limit. Luckily, being in the order of
millions, like you said, the general limit here is high enough so that it has no
relevance for most people. But since the limit is based on hashes of filenames,
it can happen in a very large directory that creating a file with name "myfile1"
fails (because this part of the hash table is full) and creating the next file
with name "myfile2" still succeeds.

Like you said, xfs has a better handling of such very large directories. The
relevant tuning options here are more or less the generic ones: Formatting with
512 byte inodes to allow inlinig of the BeeGFS metadata and mounting with
noatime,nodiratime(,nobarrier).

Best regards,
Sven

Reply all

Reply to author

Forward