Hello All,
I have users working with a mix of large and small files for bioinformatics work stored on a BeeGFS system. I've found that the number of small files is having a larger influence on performance and data movements (especially backups).
Is there a correlation between the number of chunk files and
actual files a user has stored? I was looking to limit chunk files
and work with users that tend to save millions of small files for
output rather than combining results into larger storage files. Or
if chunk files are distributed across the cluster is it possible
that a user can have many chunk files caused by large files split
across targets. My system is setup with a chunksize of 512K and
RAID0 strip pattern with a desired 4 storage targets.
For example, I have a user with a storage size of only 2.1TiB but 616944811 chunk files.
Regards,
Robert E. Anderson
University of NH / Research Computing Center / Data Center Operations
--
You received this message because you are subscribed to the Google Groups "beegfs-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fhgfs-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fhgfs-user/846db191-21f3-de7c-2051-cd2e9bf6efb9%40sdsc.edu.