Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Migrate large amount of small files

1 view
Skip to first unread message

Ckcheng

unread,
Oct 10, 2009, 11:46:52 AM10/10/09
to freebsd-performance
Hi all,
Currently, I have a directory with over 5M small files (1~32K). Now,
I want to transfer this directory to another machine and found
that it's extremely slow and painful process. I tried the following
method:

1. rsync
2. tar via ssh
3. tar via nc
(all take hours and hours to finish)

None of them is able to give me a reasonable migration time. So, I'm
here for asking help. Any suggestion is extremely welcomed. Thank you.

Btw, here is brief information of my server. (both machines are the same)

OS: FreeBSD 6.4-Stable 64Bit
CPU: 2 x Xeon L5420 2.50GHz
RAM: 2 x 2G ECC DDR2-667 (full buffered)
DISK: Seagate Barracuda ES 16MB (SATA 300)
Network: 1Gbps (Broadcom BCM5708)
Filesystem: UFS2

Regards,

Francisco Reyes

unread,
Oct 11, 2009, 4:50:10 PM10/11/09
to net....@m2k.com.tw, freebsd-performance
Ckcheng writes:

> 1. rsync
> 2. tar ..

If this is a migration I find that tar to the local machine, copy over,
restore, then rsync are likely the best options.

In my experience copying lots of small files is going to take a long time,
no matter which method you use.

>From all the combinations I have tried in the past tar first then rsync
worked best for me.

Mike Tancsa

unread,
Oct 11, 2009, 5:07:08 PM10/11/09
to net....@m2k.com.tw, freebsd-performance
At 11:46 AM 10/10/2009, Ckcheng wrote:
>Hi all,
>Currently, I have a directory with over 5M small files (1~32K). Now,
>I want to transfer this directory to another machine and found
>that it's extremely slow and painful process. I tried the following
>method:

It might help if you mount all -onoatime as well as bump up
vfs.ufs.dirhash_maxmem to 4x the default size

---Mike

>1. rsync
>2. tar via ssh
>3. tar via nc
>(all take hours and hours to finish)
>
>None of them is able to give me a reasonable migration time. So, I'm
>here for asking help. Any suggestion is extremely welcomed. Thank you.
>
>Btw, here is brief information of my server. (both machines are the same)
>
>OS: FreeBSD 6.4-Stable 64Bit
>CPU: 2 x Xeon L5420 2.50GHz
>RAM: 2 x 2G ECC DDR2-667 (full buffered)
>DISK: Seagate Barracuda ES 16MB (SATA 300)
>Network: 1Gbps (Broadcom BCM5708)
>Filesystem: UFS2
>
>Regards,

>_______________________________________________
>freebsd-p...@freebsd.org mailing list
>http://lists.freebsd.org/mailman/listinfo/freebsd-performance
>To unsubscribe, send any mail to "freebsd-perform...@freebsd.org"

--------------------------------------------------------------------
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, mi...@sentex.net
Providing Internet since 1994 www.sentex.net
Cambridge, Ontario Canada www.sentex.net/mike

Jin Guojun

unread,
Oct 11, 2009, 6:52:17 PM10/11/09
to net....@m2k.com.tw, freebsd-performance, Mike Tancsa
The major factor affects the performance the the number of files in "PER" directory.
If I got impression correct, there are 5M files in JUST one directory. If this is true, then it is the problem.

The best way to avoid performance penalty is to sort different files into differet named directories.
Each directory should contain about 1K files or less. Carefully tuned FS may have better performance for a directory conataining a few thousand files, but no more.

If files are created by some state machine automatically, and you do not have a good sorting algorithm for directory naming, use Year-Month-week for auto directory naming.

--- On Sun, 10/11/09, Mike Tancsa <mi...@sentex.net> wrote:

0 new messages