Faster tree walk

551 views
Skip to first unread message

Pierre Neidhardt

unread,
Oct 21, 2015, 9:43:44 AM10/21/15
to golang-nuts
Hi!

I have a small performance challenge for the go nuts out there!

Goal: store in memory (e.g. slice) all regular files from a folder recursively,
do not follow symlinks. Printing the result should output something similar to
`find $1 -type f`. Storage and output need not be sorted. Bonus if
non-concurrent.

I noticed a significant (approx. 2x) performance difference on Linux amd64
between my quick-and-dirty C implementation:

https://gist.github.com/Ambrevar/8a9f3cb2a22c2b67b86b

and my (naive?) Go implementation:

http://play.golang.org/p/TPrXUXBMYs

Benchmarks are done on hot cache and sane hardware (SSD).

I tried a few different approaches:

* Unwrapping the abstraction by using mostly syscalls did help a
little (~10-20% better), but still far from optimal. Besides, I lose
portability. (Well, the C version is not the most portable either.)

* github.com/kr/fs: nice, but still too slow.

* github.com/mirtchovski/walk: not much faster.

* github.com/MichaelTJones/walk is reaching C performance when GOMAXPROCS >= 2, but still twice as slow when GOMAXPROCS==1.

* I did not spot any obvious bottleneck from profiling.

Any clue how to make this faster?

Cheers!


--
Pierre Neidhardt

Vitamin C deficiency is apauling.

Manlio Perillo

unread,
Oct 21, 2015, 10:26:32 AM10/21/15
to golang-nuts
Il giorno mercoledì 21 ottobre 2015 15:43:44 UTC+2, Pierre Neidhardt ha scritto:
Hi!

I have a small performance challenge for the go nuts out there!

Goal: store in memory (e.g. slice) all regular files from a folder recursively,
do not follow symlinks. Printing the result should output something similar to
`find $1 -type f`. Storage and output need not be sorted. Bonus if
non-concurrent.


Readdir implementation in golang stdlib is *very* inefficient.
The implementation on Linux use the efficient getdents64 syscall, but the file stat [1] informations are *discarded*, and computed again using
lstat (one syscall for each file).

On Windows the implementation is more efficient.

The solution is to reimplement Readdir, returning something like a DirEntInfo interface, containing only the minimal data you need for directory traversal.





Regards  Manlio

andrey mirtchovski

unread,
Oct 21, 2015, 11:45:17 AM10/21/15
to Pierre Neidhardt, golang-nuts
> * github.com/mirtchovski/walk: not much faster.

Just a note: speed is not the primary goal of this code, but it will
allow you to navigate and discover files and directories with paths
deeper than PATH_MAX for that respective OS. Still, I'm happy that it
performs not unreasonably :)

andrey

Pierre Neidhardt

unread,
Oct 26, 2015, 11:47:10 AM10/26/15
to golang-nuts
Thanks for the hint! I wasn't aware of this Linux / *BSD specific optimization. Indeed, storing the file type in the dirent will save a syscall to lstat.
I've updated both my C and my Go implementations according to this:

Result: the Go implementation is now 10% close!
As a side note, I've been using GNU's malloc function family for memory allocation, and the C implementation uses approx. 20 times as much memory.
I assume the 10% goes in a performance / memory trade-off. But I won't test any further, performance is good enough for me now.

Thank you very much!

Manlio Perillo

unread,
Oct 26, 2015, 12:25:10 PM10/26/15
to golang-nuts
Il giorno lunedì 26 ottobre 2015 16:47:10 UTC+1, Pierre Neidhardt ha scritto:
Thanks for the hint! I wasn't aware of this Linux / *BSD specific optimization. Indeed, storing the file type in the dirent will save a syscall to lstat.
I've updated both my C and my Go implementations according to this:

Result: the Go implementation is now 10% close!

Note that, according to Linux documentation, the length of the file name is know, so there should be no need to use the clen function.
I don't know why the Go implementation explicitly search for string terminator.


  char d_name[]; /* Filename (null-terminated) */ /* length is actually (d_reclen - 2 - offsetof(struct linux_dirent, d_name)) */


Regards   Manlio

Pierre Neidhardt

unread,
Oct 26, 2015, 2:26:28 PM10/26/15
to Manlio Perillo, golang-nuts
The filename length is known for the original getdents() function, not for the
modern getdents64() which Go uses. The latter function is for modern (larger)
file systems. Read on the man page for more details.

As a result, you need to use `clen` to compute the exact length of the string.
Note that `reclen - offsetof(d_name) - 8` will give you the first multiple of 8
below strlen(d_name). As such it is possible to short-circuit clen.

See my last playground: http://play.golang.org/p/Q3s3J7JXNk

No big optimization here, this would only be faster with a lot of long
filenames. This makes the code a bit harder to read for not much gain, so I'm
not sure if this is worth comitting upstream.

--
Pierre Neidhardt

The idle man does not know what it is to enjoy rest.
Reply all
Reply to author
Forward
0 new messages