I have one process on each client box that does all the writing
to the NFS server. The files are occasionally read, but most of
the I/O is writing. So, a few questions...
1) What parameters can we tune to maximize performance? Most web
pages I've seen care more about large files, but most of ours are
about 3-20K in size.
2) If we change the filesystem on the NFS server to ResiserFS, will
out performance improve, considering the kind of load we get?
3) (the one that mostly concerns me, the developer) The concern seems
to be the overhead caused by so many files making a new connection
each time. Is there some way I can have my one process keep a
connection open to the NFS server to reduce this overhead? Or am I
totally misunderstanding how NFS works? Is there anything I can do
within my program that copies files to the NFS server to improve
performance? Would it be better to scrap NFS altogether and have a
daemon on each box moving files? Somehow it doesn't seem like it.
Is there a web resource that describes NFS details in such a manner
that is useful for coders?
Thanks for any suggestions.
Brian Bebeau
Mycom Group, Inc.
>3) (the one that mostly concerns me, the developer) The concern seems
>to be the overhead caused by so many files making a new connection
>each time. Is there some way I can have my one process keep a
>connection open to the NFS server to reduce this overhead? Or am I
>totally misunderstanding how NFS works? Is there anything I can do
>within my program that copies files to the NFS server to improve
>performance? Would it be better to scrap NFS altogether and have a
>daemon on each box moving files? Somehow it doesn't seem like it.
Typically clients keep connections open; but what i shurting you most
likely is the latency for each of the file creates. With large files,
it's easy to stream writes (have many outstanding writes and still
send data); but with small files you need to wait for a response from
the server in order to obtain the filehandle for the new file.
An FS which is quicker in creating files could help.
Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.
use noatime i'm a Sun/AIX guy so i'm not sure if Linux has a caching fs
ie on Solaris we use cachefsd which really helps, hope this helps.
Their are tons of NFS tuning tips on the net that's what I could off
the top of my head.
--
Unix Systems Engineer
The City of New York
Dept. of Information Technology
http://www.nyc.gov/doitt
rbrown[(@)]doitt.nyc.gov
http://www.rodrickbrown.com
Client nfs v3:
null getattr setattr lookup access readlink
0 0% 1992898 8% 3192494 13% 7611159 31% 22174 0% 0 0%
read write create mkdir symlink mknod
1117 0% 5266596 21% 3192091 13% 247 0% 0 0% 0 0%
remove rmdir rename link readdir readdirplus
22 0% 53 0% 0 0% 0 0% 489 0% 0 0%
fsstat fsinfo pathconf commit
14477 0% 14477 0% 0 0% 3197249 13%
From what I was able to find on the Internet, some of these are way
too high. I removed the chmod() from my code, so hopefully the setattr
calls will go down. The lookup calls at 31% concern me though. What
are the lookup calls doing? Will setting the "no_subtree_check" help
here? I presume it'll be somewhat high because I'm creating a lot of
files, but it seems it can be better. I also noticed most problems come
when I'm writing to directories with a huge number of files (~100000).
Is the lookup taking so much time because it has to read all those
directory entries to create a new one? Would it improve things if I
changed the directory structure so no one dir would have that many
files? This would increase the number of directories though.
Also, the retrans number is awfully high. It's typically about 13% for
any of the clients. What would help cut down the number of retransmits?
Additional help from Casper or anybody is greatly appreciated.
> An FS which is quicker in creating files could help.
>
When we move the NFS server to a new box, we'll probably switch
to ReiserFS, since it's supposed to handle lots of small files
better.
--
Brian Bebeau
Mycom Group, Inc.
> Will setting the "no_subtree_check" help here?
I would advise to put it anyway.
I got corruption of data when accessing files using mmap() through NFS
without this flag.
--
DINH V. Hoa,
"monde de merde" -- Erwan David
>Client nfs v3:
>null getattr setattr lookup access readlink
>0 0% 1992898 8% 3192494 13% 7611159 31% 22174 0% 0 0%
>read write create mkdir symlink mknod
>1117 0% 5266596 21% 3192091 13% 247 0% 0 0% 0 0%
>remove rmdir rename link readdir readdirplus
>22 0% 53 0% 0 0% 0 0% 489 0% 0 0%
>fsstat fsinfo pathconf commit
>14477 0% 14477 0% 0 0% 3197249 13%
This indeed looks like create bound traffic; 1.5 writes for each
create; the chmod as you point out should shave a bit of your latency,
I'd think. You should look at the traffic to see exactly what is
happening; are you checking whether the file exists before
you create it?
> From what I was able to find on the Internet, some of these are way
>too high. I removed the chmod() from my code, so hopefully the setattr
>calls will go down. The lookup calls at 31% concern me though. What
>are the lookup calls doing? Will setting the "no_subtree_check" help
>here? I presume it'll be somewhat high because I'm creating a lot of
>files, but it seems it can be better. I also noticed most problems come
>when I'm writing to directories with a huge number of files (~100000).
>Is the lookup taking so much time because it has to read all those
>directory entries to create a new one? Would it improve things if I
>changed the directory structure so no one dir would have that many
>files? This would increase the number of directories though.
Large directories may themselves be a performance problem; but the
server will figure out whether a file already exists and the client
doesn't scan the directory.
>Also, the retrans number is awfully high. It's typically about 13% for
>any of the clients. What would help cut down the number of retransmits?
>Additional help from Casper or anybody is greatly appreciated.
Retransmits would be cut out be setting the timeouts higher.
(This may point to creates taking a long time as the percentaiege is nearly
the same as the number of creates)