Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

NFS on 10G interfaces still painfully slow

239 views
Skip to first unread message

Gerrit Kühn

unread,
Aug 2, 2016, 4:55:27 AM8/2/16
to
Hi all,

I already reported this issue here a year ago and unfortunately was not
able to fix it back then. Now I had another run at it, using two recent
10.3-machines with a direct 10G link. I still see nfs is painfully
slow (around 20-80MB/s). I tried both nfsv3 and nfsv4, with almost the same
results. Everything I tried so far (mtu size, wcommitsize, readahead...)
only makes things worse or at least not much better.
Moving data in different ways (scp, ggate) is much faster, so plain
network speed should not be an issue.

Is there anyone around here who can confirm that nfs can go faster over
10G links?
Any hints for further tuning/debugging are greatly appreciated.


cu
Gerrit
_______________________________________________
freeb...@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net...@freebsd.org"

Borja Marcos

unread,
Aug 2, 2016, 5:00:51 AM8/2/16
to

> On 02 Aug 2016, at 10:49, Gerrit Kühn <gerrit...@aei.mpg.de> wrote:
>
> Is there anyone around here who can confirm that nfs can go faster over
> 10G links?
> Any hints for further tuning/debugging are greatly appreciated.

Can you show us ifconfig output, please?


Borja.

Alan Somers

unread,
Aug 2, 2016, 10:45:29 AM8/2/16
to
On Tue, Aug 2, 2016 at 2:49 AM, Gerrit Kühn <gerrit...@aei.mpg.de> wrote:
> Hi all,
>
> I already reported this issue here a year ago and unfortunately was not
> able to fix it back then. Now I had another run at it, using two recent
> 10.3-machines with a direct 10G link. I still see nfs is painfully
> slow (around 20-80MB/s). I tried both nfsv3 and nfsv4, with almost the same
> results. Everything I tried so far (mtu size, wcommitsize, readahead...)
> only makes things worse or at least not much better.
> Moving data in different ways (scp, ggate) is much faster, so plain
> network speed should not be an issue.
>
> Is there anyone around here who can confirm that nfs can go faster over
> 10G links?
> Any hints for further tuning/debugging are greatly appreciated.
>
>
> cu
> Gerrit


I can get 1GB/s over NFS on a 10G link, so it's not always slow.
There's probably something about your setup that's slowing it down.
What is your NFS client? If Linux, make sure that you're using the
"async" mount option instead of "sync". What benchmark are you using
to measure that speed? Did you remember to start lockd and statd? If
you post your /etc/exports and the client's /etc/fstab, that might
reveal something.

Rick Macklem

unread,
Aug 2, 2016, 7:14:28 PM8/2/16
to
Alan Somers wrote:

>On Tue, Aug 2, 2016 at 2:49 AM, Gerrit Kühn <gerrit...@aei.mpg.de> wrote:
>> Hi all,
>>
>> I already reported this issue here a year ago and unfortunately was not
>> able to fix it back then. Now I had another run at it, using two recent
>> 10.3-machines with a direct 10G link. I still see nfs is painfully
>> slow (around 20-80MB/s). I tried both nfsv3 and nfsv4, with almost the same
>> results. Everything I tried so far (mtu size, wcommitsize, readahead...)
>> only makes things worse or at least not much better.
>> Moving data in different ways (scp, ggate) is much faster, so plain
>> network speed should not be an issue.
>>
>> Is there anyone around here who can confirm that nfs can go faster over
>> 10G links?
>> Any hints for further tuning/debugging are greatly appreciated.
>>
I can't help much, but a couple of things you can try:
- Disable TSO
- Turn off/reduce interrupt moderation on the net interface. (NFS perf.
depends on response time and anything that delays interrupt servicing
will slow it down.)

Good luck with it, rick

>>
>> cu
>> Gerrit
>
>
>I can get 1GB/s over NFS on a 10G link, so it's not always slow.
>There's probably something about your setup that's slowing it down.
<What is your NFS client? If Linux, make sure that you're using the
>"async" mount option instead of "sync". What benchmark are you using
<to measure that speed? Did you remember to start lockd and statd? If
>you post your /etc/exports and the client's /etc/fstab, that might
>reveal something.
>_______________________________________________
>freeb...@freebsd.org mailing list
>https://lists.freebsd.org/mailman/listinfo/freebsd-net<https://lists.freebsd.org/mailman/listinfo/freebsd-net>

Gerrit Kühn

unread,
Aug 3, 2016, 2:00:17 AM8/3/16
to
On Tue, 2 Aug 2016 08:45:07 -0600 Alan Somers <aso...@freebsd.org> wrote
about Re: NFS on 10G interfaces still painfully slow:

> > Is there anyone around here who can confirm that nfs can go faster
> > over 10G links?
> > Any hints for further tuning/debugging are greatly appreciated.

AS> I can get 1GB/s over NFS on a 10G link, so it's not always slow.
AS> There's probably something about your setup that's slowing it down.
AS> What is your NFS client?

This time, FreeBSD 10.3 on both client and server (to make debugging
easier).

AS> What benchmark are you using to measure that speed?

Right now only very simple things like using dd from /dev/zero or copying
large files. In my experience it is useless to go for more sophisticated
benchmarks, if these simple things already don't work as expected.

AS> Did you remember to start lockd and statd?

Yes.

AS> If you post your /etc/exports and the client's /etc/fstab, that might
AS> reveal something.

exports on the server side:

V4: /mt-rear -sec=sys 192.168.1.11
/mt-right 192.168.1.11 -maproot=root
/mt-rear -maproot=root 192.168.1.11
/mt-left 192.168.1.11 -maproot=root


fstab on the client does not tell you anything, I still use commandline
mounts during testing. This is what nfsstat -m will tell (V4 is not
mounted right now):

tom:/mt-rear on /net/mt-rear
nfsv3,tcp,resvport,hard,cto,lockd,rdirplus,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=65536,wsize=65536,readdirsize=65536,readahead=4,wcommitsize=50000000,timeout=120,retrans=2
tom:/mt-right on /net/mt-right
nfsv3,tcp,resvport,hard,cto,lockd,rdirplus,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=65536,wsize=65536,readdirsize=65536,readahead=1,wcommitsize=16777216,timeout=120,retrans=2
tom:/mt-left on /net/mt-left
nfsv3,tcp,resvport,hard,cto,lockd,rdirplus,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=65536,wsize=65536,readdirsize=65536,readahead=1,wcommitsize=16777216,timeout=120,retrans=2


This is what, e.g., dd gives:

root@crest:~ # dd if=/dev/zero of=/net/mt-rear/Z bs=1024k count=1000
1000+0 records in
1000+0 records out
1048576000 bytes transferred in 1.403620 secs (747051194 bytes/sec)

root@crest:~ # dd if=/dev/zero of=/net/mt-right/Z bs=1024k
count=1000 1000+0 records in
1000+0 records out
1048576000 bytes transferred in 1.380546 secs (759537249 bytes/sec)


And yes (before that question pops up :-), I'm using zfs on the server
side, but I disabled syncing for testing purposes.


cu
Gerrit

Gerrit Kühn

unread,
Aug 3, 2016, 2:55:23 AM8/3/16
to
On Wed, 3 Aug 2016 07:59:54 +0200 Gerrit Kühn <gerrit...@aei.mpg.de>
wrote about Re: NFS on 10G interfaces still painfully slow:

GK> 1048576000 bytes transferred in 1.403620 secs (747051194 bytes/sec)

GK> 1048576000 bytes transferred in 1.380546 secs (759537249 bytes/sec)


Argh! ;-)
Obviously, I cannot read and miscounted the number of digits. I should
have read the seconds instead.
750MB/s is still not the maximum, but totally fine for my purposes here.
Reverting step-by-step all the things I tried yesterday, it looks like
disabling the sync on the zfs volume gave the biggest performance boost,
and I simply overlooked the extra digit when it went from 25MB/s or so to
750MB/s. Sorry for the noise...

Valeri Galtsev

unread,
Aug 3, 2016, 10:21:43 AM8/3/16
to

On Wed, August 3, 2016 12:59 am, Gerrit Kühn wrote:
> On Tue, 2 Aug 2016 08:45:07 -0600 Alan Somers <aso...@freebsd.org> wrote
> about Re: NFS on 10G interfaces still painfully slow:
>
>> > Is there anyone around here who can confirm that nfs can go faster
>> > over 10G links?
>> > Any hints for further tuning/debugging are greatly appreciated.
>
> AS> I can get 1GB/s over NFS on a 10G link, so it's not always slow.
> AS> There's probably something about your setup that's slowing it down.
> AS> What is your NFS client?
>
> This time, FreeBSD 10.3 on both client and server (to make debugging
> easier).
>
> AS> What benchmark are you using to measure that speed?
>
> Right now only very simple things like using dd from /dev/zero or copying
> large files.

When I had trouble (too slow data throughput, never mind on what medium)
using dd - it was: not specifying bs=[big number] in dd command. It turned
out without that dd sends stuff down the pipe in very small chunks and
wouldn't send next chunk till acknowledgement on previous is received.
Imagine hard drive wrighting 4 kb at a time send 1 byte at a time, or RAID
expected to write 8 x 64 kB striped in parallel sent 1 byte at a time. In
your case this [big number] ideally should be equal or slightly smaller
than size of jumbo packet of you net connection.

Just speculating (remembered really trivial thing _I_ hit myself ;-)

Thanks.
Valeri

> In my experience it is useless to go for more sophisticated
> benchmarks, if these simple things already don't work as expected.
>
> AS> Did you remember to start lockd and statd?
>
> Yes.
>
> AS> If you post your /etc/exports and the client's /etc/fstab, that might
> AS> reveal something.
>
> exports on the server side:
>
> V4: /mt-rear -sec=sys 192.168.1.11
> /mt-right 192.168.1.11 -maproot=root
> /mt-rear -maproot=root 192.168.1.11
> /mt-left 192.168.1.11 -maproot=root
>
>
> fstab on the client does not tell you anything, I still use commandline
> mounts during testing. This is what nfsstat -m will tell (V4 is not
> mounted right now):
>
> tom:/mt-rear on /net/mt-rear
> nfsv3,tcp,resvport,hard,cto,lockd,rdirplus,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=65536,wsize=65536,readdirsize=65536,readahead=4,wcommitsize=50000000,timeout=120,retrans=2
> tom:/mt-right on /net/mt-right
> nfsv3,tcp,resvport,hard,cto,lockd,rdirplus,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=65536,wsize=65536,readdirsize=65536,readahead=1,wcommitsize=16777216,timeout=120,retrans=2
> tom:/mt-left on /net/mt-left
> nfsv3,tcp,resvport,hard,cto,lockd,rdirplus,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=65536,wsize=65536,readdirsize=65536,readahead=1,wcommitsize=16777216,timeout=120,retrans=2
>
>
> This is what, e.g., dd gives:
>
> root@crest:~ # dd if=/dev/zero of=/net/mt-rear/Z bs=1024k count=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes transferred in 1.403620 secs (747051194 bytes/sec)
>
> root@crest:~ # dd if=/dev/zero of=/net/mt-right/Z bs=1024k
> count=1000 1000+0 records in
> 1000+0 records out
> 1048576000 bytes transferred in 1.380546 secs (759537249 bytes/sec)
>
>
> And yes (before that question pops up :-), I'm using zfs on the server
> side, but I disabled syncing for testing purposes.
>
>
> cu
> Gerrit
> _______________________________________________
> freeb...@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net...@freebsd.org"
>


++++++++++++++++++++++++++++++++++++++++
Valeri Galtsev
Sr System Administrator
Department of Astronomy and Astrophysics
Kavli Institute for Cosmological Physics
University of Chicago
Phone: 773-702-4247
++++++++++++++++++++++++++++++++++++++++
0 new messages