Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#1040343: linux-image-5.10.0-9-amd64: Kernel silenty de-supported nfsv3 UDP mounts

59 views
Skip to first unread message

Hauke Fath

unread,
Jul 4, 2023, 12:50:03 PM7/4/23
to
Package: src:linux
Version: 5.10.70-1
Severity: important

Dear Maintainer,

*** Reporter, please consider answering these questions, where appropriate ***

* What led up to the situation?

Configuring this machine to mount nfs shares

* What exactly did you do (or not do) that was effective (or
ineffective)?

I set up the automounter with the standard mount options of our Arch
clients:

(auto.master)
/misc /etc/auto.misc -nfsvers=3,proto=udp,resvport,retrans=5,rsize=16384,wsize=16384,rw,hard

When the mount failed, I repeated it manually.

* What was the outcome of this action?

The mount failed:

# mount -t nfs -vvv -o nfsvers=3,proto=udp,retrans=5,rsize=16384,wsize=16384,rw,hard <fileserver>:/u/pkgsrc /mnt
mount.nfs: timeout set for Tue Jul 4 18:23:55 2023
mount.nfs: trying text-based options 'nfsvers=3,proto=udp,retrans=5,rsize=16384,wsize=16384,hard,addr=<redacted>'
mount.nfs: prog 100003, trying vers=3, prot=17
mount.nfs: trying 130.83.197.22 prog 100003 vers 3 prot UDP port 2049
mount.nfs: prog 100005, trying vers=3, prot=17
mount.nfs: trying 130.83.197.22 prog 100005 vers 3 prot UDP port 701
mount.nfs: mount(2): Invalid argument
mount.nfs: an incorrect mount option was specified
#

* What outcome did you expect instead?

A successful nfs share mount.

Instead, <https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1964093> pointed me to

# grep "NFS.*UDP" /boot/config-5.10.0-9-amd64
CONFIG_NFS_DISABLE_UDP_SUPPORT=y
#

which disables UDP support for nfsv3 in the kernel.

This upstream decision
<https://www.spinics.net/lists/linux-nfs/msg74889.html> is more than
debatable -- we have been running nfsv3 over UDP for ~20 years here
without ever seeing the data corruption that was claimed as
motivation.

We run nfs through a router (several client subnets accessing servers
in an internal server subnet), and found nfs over udp a lot more
robust in the face or router reboots.


*** End of the template - remove these template lines ***


-- Package-specific info:
** Version:
Linux version 5.10.0-9-amd64 (debian...@lists.debian.org) (gcc-10 (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP Debian 5.10.70-1 (2021-09-30)

** Command line:
root=UUID=67153f14-9542-47ac-9643-c3be9ed3e33c ro initrd=/install/initrd.gz quiet

** Not tainted

** Kernel log:
[ 9.405659] systemd[1]: Queued start job for default target Graphical Interface.
[ 9.406611] random: systemd: uninitialized urandom read (16 bytes read)
[ 9.409768] systemd[1]: Created slice system-getty.slice.
[ 9.409969] random: systemd: uninitialized urandom read (16 bytes read)
[ 9.410579] systemd[1]: Created slice system-modprobe.slice.
[ 9.411370] systemd[1]: Created slice system-serial\x2dgetty.slice.
[ 9.412046] systemd[1]: Created slice system-systemd\x2dfsck.slice.
[ 9.412681] systemd[1]: Created slice User and Session Slice.
[ 9.412965] systemd[1]: Started Forward Password Requests to Wall Directory Watch.
[ 9.413483] systemd[1]: Set up automount Arbitrary Executable File Formats File System Automount Point.
[ 9.413723] systemd[1]: Reached target User and Group Name Lookups.
[ 9.413828] systemd[1]: Reached target Slices.
[ 9.413924] systemd[1]: Reached target Swap.
[ 9.414209] systemd[1]: Listening on Device-mapper event daemon FIFOs.
[ 9.414533] systemd[1]: Listening on LVM2 poll daemon socket.
[ 9.432622] systemd[1]: Listening on RPCbind Server Activation Socket.
[ 9.434023] systemd[1]: Listening on Syslog Socket.
[ 9.434436] systemd[1]: Listening on fsck to fsckd communication Socket.
[ 9.434676] systemd[1]: Listening on initctl Compatibility Named Pipe.
[ 9.435329] systemd[1]: Listening on Journal Audit Socket.
[ 9.435744] systemd[1]: Listening on Journal Socket (/dev/log).
[ 9.436278] systemd[1]: Listening on Journal Socket.
[ 9.437130] systemd[1]: Listening on udev Control Socket.
[ 9.437505] systemd[1]: Listening on udev Kernel Socket.
[ 9.438066] systemd[1]: Condition check resulted in Huge Pages File System being skipped.
[ 9.441114] systemd[1]: Mounting POSIX Message Queue File System...
[ 9.444148] systemd[1]: Mounting RPC Pipe File System...
[ 9.447699] systemd[1]: Mounting Kernel Debug File System...
[ 9.451315] systemd[1]: Mounting Kernel Trace File System...
[ 9.451885] systemd[1]: Condition check resulted in Kernel Module supporting RPCSEC_GSS being skipped.
[ 9.452326] systemd[1]: Finished Availability of block devices.
[ 9.455957] systemd[1]: Starting Set the console keyboard layout...
[ 9.459351] systemd[1]: Starting Create list of static device nodes for the current kernel...
[ 9.462868] systemd[1]: Starting Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling...
[ 9.466545] systemd[1]: Starting Load Kernel Module configfs...
[ 9.470181] systemd[1]: Starting Load Kernel Module drm...
[ 9.473529] systemd[1]: Starting Load Kernel Module fuse...
[ 9.485932] systemd[1]: Condition check resulted in Set Up Additional Binary Formats being skipped.
[ 9.488823] systemd[1]: Starting File System Check on Root Device...
[ 9.496264] systemd[1]: Starting Journal Service...
[ 9.500754] systemd[1]: Starting Load Kernel Modules...
[ 9.502155] fuse: init (API version 7.32)
[ 9.504365] systemd[1]: Starting Coldplug All udev Devices...
[ 9.510817] systemd[1]: Mounted POSIX Message Queue File System.
[ 9.511255] systemd[1]: Mounted Kernel Debug File System.
[ 9.511668] systemd[1]: Mounted Kernel Trace File System.
[ 9.512976] systemd[1]: Finished Create list of static device nodes for the current kernel.
[ 9.514353] systemd[1]: modp...@configfs.service: Succeeded.
[ 9.515187] systemd[1]: Finished Load Kernel Module configfs.
[ 9.516356] systemd[1]: modp...@fuse.service: Succeeded.
[ 9.517146] systemd[1]: Finished Load Kernel Module fuse.
[ 9.519304] random: crng init done
[ 9.519307] random: 7 urandom warning(s) missed due to ratelimiting
[ 9.522365] systemd[1]: Mounting FUSE Control File System...
[ 9.526318] systemd[1]: Mounting Kernel Configuration File System...
[ 9.530128] systemd[1]: Started File System Check Daemon to report status.
[ 9.534347] systemd[1]: Mounted FUSE Control File System.
[ 9.534709] systemd[1]: Condition check resulted in VMware vmblock fuse mount being skipped.
[ 9.539153] systemd[1]: Mounted Kernel Configuration File System.
[ 9.544558] RPC: Registered named UNIX socket transport module.
[ 9.544560] RPC: Registered udp transport module.
[ 9.544561] RPC: Registered tcp transport module.
[ 9.544563] RPC: Registered tcp NFSv4.1 backchannel transport module.
[ 9.547352] systemd[1]: Mounted RPC Pipe File System.
[ 9.548196] systemd[1]: modp...@drm.service: Succeeded.
[ 9.548840] systemd[1]: Finished Load Kernel Module drm.
[ 9.559388] systemd[1]: Finished File System Check on Root Device.
[ 9.560922] lp: driver loaded but no devices found
[ 9.565527] systemd[1]: Starting Remount Root and Kernel File Systems...
[ 9.570755] ppdev: user-space parallel port driver
[ 9.574973] systemd[1]: Finished Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling.
[ 9.595317] EXT4-fs (xvdb): re-mounted. Opts: errors=remount-ro
[ 9.598147] systemd[1]: Started Journal Service.
[ 9.622547] systemd-journald[240]: Received client request to flush runtime journal.
[ 10.017534] input: PC Speaker as /devices/platform/pcspkr/input/input0
[ 10.163923] EXT4-fs (xvda): mounting ext2 file system using the ext4 subsystem
[ 10.171102] EXT4-fs (xvda): mounted filesystem without journal. Opts: (null)
[ 10.171112] ext2 filesystem being mounted at /boot supports timestamps until 2038 (0x7fffffff)
[ 10.261124] cryptd: max_cpu_qlen set to 1000
[ 10.264953] audit: type=1400 audit(1688149096.407:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lsb_release" pid=341 comm="apparmor_parser"
[ 10.266065] audit: type=1400 audit(1688149096.407:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=339 comm="apparmor_parser"
[ 10.266071] audit: type=1400 audit(1688149096.407:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=339 comm="apparmor_parser"
[ 10.266218] audit: type=1400 audit(1688149096.407:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-oopslash" pid=338 comm="apparmor_parser"
[ 10.269978] audit: type=1400 audit(1688149096.411:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/cups-browsed" pid=343 comm="apparmor_parser"
[ 10.270188] audit: type=1400 audit(1688149096.411:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/ntpd" pid=342 comm="apparmor_parser"
[ 10.275077] audit: type=1400 audit(1688149096.415:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-xpdfimport" pid=347 comm="apparmor_parser"
[ 10.285085] audit: type=1400 audit(1688149096.423:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-senddoc" pid=351 comm="apparmor_parser"
[ 10.285091] audit: type=1400 audit(1688149096.423:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/man" pid=348 comm="apparmor_parser"
[ 10.474672] AVX version of gcm_enc/dec engaged.
[ 10.474675] AES CTR mode by8 optimization enabled
[ 1831.088175] kauditd_printk_skb: 16 callbacks suppressed
[ 1831.088178] audit: type=1400 audit(1688150917.210:27): apparmor="DENIED" operation="capable" profile="/usr/sbin/cupsd" pid=22069 comm="cupsd" capability=12 capname="net_admin"
[333037.457667] FS-Cache: Loaded
[333037.523512] FS-Cache: Netfs 'nfs' registered for caching
[333037.533417] Key type dns_resolver registered
[333037.828391] NFS: Registering the id_resolver key type
[333037.828403] Key type id_resolver registered
[333037.828404] Key type id_legacy registered
[333502.223203] nfs: Unknown parameter 'grpid'
[333557.980988] nfs: Deprecated parameter 'intr'

** Model information

** Loaded modules:
nfsv3
nfs_acl
rpcsec_gss_krb5
auth_rpcgss
nfsv4
dns_resolver
nfs
lockd
grace
nfs_ssc
fscache
xt_multiport
binfmt_misc
ipt_REJECT
nf_reject_ipv4
nft_compat
nft_counter
nf_tables
libcrc32c
nfnetlink
intel_rapl_msr
intel_rapl_common
rfkill
ghash_clmulni_intel
aesni_intel
libaes
crypto_simd
cryptd
glue_helper
evdev
pcspkr
vmwgfx
ttm
drm_kms_helper
cec
hwmon_vid
parport_pc
ppdev
lp
parport
drm
sunrpc
fuse
configfs
ip_tables
x_tables
autofs4
ext4
crc16
mbcache
jbd2
crc32c_generic
xen_netfront
xen_blkfront
crct10dif_pclmul
crct10dif_common
crc32_pclmul
crc32c_intel

** Network interface configuration:
*** /etc/network/interfaces:

source /etc/network/interfaces.d/*

auto lo
iface lo inet loopback

** Network status:
*** IP interfaces and addresses:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether <redacted> brd ff:ff:ff:ff:ff:ff
inet <redacted>/26 brd <redacted> scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::<redacted>/64 scope link
valid_lft forever preferred_lft forever

*** Device statistics:
Inter-| Receive | Transmit
face |bytes packets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed
lo: 5445095080 2995747 0 0 0 0 0 0 5445095080 2995747 0 0 0 0 0 0
eth0: 364690443 1032426 0 0 0 0 0 0 421652187 719954 0 0 0 0 0 0

*** Protocol statistics:
Ip:
Forwarding: 2
2631910 total packets received
32 with invalid addresses
0 forwarded
0 incoming packets discarded
2389554 incoming packets delivered
2222123 requests sent out
6 dropped because of missing route
Icmp:
83617 ICMP messages received
0 input ICMP message failed
ICMP input histogram:
destination unreachable: 18
echo requests: 83599
86174 ICMP messages sent
0 ICMP messages failed
ICMP output histogram:
destination unreachable: 2575
echo replies: 83599
IcmpMsg:
InType3: 18
InType8: 83599
OutType0: 83599
OutType3: 2575
Tcp:
24849 active connection openings
39735 passive connection openings
802 failed connection attempts
442 connection resets received
52 connections established
2909498 segments received
2930804 segments sent out
4338 segments retransmitted
191 bad segments received
983 resets sent
InCsumErrors: 191
Udp:
8933 packets received
0 packets to unknown port received
0 packet receive errors
8942 packets sent
0 receive buffer errors
0 send buffer errors
IgnoredMulti: 2657
UdpLite:
TcpExt:
402 resets received for embryonic SYN_RECV sockets
101 packets pruned from receive queue because of socket buffer overrun
25755 TCP sockets finished time wait in fast timer
334 packetes rejected in established connections because of timestamp
10519 delayed acks sent
14 delayed acks further delayed because of locked socket
Quick ack mode was activated 423 times
835877 packet headers predicted
838864 acknowledgments not containing data payload received
457775 predicted acknowledgments
TCPSackRecovery: 72
Detected reordering 4 times using SACK
TCPDSACKUndo: 11
37 congestion windows recovered without slow start after partial ack
TCPLostRetransmit: 1478
TCPSackFailures: 1
985 fast retransmits
TCPTimeouts: 2523
TCPLossProbes: 1132
TCPLossProbeRecovery: 214
TCPSackRecoveryFail: 9
TCPBacklogCoalesce: 20634
TCPDSACKOldSent: 428
TCPDSACKOfoSent: 180
TCPDSACKRecv: 382
TCPDSACKOfoRecv: 1
139 connections reset due to unexpected data
337 connections reset due to early user close
216 connections aborted due to timeout
TCPDSACKIgnoredNoUndo: 173
TCPSackShifted: 785
TCPSackMerged: 599
TCPSackShiftFallback: 147
TCPRcvCoalesce: 302369
TCPOFOQueue: 2018
TCPOFOMerge: 16
TCPAutoCorking: 4442
TCPFromZeroWindowAdv: 26
TCPToZeroWindowAdv: 26
TCPWantZeroWindowAdv: 137
TCPSynRetrans: 578
TCPOrigDataSent: 1627901
TCPHystartTrainDetect: 1161
TCPHystartTrainCwnd: 27560
TCPHystartDelayDetect: 5
TCPHystartDelayCwnd: 563
TCPACKSkippedPAWS: 325
TCPACKSkippedSeq: 75
TCPWinProbe: 1
TCPKeepAlive: 215
TCPDelivered: 1651369
TCPAckCompressed: 879
TcpTimeoutRehash: 1884
TcpDuplicateDataRehash: 8
TCPDSACKRecvSegs: 296
TCPDSACKIgnoredDubious: 87
IpExt:
InMcastPkts: 739
OutMcastPkts: 64
InBcastPkts: 2643
InOctets: 2668751805
OutOctets: 2719203196
InMcastOctets: 206771
OutMcastOctets: 24307
InBcastOctets: 563739
InNoECTPkts: 2630355
InECT1Pkts: 27
InECT0Pkts: 1894


** PCI devices:

** USB devices:
not available


-- System Information:
Debian Release: 11.7
APT prefers oldstable-updates
APT policy: (500, 'oldstable-updates'), (500, 'oldstable-security'), (500, 'oldstable')
Architecture: amd64 (x86_64)

Kernel: Linux 5.10.0-9-amd64 (SMP w/8 CPU threads)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US:en
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages linux-image-5.10.0-9-amd64 depends on:
ii initramfs-tools [linux-initramfs-tool] 0.140
ii kmod 28-1
ii linux-base 4.6

Versions of packages linux-image-5.10.0-9-amd64 recommends:
ii apparmor 2.13.6-10
ii firmware-linux-free 20200122-1

Versions of packages linux-image-5.10.0-9-amd64 suggests:
pn debian-kernel-handbook <none>
ii grub-pc 2.06-3~deb11u5
pn linux-doc-5.10 <none>

Versions of packages linux-image-5.10.0-9-amd64 is related to:
ii firmware-amd-graphics 20210315-3
pn firmware-atheros <none>
pn firmware-bnx2 <none>
pn firmware-bnx2x <none>
pn firmware-brcm80211 <none>
pn firmware-cavium <none>
pn firmware-intel-sound <none>
pn firmware-intelwimax <none>
pn firmware-ipw2x00 <none>
pn firmware-ivtv <none>
pn firmware-iwlwifi <none>
pn firmware-libertas <none>
ii firmware-linux-nonfree 20210315-3
ii firmware-misc-nonfree 20210315-3
pn firmware-myricom <none>
pn firmware-netxen <none>
pn firmware-qlogic <none>
ii firmware-realtek 20210315-3
pn firmware-samsung <none>
pn firmware-siano <none>
pn firmware-ti-connectivity <none>
pn xen-hypervisor <none>

-- no debconf information

Debian Bug Tracking System

unread,
Jul 5, 2023, 4:30:03 AM7/5/23
to
Processing control commands:

> severity -1 normal
Bug #1040343 [src:linux] linux-image-5.10.0-9-amd64: Kernel silenty de-supported nfsv3 UDP mounts
Severity set to 'normal' from 'important'

--
1040343: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1040343
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems

Bastian Blank

unread,
Jul 5, 2023, 4:30:04 AM7/5/23
to
Control: severity -1 normal

Hi

On Tue, Jul 04, 2023 at 06:35:44PM +0200, Hauke Fath wrote:
> /misc /etc/auto.misc -nfsvers=3,proto=udp,resvport,retrans=5,rsize=16384,wsize=16384,rw,hard

And if you set it to TCP (the default) or better directly switch to
NFSv4?

> This upstream decision
> <https://www.spinics.net/lists/linux-nfs/msg74889.html> is more than
> debatable -- we have been running nfsv3 over UDP for ~20 years here
> without ever seeing the data corruption that was claimed as
> motivation.

That you have to take up with upstream, not us. Also this talks about
problem with fragment reassembly, not data corruption itself. But
usually I would trust upstream to know more about it then yourself.

Bastian

--
Another dream that failed. There's nothing sadder.
-- Kirk, "This side of Paradise", stardate 3417.3

Debian Bug Tracking System

unread,
Jul 5, 2023, 8:41:47 AM7/5/23
to
Your message dated Wed, 05 Jul 2023 14:34:18 +0200
with message-id <de84d9d2f598f3804d13579...@decadent.org.uk>
and subject line Re: Bug#1040343: linux-image-5.10.0-9-amd64: Kernel silenty de-supported nfsv3 UDP mounts
has caused the Debian Bug report #1040343,
regarding linux-image-5.10.0-9-amd64: Kernel silenty de-supported nfsv3 UDP mounts
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ow...@bugs.debian.org
immediately.)
signature.asc

Hauke Fath

unread,
Jul 5, 2023, 9:40:04 AM7/5/23
to
On 7/5/23 10:20, Bastian Blank wrote:
> On Tue, Jul 04, 2023 at 06:35:44PM +0200, Hauke Fath wrote:
>> /misc /etc/auto.misc -nfsvers=3,proto=udp,resvport,retrans=5,rsize=16384,wsize=16384,rw,hard
> And if you set it to TCP (the default) or better directly switch to
> NFSv4?

nfsv3 over tcp works, but is subobtimal , as described - when the router
goes down, the tcp mounts will hang, and the machine will have to be
rebooted.

We do not use nfsv4 here.

>> This upstream decision
>> <https://www.spinics.net/lists/linux-nfs/msg74889.html> is more than
>> debatable -- we have been running nfsv3 over UDP for ~20 years here
>> without ever seeing the data corruption that was claimed as
>> motivation.
> That you have to take up with upstream, not us.

Note we are not talking about code changes here. This is about a kernel
configuration option, which is very much at the discretion of a
distribution (as the Arch example has shown).

> Also this talks about
> problem with fragment reassembly, not data corruption itself. But
> usually I would trust upstream to know more about it then yourself.

At this point, we have 20 years of experience with running nfsv3 over
udp for ~40 clients that mount user homes over nfs.

On 7/5/23 14:34, Ben Hutchings wrote:
> This was an upstream change in Linux 5.6 that we won't override. NFS-
> over-TCP has been well supported on Linux, and better performing, for a
> long time.

This request is not about defaulting, or even preferring, udp over tcp.

It is simply about having the option, for interoperability as well as
for situations (and they do exist, despite the blanket statement), where
nfsv3 over udp provides more robust service -- without having to deploy
a self-compiled kernel.

Please re-consider, and re-open the ticket.

Cheerio,
Hauke

--
The ASCII Ribbon Campaign Hauke Fath
() No HTML/RTF in email Institut für Nachrichtentechnik
/\ No Word docs in email TU Darmstadt
Respect for open standards Ruf +49-6151-16-21344

Ben Hutchings

unread,
Jul 16, 2023, 2:20:04 PM7/16/23
to
On Wed, 2023-07-05 at 15:18 +0200, Hauke Fath wrote:
> On 7/5/23 10:20, Bastian Blank wrote:
> > On Tue, Jul 04, 2023 at 06:35:44PM +0200, Hauke Fath wrote:
> > > /misc /etc/auto.misc -nfsvers=3,proto=udp,resvport,retrans=5,rsize=16384,wsize=16384,rw,hard
> > And if you set it to TCP (the default) or better directly switch to
> > NFSv4?
>
> nfsv3 over tcp works, but is subobtimal , as described - when the router
> goes down, the tcp mounts will hang, and the machine will have to be
> rebooted.
[...]

Does this mean you are using the "soft" mount option? Without that, I
would expect access to the mount to hang until the network connection
is restored, regardless of whether the TCP or UDP transport is used.

The default retry and timeout behaviour *is* different between
transports, though. See the "timeo" and "retrans" options in nfs(5).
You may wish to override the defaults in your environment.

Ben.

--
Ben Hutchings
Theory and practice are closer in theory than in practice - John Levine

signature.asc

Hauke Fath

unread,
Jul 17, 2023, 3:20:04 AM7/17/23
to
On Sun, 16 Jul 2023 20:14:20 +0200, Ben Hutchings wrote:
>>
>> nfsv3 over tcp works, but is subobtimal , as described - when the router
>> goes down, the tcp mounts will hang, and the machine will have to be
>> rebooted.
> [...]
>
> Does this mean you are using the "soft" mount option? Without that, I
> would expect access to the mount to hang until the network connection
> is restored, regardless of whether the TCP or UDP transport is used.

No, we use hard mounts.

But the router's package filter will have lost state after a reboot,
and reject packets from tcp connections that the clients assume to
exist. This is not a problem with udp, because connection-less.

Ben Hutchings

unread,
Jul 17, 2023, 2:40:04 PM7/17/23
to
On Mon, 2023-07-17 at 09:05 +0200, Hauke Fath wrote:
> On Sun, 16 Jul 2023 20:14:20 +0200, Ben Hutchings wrote:
> > >
> > > nfsv3 over tcp works, but is subobtimal , as described - when the router
> > > goes down, the tcp mounts will hang, and the machine will have to be
> > > rebooted.
> > [...]
> >
> > Does this mean you are using the "soft" mount option? Without that, I
> > would expect access to the mount to hang until the network connection
> > is restored, regardless of whether the TCP or UDP transport is used.
>
> No, we use hard mounts.
>
> But the router's package filter will have lost state after a reboot,
> and reject packets from tcp connections that the clients assume to
> exist. This is not a problem with udp, because connection-less.

Ah, I see. You didn't mention that there was dynamic NAT involved
before.

If an NFS server is rebooted abruptly (so it doesn't properly close TCP
connections), once it's back up it will respond to any requests from
clients with a TCP RST, and they should reconnect.

If a NAT router between client and server is rebooted, I think that
something similar should happen, but the router would need to send the
TCP RST instead.

Is your router configured to send a TCP RST when receiving a packet for
an unknown connection, or does it just drop those packets? (In
iptables this is the difference between REJECT and DROP policies.)

Ben.

--
Ben Hutchings
Never attribute to conspiracy what can adequately be explained
by stupidity.

signature.asc

Hauke Fath

unread,
Jul 18, 2023, 6:40:04 AM7/18/23
to
On 7/17/23 20:29, Ben Hutchings wrote:
>> But the router's package filter will have lost state after a reboot,
>> and reject packets from tcp connections that the clients assume to
>> exist. This is not a problem with udp, because connection-less.
>
> Ah, I see. You didn't mention that there was dynamic NAT involved
> before.

Because it isn't. What is involved is a stateful packet filter (FreeBSD
pf). I said

| We run nfs through a router (several client subnets accessing servers
| in an internal server subnet), and found nfs over udp a lot more
| robust in the face or router reboots.

> If an NFS server is rebooted abruptly (so it doesn't properly close TCP
> connections), once it's back up it will respond to any requests from
> clients with a TCP RST, and they should reconnect.

Understood, and not relevant here.

> If a NAT router between client and server is rebooted, I think that
> something similar should happen, but the router would need to send the
> TCP RST instead.

After a router reboot, the stateful packet filter will have lost
information on active tcp connections, and (rightfully) reject packets
for what the nfs clients (rightfully) see as an existing connection.

> Is your router configured to send a TCP RST when receiving a packet for
> an unknown connection, or does it just drop those packets? (In
> iptables this is the difference between REJECT and DROP policies.)

The router defaults to returning RST.

Anyway: I am not asking for an udp default here, but simply for Debian
to keep providing the _option_, and leave the decision to me, the admin.
0 new messages