Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#763192: [LXC] [nfsd] kernel crash when running nfs-kernel-server in one LXC Container

246 views
Skip to first unread message

LACROIX Jean Marc

unread,
Sep 28, 2014, 11:50:02 AM9/28/14
to
Package: nfs-kernel-server
Version: 1:1.2.6-4
Severity: serious

Hi dear maintainer,

I have on problem on nfs-kernel-server (Debian Wheezy) when installed in
one LXC container (debian Wheezy) on one Debian Jessie host with very
recent kernel (3.16-2-amd64)

The configuration is hosted on one IBM IBM eServer x3400 (2 CPU 4 cores)

- config 1: host configuration
--------------------------------

admlocal@srv-alex:~$ uname -a
Linux srv-alex 3.16-2-amd64 #1 SMP Debian 3.16.3-2 (2014-09-20) x86_64
GNU/Linux

admlocal@srv-alex:~$ cat /etc/debian_version
jessie/sid

admlocal@srv-alex:~$ dpkg -l |grep libc6
ii libc6:amd64 2.19-11
amd64 GNU C Library: Shared libraries
ii libc6-dev:amd64 2.19-11
amd64 GNU C Library: Development Libraries and Header Files
ii libcompfaceg1 1:1.5.2-5
amd64 Compress/decompress images for mailheaders, libc6 runtime


Please note that Jessie is up to date when writing this email

admlocal@srv-alex:~$ cat /etc/apt/sources.list |grep -v "#"

deb http://ftp.fr.debian.org/debian/ jessie main contrib non-free
deb-src http://ftp.fr.debian.org/debian/ jessie main contrib non-free

root@srv-alex:~# lsmod |grep nfs
nfsv3 37551 1
nfs 187961 2 nfsv3
fscache 45542 1 nfs
nfsd 263053 9
auth_rpcgss 51240 1 nfsd
nfs_acl 12511 2 nfsd,nfsv3
lockd 83417 3 nfs,nfsd,nfsv3
sunrpc 237445 30 nfs,nfsd,auth_rpcgss,lockd,nfsv3,nfs_acl

- config 2: container configuration
-----------------------------------

According LXC feature, i have installed one LXC adm64 container based on
stable Debian distribution (Wheezy 7.6) in order to be sure to have one
stable user space daemon version.

root@vm-wheezy-x86-amd64-3:~# dpkg -l |grep libc6
ii libc6:amd64 2.13-38+deb7u4 amd64
Embedded GNU C Library: Shared libraries
ii libc6-dbg:amd64 2.13-38+deb7u4 amd64
Embedded GNU C Library: detached debugging symbols
ii libc6-dev:amd64 2.13-38+deb7u4 amd64
Embedded GNU C Library: Development Libraries and Header Files
ii libcompfaceg1 1:1.5.2-5 amd64
Compress/decompress images for mailheaders, libc6 runtime

root@vm-wheezy-x86-amd64-3:~# dpkg -l |grep nfs
ii libnfsidmap2:amd64 0.25-4 amd64
NFS idmapping library
ii nfs-common 1:1.2.6-4 amd64
NFS support files common to client and server
ii nfs-kernel-server 1:1.2.6-4 amd64
support for NFS kernel server

root@vm-wheezy-x86-amd64-3:~# cat /etc/exports |grep -v '#'
/tmp *(rw,sync,no_subtree_check)


internal mount point are as follow in the container

root@vm-wheezy-x86-amd64-3:~# df -h
Filesystem Size Used Avail Use%
Mounted on
rootfs 190M 38M 142M 22% /
/dev/mapper/vg_wheezy_x86_amd64_3-lv_rootfs 190M 38M 142M 22% /
/dev/mapper/vg_raid_0-lv_tmp_wheezy_x86_amd64_3 4.8G 11M 4.6G 1% /tmp
/dev/mapper/vg_wheezy_x86_amd64_3-lv_var 575M 176M 370M 33% /var
/dev/mapper/vg_wheezy_x86_amd64_3-lv_usr 3.4G 2.7G 528M 84% /usr
tmpfs 599M 56K 599M 1% /run
tmpfs 5.0M 0 5.0M 0%
/run/lock
tmpfs 1.2G 0 1.2G 0%
/run/shm
root@vm-wheezy-x86-amd64-3:~# /etc/init.d/nfs-kernel-server restart
[ ok ] Stopping NFS kernel daemon: mountd nfsd.
[ ok ] Unexporting directories for NFS kernel daemon....
[ ok ] Exporting directories for NFS kernel daemon....
[ ok ] Starting NFS kernel daemon: nfsd mountd.

As it is not possible for my contianer to insert nfsd kernel module,
this one is loaded in /etc/module on the host during booting processus.

The container is starting and the last daemon run (nfsd), but in dmesg,
i have following message....

[ 160.421748] ------------[ cut here ]------------
[ 160.421777] WARNING: CPU: 5 PID: 4638 at
/build/linux-P15SNz/linux-3.16.3/fs/nfsd/nfs4recover.c:1195
nfsd4_umh_cltrack_init+0x3a/0x40 [nfsd]()
[ 160.421779] NFSD: attempt to initialize umh client tracking in a
container!
[ 160.421900] Modules linked in: veth nfsv3 nfs fscache binfmt_misc
bridge 8021q garp stp mrp llc iTCO_wdt iTCO_vendor_support ppdev joydev
raid0 radeon ttm coretemp drm_kms_helper psmouse drm kvm_intel evdev
i5000_edac pcspkr i2c_algo_bit edac_coe serio_raw kvm parport_pc parport
shpchp i2c_i801 i2c_core i5k_amb lpc_ich mfd_core rng_core processor
thermal_sys button nfsd auth_rpcgss oid_registry nfs_acl lockd sunrpc
loop autofs4 ext4 crc16 mbcache jbd2 dm_mod raid1 md_mod hid_generic
usbhi hid sg sd_mod crc_t10dif crct10dif_generic ses enclosure
crct10dif_common ehci_pci uhci_hcd ehci_hcd tg3 ptp aacraid usbcore
pps_core libphy usb_common scsi_mod
[ 160.421953] CPU: 5 PID: 4638 Comm: rpc.nfsd Tainted: G I
3.16-2-amd64 #1 Debian 3.16.3-2
[ 160.421955] Hardware name: IBM IBM eServer x3400-[7976ABG]-/M97IP,
BIOS IBM BIOS Version 1.62-[SPE162AUS-1.62]- 11/09/2007
[ 160.421958] 0000000000000009 ffffffff81506188 ffff8801b6f87d98
ffffffff81065707
[ 160.421961] ffff8800b9df7600 ffff8801b6f87de8 ffff8800b9df7600
0000000000000008
[ 160.421963] 0000000000000000 ffffffff8106576c ffffffffa02e27a8
0000000000000018
[ 160.421967] Call Trace:
[ 160.421975] [<ffffffff81506188>] ? dump_stack+0x41/0x51
[ 160.421980] [<ffffffff81065707>] ? warn_slowpath_common+0x77/0x90
[ 160.421983] [<ffffffff8106576c>] ? warn_slowpath_fmt+0x4c/0x50
[ 160.421991] [<ffffffffa02dd1aa>] ? nfsd4_umh_cltrack_init+0x3a/0x40
[nfsd]
[ 160.421998] [<ffffffffa02de461>] ?
nfsd4_client_tracking_init+0x81/0x130 [nfsd]
[ 160.422006] [<ffffffffa02d8a62>] ? nfs4_state_start_net+0x2a2/0x340
[nfsd]
[ 160.422013] [<ffffffffa02b3b20>] ? nfsd_svc+0x1d0/0x330 [nfsd]
[ 160.422019] [<ffffffffa02b4600>] ? write_pool_threads+0x260/0x260 [nfsd]
[ 160.422025] [<ffffffffa02b468a>] ? write_threads+0x8a/0xf0 [nfsd]
[ 160.422031] [<ffffffff8113ecca>] ? __get_free_pages+0xa/0x50
[ 160.422035] [<ffffffff811ca5e0>] ? simple_transaction_get+0xa0/0xc0
[ 160.422041] [<ffffffffa02b4093>] ?
nfsctl_transaction_write+0x43/0x70 [nfsd]
[ 160.422045] [<ffffffff811a52f2>] ? vfs_write+0xb2/0x1f0
[ 160.422048] [<ffffffff811a5e32>] ? SyS_write+0x42/0xa0
[ 160.422052] [<ffffffff8150c26d>] ?
system_call_fast_compare_end+0x10/0x15
[ 160.422086] WARNING: CPU: 5 PID: 4638 at
/build/linux-P15SNz/linux-3.16.3/fs/nfsd/nfs4recover.c:530
nfsd4_legacy_tracking_init+0x1aa/0x240 [nfsd]()
[ 160.422087] NFSD: attempt to initialize legacy client tracking in a
container!

After killing all processes, container and removing kernel module, i
restart the same scenario

root@srv-alex:~# rmmod nfsd
root@srv-alex:~# rmmod nfsd
rmmod: ERROR: Module nfsd is not currently loaded
root@srv-alex:~# lsmod |grep nfs
nfsv3 37551 1
nfs 187961 2 nfsv3
fscache 45542 1 nfs
nfs_acl 12511 1 nfsv3
lockd 83417 2 nfs,nfsv3
sunrpc 237445 18 nfs,auth_rpcgss,lockd,nfsv3,nfs_acl

root@srv-alex:~# modprobe nfsd nfs4_disable_idmapping=0

then restart the container...
lxc-start -f /etc/lxc/auto/vm-wheezy-x86-amd64-3 -n vm-wheezy-x86-amd64-3

LXC container config is ...
root@srv-alex:~# cat /etc/lxc/auto/vm-wheezy-x86-amd64-3 |grep -v "#"

---------------------- container config --------------

lxc.arch = amd64
lxc.utsname = vm-wheezy-x86-amd64-3
lxc.start.auto = 1

lxc.tty = 4
lxc.pts = 1024
lxc.rootfs = /var/lib/lxc/vm-wheezy-x86-amd64-3/rootfs
lxc.cgroup.devices.deny = a
lxc.cgroup.devices.allow = c 1:3 rwm
lxc.cgroup.devices.allow = c 1:5 rwm
lxc.cgroup.devices.allow = c 5:1 rwm
lxc.cgroup.devices.allow = c 5:0 rwm
lxc.cgroup.devices.allow = c 4:0 rwm
lxc.cgroup.devices.allow = c 4:1 rwm
lxc.cgroup.devices.allow = c 1:9 rwm
lxc.cgroup.devices.allow = c 1:8 rwm
lxc.cgroup.devices.allow = c 136:* rwm
lxc.cgroup.devices.allow = c 5:2 rwm
lxc.cgroup.devices.allow = c 10:235 rwm
lxc.cgroup.devices.allow = c 254:0 rwm
lxc.mount.entry = proc
/var/lib/lxc/vm-wheezy-x86-amd64-3/rootfs/proc proc
nodev,noexec,nosuid 0 0
lxc.mount.entry = devpts
/var/lib/lxc/vm-wheezy-x86-amd64-3/rootfs/dev/pts devpts defaults 0 0
lxc.mount.entry = sysfs
/var/lib/lxc/vm-wheezy-x86-amd64-3/rootfs/sys sysfs defaults 0 0

lxc.cgroup.cpuset.cpus = 1-7

lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = br-admin
lxc.network.name = eth0-admin
lxc.network.hwaddr = 02:00:00:02:01:00
lxc.network.veth.pair = e0-wham64adm

lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = br-services
lxc.network.name = eth1-services
lxc.network.hwaddr = 02:00:00:02:01:01
lxc.network.veth.pair = e1-wham64srv

lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = br-users
lxc.network.name = eth2-users
lxc.network.hwaddr = 02:00:00:02:01:02
lxc.network.veth.pair = e2-wham64usr
---------------------- container config END --------------


..... and the problem occurs with the same error.

Of course, as there is a one fatal error in the kernel, it is not
possible to restart the container due to temporary name used when
creating network interface (unrelated problem with NFS, i hope!), and
the server must be restarted....or all nfsd processus must be killed
with -9 signal

The container NFS config is :

root@srv-alex:~# cat
/var/lib/lxc/vm-wheezy-x86-amd64-3/rootfs/etc/default/nfs-common |grep
-v '#'
-------------------------- start config ----------------------
NEED_STATD=yes

STATDOPTS="--port 32766 --outgoing-port 32765"

NEED_IDMAPD=no

NEED_GSSD=no
-------------------------- end config ----------------------
root@srv-alex:~# cat
/var/lib/lxc/vm-wheezy-x86-amd64-3/rootfs/etc/default/nfs-kernel-server
|grep -v '#'
-------------------------- start config ----------------------
RPCNFSDCOUNT=8

RPCNFSDPRIORITY=0

RPCMOUNTDOPTS='--manage-gids --port 32767 --num-threads=6
--no-nfs-version 4'

NEED_SVCGSSD=no

RPCSVCGSSDOPTS=

-------------------------- end config ----------------------

The log when booting the container is ....
-------------------------- start log ----------------------
root@srv-alex:~# lxc-start -f /etc/lxc/auto/vm-wheezy-x86-amd64-3 -n
vm-wheezy-x86-amd64-3
INIT: version 2.88 booting
Using makefile-style concurrent boot in runlevel S.
Setting the system clock.
hwclock: Cannot access the Hardware Clock via any known method.
hwclock: Use the --debug option to see the details of our search for an
access method.
Unable to set System Clock to: Sun Sep 28 17:31:07 CEST 2014 ... (warning).
Activating swap...done.
Cleaning up temporary files... /tmp /lib/init/rw.
Mount point '/dev/console' does not exist. Skipping mount. ... (warning).
Mount point '/dev/tty1' does not exist. Skipping mount. ... (warning).
Mount point '/dev/tty2' does not exist. Skipping mount. ... (warning).
Mount point '/dev/tty3' does not exist. Skipping mount. ... (warning).
Mount point '/dev/tty4' does not exist. Skipping mount. ... (warning).
Mount point '/dev/ptmx' does not exist. Skipping mount. ... (warning).
Activating lvm and md swap...done.
Checking file systems...fsck from util-linux 2.20.1
done.
Mounting local filesystems...done.
/etc/init.d/mountall.sh: 59: kill: Illegal number: 3 1
Activating swapfile swap...done.
Cleaning up temporary files....
Setting kernel variables ...done.
Configuring network interfaces...Internet Systems Consortium DHCP Client
4.2.2
Copyright 2004-2011 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/
Listening on LPF/eth0-admin/02:00:00:02:01:00
Sending on LPF/eth0-admin/02:00:00:02:01:00
Sending on Socket/fallback
DHCPDISCOVER on eth0-admin to 255.255.255.255 port 67 interval 5
DHCPREQUEST on eth0-admin to 255.255.255.255 port 67
DHCPOFFER from 192.168.9.8
DHCPACK from 192.168.9.8
bound to 192.168.9.29 -- renewal in 25 seconds.
if-up.d/mountnfs[eth0-admin]: waiting for interface eth1-services before
doing NFS mounts ... (warning).
Internet Systems Consortium DHCP Client 4.2.2
Copyright 2004-2011 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

Listening on LPF/eth1-services/02:00:00:02:01:01
Sending on LPF/eth1-services/02:00:00:02:01:01
Sending on Socket/fallback
DHCPDISCOVER on eth1-services to 255.255.255.255 port 67 interval 8
DHCPREQUEST on eth1-services to 255.255.255.255 port 67
DHCPOFFER from 192.168.6.2
DHCPACK from 192.168.6.2

Debian GNU/Linux 7 vm-wheezy-x86-amd64-3 console

vm-wheezy-x86-amd64-3 login: root
Password:
Last login: Sun Sep 28 17:15:01 CEST 2014 on console
Linux vm-wheezy-x86-amd64-3 3.16-2-amd64 #1 SMP Debian 3.16.3-2
(2014-09-20) x86_64


-------------------------- end log ----------------------


I tried also to put the parameter to RPCNFSDCOUNT=1,
and the result is after error
----------------------------start trace -------------------------------
[ 2996.630588] ------------[ cut here ]------------
[ 2996.630608] WARNING: CPU: 7 PID: 11074 at
/build/linux-P15SNz/linux-3.16.3/fs/nfsd/nfs4recover.c:1195
nfsd4_umh_cltrack_init+0x3a/0x40 [nfsd]()
[ 2996.630609] NFSD: attempt to initialize umh client tracking in a
container!
[ 2996.630729] Modules linked in: nfsd veth nfsv3 nfs fscache
binfmt_misc bridge 8021q garp stp mrp llc iTCO_wdt iTCO_vendor_support
ppdev joydev raid0 radettm coretemp drm_kms_helper psmouse drm kvm_intel
evdev i5000_edac pcspkr i2c_algo_bit edac_core serio_raw kvm parport_pc
parport shpchp i2c_i801 i2c_core amb lpc_ich mfd_core rng_core processor
thermal_sys button auth_rpcgss oid_registry nfs_acl lockd sunrpc loop
autofs4 ext4 crc16 mbcache jbd2 dm_mod raidod hid_generic usbhid hid sg
sd_mod crc_t10dif crct10dif_generic ses enclosure crct10dif_common
ehci_pci uhci_hcd ehci_hcd tg3 ptp aacraid usbcore pps_cophy usb_common
scsi_mod [last unloaded: nfsd]
[ 2996.630790] CPU: 7 PID: 11074 Comm: rpc.nfsd Tainted: G W I
3.16-2-amd64 #1 Debian 3.16.3-2
[ 2996.630792] Hardware name: IBM IBM eServer x3400-[7976ABG]-/M97IP,
BIOS IBM BIOS Version 1.62-[SPE162AUS-1.62]- 11/09/2007
[ 2996.630794] 0000000000000009 ffffffff81506188 ffff8801b6813d98
ffffffff81065707
[ 2996.630797] ffff8801b5d45e00 ffff8801b6813de8 ffff8801b5d45e00
0000000000000001
[ 2996.630800] 0000000000000000 ffffffff8106576c ffffffffa02e27a8
0000000000000018
[ 2996.630803] Call Trace:
[ 2996.630811] [<ffffffff81506188>] ? dump_stack+0x41/0x51
[ 2996.630816] [<ffffffff81065707>] ? warn_slowpath_common+0x77/0x90
[ 2996.630819] [<ffffffff8106576c>] ? warn_slowpath_fmt+0x4c/0x50
[ 2996.630826] [<ffffffffa02dd1aa>] ? nfsd4_umh_cltrack_init+0x3a/0x40
[nfsd]
[ 2996.630832] [<ffffffffa02de461>] ?
nfsd4_client_tracking_init+0x81/0x130 [nfsd]
[ 2996.630839] [<ffffffffa02d8a62>] ? nfs4_state_start_net+0x2a2/0x340
[nfsd]
[ 2996.630844] [<ffffffffa02b3b20>] ? nfsd_svc+0x1d0/0x330 [nfsd]
[ 2996.630850] [<ffffffffa02b4600>] ? write_pool_threads+0x260/0x260 [nfsd]
[ 2996.630855] [<ffffffffa02b468a>] ? write_threads+0x8a/0xf0 [nfsd]
[ 2996.630860] [<ffffffff8113ecca>] ? __get_free_pages+0xa/0x50
[ 2996.630864] [<ffffffff811ca5e0>] ? simple_transaction_get+0xa0/0xc0
[ 2996.630869] [<ffffffffa02b4093>] ?
nfsctl_transaction_write+0x43/0x70 [nfsd]
[ 2996.630873] [<ffffffff811a52f2>] ? vfs_write+0xb2/0x1f0
[ 2996.630876] [<ffffffff811a5e32>] ? SyS_write+0x42/0xa0
[ 2996.630880] [<ffffffff8150c26d>] ?
system_call_fast_compare_end+0x10/0x15
[ 2996.630882] ---[ end trace 61dda43e27c71f62 ]---
[ 2996.630887] ------------[ cut here ]------------

[ 2996.630894] WARNING: CPU: 7 PID: 11074 at
/build/linux-P15SNz/linux-3.16.3/fs/nfsd/nfs4recover.c:530
nfsd4_legacy_tracking_init+0x1aa/0x240 [nfsd]()
[ 2996.630895] NFSD: attempt to initialize legacy client tracking in a
container!
[ 2996.630999] Modules linked in: nfsd veth nfsv3 nfs fscache
binfmt_misc bridge 8021q garp stp mrp llc iTCO_wdt iTCO_vendor_support
ppdev joydev raid0 ttm coretemp drm_kms_helper psmouse drm kvm_intel
evdev i5000_edac pcspkr i2c_algo_bit edac_core serio_raw kvm parport_pc
parport shpchp i2c_i801 i2c_coamb lpc_ich mfd_core rng_core processor
thermal_sys button auth_rpcgss oid_registry nfs_acl lockd sunrpc loop
autofs4 ext4 crc16 mbcache jbd2 dm_mod raiod hid_generic usbhid hid sg
sd_mod crc_t10dif crct10dif_generic ses enclosure crct10dif_common
ehci_pci uhci_hcd ehci_hcd tg3 ptp aacraid usbcore pps_cphy usb_common
scsi_mod [last unloaded: nfsd]
[ 2996.631043] CPU: 7 PID: 11074 Comm: rpc.nfsd Tainted: G W I
3.16-2-amd64 #1 Debian 3.16.3-2
[ 2996.631045] Hardware name: IBM IBM eServer x3400-[7976ABG]-/M97IP,
BIOS IBM BIOS Version 1.62-[SPE162AUS-1.62]- 11/09/2007
[ 2996.631046] 0000000000000009 ffffffff81506188 ffff8801b6813d80
ffffffff81065707
[ 2996.631049] ffff8801b5d45e00 ffff8801b6813dd0 0000000000004000
0000000000000001
[ 2996.631052] 0000000000000000 ffffffff8106576c ffffffffa02e2a28
ffff880100000018
[ 2996.631055] Call Trace:
[ 2996.631058] [<ffffffff81506188>] ? dump_stack+0x41/0x51
[ 2996.631061] [<ffffffff81065707>] ? warn_slowpath_common+0x77/0x90
[ 2996.631064] [<ffffffff8106576c>] ? warn_slowpath_fmt+0x4c/0x50
[ 2996.631068] [<ffffffff812b4258>] ? lockref_put_or_lock+0x48/0x80
[ 2996.631074] [<ffffffffa02de2ca>] ?
nfsd4_legacy_tracking_init+0x1aa/0x240 [nfsd]
[ 2996.631080] [<ffffffffa02de431>] ?
nfsd4_client_tracking_init+0x51/0x130 [nfsd]
[ 2996.631086] [<ffffffffa02d8a62>] ? nfs4_state_start_net+0x2a2/0x340
[nfsd]
[ 2996.631091] [<ffffffffa02b3b20>] ? nfsd_svc+0x1d0/0x330 [nfsd]
[ 2996.631097] [<ffffffffa02b4600>] ? write_pool_threads+0x260/0x260 [nfsd]
[ 2996.631102] [<ffffffffa02b468a>] ? write_threads+0x8a/0xf0 [nfsd]
[ 2996.631105] [<ffffffff8113ecca>] ? __get_free_pages+0xa/0x50
[ 2996.631107] [<ffffffff811ca5e0>] ? simple_transaction_get+0xa0/0xc0
[ 2996.631112] [<ffffffffa02b4093>] ?
nfsctl_transaction_write+0x43/0x70 [nfsd]
[ 2996.631116] [<ffffffff811a52f2>] ? vfs_write+0xb2/0x1f0
[ 2996.631118] [<ffffffff811a5e32>] ? SyS_write+0x42/0xa0
[ 2996.631121] [<ffffffff8150c26d>] ?
system_call_fast_compare_end+0x10/0x15
[ 2996.631123] ---[ end trace 61dda43e27c71f63 ]---
[ 2996.631125] NFSD: Unable to initialize client recovery tracking! (-22)
[ 2996.631127] NFSD: starting 90-second grace period (net ffff8801b6bfa0c0)
----------------------end trace -----------------------------

Last information, ....the host container (Debian Jessie) runs also one
NFS client daemon, so i suspect perhapŝ one problem in sysfs on name
space code ?

best regards

--
--------------------------------------
-- Jean-Marc LACROIX --
-- mailto : jeanmarc...@free.fr --
---------------------------------------


--
To UNSUBSCRIBE, email to debian-bugs-...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org

Debian Bug Tracking System

unread,
Apr 23, 2021, 2:30:03 PM4/23/21
to
Your message dated Fri, 23 Apr 2021 20:23:40 +0200
with message-id <YIMQrIiO...@eldamar.lan>
and subject line Re: Bug#763192: [LXC] [nfsd] kernel crash when running nfs-kernel-server in one LXC Container
has caused the Debian Bug report #763192,
regarding NFSv4 server recovery not supported in container
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ow...@bugs.debian.org
immediately.)


--
763192: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=763192
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems

Jean-Marc LACROIX (jeanmarc.lacroix@free.fr)

unread,
Apr 23, 2021, 3:30:03 PM4/23/21
to


Le 23/04/2021 à 20:27, Debian Bug Tracking System a écrit :
> This is an automatic notification regarding your Bug report
> which was filed against the src:linux package:
>
> #763192: NFSv4 server recovery not supported in container
>
> It has been closed by Salvatore Bonaccorso <car...@debian.org>.
>
> Their explanation is attached below along with your original report.
> If this explanation is unsatisfactory and you have not received a
> better one in a separate message then please contact Salvatore Bonaccorso <car...@debian.org> by
> replying to this email.
>
>
Hi,

Please find here some good news about this issue. It is now possible to
run NFS server into one LXC container.

One of my current configuration running for 2 years on Debian Buster
on armhf and amd64 architecture is ...

Step 1: hypervisor configuration (target = hc1-260)
----------------------------------------------------
This is one arhf octocore odroid hC1 board :
ansible@hc1-260:~$ uname -a
Linux hc1-260 5.10.0-0.bpo.5-armmp-lpae #1 SMP Debian 5.10.24-1~bpo10+1
(2021-03-29) armv7l GNU/Linux
ansible@hc1-260:~$ cat /etc/debian_version
10.9
ansible@hc1-260:~$ ansible@hc1-260:~$ cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
CPU6 CPU7
57: 0 0 0 0 0 0
0 0 COMBINER 187 Edge mct_comp_irq
58: 48423896 0 0 0 0 0
0 0 GICv2 152 Level mct_tick0
59: 0 42386303 0 0 0 0
0 0 GICv2 153 Level mct_tick1
........

Because in my configuration, LXC container can not insmod dedicated nfs
module, it is then mandatory to instert it into hypervisor.
As a result, this is done into /etc/module file

ansible@hc1-260:~$ sudo cat /etc/modules |grep -v "#" |grep -v ^$
iptable_filter
autofs4
8021q
tun
nfsv4
nfsd


On hypervisor, lxc container running nfs server is ok

ansible@hc1-260:~$ sudo lxc-ls -f |grep nfs
vm-nfs-260 RUNNING 1 grp_lxc_start_on_boot
192.168.22.136, 192.168.24.136, 192.168.25.136

Here is the configuration of LXC container

ansible@hc1-260:~$ sudo cat /etc/lxc/auto/vm-nfs-260 |grep -v '#'
|grep -v ^$
lxc.arch = armv7l
lxc.uts.name = vm-nfs-260
lxc.start.auto = 1
lxc.start.order = 80
lxc.start.delay = 0
lxc.group = grp_lxc_start_on_boot
lxc.init.cmd = /sbin/init
lxc.init.uid = 0
lxc.init.gid = 0
lxc.ephemeral = 0

lxc.console.buffer.size = 102400
lxc.console.size = 102400
lxc.log.level = DEBUG
lxc.log.file = /var/log/lxc/vm-nfs-260.log
lxc.tty.max = 4
lxc.pty.max = 10
lxc.signal.halt = SIGPWR
lxc.signal.reboot = SIGINT
lxc.signal.stop = SIGKILL
lxc.cgroup.memory.limit_in_bytes = 313M

lxc.cgroup.cpuset.cpus = 4
lxc.cgroup.cpu.shares = 1024
lxc.cgroup.devices.deny = a
lxc.autodev = 1
lxc.cgroup.devices.allow = c 1:3 rwm
lxc.cgroup.devices.allow = c 1:5 rwm
lxc.cgroup.devices.allow = c 1:8 rwm
lxc.cgroup.devices.allow = c 1:9 rwm
lxc.cgroup.devices.allow = c 5:1 rwm
lxc.cgroup.devices.allow = c 5:2 rwm
lxc.cgroup.devices.allow = c 136:0 rwm
lxc.cgroup.devices.allow = c 136:1 rwm
lxc.cgroup.devices.allow = c 136:2 rwm
lxc.cgroup.devices.allow = c 136:3 rwm
lxc.cgroup.devices.allow = c 136:4 rwm
lxc.cgroup.devices.allow = c 136:5 rwm
lxc.cgroup.devices.allow = c 136:6 rwm
lxc.cgroup.devices.allow = c 136:7 rwm
lxc.cgroup.devices.allow = c 136:8 rwm
lxc.cgroup.devices.allow = c 136:9 rwm
lxc.cgroup.devices.allow = c 5:0 rwm
lxc.cgroup.devices.allow = c 4:64 rwm
lxc.cgroup.devices.allow = c 4:65 rwm
lxc.cgroup.devices.allow = c 4:0 rwm
lxc.cgroup.devices.allow = c 4:1 rwm
lxc.cgroup.devices.allow = c 4:2 rwm
lxc.cgroup.devices.allow = c 4:3 rwm
lxc.cgroup.devices.allow = c 4:4 rwm
lxc.cgroup.devices.allow = c 4:5 rwm
lxc.cgroup.devices.allow = c 4:6 rwm
lxc.rootfs.mount = /var/lib/lxc/vm-nfs-260/rootfs
lxc.rootfs.path =
/dev/mapper/vg_vm_nfs_260-lv_rootfs
lxc.rootfs.options = defaults,noatime,nodiratime
lxc.mount.entry = proc
/var/lib/lxc/vm-nfs-260/rootfs/proc proc nodev,noexec,nosuid 0 0
lxc.mount.entry = devpts
/var/lib/lxc/vm-nfs-260/rootfs/dev/pts devpts defaults 0 0
lxc.mount.entry = sysfs
/var/lib/lxc/vm-nfs-260/rootfs/sys sysfs d
lxc.mount.entry =
/dev/mapper/vg_vm_nfs_260-lv_usr /var/lib/lxc/vm-nfs-ime,nodiratime
lxc.mount.entry =
/dev/mapper/vg_vm_nfs_260-lv_var /var/lib/lxc/vm-nfs-ime,nodiratime
lxc.mount.entry =
/dev/mapper/vg_vm_nfs_260-lv_tmp /var/lib/lxc/vm-nfs-ime,nodiratime
lxc.mount.entry =
/dev/mapper/vg_vm_nfs_260-lv_home /var/lib/lxc/vm-nfsatime,nodiratime
lxc.mount.entry =
/dev/mapper/vg_vm_nfs_260-lv_var_log /var/lib/lxc/vm-lts,noatime,nodiratime
lxc.mount.entry =
/dev/mapper/vg_vm_nfs_260-lv_var_lib /var/lib/lxc/vm-lts,noatime,nodiratime
lxc.mount.entry =
/dev/mapper/vg_vm_nfs_260-lv_var_cache
/var/lib/lxc/vefaults,noatime,nodiratime
lxc.mount.entry =
/dev/mapper/vg_vm_nfs_260-lv_var_lib_apt /var/lib/lxc4
defaults,noatime,nodiratime
lxc.mount.entry =
/dev/mapper/vg_vm_nfs_260-lv_nfs_home /var/lib/lxc/vm,noatime,nodiratime
lxc.net.0.type = veth
lxc.net.0.flags = up
lxc.net.0.link = br-admi
lxc.net.0.name = et-admi
lxc.net.0.hwaddr = 02:00:10:80:08:25
lxc.net.0.veth.pair = e-nfs-adm
lxc.net.1.type = veth
lxc.net.1.flags = up
lxc.net.1.link = br-user
lxc.net.1.name = et-user
lxc.net.1.hwaddr = 02:00:10:80:08:24
lxc.net.1.veth.pair = e-nfs-usr
lxc.net.2.type = veth
lxc.net.2.flags = up
lxc.net.2.link = br-wifi
lxc.net.2.name = et-wifi
lxc.net.2.hwaddr = 02:00:10:80:08:27
lxc.net.2.veth.pair = e-nfs-wifi
lxc.net.3.type = veth
lxc.net.3.flags = up
lxc.net.3.link = br-serv
lxc.net.3.name = et-serv
lxc.net.3.hwaddr = 02:00:10:80:08:22
lxc.net.3.veth.pair = e-nfs-srv
lxc.net.4.type = veth
lxc.net.4.flags = up
lxc.net.4.link = br-fact
lxc.net.4.name = et-fact
lxc.net.4.hwaddr = 02:00:10:80:08:31
lxc.net.4.veth.pair = e-nfs-fact
lxc.apparmor.allow_incomplete = 1
lxc.apparmor.profile = unconfined
ansible@hc1-260:~$

Of course, on hypervisor, no nfs daemon is running , because ....
ansible@hc1-260:~$ dpkg -l |grep nfs

and no special mount point

ansible@hc1-260:~$ df
Filesystem 1K-blocks Used Available Use% Mounted on
udev 961608 0 961608 0% /dev
tmpfs 196624 1456 195168 1% /run
/dev/mmcblk0p1 3028752 1217568 1637616 43% /
tmpfs 5120 0 5120 0% /run/lock
tmpfs 393240 40 393200 1% /dev/shm
cgroup 983112 0 983112 0% /sys/fs/cgroup


On hypervisor, one dedicated disk is available, but only for nfs
server container, all is done accross LVM on SSD disk:

ansible@hc1-260:~$ sudo lvs |grep nfs
lv_home vg_vm_nfs_260 -wi-ao---- 124.00m

lv_nfs_home vg_vm_nfs_260 -wi-ao---- 58.00g

lv_rootfs vg_vm_nfs_260 -wi-ao---- 124.00m

lv_tmp vg_vm_nfs_260 -wi-ao---- 144.00m

lv_usr vg_vm_nfs_260 -wi-ao---- 688.00m

lv_var vg_vm_nfs_260 -wi-ao---- 112.00m

lv_var_cache vg_vm_nfs_260 -wi-ao---- 384.00m

lv_var_lib vg_vm_nfs_260 -wi-ao---- 132.00m

lv_var_lib_apt vg_vm_nfs_260 -wi-ao---- 732.00m

lv_var_log vg_vm_nfs_260 -wi-ao---- 200.00m


Previous LVM patrtition are of course mounted into LXC nfs container
according previous configuration...


Step 2: nfs server configuration (target = vm-nfs-260)
------------------------------------------------------

ansible@vm-nfs-260:~$ df
Filesystem 1K-blocks Used Available
Use% Mounted on
/dev/mapper/vg_vm_nfs_260-lv_rootfs 118867 4906 105074 5% /
none 492 0 492
0% /dev
/dev/mapper/vg_vm_nfs_260-lv_usr 677032 371040 256680
60% /usr
/dev/mapper/vg_vm_nfs_260-lv_var 106967 2335 96605
3% /var
/dev/mapper/vg_vm_nfs_260-lv_tmp 138697 1550 126826
2% /tmp
/dev/mapper/vg_vm_nfs_260-lv_home 118867 1769 108211
2% /home
/dev/mapper/vg_vm_nfs_260-lv_var_log 194235 79682 100217
45% /var/log
/dev/mapper/vg_vm_nfs_260-lv_var_lib 126786 11643 105682
10% /var/lib
/dev/mapper/vg_vm_nfs_260-lv_var_cache 372607 68377 280474
20% /var/cache
/dev/mapper/vg_vm_nfs_260-lv_var_lib_apt 721392 155928 513000
24% /var/lib/apt
/dev/mapper/vg_vm_nfs_260-lv_nfs_home 59600812 18687568 37855992
34% /srv
tmpfs 196624 56 196568
1% /run
tmpfs 5120 0 5120
0% /run/lock
tmpfs 393240 40 393200
1% /dev/shm
ansible@vm-nfs-260:~$
ansible@vm-nfs-260:~$ cat /etc/debian_version
10.9

ansible@vm-nfs-260:~$ pstree -anp
init,1
|-rpcbind,662 -w -h vm-nfs-260-service
|-rpc.statd,671 --state-directory-path /var/lib/nfs --port 32766
--outgoing-port 32765 --name vm-nfs-260-service
|-rpc.idmapd,680
|-rpc.mountd,742 --state-directory-path /var/lib/nfs --manage-gids
--port 32767 --num-threads=6
| |-rpc.mountd,745 --state-directory-path /var/lib/nfs
--manage-gids --port 32767 --num-threads=6
| |-rpc.mountd,746 --state-directory-path /var/lib/nfs
--manage-gids --port 32767 --num-threads=6
| |-rpc.mountd,747 --state-directory-path /var/lib/nfs
--manage-gids --port 32767 --num-threads=6
| |-rpc.mountd,748 --state-directory-path /var/lib/nfs
--manage-gids --port 32767 --num-threads=6
| |-rpc.mountd,749 --state-directory-path /var/lib/nfs
--manage-gids --port 32767 --num-threads=6
| `-rpc.mountd,750 --state-directory-path /var/lib/nfs
--manage-gids --port 32767 --num-threads=6
|-syslog-ng,764
| `-syslog-ng,765 -p /var/run/syslog-ng.pid --no-caps
|-cron,788
|-monit,804 -c /etc/monit/monitrc
| |-{monit},9259
| |-{monit},9260
| `-(verify_rpc_stat,10086)
|-getty,808 115200 console
`-sshd,32294
`-sshd,10079
`-sshd,10081
`-bash,10082
`-pstree,10107 -anp
ansible@vm-nfs-260:~$

Only ssh, monit, syslog are running with one sysvinit init
( no systemd !)

ansible@vm-nfs-260:~$ ip route ls
default via 192.168.24.254 dev et-user
192.168.22.0/24 dev et-serv proto kernel scope link src 192.168.22.136
192.168.24.0/24 dev et-user proto kernel scope link src 192.168.24.136
192.168.25.0/24 dev et-admi proto kernel scope link src 192.168.25.136
ansible@vm-nfs-260:~$


Of course, all configuration files are 100% compatible with Debian Buster.

ansible@vm-nfs-260:~$ cat /etc/default/nfs-kernel-server |grep -v "#"
|grep -v ^$
RPCNFSDCOUNT=8
RPCNFSDPRIORITY=0
RPCMOUNTDOPTS="--state-directory-path /var/lib/nfs --manage-gids --port
32767 --num-threads=6"
NEED_SVCGSSD="no"
ansible@vm-nfs-260:~$

ansible@vm-nfs-260:~$ cat /etc/exports |grep -v "#" |grep -v ^$

/srv/nfs/home
localhost(rw,secure_locks,insecure,no_subtree_check,no_all_squash,async,no_root_squash)

/srv/nfs/home
192.168.22.0/24(rw,secure_locks,insecure,no_subtree_check,no_all_squash,async,root_squash)


-
Step 3: nfs clien side
-----------------------

For client side, i am running nfs client on amd64, armhf and arm64
architecture, either on one real physical target, either on one
LXC container.

For example, following configuration is done on one LXC arm64 bullseye
arm64 target

jean-marc@vm-bullseye-arm64-280:~$ df
Sys. de fichiers
blocs de 1K Utilisé Disponible Uti% Monté sur
/dev/mapper/vg_vm_bullseye_arm64_280-lv_rootfs
118867 10847 99133 10% /
none
492 0 492 0% /dev
udev
1806524 0 1806524 0% /dev/dri
/dev/mapper/vg_vm_bullseye_arm64_280-lv_usr
5981956 4439736 1218636 79% /usr
/dev/mapper/vg_vm_bullseye_arm64_280-lv_var
206112 5011 186151 3% /var
/dev/mapper/vg_vm_bullseye_arm64_280-lv_tmp
138697 1554 126822 2% /tmp
/dev/mapper/vg_vm_bullseye_arm64_280-lv_home
118867 3256 106724 3% /home
/dev/mapper/vg_vm_bullseye_arm64_280-lv_var_log
194235 66707 113192 38% /var/log
/dev/mapper/vg_vm_bullseye_arm64_280-lv_var_lib
126786 48031 69294 41% /var/lib
/dev/mapper/vg_vm_bullseye_arm64_280-lv_var_cache
991512 196272 727656 22% /var/cache
/dev/mapper/vg_vm_bullseye_arm64_280-lv_var_lib_apt
721392 279148 389780 42% /var/lib/apt
tmpfs
390880 736 390144 1% /run
tmpfs
5120 0 5120 0% /run/lock
tmpfs
781760 16 781744 1% /dev/shm
tmpfs
390880 0 390880 0% /run/user/10000
vm-nfs-260-service.sub-dns-lapiteau.TLD.jml:/srv/nfs/home/jean-marc
59600896 18687616 37856000 34% /nfs-home/jean-marc
jean-marc@vm-bullseye-arm64-280:~$ uname -a
Linux vm-bullseye-arm64-280 5.10.0-0.bpo.5-arm64 #1 SMP Debian
5.10.24-1~bpo10+1 (2021-03-29) aarch64 GNU/Linux
jean-marc@vm-bullseye-arm64-280:~$ cat /etc/debian_version
bullseye/sid
jean-marc@vm-bullseye-arm64-280:~$


-
best regards
----------------------------------------
-- Jean-Marc LACROIX (06 82 29 98 66) --

Salvatore Bonaccorso

unread,
Apr 24, 2021, 2:40:03 AM4/24/21
to
Hi Jean-Marc,

On Fri, Apr 23, 2021 at 09:23:35PM +0200, Jean-Marc LACROIX (jeanmarc...@free.fr) wrote:
>
>
> Le 23/04/2021 à 20:27, Debian Bug Tracking System a écrit :
> > This is an automatic notification regarding your Bug report
> > which was filed against the src:linux package:
> >
> > #763192: NFSv4 server recovery not supported in container
> >
> > It has been closed by Salvatore Bonaccorso <car...@debian.org>.
> >
> > Their explanation is attached below along with your original report.
> > If this explanation is unsatisfactory and you have not received a
> > better one in a separate message then please contact Salvatore Bonaccorso <car...@debian.org> by
> > replying to this email.
> >
> >
> Hi,
>
> Please find here some good news about this issue. It is now possible to run
> NFS server into one LXC container.

Many thanks for sharing this!

Regards,
Salvatore
0 new messages