Another force unmount failure

Emmanuel Dreyfus

unread,

Jul 16, 2015, 12:54:12 PM7/16/15

to

Hello

I discovered another scenario where force unmount could not work: an
unresponsive PUFFS filesystem. The filesystem got out of order during an
operation where the filesystem root vnode is locked. As a result, trying
to unmount goes this way

sys_unmount -> namei -> namei_tryemulroot -> lookup_once -> VFS_ROOT ->
puffs_vfsop_root -> vn_lock -> VOP_LOCK and we loose.

I can kill the userland process, but the mount remains, an any attempt
to touch it sinks in tstile.

Any idea how can that be fixed?

--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
ma...@netbsd.org

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-...@muc.de

David Holland

unread,

Jul 16, 2015, 1:20:33 PM7/16/15

to

On Thu, Jul 16, 2015 at 06:57:30PM +0200, Emmanuel Dreyfus wrote:
> Hello
>
> I discovered another scenario where force unmount could not work: an
> unresponsive PUFFS filesystem. The filesystem got out of order during an
> operation where the filesystem root vnode is locked. As a result, trying
> to unmount goes this way
>
> sys_unmount -> namei -> namei_tryemulroot -> lookup_once -> VFS_ROOT ->
> puffs_vfsop_root -> vn_lock -> VOP_LOCK and we loose.
>
> I can kill the userland process, but the mount remains, an any attempt
> to touch it sinks in tstile.
>
> Any idea how can that be fixed?

Either make vnode locks interruptible, or debug puffs.

I favor the former, but lost the argument a few years back. Others
(including e.g. yamt) said no.

--
David A. Holland
dhol...@netbsd.org

Emmanuel Dreyfus

unread,

Jul 16, 2015, 1:31:00 PM7/16/15

to

David Holland <dholla...@netbsd.org> wrote:

> Either make vnode locks interruptible, or debug puffs.
>
> I favor the former, but lost the argument a few years back. Others
> (including e.g. yamt) said no.

Even as not interruptible, I can call VOP_LOCK with LK_NOWAIT and loop
on it until I decide it is time to give up. But once I failed to acquire
filesystem root vnode lock, how can I forcibly unmount it?

--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
ma...@netbsd.org

Emmanuel Dreyfus

unread,

Jul 17, 2015, 7:41:17 AM7/17/15

to

Emmanuel Dreyfus <ma...@netbsd.org> wrote:

> I discovered another scenario where force unmount could not work: an
> unresponsive PUFFS filesystem. The filesystem got out of order during an
> operation where the filesystem root vnode is locked.

I tried LOCKDEBUG to get some hints, and now I get a nice
panic (see below).

gdb suggests this happens in LOCKDEBUG_BARRIER(NULL, 0) in
mi_userret().

Anyone knowledgable can tell me what is it so that I get some ideas
of what needs to be fixed? I have the feeling this detects I leave the kernel
while a lock was not released, is that the problem?

The panic:

Reader / writer lock error: lockdebug_barrier: sleep lock held

lock address : 0x00000000c2965dc0 type : sleep/adaptive
initialized : 0x00000000c0400b1b
shared holds : 0 exclusive: 1
shares wanted: 0 exclusive: 0
current cpu : 0 last held: 0
current lwp : 0x00000000c2a1bd20 last held: 0x00000000c2a1bd20
last locked* : 0x00000000c018b217 unlocked : 0x00000000c018b2dd
owner/count : 0x00000000c2a1bd20 flags : 0x0000000000000004

Turnstile chain at 0xc049ef00.
=> No active turnstile for this lock.

panic: LOCKDEBUG: Reader / writer lock error: lockdebug_barrier: sleep lock held
fatal breakpoint trap in supervisor mode
trap type 1 code 0 eip c012fcd4 cs 9 eflags 282 cr2 b9d1997f ilevel 0 esp da8b3e
cc
curlwp 0xc2a1bd20 pid 4216 lid 1 lowest kstack 0xda8b22c0
Stopped in pid 4216.1 (ln) at netbsd:breakpoint+0x4: popl %ebp
breakpoint(c04803c9,c04e7fe0,c04803cb,da8b3ef8,0,c048048d,da8b3eec,c28c0280,0,c0
48048d) at netbsd:breakpoint+0x4
vpanic(c04803cb,da8b3ef8,207,c048048d,0,c28c0280,da8b3f2c,c035847e,c04803cb,c047
51d2) at netbsd:vpanic+0x117
panic(c04803cb,c04751d2,c045cacc,c048048d,c2a1bd20,ffffff9c,17ffc77,c045cacc,bf7
ffc8d,400) at netbsd:panic+0x18
lockdebug_abort1(c048048d,1,da8b3f7c,da8b3fa0,c02ebc23,da8b3f54,9,106,bf7ffc77,b
f7ffc8d) at netbsd:lockdebug_abort1+0xce
syscall() at netbsd:syscall+0xea
--- syscall (number 0) ---
bb69d1d7:
ds da8b0011
es c0480011 copyright+0x1b131
fs 31
gs da8b0011
edi da8b3ef8
esi c04803cb copyright+0x1b4eb
ebp da8b3e9c
ebx 104
edx 0
ecx 0
eax 1
eip c012fcd4 breakpoint+0x4
cs 9
eflags 282
esp da8b3e9c
ss 11
netbsd:breakpoint+0x4: popl %ebp

(gdb) list *(syscall+0xea)
0xc037d4ca is in syscall (../../../../arch/x86/x86/syscall.c:185).
180 }
181 }
182
183 SYSCALL_TIME_SYS_EXIT(l);
184 userret(l);
185 }
186
187 void
188 syscall_intern(struct proc *p)
189 {

i386 userrer() is just a call to mi_userret(), which ends with
LOCKDEBUG_BARRIER(NULL, 0);

If LOCKDEBUG is enabled, it LOCKDEBUG_BARRIER is defined as
#define LOCKDEBUG_BARRIER(lock, slp) lockdebug_barrier(lock, slp)
And:

void
lockdebug_barrier(volatile void *spinlock, int slplocks)
{
struct lwp *l = curlwp;
lockdebug_t *ld;
int s;

if (panicstr != NULL || ld_panic)
return;

s = splhigh();
if ((l->l_pflag & LP_INTR) == 0) {
TAILQ_FOREACH(ld, &curcpu()->ci_data.cpu_ld_locks, ld_chain) {
if (ld->ld_lock == spinlock) {
continue;
}
__cpu_simple_lock(&ld->ld_spinlock);
lockdebug_abort1(ld, s, __func__,
"spin lock held", true);
return;
}
}
if (slplocks) {
splx(s);
return;
}
if ((ld = TAILQ_FIRST(&l->l_ld_locks)) != NULL) {
__cpu_simple_lock(&ld->ld_spinlock);
lockdebug_abort1(ld, s, __func__, "sleep lock held", true);
return;
}
splx(s);
if (l->l_shlocks != 0) {
panic("lockdebug_barrier: holding %d shared locks",
l->l_shlocks);

Taylor R Campbell

unread,

Jul 17, 2015, 12:00:29 PM7/17/15

to

Date: Fri, 17 Jul 2015 13:44:28 +0200
From: ma...@netbsd.org (Emmanuel Dreyfus)

I tried LOCKDEBUG to get some hints, and now I get a nice
panic (see below).

gdb suggests this happens in LOCKDEBUG_BARRIER(NULL, 0) in
mi_userret().

Anyone knowledgable can tell me what is it so that I get some ideas
of what needs to be fixed? I have the feeling this detects I leave
the kernel while a lock was not released, is that the problem?

Correct. The part you want to look at is:

Reader / writer lock error: lockdebug_barrier: sleep lock held

lock address : 0x00000000c2965dc0 type : sleep/adaptive
initialized : 0x00000000c0400b1b
shared holds : 0 exclusive: 1
shares wanted: 0 exclusive: 0
current cpu : 0 last held: 0
current lwp : 0x00000000c2a1bd20 last held: 0x00000000c2a1bd20
last locked* : 0x00000000c018b217 unlocked : 0x00000000c018b2dd

^^^^^^^^^^^^^^^^^^
owner/count : 0x00000000c2a1bd20 flags : 0x0000000000000004

`Last locked' tells you the return address of the call to rw_enter
that last acquired the lock. (The other addresses may be useful for
other lockdebug panics but aren't likely to be of much use here.)

Emmanuel Dreyfus

unread,

Jul 17, 2015, 12:34:13 PM7/17/15

to

Taylor R Campbell <campbell+net...@mumble.net> wrote:

> `Last locked' tells you the return address of the call to rw_enter
> that last acquired the lock. (The other addresses may be useful for
> other lockdebug panics but aren't likely to be of much use here.)

Here is the code. The function cannot exit without vp->v_interlock being
unlocked. What does that means?

vdead_check() can go in vwait() -> cv_wait(), and while we were
sleeping, another thread exitted the kernel and triggered the debug
check? But it seems to be perfomed within a given thread.

(gdb) list *0xc018b217
0xc018b217 is in genfs_lock
(../../../../miscfs/genfs/genfs_vnops.c:385).
380 return error;
381 }
382
383 fstrans_start(mp, FSTRANS_SHARED);
384 rw_enter(&vp->v_lock, op);
385 mutex_enter(vp->v_interlock);
386 error = vdead_check(vp, VDEAD_NOWAIT);
387 if (error) {
388 rw_exit(&vp->v_lock);
389 fstrans_done(mp);
390 error = vdead_check(vp, 0);
391 KASSERT(error == ENOENT);
392 }
393 mutex_exit(vp->v_interlock);
394 return error;
395 }

--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
ma...@netbsd.org

Taylor R Campbell

unread,

Jul 17, 2015, 12:43:53 PM7/17/15

to

Date: Fri, 17 Jul 2015 18:37:30 +0200
From: ma...@netbsd.org (Emmanuel Dreyfus)

Taylor R Campbell <campbell+net...@mumble.net> wrote:

> `Last locked' tells you the return address of the call to rw_enter
> that last acquired the lock. (The other addresses may be useful for
> other lockdebug panics but aren't likely to be of much use here.)

Here is the code. The function cannot exit without vp->v_interlock being
unlocked. What does that means?

(gdb) list *0xc018b217
0xc018b217 is in genfs_lock
(../../../../miscfs/genfs/genfs_vnops.c:385).
380 return error;
381 }
382
383 fstrans_start(mp, FSTRANS_SHARED);
384 rw_enter(&vp->v_lock, op);
385 mutex_enter(vp->v_interlock);

What's locked is not vp_interlock, but v_lock. The address c018b217
or line 385 is the return address of the lock, i.e. what happens
afterward.

Since this is in genfs_lock, what happened here is almost certainly
that you locked a vnode but didn't unlock it. Unfortunately,
lockdebug does not provide a full stack trace, so you'll have to guess
where the relevant vn_lock was.

(Perhaps we ought to put extra lockdebug crud in vn_lock and a new
vn_unlock in order to track these things down more usefully.)

David Holland

unread,

Jul 17, 2015, 12:44:36 PM7/17/15

to

On Fri, Jul 17, 2015 at 06:37:30PM +0200, Emmanuel Dreyfus wrote:
> > `Last locked' tells you the return address of the call to rw_enter
> > that last acquired the lock. (The other addresses may be useful for
> > other lockdebug panics but aren't likely to be of much use here.)
>
> Here is the code. The function cannot exit without vp->v_interlock being
> unlocked. What does that means?
>

> [...]

>
> (gdb) list *0xc018b217
> 0xc018b217 is in genfs_lock

That you've leaked a vnode lock.

--
David A. Holland
dhol...@netbsd.org

David Holland

unread,

Jul 17, 2015, 12:45:36 PM7/17/15

to

On Fri, Jul 17, 2015 at 04:43:36PM +0000, Taylor R Campbell wrote:
> (Perhaps we ought to put extra lockdebug crud in vn_lock and a new
> vn_unlock in order to track these things down more usefully.)

Yes please :-)

(vn_unlock should exist for symmetry; I've been meaning to institute
it for so long that I'd just about forgotten)

--
David A. Holland
dhol...@netbsd.org

Emmanuel Dreyfus

unread,

Jul 17, 2015, 3:14:21 PM7/17/15

to

Taylor R Campbell <campbell+net...@mumble.net> wrote:

> Since this is in genfs_lock, what happened here is almost certainly
> that you locked a vnode but didn't unlock it. Unfortunately,
> lockdebug does not provide a full stack trace, so you'll have to guess
> where the relevant vn_lock was.

If there any change to at least get the syscall number?

--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
ma...@netbsd.org

Taylor R Campbell

unread,

Jul 17, 2015, 3:32:30 PM7/17/15

to

Date: Fri, 17 Jul 2015 21:17:43 +0200
From: ma...@netbsd.org (Emmanuel Dreyfus)

Taylor R Campbell <campbell+net...@mumble.net> wrote:

> Since this is in genfs_lock, what happened here is almost certainly
> that you locked a vnode but didn't unlock it. Unfortunately,
> lockdebug does not provide a full stack trace, so you'll have to guess
> where the relevant vn_lock was.

If there any change to at least get the syscall number?

In theory, yes -- it is put on the stack when control enters the
syscall vector, and the `SPL NOT LOWERED AFTER EXIT' error gets at it,
in, e.g., arch/amd64/amd64/locore.S. But I'm not sure there's an easy
way to get at it otherwise. From a core dump, you could perhaps
grovel through the stack to find it, if you were dedicated enough.

As an alternative, I could imagine putting another LOCKDEBUG_BARRIER
inside sy_invoke, and modifying LOCKDEBUG_BARRIER to take a formatted
string which, in this case, would be used to show the syscall number.

Emmanuel Dreyfus

unread,

Jul 17, 2015, 10:56:18 PM7/17/15

to

David Holland <dholla...@netbsd.org> wrote:

> That you've leaked a vnode lock.

I did not leak anything: this is netbsd-7 PUFFS without any add-on from
me :-/

--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
ma...@netbsd.org

David Holland

unread,

Jul 18, 2015, 2:28:09 PM7/18/15

to

On Sat, Jul 18, 2015 at 04:59:36AM +0200, Emmanuel Dreyfus wrote:
> David Holland <dholla...@netbsd.org> wrote:
>
> > That you've leaked a vnode lock.
>
> I did not leak anything: this is netbsd-7 PUFFS without any add-on from
> me :-/

Must have been pooka then :-) but that's what happened.

--
David A. Holland
dhol...@netbsd.org

David Holland

unread,

Jul 18, 2015, 2:35:22 PM7/18/15

to

On Thu, Jul 16, 2015 at 07:34:20PM +0200, Emmanuel Dreyfus wrote:
> David Holland <dholla...@netbsd.org> wrote:
>
> > Either make vnode locks interruptible, or debug puffs.
> >
> > I favor the former, but lost the argument a few years back. Others
> > (including e.g. yamt) said no.
>
> Even as not interruptible, I can call VOP_LOCK with LK_NOWAIT and loop
> on it until I decide it is time to give up.

Busy-looping like that on a vnode lock is Wrong(TM).

> But once I failed to acquire
> filesystem root vnode lock, how can I forcibly unmount it?

You probably can't. It would be nice to have recovery logic for that
level of problem, but it's not likely to happen anytime soon.

--
David A. Holland
dhol...@netbsd.org

Emmanuel Dreyfus

unread,

Jul 18, 2015, 6:38:14 PM7/18/15

to

David Holland <dholla...@netbsd.org> wrote:

> > I did not leak anything: this is netbsd-7 PUFFS without any add-on from
> > me :-/
>
> Must have been pooka then :-) but that's what happened.

I finally sorted it out: a checkout with -rnetbsd-7 but at some time
src/sys/fs/puffs was cleared and checked out from -current, resulting in
an hybrid kernel. The difference is thin, just a few lines, but of
course a cvs diff would not tell it since it silently compared the tree
to netbsd-7 and a directory to HEAD.

Hence in the end, no bug to fix, it was a developer PEBKAC. But
LOCKDEBUG_BARRIER raised a real problem.

--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
ma...@netbsd.org

Don Lee

unread,

Jul 18, 2015, 9:31:16 PM7/18/15

to

On Jul 18, 2015, at 1:35 PM, David Holland <dholla...@netbsd.org> wrote:

>>>
>>> Either make vnode locks interruptible, or debug puffs.
>>>
>>> I favor the former, but lost the argument a few years back. Others
>>> (including e.g. yamt) said no.
>>
>> Even as not interruptible, I can call VOP_LOCK with LK_NOWAIT and loop
>> on it until I decide it is time to give up.
>
> Busy-looping like that on a vnode lock is Wrong(TM).

FWIW, I would much rather get a panic than a lockup.

-dgl-

Emmanuel Dreyfus

unread,

Jul 18, 2015, 10:22:54 PM7/18/15

to

Don Lee <Mac...@c.icompute.com> wrote:

> FWIW, I would much rather get a panic than a lockup.

You get it with LOCKDEBUG enabled

--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
ma...@netbsd.org