To subscribe or unsubscribe via the World Wide Web, visit
http://lists.freebsd.org/mailman/listinfo/freebsd-scsi
or, via email, send a message with subject or body 'help' to
freebsd-sc...@freebsd.org
You can reach the person managing the list at
freebsd-s...@freebsd.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of freebsd-scsi digest..."
Today's Topics:
1. Re: How is supposed to be protected the units list? (Attilio Rao)
2. Re: How is supposed to be protected the units list? (Attilio Rao)
3. Re: How is supposed to be protected the units list?
(Matthew Jacob)
4. Re: How is supposed to be protected the units list?
(Matthew Jacob)
5. Re: How is supposed to be protected the units list? (Ed Maste)
6. Re: How is supposed to be protected the units list? (Attilio Rao)
7. Re: How is supposed to be protected the units list?
(Matthew Jacob)
8. Re: How is supposed to be protected the units list? (Attilio Rao)
9. Re: How is supposed to be protected the units list?
(Matthew Jacob)
10. Re: How is supposed to be protected the units list? (Attilio Rao)
11. Re: How is supposed to be protected the units list?
(Matthew Jacob)
12. Re: How is supposed to be protected the units list? (Attilio Rao)
13. Re: How is supposed to be protected the units list?
(Matthew Jacob)
14. Re: How is supposed to be protected the units list?
(Matthew Jacob)
15. Re: How is supposed to be protected the units list? (Attilio Rao)
16. Re: How is supposed to be protected the units list?
(Matthew Jacob)
17. Re: How is supposed to be protected the units list?
(Matthew Jacob)
18. Re: How is supposed to be protected the units list?
(Alexander Motin)
----------------------------------------------------------------------
Message: 1
Date: Wed, 3 Mar 2010 22:34:36 +0100
From: Attilio Rao <att...@freebsd.org>
Subject: Re: How is supposed to be protected the units list?
To: Matthew Jacob <m...@feral.com>
Cc: freebs...@freebsd.org
Message-ID:
<3bbf2fe11003031334g459...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
2010/3/2 Matthew Jacob <m...@feral.com>:
> I will admit to not looking at this stuff closely. But I'll also test with
> this today and give back an opinion.
> I would really like to hear Scott, Ken, Alexander or Justin express an
> opinion on this.
So I stress-tested the patch for several hours (6-7) with a
stress-test that could reproduce the bug for us, on a debugging
kernel, and it didn't panic'ed or showed LORs, deadlock, etc.
If someone could offer time for reviews or futher examinations it
would be very much appreciated.
Thanks,
Attilio
--
Peace can only be achieved by understanding - A. Einstein
------------------------------
Message: 2
Date: Wed, 3 Mar 2010 22:48:31 +0100
From: Attilio Rao <att...@freebsd.org>
Subject: Re: How is supposed to be protected the units list?
To: Ed Maste <ema...@freebsd.org>
Cc: freebs...@freebsd.org, Matthew Jacob <m...@feral.com>
Message-ID:
<3bbf2fe11003031348q4c1...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
2010/3/3 Ed Maste <ema...@freebsd.org>:
> On Wed, Mar 03, 2010 at 10:34:36PM +0100, Attilio Rao wrote:
>
>> So I stress-tested the patch for several hours (6-7) with a
>> stress-test that could reproduce the bug for us, on a debugging
>> kernel, and it didn't panic'ed or showed LORs, deadlock, etc.
>>
>> If someone could offer time for reviews or futher examinations it
>> would be very much appreciated.
>
> I reviewed the patch and am happy with it. My only comment is to
> consider using macros for the lock/unlock; it seems to be a pretty
> common idiom.
>
> #define FOO_LOCK_INIT() \
> mtx_init(&foo_mtx, "foo lock", NULL, MTX_DEF)
> #define FOO_LOCK_ASSERT() mtx_assert(&foo_mtx, MA_OWNED)
> #define FOO_LOCK() mtx_lock(&foo_mtx)
> #define FOO_UNLOCK() mtx_unlock(&foo_mtx)
Well, using functions will allow us to not exort the "foo_mtx" symbol,
something I want to avoid.
Such idiot is mostly better when the mutex is part of the ABI.
Attilio
--
Peace can only be achieved by understanding - A. Einstein
------------------------------
Message: 3
Date: Wed, 03 Mar 2010 13:49:54 -0800
From: Matthew Jacob <m...@feral.com>
Subject: Re: How is supposed to be protected the units list?
To: Attilio Rao <att...@freebsd.org>
Cc: freebs...@freebsd.org
Message-ID: <4B8ED982...@feral.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
On 03/03/2010 01:34 PM, Attilio Rao wrote:
> 2010/3/2 Matthew Jacob<m...@feral.com>:
>
>> I will admit to not looking at this stuff closely. But I'll also test with
>> this today and give back an opinion.
>> I would really like to hear Scott, Ken, Alexander or Justin express an
>> opinion on this.
>>
> So I stress-tested the patch for several hours (6-7) with a
> stress-test that could reproduce the bug for us, on a debugging
> kernel, and it didn't panic'ed or showed LORs, deadlock, etc.
>
> If someone could offer time for reviews or futher examinations it
> would be very much appreciated.
>
> Thanks,
> Attilio
>
>
>
I didn't get a chance to look at it more- work intervened.
Have you tested with FC or SAS with drives arriving/departing a lot?
------------------------------
Message: 4
Date: Wed, 03 Mar 2010 13:55:52 -0800
From: Matthew Jacob <m...@feral.com>
Subject: Re: How is supposed to be protected the units list?
To: freebs...@freebsd.org
Message-ID: <4B8EDAE8...@feral.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
On static review, the only code that makes me nervous are
ata_shutdown/da_shutdown.
Those are the only places where you hold that lock across an uncertain
interval.
------------------------------
Message: 5
Date: Wed, 3 Mar 2010 16:44:24 -0500
From: Ed Maste <ema...@freebsd.org>
Subject: Re: How is supposed to be protected the units list?
To: Attilio Rao <att...@freebsd.org>
Cc: freebs...@freebsd.org, Matthew Jacob <m...@feral.com>
Message-ID: <20100303214...@sandvine.com>
Content-Type: text/plain; charset=us-ascii
On Wed, Mar 03, 2010 at 10:34:36PM +0100, Attilio Rao wrote:
> So I stress-tested the patch for several hours (6-7) with a
> stress-test that could reproduce the bug for us, on a debugging
> kernel, and it didn't panic'ed or showed LORs, deadlock, etc.
>
> If someone could offer time for reviews or futher examinations it
> would be very much appreciated.
I reviewed the patch and am happy with it. My only comment is to
consider using macros for the lock/unlock; it seems to be a pretty
common idiom.
#define FOO_LOCK_INIT() \
mtx_init(&foo_mtx, "foo lock", NULL, MTX_DEF)
#define FOO_LOCK_ASSERT() mtx_assert(&foo_mtx, MA_OWNED)
#define FOO_LOCK() mtx_lock(&foo_mtx)
#define FOO_UNLOCK() mtx_unlock(&foo_mtx)
Regards,
Ed
------------------------------
Message: 6
Date: Wed, 3 Mar 2010 22:57:21 +0100
From: Attilio Rao <att...@freebsd.org>
Subject: Re: How is supposed to be protected the units list?
To: m...@feral.com
Cc: freebs...@freebsd.org
Message-ID:
<3bbf2fe11003031357o518...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
2010/3/3 Matthew Jacob <m...@feral.com>:
> On static review, the only code that makes me nervous are
> ata_shutdown/da_shutdown.
> Those are the only places where you hold that lock across an uncertain
> interval.
Please note that a def mutex is already held (the cam_periph_lock),
so, unless LORs I'm not thinking about, I don't expect too much
surprises for that codepath.
Thanks,
Attilio
--
Peace can only be achieved by understanding - A. Einstein
------------------------------
Message: 7
Date: Wed, 03 Mar 2010 15:30:48 -0800
From: Matthew Jacob <m...@feral.com>
Subject: Re: How is supposed to be protected the units list?
To: freebs...@freebsd.org
Message-ID: <4B8EF128...@feral.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
On 3/3/2010 1:57 PM, Attilio Rao wrote:
> 2010/3/3 Matthew Jacob<m...@feral.com>:
>
>> On static review, the only code that makes me nervous are
>> ata_shutdown/da_shutdown.
>> Those are the only places where you hold that lock across an uncertain
>> interval.
>>
> Please note that a def mutex is already held (the cam_periph_lock),
> so, unless LORs I'm not thinking about, I don't expect too much
> surprises for that codepath.
>
> Thanks,
> Attilio
>
>
>
The only potential operational difference was on reattach (power SAN
disks on, they attach. Power them off. Wait for 60 seconds. The
deattach. Power them on again). I've run this test many times before and
haven't seen this.
da7: 34732MB (71132959 512 byte sectors: 255H 63S/T 4427C)
(da7:isp0:0:6:0): removing device entry
panic: Bad link elm 0xffffff0018fbbc00 next->prev != elm
cpuid = 0
KDB: enter: panic
[ thread pid 0 tid 100044 ]
Stopped at kdb_enter+0x3d: movq $0,0x6b02d0(%rip)
db> bt
Tracing pid 0 tid 100044 td 0xffffff0002fbe000
kdb_enter() at kdb_enter+0x3d
panic() at panic+0x17b
camperiphfree() at camperiphfree+0x1c2
cam_periph_release_locked() at cam_periph_release_locked+0x48
cam_periph_release() at cam_periph_release+0x53
dasysctlinit() at dasysctlinit+0x153
taskqueue_run() at taskqueue_run+0x91
taskqueue_thread_loop() at taskqueue_thread_loop+0x3f
fork_exit() at fork_exit+0x12a
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffff8000188d30, rbp = 0 ---
------------------------------
Message: 8
Date: Thu, 4 Mar 2010 00:32:37 +0100
From: Attilio Rao <att...@freebsd.org>
Subject: Re: How is supposed to be protected the units list?
To: Matthew Jacob <m...@feral.com>
Cc: freebs...@freebsd.org
Message-ID:
<3bbf2fe11003031532u220...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
2010/3/4 Matthew Jacob <m...@feral.com>:
> On 3/3/2010 1:57 PM, Attilio Rao wrote:
>>
>> 2010/3/3 Matthew Jacob<m...@feral.com>:
>>
>>>
>>> On static review, the only code that makes me nervous are
>>> ata_shutdown/da_shutdown.
>>> Those are the only places where you hold that lock across an uncertain
>>> interval.
>>>
>>
>> Please note that a def mutex is already held (the cam_periph_lock),
>> so, unless LORs I'm not thinking about, I don't expect too much
>> surprises for that codepath.
>>
>> Thanks,
>> Attilio
>>
>>
>>
>
> The only potential operational difference was on reattach (power SAN disks
> on, they attach. Power them off. Wait for 60 seconds. The deattach. Power
> them on again). I've run this test many times before and haven't seen this.
>
> da7: 34732MB (71132959 512 byte sectors: 255H 63S/T 4427C)
> (da7:isp0:0:6:0): removing device entry
> panic: Bad link elm 0xffffff0018fbbc00 next->prev != elm
> cpuid = 0
> KDB: enter: panic
> [ thread pid 0 tid 100044 ]
> Stopped at kdb_enter+0x3d: movq $0,0x6b02d0(%rip)
> db> bt
> Tracing pid 0 tid 100044 td 0xffffff0002fbe000
> kdb_enter() at kdb_enter+0x3d
> panic() at panic+0x17b
> camperiphfree() at camperiphfree+0x1c2
> cam_periph_release_locked() at cam_periph_release_locked+0x48
> cam_periph_release() at cam_periph_release+0x53
> dasysctlinit() at dasysctlinit+0x153
> taskqueue_run() at taskqueue_run+0x91
> taskqueue_thread_loop() at taskqueue_thread_loop+0x3f
> fork_exit() at fork_exit+0x12a
> fork_trampoline() at fork_trampoline+0xe
> --- trap 0, rip = 0, rsp = 0xffffff8000188d30, rbp = 0 ---
Is this with the patch or without?
Thanks,
Attilio
--
Peace can only be achieved by understanding - A. Einstein
------------------------------
Message: 9
Date: Wed, 03 Mar 2010 15:39:34 -0800
From: Matthew Jacob <m...@feral.com>
Subject: Re: How is supposed to be protected the units list?
To: freebs...@freebsd.org
Message-ID: <4B8EF33...@feral.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
With
---
> Is this with the patch or without?
>
> Thanks,
> Attilio
>
>
>
------------------------------
Message: 10
Date: Thu, 4 Mar 2010 00:47:50 +0100
From: Attilio Rao <att...@freebsd.org>
Subject: Re: How is supposed to be protected the units list?
To: Matthew Jacob <m...@feral.com>
Cc: freebs...@freebsd.org
Message-ID:
<3bbf2fe11003031547kd5...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
2010/3/4 Matthew Jacob <m...@feral.com>:
> With
>
I assume you did use INVARIANTS, did you also include WITNESS?
Also, do you have a coredump for it?
Is this panic reproducible easilly?
Attilio
--
Peace can only be achieved by understanding - A. Einstein
------------------------------
Message: 11
Date: Wed, 03 Mar 2010 16:06:40 -0800
From: Matthew Jacob <m...@feral.com>
Subject: Re: How is supposed to be protected the units list?
To: Attilio Rao <att...@freebsd.org>
Cc: freebs...@freebsd.org
Message-ID: <4B8EF990...@feral.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
On 3/3/2010 3:47 PM, Attilio Rao wrote:
> 2010/3/4 Matthew Jacob<m...@feral.com>:
>
>> With
>>
>>
> I assume you did use INVARIANTS, did you also include WITNESS?
>
witness, yes, INVARIANTS yes, it was a GENERIC kernel less ahc so it
runs on my box
> Also, do you have a coredump for it?
>
Ah, not, sorry, was impatient. I need to port my core dump fixes to
speed things up. It takes 20 minutes on this box to get a coredump.
> Is this panic reproducible easilly?
>
Dunno. I'll try again. This time I'll let the core dump run through.
------------------------------
Message: 12
Date: Thu, 4 Mar 2010 01:07:50 +0100
From: Attilio Rao <att...@freebsd.org>
Subject: Re: How is supposed to be protected the units list?
To: Matthew Jacob <m...@feral.com>
Cc: freebs...@freebsd.org
Message-ID:
<3bbf2fe11003031607wa3...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
2010/3/4 Matthew Jacob <m...@feral.com>:
> On 3/3/2010 3:47 PM, Attilio Rao wrote:
>>
>> 2010/3/4 Matthew Jacob<m...@feral.com>:
>>
>>>
>>> With
>>>
>>>
>>
>> I assume you did use INVARIANTS, did you also include WITNESS?
>>
>
> witness, yes, INVARIANTS yes, it was a GENERIC kernel less ahc so it runs on
> my box
>
>> Also, do you have a coredump for it?
>>
>
> Ah, not, sorry, was impatient. I need to port my core dump fixes to speed
> things up. It takes 20 minutes on this box to get a coredump.
>
>> Is this panic reproducible easilly?
>>
>
> Dunno. I'll try again. This time I'll let the core dump run through.
Otherwise you can check out if the panic happens exactly on the units
list or it is somewhere else like xpt_remove_periph(), etc.
Thanks,
Attilio
--
Peace can only be achieved by understanding - A. Einstein
------------------------------
Message: 13
Date: Wed, 03 Mar 2010 16:14:15 -0800
From: Matthew Jacob <m...@feral.com>
Subject: Re: How is supposed to be protected the units list?
To: freebs...@freebsd.org
Message-ID: <4B8EFB5...@feral.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Happened again in the same place. Collecting the core now.
>> Dunno. I'll try again. This time I'll let the core dump run through.
>>
> Otherwise you can check out if the panic happens exactly on the units
> list or it is somewhere else like xpt_remove_periph(), etc.
>
> Thanks,
> Attilio
>
>
>
------------------------------
Message: 14
Date: Wed, 03 Mar 2010 17:47:35 -0800
From: Matthew Jacob <m...@feral.com>
Subject: Re: How is supposed to be protected the units list?
To: freebs...@freebsd.org
Message-ID: <4B8F1137...@feral.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
On 3/3/2010 1:57 PM, Attilio Rao wrote:
So sad, the core file produced is not usable. Que lastima.
------------------------------
Message: 15
Date: Thu, 4 Mar 2010 02:52:30 +0100
From: Attilio Rao <att...@freebsd.org>
Subject: Re: How is supposed to be protected the units list?
To: Matthew Jacob <m...@feral.com>
Cc: freebs...@freebsd.org
Message-ID:
<3bbf2fe11003031752l5f9...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
2010/3/4 Matthew Jacob <m...@feral.com>:
> On 3/3/2010 1:57 PM, Attilio Rao wrote:
>
> So sad, the core file produced is not usable. Que lastima.
BTW, I'm having hard times thinking this patch introduces the problem,
I think it is more likely it just changes the times of some events
happening or, more likely, the problem should also happen without the
patch.
I wanted to check, however, which list is affected by this.
Thanks,
Attilio
--
Peace can only be achieved by understanding - A. Einstein
------------------------------
Message: 16
Date: Wed, 03 Mar 2010 17:54:31 -0800
From: Matthew Jacob <m...@feral.com>
Subject: Re: How is supposed to be protected the units list?
To: freebs...@freebsd.org
Message-ID: <4B8F12D7...@feral.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
I'd only say that is is possible that the patch *uncovers* an existing
problem.
>> So sad, the core file produced is not usable. Que lastima.
>>
> BTW, I'm having hard times thinking this patch introduces the problem,
> I think it is more likely it just changes the times of some events
> happening or, more likely, the problem should also happen without the
> patch.
> I wanted to check, however, which list is affected by this.
>
> Thanks,
> Attilio
>
>
>
------------------------------
Message: 17
Date: Wed, 03 Mar 2010 18:01:25 -0800
From: Matthew Jacob <m...@feral.com>
Subject: Re: How is supposed to be protected the units list?
To: freebs...@freebsd.org
Message-ID: <4B8F147...@feral.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
On 3/3/2010 5:54 PM, Matthew Jacob wrote:
>
> I'd only say that is is possible that the patch *uncovers* an existing
> problem.
>
It's HEAD. Barring a negative comment from Scott, Alexander or Justin,
I'd say put it in.
------------------------------
Message: 18
Date: Thu, 04 Mar 2010 11:20:17 +0200
From: Alexander Motin <m...@FreeBSD.org>
Subject: Re: How is supposed to be protected the units list?
To: Attilio Rao <att...@freebsd.org>
Cc: freebs...@freebsd.org, "Justin T. Gibbs" <gi...@freebsd.org>,
Matthew Jacob <m...@feral.com>
Message-ID: <4B8F7B51...@FreeBSD.org>
Content-Type: text/plain; charset=UTF-8
Hi.
Attilio Rao wrote:
> 2010/3/1 Attilio Rao <att...@freebsd.org>:
>> I have a question that I've been unable to reply reading the code.
>> Someone could point me to documentation explaining how the unit tailq
>> (within a struct periph_driver) is supposed to be locked?
>> I'm not sure how it is assured consistency of accesses to the list and
>> more important how is ensured that the periphs composing it doesn't go
>> away as I don't see any reference bump for objects inserted there.
I am not sure that existing implementation is complete, but at least in
some places it is protected by xpt_lock_buses(), which is wrapper for
mtx_lock(&xsoftc.xpt_topo_lock). camperiphfree() called under the lock,
same as many (all?) traverses.
> I don't think the lists are protected at all so I made this simple
> patch taking advantage by a global lock:
> http://www.freebsd.org/~attilio/Sandvine/pdrv/pdrv_lock.diff
>
> The patch is simple enough but I just test-compiled it (will need some
> time to run in a debugging kernel, hope to do tonight) and maybe you
> can already give your opinions here.
It is not obvious to me, how your new lock expected to interoperate with
existing xpt_topo_lock. I have feeling that it duplicates one in some
places. If you believe that second lock is needed there, then probably
first one should be partially removed. But then we should be careful to
not create LORs between them, as they are not natively nested.
Changes in cam_xpt.c break splbreaknum meaning of temporary dropping
xpt_topo_lock. I have doubts whether it is required now at all now, but
leaving dead code it definitely not good.
--
Alexander Motin
------------------------------
End of freebsd-scsi Digest, Vol 358, Issue 4
********************************************