Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

SMP/preempt bug in matt-nb5-mips64

4 views
Skip to first unread message

Manuel Bouyer

unread,
Sep 5, 2010, 5:35:41 PM9/5/10
to
Hello,
I found a problem in matt-nb5-mips64's mips/mips/spl.S, regarding
curcpu() and preemption. In both _splraise and _splsw_splhigh,
curcpu() is loaded from L_CPU(MIPS_CURLWP) in a register early,
especially before disabling interrupt. If the current IPL is 0,
the current thread can be preempted and rescheduled on another CPU,
and the new SPL is written back to the wrong cpu_info.
From there, bad things happens (what I've seen is an infinite loop
from the interrupt handler on the victim CPU, because _splsw_splhigh
thinks we're already at splhigh and do nothing, when interrupts are
really enabled).
The attached patch seems to fix it for me: it's enough to reload
curcpu() before writing back the new IPL, as for the above senario to
happen the old IPL of both CPUs has to be 0.

I suspect there's a similar issue with the use of L_CPU(MIPS_CURLWP)
in stub_lock.S, but I've not looked in details, as right now I'm
running with LOCKDEBUG and this code isn't used.

--
Manuel Bouyer <bou...@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--

mipsspl.diff

Manuel Bouyer

unread,
Sep 22, 2010, 8:39:57 AM9/22/10
to
On Sun, Sep 05, 2010 at 11:35:41PM +0200, Manuel Bouyer wrote:
> Hello,
> I found a problem in matt-nb5-mips64's mips/mips/spl.S, regarding
> curcpu() and preemption. In both _splraise and _splsw_splhigh,
> curcpu() is loaded from L_CPU(MIPS_CURLWP) in a register early,
> especially before disabling interrupt. If the current IPL is 0,
> the current thread can be preempted and rescheduled on another CPU,
> and the new SPL is written back to the wrong cpu_info.
> >From there, bad things happens (what I've seen is an infinite loop
> from the interrupt handler on the victim CPU, because _splsw_splhigh
> thinks we're already at splhigh and do nothing, when interrupts are
> really enabled).
> The attached patch seems to fix it for me: it's enough to reload
> curcpu() before writing back the new IPL, as for the above senario to
> happen the old IPL of both CPUs has to be 0.

Well, unfortunably this is not enough.
If we get preempted just after loading L_CPU(MIPS_CURLWP) and before
checking the spl, we may be moved to another CPU and run at a later
time. When we get back running we check the spl of the previous CPU
which may not be 0 any more.
I couldn't find a better way than disabling the interrupts before
checking the spl (well, we could disable preemption but I suspect
it cost more than just disabling interrupts). The attached patch now seems to
DTRT for me.

mipsspl.diff
0 new messages