0x100 means that a HRTIMER_SOFTIRQ is pending when the cpu tries to disable
the tick.
In perf_event.c we have a call to __hrtimer_start_range_ns() in
perf_swevent_start_hrtimer() where its wakeup parameter is zero.
__hrtimer_start_range_ns() in turn will call hrtimer_enqueue_reprogram()
which will call __raise_softirq_irqoff(HRTIMER_SOFTIRQ) (since wakeup is
zero).
That means that just the HRTIMER_SOFTIRQ bit gets set in the softirq
pending field, but wakeup_softirqd() doesn't get called.
As far as I could see this function gets called from process context with
a spinlock held and hence we don't have any guarantee that this pending
softirq get executed before the idle task gets scheduled and tries to
disable the tick.
The easiest fix would be to set wakeup to one (see patch below), but I guess
there is a reason why its zero. Anybody?
---
kernel/perf_event.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index eac7e33..958b3e0 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -4942,7 +4942,7 @@ static void perf_swevent_start_hrtimer(struct perf_event *event)
}
__hrtimer_start_range_ns(&hwc->hrtimer,
ns_to_ktime(period), 0,
- HRTIMER_MODE_REL_PINNED, 0);
+ HRTIMER_MODE_REL_PINNED, 1);
}
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Ah. With this patch "perf record" might deadlock.
That's the reason why wakeup is zero. Tough luck.
We can start that hrtimer from within the scheduler function while
holding the rq->lock, doing a wakeup from there is not sane.
The best solution would be to fix the hrtimer_start*() interface,
something Thomas and I have wanted to do for ages but because we've
procrastinated is now a much larger job than it was :/
The whole HRTIMER_SOFTIRQ thing should die.. but for that to happen its
only use-case today must first go.
The problem is trying to start a timer with already elapsed time.
Preferably hrtimer_start*() would simply return -ETIME and let the
caller sort it, sadly the current behaviour is to 'fix' it for the
caller by enqueueing the timer onto the softirq list and raising the
softirq.
I guess we could make hrtimer_start*(.wakeup=false) return the -ENOTIME
thing and audit those few use-cases.
That would be sensible anyway.