clocksource mutex deadlock, cat current

Andreas Mohr

unread,

Feb 6, 2010, 9:30:02 AM2/6/10

to

Hello all,

I did

# cat /sys/devices/system/clocksource/clocksource0/current_clocksource

[oh yes indeed, that's the full output]

This is on Dell Inspiron 8000 P3/700 UP (after some suspend/resume cycles).

Haven't checked yet whether this occurs on fresh boot, too.

clocksource / event device setup is (from memory) PIT (TSC disabled due
to C2 stall) and acpi_pm (lapic is _enabled_, and CPU support only,
not BIOS).

Thanks,

Andreas Mohr

echo t > /proc/sysrq-trigger shows (full output on request):

cat D 00000000 0 19361 2959 0x00000000
de215ef8 00000046 00000000 00000000 c13b0294 dd45c380 dd45c5fc df9c6ef0
de215f28 de215f08 c13b0260 00000202 de215f28 c124fcae dd45c380 c13b0280
c13b0280 c13b0280 dd45c380 de215f08 ffffffff d9de1000 c13b0338 df9c6ef0
Call Trace:
[<c124fcae>] mutex_lock_nested+0x157/0x23d
[<c104873e>] sysfs_show_current_clocksources+0x17/0x41
[<c1048727>] ? sysfs_show_current_clocksources+0x0/0x41
[<c1196a75>] sysdev_show+0x1d/0x21
[<c10dea91>] sysfs_read_file+0x9b/0x135
[<c10de9f6>] ? sysfs_read_file+0x0/0x135
[<c109fc1b>] vfs_read+0x8c/0x13d
[<c109fd6f>] sys_read+0x40/0x65
[<c1002890>] sysenter_do_call+0x12/0x36
Sched Debug Version: v0.09, 2.6.33-rc6 #3
now at 43142470.181620 msecs
.jiffies : 4283441
.sysctl_sched_latency : 5.000000
.sysctl_sched_min_granularity : 1.000000
.sysctl_sched_wakeup_granularity : 1.000000
.sysctl_sched_child_runs_first : 0.000000
.sysctl_sched_features : 7917179
.sysctl_sched_tunable_scaling : 1 (logaritmic)

cpu#0, 698.566 MHz
.nr_running : 2
.load : 2048
.nr_switches : 17910608
.nr_load_updates : 4313441
.nr_uninterruptible : 1
.next_balance : 0.000000
.curr->pid : 9119
.clock : 43142407.895281
.cpu_load[0] : 1024
.cpu_load[1] : 1033
.cpu_load[2] : 706
.cpu_load[3] : 465
.cpu_load[4] : 406
.yld_count : 8871
.sched_switch : 0
.sched_count : 18023143
.sched_goidle : 4265782
.ttwu_count : 0
.ttwu_local : 0
.bkl_count : 0

cfs_rq[0]:
.exec_clock : 11345719.395963
.MIN_vruntime : 232465648.024326
.min_vruntime : 232465649.026399
.max_vruntime : 232465648.024326
.spread : 0.000000
.spread0 : 0.000000
.nr_running : 2
.load : 2048
.nr_spread_over : 492453

rt_rq[0]:
.rt_nr_running : 0
.rt_throttled : 0
.rt_time : 0.000000
.rt_runtime : 950.000000

runnable tasks:
task PID tree-key switches prio exec-runtime
sum-exec sum-sleep
----------------------------------------------------------------------------------------------------------
x-www-browser 3345 232465648.024326 1053319 120 232465648.024326
2047447.788077 39329053.727630
R bash 9119 232465646.526399 514 120 232465646.526399
250.811110 33621456.410620

INFO: lockdep is turned off.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Thomas Gleixner

unread,

Feb 6, 2010, 9:50:03 AM2/6/10

to

On Sat, 6 Feb 2010, Andreas Mohr wrote:
> echo t > /proc/sysrq-trigger shows (full output on request):

At least the state of "kwatchdog" would be interesting.

> cat D 00000000 0 19361 2959 0x00000000
> de215ef8 00000046 00000000 00000000 c13b0294 dd45c380 dd45c5fc df9c6ef0
> de215f28 de215f08 c13b0260 00000202 de215f28 c124fcae dd45c380 c13b0280
> c13b0280 c13b0280 dd45c380 de215f08 ffffffff d9de1000 c13b0338 df9c6ef0
> Call Trace:
> [<c124fcae>] mutex_lock_nested+0x157/0x23d
> [<c104873e>] sysfs_show_current_clocksources+0x17/0x41
>

> INFO: lockdep is turned off.

Why is lockdep turned off ?

Thanks,

tglx

Andreas Mohr

unread,

Feb 7, 2010, 2:30:01 PM2/7/10

to

Hi,

On Sat, Feb 06, 2010 at 03:46:50PM +0100, Thomas Gleixner wrote:
> On Sat, 6 Feb 2010, Andreas Mohr wrote:
> > echo t > /proc/sysrq-trigger shows (full output on request):
>
> At least the state of "kwatchdog" would be interesting.

Hrmm, scrolled out of dmesg buffer :(
(will now increase buffer size)

> > cat D 00000000 0 19361 2959 0x00000000
> > de215ef8 00000046 00000000 00000000 c13b0294 dd45c380 dd45c5fc df9c6ef0
> > de215f28 de215f08 c13b0260 00000202 de215f28 c124fcae dd45c380 c13b0280
> > c13b0280 c13b0280 dd45c380 de215f08 ffffffff d9de1000 c13b0338 df9c6ef0
> > Call Trace:
> > [<c124fcae>] mutex_lock_nested+0x157/0x23d
> > [<c104873e>] sysfs_show_current_clocksources+0x17/0x41
> >
> > INFO: lockdep is turned off.
>
> Why is lockdep turned off ?

Good question.

# grep -i lockdep /boot/config-2.6.33-rc6
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_LOCKDEP=y
# CONFIG_DEBUG_LOCKDEP is not set

BTW, things get even weirder:

On bootup, I get lockups during

Switching to clocksource acpi_pm

very frequently, but _only_ in case of connected AC adapter.
If I don't connect AC adapter, it always boots.
(I tried about 8 times, with 5 times AC and 3 times without,
and consistent behaviour)

The current boot was with AC connected, and this time it did manage to
boot, but with a slight hang at the
Switching to clocksource acpi_pm
line. And this time NMI watchdog detected things going haywire:

PM: Adding info for No Bus:input1
input: Power Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input1
ACPI: Power Button [PBTN]
driver: 'PNP0C0C:00': driver_bound: bound to device 'button'
bus: 'acpi': really_probe: bound device PNP0C0C:00 to driver button
bus: 'acpi': driver_probe_device: matched device PNP0C0E:00 with driver button
bus: 'acpi': really_probe: probing driver button with device PNP0C0E:00
device: 'input2': device_add
PM: Adding info for No Bus:input2
input: Sleep Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0E:00/input/input2
ACPI: Sleep Button [SBTN]
driver: 'PNP0C0E:00': driver_bound: bound to device 'button'
bus: 'acpi': really_probe: bound device PNP0C0E:00 to driver button
bus: 'acpi': add driver fan
bus: 'acpi': add driver processor
bus: 'acpi': driver_probe_device: matched device LNXCPU:00 with driver processor
bus: 'acpi': really_probe: probing driver processor with device LNXCPU:00
Marking TSC unstable due to TSC halts in idle
device: 'cooling_device0': device_add
PM: Adding info for No Bus:cooling_device0
Switching to clocksource acpi_pm
BUG: NMI Watchdog detected LOCKUP on CPU0, ip c1046694, registers:
Modules linked in:

Pid: 266, comm: kwatchdog Not tainted 2.6.33-rc6 #3 Inspiron 8000 /Inspiron 8000
EIP: 0060:[<c1046694>] EFLAGS: 00000082 CPU: 0
EIP is at timekeeping_forward_now+0xf0/0x11d
EAX: 342100fe EBX: f3dcdfc8 ECX: 63826cfb EDX: ffffffff
ESI: 003ffd7b EDI: 00000000 EBP: dfa31f50 ESP: dfa31f28
DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
Process kwatchdog (pid: 266, ti=dfa31000 task=dfa4e0a0 task.ti=dfa31000)
Stack:
005b9e7e 00000000 6184c590 00000fff 002cb338 4b6f0da7 00000000 c13d8524
<0> c13d8524 dfa31f48 dfa31f60 c10466d2 c13d8524 c13d8524 dfa31f6c c104725e
<0> c13b02c0 dfa31f84 c1048376 c13285ec c1328650 c13b03a4 c13d8528 dfa31fa4
Call Trace:
[<c10466d2>] ? change_clocksource+0x11/0x3e
[<c104725e>] ? timekeeping_notify+0x20/0x2e
[<c1048376>] ? clocksource_select+0x9a/0xa3
[<c10484ba>] ? clocksource_watchdog_kthread+0x13b/0x163
[<c104837f>] ? clocksource_watchdog_kthread+0x0/0x163
[<c103fa6e>] ? kthread+0x65/0x6a
[<c103fa09>] ? kthread+0x0/0x6a
[<c1002dba>] ? kernel_thread_helper+0x6/0x10
Code: 65 e8 8d 14 17 89 c6 89 d7 31 c0 0f ad d6 d3 ef f6 c1 20 0f 45 f7 0f 45 f8 a1 10 21 48 c1 89 f1 89 fb 99 01 c1 11 d3 31 c0 eb 0a <81> c1 00 36 65 c4 83 d3 ff 40 83 fb 00 77 f1 81 f9 ff c9 9a 3b
---[ end trace a7919e7f17c0a725 ]---
driver: 'LNXCPU:00': driver_bound: bound to device 'processor'
bus: 'acpi': really_probe: bound device LNXCPU:00 to driver processor
bus: 'acpi': add driver container
bus: 'acpi': add driver thermal

Kernel options I used are:

Kernel command line: BOOT_IMAGE=L2633rc6 root=802 ramdisk=0 video=vesa:mtrr ywrap devfs=nomount apm=off root=/dev/sda2 lapic apic=debug nmi_watchdog=2 usbcore.autosuspend=1

Since clocksource_mutex code invocation seems pretty symmetric, I cannot imagine any problem here,
except for mutex implementation issues, recursive clocksource code calls and NMI / interrupt issues.

Could it be that some system code (ACPI, SMI, ...?) calls clocksource stuff?
_Or_ there's some corruption going on, in the area near the clocksource_mutex storage.

I _think_ older kernels had some slight delays during clocksource switch, too, but I cannot remember any
outright hangs during bootup.

Hrmm, OTOH I changed quite some options after the last build:

$ diff -uN config-2.6.33-rc4 config-2.6.33-rc6|grep ^+
+++ config-2.6.33-rc6 2010-01-31 20:06:59.000000000 +0100
+# Linux kernel version: 2.6.33-rc6
+# Sun Jan 31 20:05:19 2010
+# CONFIG_COMPAT_BRK is not set
+# CONFIG_COMPAT_VDSO is not set
+CONFIG_PARPORT=m
+CONFIG_PARPORT_PC=m
+CONFIG_TRACE_IRQFLAGS=y
+CONFIG_LATENCYTOP=y
+CONFIG_HAVE_FTRACE_NMI_ENTER=y
+CONFIG_TRACER_MAX_TRACE=y
+CONFIG_FTRACE_NMI_ENTER=y
+CONFIG_FUNCTION_TRACER=y
+CONFIG_IRQSOFF_TRACER=y
+CONFIG_SCHED_TRACER=y
+CONFIG_FTRACE_SYSCALLS=y
+CONFIG_BLK_DEV_IO_TRACE=y
+CONFIG_DYNAMIC_FTRACE=y
+# CONFIG_FUNCTION_PROFILER is not set
+CONFIG_FTRACE_MCOUNT_RECORD=y
+CONFIG_4KSTACKS=y

Umm, CONFIG_FTRACE_NMI_ENTER, anyone?
That sounds like the most invasive candidate at least.

Anyway, behaviour does seem broken given that clocksource_mutex use is fully symmetric
and shouldn't cause issues on its own.
If I had a guess, then I'd say some of these options above do some mutex-related things
and corrupt their surroundings. But who knows...

BTW, it's not an lapic thing: I had hangs with and without enabling it.

I'm afraid I'll have to back out one half of the options above and retry...

Any ideas?

Thanks,

Andreas Mohr

unread,

Feb 8, 2010, 12:40:02 AM2/8/10

to

Hi,

On Sun, Feb 07, 2010 at 08:19:49PM +0100, Andreas Mohr wrote:
> Umm, CONFIG_FTRACE_NMI_ENTER, anyone?
> That sounds like the most invasive candidate at least.

Nope, that wasn't it.
(I removed both CONFIG_DYNAMIC_FTRACE - which implicitly removes
CONFIG_FTRACE_NMI_ENTER - and CONFIG_FTRACE_SYSCALLS)

Next theory:

On this -rc7 upgrade, I had another NMI watchdog trigger on bootup:

BUG: NMI Watchdog detected LOCKUP on CPU0, ip c1045170, registers:
Modules linked in:

Pid: 266, comm: kwatchdog Not tainted 2.6.33-rc7 #1 Inspiron 8000
/Inspiron 8000
EIP: 0060:[<c1045170>] EFLAGS: 00000082 CPU: 0
EIP is at timekeeping_forward_now+0x116/0x139
EAX: 00000000 EBX: efd7f032 ECX: fb5d3b74 EDX: 45643ff3
ESI: 8e7480ca EDI: ffffffff EBP: df8cdf3c ESP: df8cdf18

DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068

Process kwatchdog (pid: 266, ti=df8cd000 task=df9bc1c0 task.ti=df8cd000)
Stack:
005ba19f 00000000 00000000 0000231e 2ab321bc 0000231e c13b6010 c13b6010
<0> c13b6014 df8cdf4c c10451a4 c1392128 c13b6010 df8cdf58 c1045d16
c13b6010
<0> df8cdf70 c1046dc6 c130ee48 c130eeac c139212c c139212c df8cdf84
c1046e36
Call Trace:
[<c10451a4>] ? change_clocksource+0x11/0x3e
[<c1045d16>] ? timekeeping_notify+0x24/0x31
[<c1046dc6>] ? clocksource_select+0x9e/0xa7
[<c1046e36>] ? __clocksource_change_rating+0x67/0x6c
[<c1046f1c>] ? clocksource_watchdog_kthread+0xe1/0x104
[<c1046e3b>] ? clocksource_watchdog_kthread+0x0/0x104
[<c103e50c>] ? kthread+0x63/0x68
[<c103e4a9>] ? kthread+0x0/0x68
[<c1002cba>] ? kernel_thread_helper+0x6/0x10
Code: ea f6 c1 20 0f 45 c2 0f 45 d7 89 c1 89 f7 8b 45 e4 89 d3 c1 ff 1f
01 f1 11 fb 31 d2 eb 0a 81 c1 00 36 65 c4 83 d
3 ff 42 83 fb 00 <77> f1 81 f9 ff c9 9a 3b 77 e9 89 45 e4 8d 04 02 a3 0c
21 46 c1

---[ end trace a7919e7f17c0a725 ]---

And then a cat current_clocksource managed to hang again.
(NOTE that the - now complete! - SysRq-T list does NOT show any backtraces
of kwatchdog any more, only many other processes)
Could it be that the (rather disruptive) NMI watchdog confuses the current state at
change_clocksource and causes that stuff to get left with
clocksource_mutex remaining taken?

Then my userspace cat current_clocksource hits the leftover mutex and
has nowhere to go...

And, could it perhaps be that the NMI watchdog gets confused by
simple timekeeping inconsistencies _during_ clocksource change?
(in that case we'd simply need to make sure NMI watchdog remains
satisfied with current conditions during clocksource rochade)

And the lockdep thingy is suboptimal: a developer colleague told me that
"INFO: lockdep is turned off." simply gets done after the first backtrace
in order to avoid subsequent spews.
But at that point in time some people (such as him) _would_ have liked to trace
some further issues.

So at least it should be enhanced to "INFO: lockdep is turned off (or defused!)."
or some such to clearly indicate that something is rotten,
and one should think of perhaps increasing one-time use to up to 3 times or so.

Thomas Gleixner

unread,

Feb 8, 2010, 4:20:03 AM2/8/10

to

On Mon, 8 Feb 2010, Andreas Mohr wrote:
>
> And then a cat current_clocksource managed to hang again.

Well, that's not surprising at all. If one task is stuck on clocksource_mutex,
then the next one will be stuck as well.

> (NOTE that the - now complete! - SysRq-T list does NOT show any backtraces
> of kwatchdog any more, only many other processes)
> Could it be that the (rather disruptive) NMI watchdog confuses the current state at
> change_clocksource and causes that stuff to get left with
> clocksource_mutex remaining taken?

Nope, the NMI watchdog is not involved. It merily tells us that the
task is stuck.

tglx

Andreas Mohr

unread,

Feb 8, 2010, 4:40:02 AM2/8/10

to

On Mon, Feb 08, 2010 at 10:13:00AM +0100, Thomas Gleixner wrote:
> On Mon, 8 Feb 2010, Andreas Mohr wrote:
> >
> > And then a cat current_clocksource managed to hang again.
>
> Well, that's not surprising at all. If one task is stuck on clocksource_mutex,
> then the next one will be stuck as well.

I believe here you are pointing at the initial bootup acpi_pm lockup which NMI
watchdog detected. And not some thought that I somehow simply executed
cat current_clocksource twice, given my wording which might erroneously
hint at that.

So you'd think that we have a clocksource_mutex problem even before
the initial bootup switch to acpi_pm?

However I don't see how this could be the case, given that in some instances
boot does continue after acpi_pm selection, albeit after a delay.

> > (NOTE that the - now complete! - SysRq-T list does NOT show any backtraces
> > of kwatchdog any more, only many other processes)
> > Could it be that the (rather disruptive) NMI watchdog confuses the current state at
> > change_clocksource and causes that stuff to get left with
> > clocksource_mutex remaining taken?
>
> Nope, the NMI watchdog is not involved. It merily tells us that the
> task is stuck.

OK.
And after that message debug_locks is zeroed and kwatchdog is gone
from the process list (probably during debug_locks change).

I still can't make much reason of this behaviour.
If we have a problem during acpi_pm selection on boot, then by all
accounts it should get stuck completely (plus yielding watchdog's lockup
message), not continue booting after some weird delay.
OK, this particular delay phenomenon could still be explained
by a pretty severe contention,
but then after successfully having gotten the mutex it certainly
shouldn't happen that then even after bootup the mutex remains blocked
(as witnessed by the cat current_clocksource issue).
After all all mutex use is fully symmetric, no surprises IMHO - unless
NMI / SMI or so are involved.

I'll explain what I think might be happening:
bootup switches to acpi_pm, timekeeping gets borked, NMI watchdog complains
due to timekeeping issues, brutally yanks the waiting acpi_pm switchover
(thereby NOT releasing clocksource_mutex),
and the result is that I have cat current_clocksource stuck in my userspace.

Andreas Mohr

Thomas Gleixner

unread,

Feb 8, 2010, 5:10:02 AM2/8/10

to

On Mon, 8 Feb 2010, Andreas Mohr wrote:
> > Nope, the NMI watchdog is not involved. It merily tells us that the
> > task is stuck.
>
> OK.
> And after that message debug_locks is zeroed and kwatchdog is gone
> from the process list (probably during debug_locks change).

Oh, no. kwatchdog is a run once thread. It always exits after work is
done, but I'm pretty confused about the NMI watchdog output.

EIP: 0060:[<c1045170>] EFLAGS: 00000082 CPU: 0
EIP is at timekeeping_forward_now+0x116/0x139

I don't see what might make the machine stuck here. Can you decode the
source line with "addr2line -e vmlinux c1045170" please ?

> I'll explain what I think might be happening:
> bootup switches to acpi_pm, timekeeping gets borked, NMI watchdog complains
> due to timekeeping issues, brutally yanks the waiting acpi_pm switchover
> (thereby NOT releasing clocksource_mutex),

No, the NMI watchdog does not yank anything. It just reports.

Thanks,

tglx

Andreas Mohr

unread,

Feb 8, 2010, 2:50:03 PM2/8/10

to

Hi,

On Mon, Feb 08, 2010 at 11:06:58AM +0100, Thomas Gleixner wrote:
> EIP: 0060:[<c1045170>] EFLAGS: 00000082 CPU: 0
> EIP is at timekeeping_forward_now+0x116/0x139
>
> I don't see what might make the machine stuck here. Can you decode the
> source line with "addr2line -e vmlinux c1045170" please ?

And the winner is:
/usr/src/linux-2.6.33-rc7/include/linux/math64.h:91

static __always_inline u32
__iter_div_u64_rem(u64 dividend, u32 divisor, u64 *remainder)
{
u32 ret = 0;

while (dividend >= divisor) {
/* The following asm() prevents the compiler from
optimising this loop into a modulo operation. */
asm("" : "+rm"(dividend));

dividend -= divisor;
ret++;
}

*remainder = dividend;

return ret;
}

while ......

Do I see a divisor == 0 here?? ;)

Andreas Mohr

Thomas Gleixner

unread,

Feb 8, 2010, 3:30:02 PM2/8/10

to

On Mon, 8 Feb 2010, Andreas Mohr wrote:

> Hi,
>
> On Mon, Feb 08, 2010 at 11:06:58AM +0100, Thomas Gleixner wrote:
> > EIP: 0060:[<c1045170>] EFLAGS: 00000082 CPU: 0
> > EIP is at timekeeping_forward_now+0x116/0x139
> >
> > I don't see what might make the machine stuck here. Can you decode the
> > source line with "addr2line -e vmlinux c1045170" please ?
>
> And the winner is:
> /usr/src/linux-2.6.33-rc7/include/linux/math64.h:91
>
> static __always_inline u32
> __iter_div_u64_rem(u64 dividend, u32 divisor, u64 *remainder)
> {
> u32 ret = 0;
>
> while (dividend >= divisor) {
> /* The following asm() prevents the compiler from
> optimising this loop into a modulo operation. */
> asm("" : "+rm"(dividend));
>
> dividend -= divisor;
> ret++;
> }
>
> *remainder = dividend;
>
> return ret;
> }
>
>
> while ......
>
> Do I see a divisor == 0 here?? ;)

The only function which is calling __iter_div_u64_rem() from
timekeeping_forward_now() is timespec_add_ns() which calls it with a
constant divisor:

static __always_inline void timespec_add_ns(struct timespec *a, u64 ns)
{
a->tv_sec += __iter_div_u64_rem(a->tv_nsec + ns, NSEC_PER_SEC, &ns);
a->tv_nsec = ns;
}

There goes the theory :)

Which compiler version are you using ?

Can you please provide the disassembly of kernel/time/timekeeping.o ?

Thanks,

tglx

Andreas Mohr

unread,

Feb 8, 2010, 3:40:02 PM2/8/10

to

Since I'm having trouble pinpointing the reason for the issue
(after all divisor is not 0 but likely NSEC_PER_SEC,
as likely called by time.h timespec_add_ns() --> __iter_div_u64_rem()),
here comes timekeeping_forward_now in its entire glory (objdump -D vmlinux):

c104505a <timekeeping_forward_now>:
c104505a: 55 push %ebp
c104505b: 89 e5 mov %esp,%ebp
c104505d: 57 push %edi
c104505e: 56 push %esi
c104505f: 53 push %ebx
c1045060: 83 ec 18 sub $0x18,%esp
c1045063: e8 5c dc fb ff call c1002cc4 <mcount>
c1045068: 8b 1d d8 20 46 c1 mov 0xc14620d8,%ebx
c104506e: 89 d8 mov %ebx,%eax
c1045070: ff 53 10 call *0x10(%ebx)
c1045073: 8b 4b 1c mov 0x1c(%ebx),%ecx
c1045076: 89 c6 mov %eax,%esi
c1045078: 89 d7 mov %edx,%edi
c104507a: 2b 73 40 sub 0x40(%ebx),%esi
c104507d: 1b 7b 44 sbb 0x44(%ebx),%edi
c1045080: 21 f1 and %esi,%ecx
c1045082: 89 4d ec mov %ecx,-0x14(%ebp)
c1045085: 8b 73 20 mov 0x20(%ebx),%esi
c1045088: 89 43 40 mov %eax,0x40(%ebx)
c104508b: 89 53 44 mov %edx,0x44(%ebx)
c104508e: a1 08 21 46 c1 mov 0xc1462108,%eax
c1045093: 21 fe and %edi,%esi
c1045095: 8b 3d c0 20 46 c1 mov 0xc14620c0,%edi
c104509b: 89 45 e4 mov %eax,-0x1c(%ebp)
c104509e: 8b 4d e4 mov -0x1c(%ebp),%ecx
c10450a1: 8b 45 ec mov -0x14(%ebp),%eax
c10450a4: f7 65 e4 mull -0x1c(%ebp)
c10450a7: 0f af ce imul %esi,%ecx
c10450aa: 89 45 e4 mov %eax,-0x1c(%ebp)
c10450ad: 8b 45 e4 mov -0x1c(%ebp),%eax
c10450b0: 01 ca add %ecx,%edx
c10450b2: 8b 0d dc 20 46 c1 mov 0xc14620dc,%ecx
c10450b8: 89 55 e8 mov %edx,-0x18(%ebp)
c10450bb: 8b 55 e8 mov -0x18(%ebp),%edx
c10450be: 0f ad d0 shrd %cl,%edx,%eax
c10450c1: d3 ea shr %cl,%edx
c10450c3: f6 c1 20 test $0x20,%cl
c10450c6: 74 04 je c10450cc <timekeeping_forward_now+0x72>
c10450c8: 89 d0 mov %edx,%eax
c10450ca: 31 d2 xor %edx,%edx
c10450cc: 89 45 dc mov %eax,-0x24(%ebp)
c10450cf: a1 c4 20 46 c1 mov 0xc14620c4,%eax
c10450d4: 89 55 e0 mov %edx,-0x20(%ebp)
c10450d7: 99 cltd
c10450d8: 01 45 dc add %eax,-0x24(%ebp)
c10450db: 11 55 e0 adc %edx,-0x20(%ebp)
c10450de: 31 c9 xor %ecx,%ecx
c10450e0: 8b 45 dc mov -0x24(%ebp),%eax
c10450e3: 8b 55 e0 mov -0x20(%ebp),%edx
c10450e6: eb 09 jmp c10450f1 <timekeeping_forward_now+0x97>
c10450e8: 05 00 36 65 c4 add $0xc4653600,%eax
c10450ed: 83 d2 ff adc $0xffffffff,%edx
c10450f0: 41 inc %ecx
c10450f1: 83 fa 00 cmp $0x0,%edx
c10450f4: 77 f2 ja c10450e8 <timekeeping_forward_now+0x8e>
c10450f6: 3d ff c9 9a 3b cmp $0x3b9ac9ff,%eax
c10450fb: 77 eb ja c10450e8 <timekeeping_forward_now+0x8e>
c10450fd: 8b 15 0c 21 46 c1 mov 0xc146210c,%edx
c1045103: 8d 3c 39 lea (%ecx,%edi,1),%edi
c1045106: 89 3d c0 20 46 c1 mov %edi,0xc14620c0
c104510c: 31 ff xor %edi,%edi
c104510e: a3 c4 20 46 c1 mov %eax,0xc14620c4
c1045113: 89 55 e4 mov %edx,-0x1c(%ebp)
c1045116: 8b 43 24 mov 0x24(%ebx),%eax
c1045119: 31 d2 xor %edx,%edx
c104511b: 89 55 e0 mov %edx,-0x20(%ebp)
c104511e: 0f af f0 imul %eax,%esi
c1045121: 89 45 dc mov %eax,-0x24(%ebp)
c1045124: 8b 45 ec mov -0x14(%ebp),%eax
c1045127: f7 65 dc mull -0x24(%ebp)
c104512a: 01 f2 add %esi,%edx
c104512c: 8b 35 10 21 46 c1 mov 0xc1462110,%esi
c1045132: 89 45 ec mov %eax,-0x14(%ebp)
c1045135: 8b 45 ec mov -0x14(%ebp),%eax
c1045138: 89 55 f0 mov %edx,-0x10(%ebp)
c104513b: 8b 4b 28 mov 0x28(%ebx),%ecx
c104513e: 8b 55 f0 mov -0x10(%ebp),%edx
c1045141: 0f ad d0 shrd %cl,%edx,%eax
c1045144: d3 ea shr %cl,%edx
c1045146: f6 c1 20 test $0x20,%cl
c1045149: 0f 45 c2 cmovne %edx,%eax
c104514c: 0f 45 d7 cmovne %edi,%edx
c104514f: 89 c1 mov %eax,%ecx
c1045151: 89 f7 mov %esi,%edi
c1045153: 8b 45 e4 mov -0x1c(%ebp),%eax
c1045156: 89 d3 mov %edx,%ebx
c1045158: c1 ff 1f sar $0x1f,%edi
c104515b: 01 f1 add %esi,%ecx
c104515d: 11 fb adc %edi,%ebx
c104515f: 31 d2 xor %edx,%edx
c1045161: eb 0a jmp c104516d <timekeeping_forward_now+0x113>
c1045163: 81 c1 00 36 65 c4 add $0xc4653600,%ecx
c1045169: 83 d3 ff adc $0xffffffff,%ebx
c104516c: 42 inc %edx
c104516d: 83 fb 00 cmp $0x0,%ebx
c1045170: 77 f1 ja c1045163 <timekeeping_forward_now+0x109>
c1045172: 81 f9 ff c9 9a 3b cmp $0x3b9ac9ff,%ecx
c1045178: 77 e9 ja c1045163 <timekeeping_forward_now+0x109>
c104517a: 89 45 e4 mov %eax,-0x1c(%ebp)
c104517d: 8d 04 02 lea (%edx,%eax,1),%eax
c1045180: a3 0c 21 46 c1 mov %eax,0xc146210c
c1045185: 89 0d 10 21 46 c1 mov %ecx,0xc1462110
c104518b: 83 c4 18 add $0x18,%esp
c104518e: 5b pop %ebx
c104518f: 5e pop %esi
c1045190: 5f pop %edi
c1045191: 5d pop %ebp
c1045192: c3 ret

Note that I got a gcc 4.4 (4.4.2) upgrade, but the problems also occurred
at least one build earlier (i.e. with 4.3.4 as standard compiler),
as also confirmed by the dpkg installation log and the kernel package timestamp.

Plus I'm still wondering how - if it got stuck during boot with an
excessively tough division and NMI watchdog triggered -
then later in userspace clocksource_mutex was still stuck.

And I need to point at nmi_watchdog=2 again, with the CPU being lapic
capable but the BIOS not so much, AFAIR.

Thanks for the very responsive help!

Thomas Gleixner

unread,

Feb 8, 2010, 3:50:01 PM2/8/10

to

On Mon, 8 Feb 2010, Thomas Gleixner wrote:
> On Mon, 8 Feb 2010, Andreas Mohr wrote:
> Which compiler version are you using ?
>
> Can you please provide the disassembly of kernel/time/timekeeping.o ?

Is that NMI watchdog hit fully reproducible ?

Andreas Mohr

unread,

Feb 8, 2010, 3:50:01 PM2/8/10

to

On Mon, Feb 08, 2010 at 09:28:39PM +0100, Thomas Gleixner wrote:
> There goes the theory :)
>
> Which compiler version are you using ?
>
> Can you please provide the disassembly of kernel/time/timekeeping.o ?

Heh, lots and lots of ""session crosstalk"" ;)

http://lisas.de/~andi/kernel/timekeeping_objdump.txt.gz

Andreas Mohr

unread,

Feb 8, 2010, 4:00:03 PM2/8/10

to

Hi,

On Mon, Feb 08, 2010 at 09:40:24PM +0100, Thomas Gleixner wrote:
> On Mon, 8 Feb 2010, Thomas Gleixner wrote:
> > On Mon, 8 Feb 2010, Andreas Mohr wrote:
> > Which compiler version are you using ?
> >
> > Can you please provide the disassembly of kernel/time/timekeeping.o ?
>
> Is that NMI watchdog hit fully reproducible ?

Looks like it:
- another bootup also had lockup message
- all /var/log/dmesg* have lockup message, oldest is:
2010-02-07 20:00 dmesg.4.gz

Linux version 2.6.33-rc6 (root@note) (gcc version 4.3.4 (Debian 4.3.4-6)) #3 Sun Jan 31 23:47:51 CET 2010
.
.
.

PM: Adding info for No Bus:cooling_device0
Switching to clocksource acpi_pm

BUG: NMI Watchdog detected LOCKUP on CPU0, ip c1046694, registers:
Modules linked in:

Pid: 266, comm: kwatchdog Not tainted 2.6.33-rc6 #3 Inspiron 8000
/Inspiron 8000

EIP: 0060:[<c1046694>] EFLAGS: 00000082 CPU: 0
EIP is at timekeeping_forward_now+0xf0/0x11d
EAX: 342100fe EBX: f3dcdfc8 ECX: 63826cfb EDX: ffffffff
ESI: 003ffd7b EDI: 00000000 EBP: dfa31f50 ESP: dfa31f28

DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068

Process kwatchdog (pid: 266, ti=dfa31000 task=dfa4e0a0 task.ti=dfa31000)
Stack:
005b9e7e 00000000 6184c590 00000fff 002cb338 4b6f0da7 00000000 c13d8524
<0> c13d8524 dfa31f48 dfa31f60 c10466d2 c13d8524 c13d8524 dfa31f6c
c104725e

IOW, it's NOT gcc 4.4 and it's pretty darn reproducible.

Andreas Mohr

unread,

Feb 8, 2010, 4:10:03 PM2/8/10

to

On Mon, Feb 08, 2010 at 09:51:05PM +0100, Andreas Mohr wrote:
> Looks like it:
> - another bootup also had lockup message
> - all /var/log/dmesg* have lockup message, oldest is:
> 2010-02-07 20:00 dmesg.4.gz
>
> Linux version 2.6.33-rc6 (root@note) (gcc version 4.3.4 (Debian 4.3.4-6)) #3 Sun Jan 31 23:47:51 CET 2010

-rc4 and 2.6.32.3 don't show lockup message, instant bootup without any
visible delay.

Don't tell me I'm now supposed to try -rc5 and also rebuild -rc6 ;)
If so, do tell early so that I have lots of time to get it built...

Thomas Gleixner

unread,

Feb 8, 2010, 4:20:01 PM2/8/10

to

On Mon, 8 Feb 2010, Andreas Mohr wrote:

> On Mon, Feb 08, 2010 at 09:51:05PM +0100, Andreas Mohr wrote:
> > Looks like it:
> > - another bootup also had lockup message
> > - all /var/log/dmesg* have lockup message, oldest is:
> > 2010-02-07 20:00 dmesg.4.gz
> >
> > Linux version 2.6.33-rc6 (root@note) (gcc version 4.3.4 (Debian 4.3.4-6)) #3 Sun Jan 31 23:47:51 CET 2010
>
> -rc4 and 2.6.32.3 don't show lockup message, instant bootup without any
> visible delay.
>
> Don't tell me I'm now supposed to try -rc5 and also rebuild -rc6 ;)
> If so, do tell early so that I have lots of time to get it built...

Well, we better know, after which point that problem manifested
itself. A bisect would be optimal.

Thanks,

tglx

Thomas Gleixner

unread,

Feb 9, 2010, 5:10:01 AM2/9/10

to

On Mon, 8 Feb 2010, Andreas Mohr wrote:

> On Mon, Feb 08, 2010 at 09:51:05PM +0100, Andreas Mohr wrote:
> > Looks like it:
> > - another bootup also had lockup message
> > - all /var/log/dmesg* have lockup message, oldest is:
> > 2010-02-07 20:00 dmesg.4.gz
> >
> > Linux version 2.6.33-rc6 (root@note) (gcc version 4.3.4 (Debian 4.3.4-6)) #3 Sun Jan 31 23:47:51 CET 2010
>
> -rc4 and 2.6.32.3 don't show lockup message, instant bootup without any
> visible delay.

Just verified, that there is no timekeeping related patch after
2.6.33-rc4 except the kgdb lockup prevention patch, which is
not touching any code which is relevant at boot time.

So bisecting between rc4 and now is probably the best thing to do.

Thanks,

tglx

Andreas Mohr

unread,

Feb 9, 2010, 5:20:01 PM2/9/10

to

On Tue, Feb 09, 2010 at 11:02:02AM +0100, Thomas Gleixner wrote:
> Just verified, that there is no timekeeping related patch after
> 2.6.33-rc4 except the kgdb lockup prevention patch, which is
> not touching any code which is relevant at boot time.
>
> So bisecting between rc4 and now is probably the best thing to do.

Despairing is the best thing to do methinks.

-rc5 gcc 4.4 new .config: acpi_pm lockup
-rc4 gcc 4.4 new .config: reboot (directly after unpacking kernel)
-rc4 gcc 4.4 old (trusted) .config: reboot!! (dito)
-rc4 gcc 4.3 old (trusted) .config: reboot!!! (dito)

I'm currently zeroing in on the libgcc1 upgrade that I did on
2010-02-04 (which also made gcc-4.4 the default system compiler).
I don't have any friggin' idea what else could have caused this.

gcc 4.3 has been configured using CC=gcc-4.3 HOSTCC=gcc-4.3
, but as said, this also failed.

Andreas Mohr

unread,

Feb 14, 2010, 1:30:02 PM2/14/10

to

Hi,

On Tue, Feb 09, 2010 at 11:02:02AM +0100, Thomas Gleixner wrote:

> On Mon, 8 Feb 2010, Andreas Mohr wrote:
>
> > On Mon, Feb 08, 2010 at 09:51:05PM +0100, Andreas Mohr wrote:
> > > Looks like it:
> > > - another bootup also had lockup message
> > > - all /var/log/dmesg* have lockup message, oldest is:
> > > 2010-02-07 20:00 dmesg.4.gz
> > >
> > > Linux version 2.6.33-rc6 (root@note) (gcc version 4.3.4 (Debian 4.3.4-6)) #3 Sun Jan 31 23:47:51 CET 2010
> >
> > -rc4 and 2.6.32.3 don't show lockup message, instant bootup without any
> > visible delay.
>
> Just verified, that there is no timekeeping related patch after
> 2.6.33-rc4 except the kgdb lockup prevention patch, which is
> not touching any code which is relevant at boot time.
>
> So bisecting between rc4 and now is probably the best thing to do.

Frankly I'm now through so many different attempts
(cold-boot test, 4.3 compile, switch both lines to -O0 in Makefile,
use unmodified .config, restart with cleanly extracted tree,
revert several low-level compiler libraries, check for other
relevant package upgrades - it's not a LILO upgrade or
parameter problem either -, ...) without any improvement
that I'm about to give up (remember: 1+ hours per rebuild).
I simply don't have any idea what else could have broken it.
The only thing remaining is either some low-level compiler support library
that I didn't downgrade or actual hardware damage.

The one thing that might be the cause for my reboot-on-boot issues
is http://marc.info/?l=linux-kernel&m=126611098225127&w=2
(blank boot on Dell machines due to video mode issues - maybe this
increases chances of a reboot on other Dells such as mine).
Will do some verification via vga=ask or so.

At the moment I'm very tempted to simple deafly ignore the issue
until some or another enlightenment comes my way or future upgrades
of gcc/kernel happen to fix it.

Note that after reinstalling old custom kernel packages it does boot properly
(all recent build tests done using -rc4, and with old package reinstalled
-rc4 then did boot).

The lockup warning by NMI watchdog does seem to be more persistent
than my recent reboot issue, however,
but I might have easily missed that in earlier kernel packages.
----> we're dearly missing a suitable user-space kernel warning
notification layer that goes all the way to the desktop.
There's a netlink userspace thingy or so but it's just not
well-established.

Andreas Mohr

unread,

Feb 15, 2010, 1:30:02 AM2/15/10

to

On Sun, Feb 14, 2010 at 07:27:58PM +0100, Andreas Mohr wrote:
> Note that after reinstalling old custom kernel packages it does boot properly
> (all recent build tests done using -rc4, and with old package reinstalled
> -rc4 then did boot).

I don't know what happened, but -rc8 does boot on this system,
compiled using 4.4.3-2, _with_ acpi_pm lockup issues gone!
(and cat current_clocksource does work)

I've done (at least...) the following:
- make-kpkg kernel_image kernel_headers
reduced to kernel_image only
(a very likely candidate, since the kernel_headers part had been added
pretty much right before all that awful trouble with acpi_pm and reboots started)
- disabled reportedly overly troublesome lilo.conf large-memory option
(rerunning lilo did NOT manage to fix old -rc4 image boot, though)
- upgraded a couple unrelated packages

I'll try to get it nailed down some more.

Note that even with a rebooting kernel, vga=ask did manage to show
the menu successfully, and _after_ selection it did reboot, IMMEDIATELY.

BTW,
# ls -l /boot/vmlinu*
-rwxr-xr-x 1 root disk 1158634 2006-02-20 07:50 /boot/vmlinux
-rw-r--r-- 1 root disk 1181225 2006-01-18 10:51 /boot/vmlinuz
-rw-r--r-- 1 root root 1214128 2007-02-28 23:05 /boot/vmlinuz-2.6.20-ck1
-rw-r--r-- 1 root root 1703984 2010-01-07 23:38 /boot/vmlinuz-2.6.32.3
-rw-r--r-- 1 root root 1715232 2010-02-12 01:44 /boot/vmlinuz-2.6.33-rc4
-rw-r--r-- 1 root root 1761616 2010-02-09 00:44 /boot/vmlinuz-2.6.33-rc5
-rw-r--r-- 1 root root 1820704 2010-01-31 23:58 /boot/vmlinuz-2.6.33-rc6
-rw-r--r-- 1 root root 1762240 2010-02-07 23:15 /boot/vmlinuz-2.6.33-rc7
-rw-r--r-- 1 root root 1716080 2010-02-15 00:51 /boot/vmlinuz-2.6.33-rc8

One might be tempted to point at a possible boot loader image size issue,
but then working -rc8 is even a bit bigger than non-booting -rc4.

clocksource mutex deadlock, cat current_clocksource (2.6.33-rc6)

Andreas Mohr

Thomas Gleixner

Andreas Mohr

Andreas Mohr

Thomas Gleixner

Andreas Mohr

Thomas Gleixner

Andreas Mohr

Thomas Gleixner

Andreas Mohr

Thomas Gleixner

Andreas Mohr

Andreas Mohr

Andreas Mohr

Thomas Gleixner

Thomas Gleixner

Andreas Mohr

Andreas Mohr

Andreas Mohr