Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: 2.6.33-rc6 crashes on resume

0 views
Skip to first unread message

Rafael J. Wysocki

unread,
Feb 8, 2010, 6:10:03 PM2/8/10
to
On Tuesday 09 February 2010, Bill Davidsen wrote:
> Pretty simple to reproduce, boot, suspend, press shift
>
> Acer Aspire 1681, Celeron, 2.6.33-rc6. Trace and config attached.

I guess 2.6.32 didn't crash?

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Bill Davidsen

unread,
Feb 8, 2010, 6:10:02 PM2/8/10
to
resume_bug-2.trace
LAST_CONFIG

Rafael J. Wysocki

unread,
Feb 8, 2010, 6:20:03 PM2/8/10
to
On Tuesday 09 February 2010, Bill Davidsen wrote:
> Pretty simple to reproduce, boot, suspend, press shift
>
> Acer Aspire 1681, Celeron, 2.6.33-rc6. Trace and config attached.

This looks suspicious:

Feb 8 17:32:47 roadwarrior3 kernel: ata2.00: ACPI cmd 03/42:00:00:00:a0:ef (CFA REQUEST EXTENDED ERROR) rejected by device (Stat=0x51 Err=0x04)
Feb 8 17:32:47 roadwarrior3 kernel: sd 0:0:0:0: [sda] Starting disk
Feb 8 17:32:47 roadwarrior3 kernel: ata2.00: ACPI cmd 00/0c:00:00:00:a0:ef (NOP) rejected by device (Stat=0x51 Err=0x04)
Feb 8 17:32:47 roadwarrior3 kernel: ata2.00: configured for UDMA/33
Feb 8 17:32:47 roadwarrior3 kernel: firewire_core: rediscovered device fw0
Feb 8 17:32:47 roadwarrior3 kernel: ata1.00: ACPI cmd 03/45:00:00:00:a0:ef (CFA REQUEST EXTENDED ERROR) rejected by device (Stat=0x51 Err=0x04)
Feb 8 17:32:47 roadwarrior3 kernel: ata1.00: ACPI cmd 00/0c:00:00:00:a0:ef (NOP) rejected by device (Stat=0x51 Err=0x04)
Feb 8 17:32:47 roadwarrior3 kernel: ata1.00: configured for UDMA/100

And the crash actually happens _after_ resume:

Feb 8 17:32:47 roadwarrior3 kernel: PM: resume of devices complete after 1977.587 msecs
Feb 8 17:32:47 roadwarrior3 kernel: Restarting tasks ... done.
Feb 8 17:32:47 roadwarrior3 kernel: hub 2-0:1.0: over-current change on port 1
Feb 8 17:32:47 roadwarrior3 kernel: hub 2-0:1.0: over-current change on port 2
Feb 8 17:32:47 roadwarrior3 kernel: BUG: unable to handle kernel paging request at f72fd5a0
Feb 8 17:32:47 roadwarrior3 kernel: IP: [<c0552c43>] ext4_xattr_get+0xa0/0x232
Feb 8 17:32:47 roadwarrior3 kernel: *pde = 00007067 *pte = 00000000
Feb 8 17:32:47 roadwarrior3 kernel: Oops: 0000 [#1] SMP
Feb 8 17:32:47 roadwarrior3 kernel: last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
Feb 8 17:32:47 roadwarrior3 kernel: Modules linked in: nfs lockd fscache nfs_acl auth_rpcgss aes_generic lib80211_crypt_ccmp fuse sunrpc cpufreq_ondemand acpi_cpufreq ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 ext2 uinput snd_intel8x0 snd_intel8x0m snd_ac97_codec ac97_bus snd_seq snd_seq_device ipw2200 snd_pcm libipw sbs cfg80211 b44 snd_timer ssb sbshc snd rfkill serio_raw tifm_7xx1 iTCO_wdt mii tifm_core lib80211 soundcore joydev iTCO_vendor_support i2c_i801 snd_page_alloc dm_multipath firewire_ohci firewire_core yenta_socket crc_itu_t rsrc_nonstatic video output radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
Feb 8 17:32:47 roadwarrior3 kernel:
Feb 8 17:32:47 roadwarrior3 kernel: Pid: 2396, comm: atd Not tainted 2.6.33-rc6 #1 Aspire 1680 /Aspire 1680
Feb 8 17:32:47 roadwarrior3 kernel: EIP: 0060:[<c0552c43>] EFLAGS: 00010286 CPU: 0
Feb 8 17:32:47 roadwarrior3 kernel: EIP is at ext4_xattr_get+0xa0/0x232
Feb 8 17:32:47 roadwarrior3 kernel: EAX: c1d82400 EBX: f72fd5a0 ECX: f72fd5a0 EDX: f72fd600
Feb 8 17:32:47 roadwarrior3 kernel: ESI: 00000000 EDI: f6a6c2f8 EBP: f5c51e08 ESP: f5c51dc8
Feb 8 17:32:47 roadwarrior3 kernel: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Feb 8 17:32:47 roadwarrior3 kernel: Process atd (pid: 2396, ti=f5c50000 task=f39a7230 task.ti=f5c50000)
Feb 8 17:32:47 roadwarrior3 kernel: Stack:
Feb 8 17:32:47 roadwarrior3 kernel: f6a6c2f8 c2be22a0 00000163 00000007 c08b2bc1 00000006 f6a6c2c8 f6a6c270
Feb 8 17:32:47 roadwarrior3 kernel: <0> f72fd59c f69ea900 00000500 00000008 f72fd5a0 c08b2bc1 c1838bb0 f5c51e5c
Feb 8 17:32:47 roadwarrior3 kernel: <0> f5c51e24 c0553bb0 f5c51e5c 00000014 c08b2bc1 c097807c c08b2bb8 f5c51e48
Feb 8 17:32:47 roadwarrior3 kernel: Call Trace:
Feb 8 17:32:47 roadwarrior3 kernel: [<c0553bb0>] ? ext4_xattr_security_get+0x39/0x47
Feb 8 17:32:47 roadwarrior3 kernel: [<c04e496e>] ? generic_getxattr+0x66/0x6a
Feb 8 17:32:47 roadwarrior3 kernel: [<c04e4908>] ? generic_getxattr+0x0/0x6a
Feb 8 17:32:47 roadwarrior3 kernel: [<c057376f>] ? get_vfs_caps_from_disk+0x45/0xcb
Feb 8 17:32:47 roadwarrior3 kernel: [<c057e984>] ? selinux_capable+0xb1/0xf6
Feb 8 17:32:47 roadwarrior3 kernel: [<c0573881>] ? cap_bprm_set_creds+0x8c/0x2f9
Feb 8 17:32:47 roadwarrior3 kernel: [<c057e673>] ? selinux_bprm_set_creds+0x31/0x20d
Feb 8 17:32:47 roadwarrior3 kernel: [<c04b795a>] ? __vma_link+0x5d/0x61
Feb 8 17:32:47 roadwarrior3 kernel: [<c04b79b6>] ? vma_link+0x58/0x77
Feb 8 17:32:47 roadwarrior3 kernel: [<c04b7a8c>] ? insert_vm_struct+0xb7/0xca
Feb 8 17:32:47 roadwarrior3 kernel: [<c0573ccb>] ? security_bprm_set_creds+0x11/0x13
Feb 8 17:32:47 roadwarrior3 kernel: [<c04d46d2>] ? prepare_binprm+0xbd/0xf4
Feb 8 17:32:47 roadwarrior3 kernel: [<c04d37b6>] ? count+0x34/0x6f
Feb 8 17:32:47 roadwarrior3 kernel: [<c04d4ca8>] ? do_execve+0x132/0x270
Feb 8 17:32:47 roadwarrior3 kernel: [<c04091f0>] ? sys_execve+0x30/0x58
Feb 8 17:32:47 roadwarrior3 kernel: [<c040346e>] ? ptregs_execve+0x12/0x18
Feb 8 17:32:47 roadwarrior3 kernel: [<c040339f>] ? sysenter_do_call+0x12/0x28
Feb 8 17:32:47 roadwarrior3 kernel: Code: d9 89 5d f0 8b 87 a4 00 00 00 8b 80 84 01 00 00 03 50 64 eb 16 0f b6 01 83 c0 13 83 e0 fc 01 c1 39 d1 72 07 be fb ff ff ff eb 34 <83> 39 00 75 e5 e9 26 01 00 00 8b 45 f0 83 7d 08 00 8b 58 08 74
Feb 8 17:32:47 roadwarrior3 kernel: EIP: [<c0552c43>] ext4_xattr_get+0xa0/0x232 SS:ESP 0068:f5c51dc8
Feb 8 17:32:47 roadwarrior3 kernel: CR2: 00000000f72fd5a0
Feb 8 17:32:47 roadwarrior3 kernel: ---[ end trace b43953454f125bbd ]---

Any chance to use gdb for identifying the code at ext4_xattr_get+0xa0?

Tejun Heo

unread,
Feb 8, 2010, 10:30:01 PM2/8/10
to
On 02/09/2010 08:16 AM, Rafael J. Wysocki wrote:
> On Tuesday 09 February 2010, Bill Davidsen wrote:
>> Pretty simple to reproduce, boot, suspend, press shift
>>
>> Acer Aspire 1681, Celeron, 2.6.33-rc6. Trace and config attached.
>
> This looks suspicious:
>
> Feb 8 17:32:47 roadwarrior3 kernel: ata2.00: ACPI cmd 03/42:00:00:00:a0:ef (CFA REQUEST EXTENDED ERROR) rejected by device (Stat=0x51 Err=0x04)
> Feb 8 17:32:47 roadwarrior3 kernel: sd 0:0:0:0: [sda] Starting disk
> Feb 8 17:32:47 roadwarrior3 kernel: ata2.00: ACPI cmd 00/0c:00:00:00:a0:ef (NOP) rejected by device (Stat=0x51 Err=0x04)
> Feb 8 17:32:47 roadwarrior3 kernel: ata2.00: configured for UDMA/33
> Feb 8 17:32:47 roadwarrior3 kernel: firewire_core: rediscovered device fw0
> Feb 8 17:32:47 roadwarrior3 kernel: ata1.00: ACPI cmd 03/45:00:00:00:a0:ef (CFA REQUEST EXTENDED ERROR) rejected by device (Stat=0x51 Err=0x04)
> Feb 8 17:32:47 roadwarrior3 kernel: ata1.00: ACPI cmd 00/0c:00:00:00:a0:ef (NOP) rejected by device (Stat=0x51 Err=0x04)
> Feb 8 17:32:47 roadwarrior3 kernel: ata1.00: configured for UDMA/100

These aren't harmful in themselves. It just means that the device
failed commands BIOS requested via ACPI. Just in case, does
"libata.noacpi=1" make any difference to the oops?

Thanks.

--
tejun

Bill Davidsen

unread,
Feb 9, 2010, 10:10:02 AM2/9/10
to
Tejun Heo wrote:
> On 02/09/2010 08:16 AM, Rafael J. Wysocki wrote:
>
>> On Tuesday 09 February 2010, Bill Davidsen wrote:
>>
>>> Pretty simple to reproduce, boot, suspend, press shift
>>>
>>> Acer Aspire 1681, Celeron, 2.6.33-rc6. Trace and config attached.
>>>
>> This looks suspicious:
>>
>> Feb 8 17:32:47 roadwarrior3 kernel: ata2.00: ACPI cmd 03/42:00:00:00:a0:ef (CFA REQUEST EXTENDED ERROR) rejected by device (Stat=0x51 Err=0x04)
>> Feb 8 17:32:47 roadwarrior3 kernel: sd 0:0:0:0: [sda] Starting disk
>>
[snip]

> These aren't harmful in themselves. It just means that the device
> failed commands BIOS requested via ACPI. Just in case, does
> "libata.noacpi=1" make any difference to the oops?
>

I tried that, most interesting result. The kernel still crashes but then
starts removing modules at random, re-crashing and declaring itself
tainted, and finally gets up (for some definition of up) enough to issue
shutdown. It then crashes its way to a halt. I have attached the first
BUG here, but the whole log, boot to shutdown is pretty huge to inflict
on people, so I put it up for download on the theory that it is unlikely
to be helpful or interesting, just big.

At this point I'm going to build 2.6.33-rc{latest} and try again, I
hoped this could be useful quickly, so I'll use the latest in the hope
that it might show the problem fixed.

The whole log is at
http://www.tmr.com/~davidsen/Private/noacpi_crash.trace.bz2

--
Bill Davidsen <davi...@tmr.com>
"We can't solve today's problems by using the same thinking we
used in creating them." - Einstein

noacpi_crash_1st.trace

Bill Davidsen

unread,
Feb 9, 2010, 6:20:01 PM2/9/10
to
Rafael J. Wysocki wrote:
> On Tuesday 09 February 2010, Bill Davidsen wrote:
>> Pretty simple to reproduce, boot, suspend, press shift
>>
>> Acer Aspire 1681, Celeron, 2.6.33-rc6. Trace and config attached.
>
> I guess 2.6.32 didn't crash?
>
Wasn't testing. 2.6.32.5 didn't crash, later Fedora kernels kill the touchpad, I
can dig up the old bugzilla if you can't find it.

--
Bill Davidsen <davi...@tmr.com>
"We have more to fear from the bungling of the incompetent than from
the machinations of the wicked." - from Slashdot

Rafael J. Wysocki

unread,
Feb 9, 2010, 7:50:03 PM2/9/10
to
On Wednesday 10 February 2010, Bill Davidsen wrote:
> Rafael J. Wysocki wrote:
> > On Tuesday 09 February 2010, Bill Davidsen wrote:
> >> Pretty simple to reproduce, boot, suspend, press shift
> >>
> >> Acer Aspire 1681, Celeron, 2.6.33-rc6. Trace and config attached.
> >
> > I guess 2.6.32 didn't crash?
> >
> Wasn't testing. 2.6.32.5 didn't crash, later Fedora kernels kill the touchpad, I
> can dig up the old bugzilla if you can't find it.

Well, so it seems something during the 2.6.33 merge window broke things for
you.

Unfortunately there have been quite a few resume regressions of all sorts in
this cycle, so bisecting that might be difficult. That said, bisection would
probably be the fastest way of debugging this issue.

Rafael

Maciej Rutecki

unread,
Feb 11, 2010, 3:00:02 PM2/11/10
to
Dnia wtorek, 9 lutego 2010 o 00:03:11 Bill Davidsen napisał(a):
> Pretty simple to reproduce, boot, suspend, press shift
>
> Acer Aspire 1681, Celeron, 2.6.33-rc6. Trace and config attached.
>

I created a Bugzilla entry at http://bugzilla.kernel.org/show_bug.cgi?id=15277
for your bug report, please add your address to the CC list in there, thanks!

Regards
--
Maciej Rutecki
http://www.maciek.unixy.pl

0 new messages