Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

2.6.33-rc5: Reported regressions from 2.6.32

10 views
Skip to first unread message

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:10:03 PM1/24/10
to
This message contains a list of some regressions from 2.6.32, for which there
are no fixes in the mainline I know of. If any of them have been fixed already,
please let me know.

If you know of any other unresolved regressions from 2.6.32, please let me know
either and I'll add them to the list. Also, please let me know if any of the
entries below are invalid.

Each entry from the list will be sent additionally in an automatic reply to
this message with CCs to the people involved in reporting and handling the
issue.


Listed regressions statistics:

Date Total Pending Unresolved
----------------------------------------
2010-01-24 75 29 23
2010-01-10 55 33 21
2009-12-29 36 34 27


Unresolved regressions
----------------------

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15139
Subject : e1000: transmit queue 0 timed out
Submitter : Alexander Beregalov <a.ber...@gmail.com>
Date : 2010-01-23 15:37 (2 days old)
References : http://marc.info/?l=linux-netdev&m=126426149306083&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15138
Subject : evdev regression on macbook
Submitter : Guillaume Chazarain <gui...@gmail.com>
Date : 2010-01-23 18:53 (2 days old)
References : http://marc.info/?l=linux-kernel&m=126427286219235&w=4
Handled-By : Dmitry Torokhov <dmitry....@gmail.com>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15133
Subject : Wake on LAN doesn't work in sky2
Submitter : Tino Keitel <tino....@tikei.de>
Date : 2010-01-15 9:10 (10 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=166a0fd4c788ec7f10ca8194ec6d526afa12db75
References : http://marc.info/?l=linux-kernel&m=126354704815848&w=4
Handled-By : Stephen Hemminger <shemm...@vyatta.com>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15132
Subject : OOPS's with large initramfs
Submitter : Nigel Kukard <nku...@lbsd.net>
Date : 2010-01-16 11:12 (9 days old)
References : http://marc.info/?l=linux-kernel&m=126364100321603&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15129
Subject : [drm:i915_gem_execbuffer] *ERROR* i915_gem_do_execbuffer returns -512
Submitter : Miles Lane <miles...@gmail.com>
Date : 2010-01-14 23:18 (11 days old)
References : http://lkml.org/lkml/2010/1/14/570
Handled-By : Chris Wilson <ch...@chris-wilson.co.uk>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15126
Subject : REGRESSION for RT2561/RT61 in 2.6.33
Submitter : Alan Stern <st...@rowland.harvard.edu>
Date : 2010-01-11 14:54 (14 days old)
References : http://marc.info/?l=linux-kernel&m=126322167427159&w=4
Handled-By : Johannes Berg <joha...@sipsolutions.net>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15125
Subject : hung task - jbd2/dm-1-8 (during raid rebuild)
Submitter : Michael Breuer <mbr...@majjas.com>
Date : 2010-01-10 21:47 (15 days old)
References : http://marc.info/?l=linux-kernel&m=126316012025978&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15124
Subject : PCI host bridge windows ignored (works with pci=use_crs)
Submitter : Jeff Garrett <je...@jgarrett.org>
Date : 2010-01-13 5:37 (12 days old)
References : http://marc.info/?l=linux-kernel&m=126336296600307&w=4
Handled-By : Yinghai Lu <yin...@kernel.org>
Bjorn Helgaas <bjorn....@hp.com>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15076
Subject : System panic under load with clockevents_program_event
Submitter : okias <d.o...@gmail.com>
Date : 2010-01-17 13:03 (8 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15043
Subject : Display goes off with i915.powersave=1
Submitter : Soeren Sonnenburg <so...@debian.org>
Date : 2010-01-10 20:09 (15 days old)
References : http://marc.info/?l=linux-kernel&m=126315457519505&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15038
Subject : drm/ksm: fbdev blanking regression
Submitter : Johan Hovold <jho...@gmail.com>
Date : 2010-01-06 17:00 (19 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=731b5a15a3b1474a41c2ca29b4c32b0f21bc852e
References : http://marc.info/?l=linux-kernel&m=126279726418748&w=4
Handled-By : James Simmons <jsim...@infradead.org>


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15036
Subject : soft lockup in dmesg after suspend/resume
Submitter : ykzhao <yakui...@intel.com>
Date : 2010-01-04 5:36 (21 days old)
References : http://marc.info/?l=linux-kernel&m=126258356202722&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15032
Subject : Oops in uart_resume_port() on resume
Submitter : Zdenek Kabelac <zdenek....@gmail.com>
Date : 2010-01-04 15:47 (21 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ba15ab0e8de0d4439a91342ad52d55ca9e313f3d
References : http://marc.info/?l=linux-kernel&m=126262008815689&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15025
Subject : Oops in ext4 driver
Submitter : Steinar H. Gunderson <sgund...@bigfoot.com>
Date : 2010-01-10 13:09 (15 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15017
Subject : kexec regression, radeon/kms irq related (bisected)
Submitter : Markus Trippelsdorf <mar...@trippelsdorf.de>
Date : 2010-01-09 18:49 (16 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d8f60cfc93452d0554f6a701aa8e3236cbee4636


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15000
Subject : Thinkpad dock button no longer works
Submitter : Paul Martin <p...@debian.org>
Date : 2010-01-07 02:11 (18 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14999
Subject : possible circular locking dependency detected in rfkill at suspend
Submitter : Christian Casteyde <casteyde....@free.fr>
Date : 2010-01-06 21:52 (19 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14950
Subject : tbench regression with 2.6.33-rc1
Submitter : Lin Ming <ming....@intel.com>
Date : 2009-12-25 11:11 (31 days old)
References : http://marc.info/?l=linux-kernel&m=126174044213172&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14946
Subject : All kernels after 2.6.32-git10 show only 1 CPU
Submitter : Sid Boyce <sbo...@blueyonder.co.uk>
Date : 2009-12-23 16:55 (33 days old)
References : http://marc.info/?l=linux-kernel&m=126158734326801&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14937
Subject : WARNING: at kernel/lockdep.c:2830
Submitter : Grant Wilson <grant....@zen.co.uk>
Date : 2009-12-27 13:35 (29 days old)
References : http://marc.info/?l=linux-kernel&m=126192220404829&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14924
Subject : Weird hard hangs when rendering 'some' web-sites in Firefox
Submitter : David <da...@unsolicited.net>
Date : 2009-12-21 21:53 (35 days old)
References : http://marc.info/?l=linux-kernel&m=126143375823340&w=4


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14859
Subject : System timer firing too much without cause
Submitter : Shawn Starr <shawn...@rogers.com>
Date : 2009-12-21 19:16 (35 days old)


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14792
Subject : Misdetection of the TV output
Submitter : Santi <sa...@agolina.net>
Date : 2009-12-12 13:28 (44 days old)


Regressions with patches
------------------------

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15137
Subject : NULL pointer dereference in vlan_skb_recv
Submitter : Bruno Prémont <bon...@linux-vserver.org>
Date : 2010-01-23 15:56 (2 days old)
References : http://marc.info/?l=linux-kernel&m=126426286507497&w=4
Handled-By : Eric Dumazet <eric.d...@gmail.com>
Patch : http://patchwork.kernel.org/patch/74999/
http://patchwork.kernel.org/patch/75002/


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15131
Subject : [OOPS] radeon kms
Submitter : John Kacur <jka...@redhat.com>
Date : 2010-01-15 15:45 (10 days old)
References : http://lkml.org/lkml/2010/1/15/129
Handled-By : Jerome Glisse <gli...@freedesktop.org>
Patch : http://git.kernel.org/?p=linux/kernel/git/airlied/drm-2.6.git;a=patch;h=30d2d9a54d48e4fefede0389ded1b6fc2d44a522


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15128
Subject : Boot regression on AMD
Submitter : Gene Heskett <gene.h...@verizon.net>
Date : 2010-01-13 20:21 (12 days old)
References : http://marc.info/?l=linux-kernel&m=126341413213017&w=4
Handled-By : Andreas Herrmann <andreas....@amd.com>
Patch : http://patchwork.kernel.org/patch/74883/


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15039
Subject : leds_alix2: can't allocate I/O for GPIO
Submitter : Arnd Hannemann <hann...@nets.rwth-aachen.de>
Date : 2010-01-07 10:26 (18 days old)
References : http://marc.info/?l=linux-kernel&m=126286001106257&w=4
Handled-By : Daniel Mack <dan...@caiaq.de>
Patch : http://patchwork.kernel.org/patch/72006/


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14949
Subject : drm_vm.c:drm_mmap: possible circular locking dependency detected
Submitter : Borislav Petkov <petk...@googlemail.com>
Date : 2009-12-26 9:45 (30 days old)
References : http://marc.info/?l=linux-kernel&m=126182073616279&w=4
Handled-By : Eric W. Biederman <ebie...@aristanetworks.com>
Patch : http://patchwork.kernel.org/patch/70461/


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14791
Subject : Something has been broken in the network stack this week
Submitter : Delete This Account <speedybo...@hotmail.com>
Date : 2009-12-12 13:06 (44 days old)
Handled-By : Ben Hutchings <b...@decadent.org.uk>
Patch : http://patchwork.kernel.org/patch/72073/


For details, please visit the bug entries and follow the links given in
references.

As you can see, there is a Bugzilla entry for each of the listed regressions.
There also is a Bugzilla entry used for tracking the regressions from 2.6.32,
unresolved as well as resolved, at:

http://bugzilla.kernel.org/show_bug.cgi?id=14885

Please let me know if there are any Bugzilla entries that should be added to
the list in there.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:20:01 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:20:01 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15017
Subject : kexec regression, radeon/kms irq related (bisected)
Submitter : Markus Trippelsdorf <mar...@trippelsdorf.de>
Date : 2010-01-09 18:49 (16 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d8f60cfc93452d0554f6a701aa8e3236cbee4636

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:20:01 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14946
Subject : All kernels after 2.6.32-git10 show only 1 CPU
Submitter : Sid Boyce <sbo...@blueyonder.co.uk>
Date : 2009-12-23 16:55 (33 days old)
References : http://marc.info/?l=linux-kernel&m=126158734326801&w=4

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:20:02 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15125
Subject : hung task - jbd2/dm-1-8 (during raid rebuild)
Submitter : Michael Breuer <mbr...@majjas.com>
Date : 2010-01-10 21:47 (15 days old)
References : http://marc.info/?l=linux-kernel&m=126316012025978&w=4

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:20:02 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15138
Subject : evdev regression on macbook
Submitter : Guillaume Chazarain <gui...@gmail.com>
Date : 2010-01-23 18:53 (2 days old)
References : http://marc.info/?l=linux-kernel&m=126427286219235&w=4
Handled-By : Dmitry Torokhov <dmitry....@gmail.com>

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:20:02 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14949
Subject : drm_vm.c:drm_mmap: possible circular locking dependency detected
Submitter : Borislav Petkov <petk...@googlemail.com>
Date : 2009-12-26 9:45 (30 days old)
References : http://marc.info/?l=linux-kernel&m=126182073616279&w=4
Handled-By : Eric W. Biederman <ebie...@aristanetworks.com>
Patch : http://patchwork.kernel.org/patch/70461/

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:20:01 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14792
Subject : Misdetection of the TV output
Submitter : Santi <sa...@agolina.net>
Date : 2009-12-12 13:28 (44 days old)

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:20:01 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15076
Subject : System panic under load with clockevents_program_event
Submitter : okias <d.o...@gmail.com>
Date : 2010-01-17 13:03 (8 days old)

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:20:02 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15038
Subject : drm/ksm: fbdev blanking regression
Submitter : Johan Hovold <jho...@gmail.com>
Date : 2010-01-06 17:00 (19 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=731b5a15a3b1474a41c2ca29b4c32b0f21bc852e
References : http://marc.info/?l=linux-kernel&m=126279726418748&w=4
Handled-By : James Simmons <jsim...@infradead.org>

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:20:02 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15139
Subject : e1000: transmit queue 0 timed out
Submitter : Alexander Beregalov <a.ber...@gmail.com>
Date : 2010-01-23 15:37 (2 days old)
References : http://marc.info/?l=linux-netdev&m=126426149306083&w=4

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:20:02 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14999
Subject : possible circular locking dependency detected in rfkill at suspend
Submitter : Christian Casteyde <casteyde....@free.fr>
Date : 2010-01-06 21:52 (19 days old)

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:20:01 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15137
Subject : NULL pointer dereference in vlan_skb_recv
Submitter : Bruno Prémont <bon...@linux-vserver.org>
Date : 2010-01-23 15:56 (2 days old)
References : http://marc.info/?l=linux-kernel&m=126426286507497&w=4
Handled-By : Eric Dumazet <eric.d...@gmail.com>
Patch : http://patchwork.kernel.org/patch/74999/
http://patchwork.kernel.org/patch/75002/

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:30:02 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15025
Subject : Oops in ext4 driver
Submitter : Steinar H. Gunderson <sgund...@bigfoot.com>
Date : 2010-01-10 13:09 (15 days old)

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:30:03 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15133
Subject : Wake on LAN doesn't work in sky2
Submitter : Tino Keitel <tino....@tikei.de>
Date : 2010-01-15 9:10 (10 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=166a0fd4c788ec7f10ca8194ec6d526afa12db75
References : http://marc.info/?l=linux-kernel&m=126354704815848&w=4
Handled-By : Stephen Hemminger <shemm...@vyatta.com>

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:30:02 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15129
Subject : [drm:i915_gem_execbuffer] *ERROR* i915_gem_do_execbuffer returns -512
Submitter : Miles Lane <miles...@gmail.com>
Date : 2010-01-14 23:18 (11 days old)
References : http://lkml.org/lkml/2010/1/14/570
Handled-By : Chris Wilson <ch...@chris-wilson.co.uk>

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:30:02 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14950
Subject : tbench regression with 2.6.33-rc1
Submitter : Lin Ming <ming....@intel.com>
Date : 2009-12-25 11:11 (31 days old)
References : http://marc.info/?l=linux-kernel&m=126174044213172&w=4

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:30:02 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15128
Subject : Boot regression on AMD
Submitter : Gene Heskett <gene.h...@verizon.net>
Date : 2010-01-13 20:21 (12 days old)
References : http://marc.info/?l=linux-kernel&m=126341413213017&w=4
Handled-By : Andreas Herrmann <andreas....@amd.com>
Patch : http://patchwork.kernel.org/patch/74883/

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:30:03 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15132
Subject : OOPS's with large initramfs
Submitter : Nigel Kukard <nku...@lbsd.net>
Date : 2010-01-16 11:12 (9 days old)
References : http://marc.info/?l=linux-kernel&m=126364100321603&w=4

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:30:04 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15039
Subject : leds_alix2: can't allocate I/O for GPIO
Submitter : Arnd Hannemann <hann...@nets.rwth-aachen.de>
Date : 2010-01-07 10:26 (18 days old)
References : http://marc.info/?l=linux-kernel&m=126286001106257&w=4
Handled-By : Daniel Mack <dan...@caiaq.de>
Patch : http://patchwork.kernel.org/patch/72006/

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:30:04 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14859
Subject : System timer firing too much without cause
Submitter : Shawn Starr <shawn...@rogers.com>
Date : 2009-12-21 19:16 (35 days old)

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:30:03 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15126
Subject : REGRESSION for RT2561/RT61 in 2.6.33
Submitter : Alan Stern <st...@rowland.harvard.edu>
Date : 2010-01-11 14:54 (14 days old)
References : http://marc.info/?l=linux-kernel&m=126322167427159&w=4
Handled-By : Johannes Berg <joha...@sipsolutions.net>

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:30:03 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14937
Subject : WARNING: at kernel/lockdep.c:2830
Submitter : Grant Wilson <grant....@zen.co.uk>
Date : 2009-12-27 13:35 (29 days old)
References : http://marc.info/?l=linux-kernel&m=126192220404829&w=4

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:30:02 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:30:02 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15124
Subject : PCI host bridge windows ignored (works with pci=use_crs)
Submitter : Jeff Garrett <je...@jgarrett.org>
Date : 2010-01-13 5:37 (12 days old)
References : http://marc.info/?l=linux-kernel&m=126336296600307&w=4
Handled-By : Yinghai Lu <yin...@kernel.org>
Bjorn Helgaas <bjorn....@hp.com>

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:30:02 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15036
Subject : soft lockup in dmesg after suspend/resume
Submitter : ykzhao <yakui...@intel.com>
Date : 2010-01-04 5:36 (21 days old)
References : http://marc.info/?l=linux-kernel&m=126258356202722&w=4

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:30:02 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15043
Subject : Display goes off with i915.powersave=1
Submitter : Soeren Sonnenburg <so...@debian.org>
Date : 2010-01-10 20:09 (15 days old)
References : http://marc.info/?l=linux-kernel&m=126315457519505&w=4

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:30:03 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15000
Subject : Thinkpad dock button no longer works
Submitter : Paul Martin <p...@debian.org>
Date : 2010-01-07 02:11 (18 days old)

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:30:03 PM1/24/10
to
This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32. Please verify if it still should be listed and let me know
(either way).


Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14924
Subject : Weird hard hangs when rendering 'some' web-sites in Firefox
Submitter : David <da...@unsolicited.net>
Date : 2009-12-21 21:53 (35 days old)
References : http://marc.info/?l=linux-kernel&m=126143375823340&w=4

Alan Stern

unread,
Jan 24, 2010, 5:40:03 PM1/24/10
to
On Sun, 24 Jan 2010, Rafael J. Wysocki wrote:

> This message has been generated automatically as a part of a report
> of recent regressions.
>
> The following bug entry is on the current list of known regressions
> from 2.6.32. Please verify if it still should be listed and let me know
> (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15126
> Subject : REGRESSION for RT2561/RT61 in 2.6.33
> Submitter : Alan Stern <st...@rowland.harvard.edu>
> Date : 2010-01-11 14:54 (14 days old)
> References : http://marc.info/?l=linux-kernel&m=126322167427159&w=4
> Handled-By : Johannes Berg <joha...@sipsolutions.net>

This bug entry can be removed from the list. It turned out not to be
a bug at all, just a kernel config error I made when updating to
2.6.33-rc1.

Alan Stern

Rafael J. Wysocki

unread,
Jan 24, 2010, 5:40:04 PM1/24/10
to
On Sunday 24 January 2010, Alan Stern wrote:
> On Sun, 24 Jan 2010, Rafael J. Wysocki wrote:
>
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> >
> > The following bug entry is on the current list of known regressions
> > from 2.6.32. Please verify if it still should be listed and let me know
> > (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15126
> > Subject : REGRESSION for RT2561/RT61 in 2.6.33
> > Submitter : Alan Stern <st...@rowland.harvard.edu>
> > Date : 2010-01-11 14:54 (14 days old)
> > References : http://marc.info/?l=linux-kernel&m=126322167427159&w=4
> > Handled-By : Johannes Berg <joha...@sipsolutions.net>
>
> This bug entry can be removed from the list. It turned out not to be
> a bug at all, just a kernel config error I made when updating to
> 2.6.33-rc1.

Thanks, closed as "invalid".

Rafael

Shawn Starr

unread,
Jan 24, 2010, 5:50:03 PM1/24/10
to
On Sunday 24 January 2010 17:04:33 Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
>
> The following bug entry is on the current list of known regressions
> from 2.6.32. Please verify if it still should be listed and let me know
> (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14859
> Subject : System timer firing too much without cause
> Submitter : Shawn Starr <shawn...@rogers.com>
> Date : 2009-12-21 19:16 (35 days old)

Continues with -rc5, I really cannot use Dynamic ticks at all, it has to be
disabled.

I should probably mention this CPU info:

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Core(TM)2 Duo CPU T9400 @ 2.53GHz
stepping : 10
cpu MHz : 800.000
cache size : 6144 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm
constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor
ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm ida
tpr_shadow vnmi flexpriority
bogomips : 5053.40
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

Johan Hovold

unread,
Jan 24, 2010, 5:50:04 PM1/24/10
to
On Sun, Jan 24, 2010 at 11:04:36PM +0100, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
>
> The following bug entry is on the current list of known regressions
> from 2.6.32. Please verify if it still should be listed and let me know
> (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15038
> Subject : drm/ksm: fbdev blanking regression
> Submitter : Johan Hovold <jho...@gmail.com>
> Date : 2010-01-06 17:00 (19 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=731b5a15a3b1474a41c2ca29b4c32b0f21bc852e
> References : http://marc.info/?l=linux-kernel&m=126279726418748&w=4
> Handled-By : James Simmons <jsim...@infradead.org>

Issue remains in rc5.

/Johan

Michael Breuer

unread,
Jan 24, 2010, 5:50:01 PM1/24/10
to
On 1/24/2010 5:04 PM, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
>
> The following bug entry is on the current list of known regressions
> from 2.6.32. Please verify if it still should be listed and let me know
> (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15125
> Subject : hung task - jbd2/dm-1-8 (during raid rebuild)
> Submitter : Michael Breuer<mbr...@majjas.com>
> Date : 2010-01-10 21:47 (15 days old)
> References : http://marc.info/?l=linux-kernel&m=126316012025978&w=4
>
>
Not an easy one to recreate. Should probably remain listed for now.

Rafael J. Wysocki

unread,
Jan 24, 2010, 6:10:01 PM1/24/10
to

Thanks for the update.

Rafael

Rafael J. Wysocki

unread,
Jan 24, 2010, 6:10:02 PM1/24/10
to
On Sunday 24 January 2010, Steinar H. Gunderson wrote:

> On Sun, Jan 24, 2010 at 11:04:35PM +0100, Rafael J. Wysocki wrote:
> > The following bug entry is on the current list of known regressions
> > from 2.6.32. Please verify if it still should be listed and let me know
> > (either way).
>
> I'm not using 2.6.33 anymore since this bug is a showstopper to me (it's on a
> production system), so I'm unable to check if it's fixed or not.

Well, in that case I'll have to close it as 'unreproducible', because no one
else seems to be able to reproduce it.

Rafael

Steinar H. Gunderson

unread,
Jan 24, 2010, 6:10:02 PM1/24/10
to
On Sun, Jan 24, 2010 at 11:04:35PM +0100, Rafael J. Wysocki wrote:
> The following bug entry is on the current list of known regressions
> from 2.6.32. Please verify if it still should be listed and let me know
> (either way).

I'm not using 2.6.33 anymore since this bug is a showstopper to me (it's on a


production system), so I'm unable to check if it's fixed or not.

/* Steinar */
--
Homepage: http://www.sesse.net/

Nigel Kukard

unread,
Jan 24, 2010, 6:10:02 PM1/24/10
to
Verified. I have tested this as far back as 2.6.30 with the same problem
with "very" large initramfs's.

> This message has been generated automatically as a part of a report
> of recent regressions.
>
> The following bug entry is on the current list of known regressions
> from 2.6.32. Please verify if it still should be listed and let me know
> (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15132
> Subject : OOPS's with large initramfs
> Submitter : Nigel Kukard <nku...@lbsd.net>
> Date : 2010-01-16 11:12 (9 days old)
> References : http://marc.info/?l=linux-kernel&m=126364100321603&w=4
>
>
>


--
Regards
Nigel Kukard, PhD CompSc
Linux Based Systems Design (Pty) Ltd

Support: 086 747 7600 (premium 24/7/365)
Fax: 086 601 7884

Quote: The best language to use is the language that was designed for
what you want to use it for.

*** The attachment to my email signature.asc is a digital PGP
signature, if your mail client supports digital signatures it will
allow you to verify I am the sender of this email and that it has not
been tampered with along the way ***


signature.asc

Rafael J. Wysocki

unread,
Jan 24, 2010, 6:20:02 PM1/24/10
to
On Sunday 24 January 2010, Michael Breuer wrote:
> On 1/24/2010 5:04 PM, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> >
> > The following bug entry is on the current list of known regressions
> > from 2.6.32. Please verify if it still should be listed and let me know
> > (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15125
> > Subject : hung task - jbd2/dm-1-8 (during raid rebuild)
> > Submitter : Michael Breuer<mbr...@majjas.com>
> > Date : 2010-01-10 21:47 (15 days old)
> > References : http://marc.info/?l=linux-kernel&m=126316012025978&w=4
> >
> >
> Not an easy one to recreate. Should probably remain listed for now.

Thanks for the update.

Rafael

Rafael J. Wysocki

unread,
Jan 24, 2010, 6:20:01 PM1/24/10
to
On Sunday 24 January 2010, Nigel Kukard wrote:
> Verified. I have tested this as far back as 2.6.30 with the same problem
> with "very" large initramfs's.

So this is not a recent regression, but a bug that has been there for a long
time. Dropping from the list.

Thanks,
Rafael

Rafael J. Wysocki

unread,
Jan 24, 2010, 6:20:02 PM1/24/10
to
On Sunday 24 January 2010, Johan Hovold wrote:
> On Sun, Jan 24, 2010 at 11:04:36PM +0100, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> >
> > The following bug entry is on the current list of known regressions
> > from 2.6.32. Please verify if it still should be listed and let me know
> > (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15038
> > Subject : drm/ksm: fbdev blanking regression
> > Submitter : Johan Hovold <jho...@gmail.com>
> > Date : 2010-01-06 17:00 (19 days old)
> > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=731b5a15a3b1474a41c2ca29b4c32b0f21bc852e
> > References : http://marc.info/?l=linux-kernel&m=126279726418748&w=4
> > Handled-By : James Simmons <jsim...@infradead.org>
>
> Issue remains in rc5.

Thanks for the update.

OK, we know what commit broke things, we don't seem to know how to fix it,
so perhaps it's time to revert that commit?

Rafael

Sid Boyce

unread,
Jan 24, 2010, 8:50:01 PM1/24/10
to
On 24/01/10 22:04, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
>
> The following bug entry is on the current list of known regressions
> from 2.6.32. Please verify if it still should be listed and let me know
> (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14946
> Subject : All kernels after 2.6.32-git10 show only 1 CPU
> Submitter : Sid Boyce <sbo...@blueyonder.co.uk>
> Date : 2009-12-23 16:55 (33 days old)
> References : http://marc.info/?l=linux-kernel&m=126158734326801&w=4
>
>
>

Definitely fixed in 2.6.33-rc4, thanks.
Regards
Sid.
--
Sid Boyce ... Hamradio License G3VBV, Licensed Private Pilot
Emeritus IBM/Amdahl Mainframes and Sun/Fujitsu Servers Tech Support
Specialist, Cricket Coach
Microsoft Windows Free Zone - Linux used for all Computing Tasks

Alex Deucher

unread,
Jan 24, 2010, 10:30:02 PM1/24/10
to
On Sun, Jan 24, 2010 at 5:04 PM, Rafael J. Wysocki <r...@sisk.pl> wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
>
> The following bug entry is on the current list of known regressions
> from 2.6.32. �Please verify if it still should be listed and let me know
> (either way).
>
>
> Bug-Entry � � � : http://bugzilla.kernel.org/show_bug.cgi?id=15017
> Subject � � � � : kexec regression, radeon/kms irq related (bisected)
> Submitter � � � : Markus Trippelsdorf <mar...@trippelsdorf.de>
> Date � � � � � �: 2010-01-09 18:49 (16 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d8f60cfc93452d0554f6a701aa8e3236cbee4636
>
>
>

That bug has patches which fix the issue attached and queued for 2.6.33.

Alex

Dave Airlie

unread,
Jan 25, 2010, 1:40:01 AM1/25/10
to
On Mon, 2010-01-25 at 00:13 +0100, Rafael J. Wysocki wrote:
> On Sunday 24 January 2010, Johan Hovold wrote:
> > On Sun, Jan 24, 2010 at 11:04:36PM +0100, Rafael J. Wysocki wrote:
> > > This message has been generated automatically as a part of a report
> > > of recent regressions.
> > >
> > > The following bug entry is on the current list of known regressions
> > > from 2.6.32. Please verify if it still should be listed and let me know
> > > (either way).
> > >
> > >
> > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15038
> > > Subject : drm/ksm: fbdev blanking regression
> > > Submitter : Johan Hovold <jho...@gmail.com>
> > > Date : 2010-01-06 17:00 (19 days old)
> > > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=731b5a15a3b1474a41c2ca29b4c32b0f21bc852e
> > > References : http://marc.info/?l=linux-kernel&m=126279726418748&w=4
> > > Handled-By : James Simmons <jsim...@infradead.org>
> >
> > Issue remains in rc5.
>
> Thanks for the update.
>
> OK, we know what commit broke things, we don't seem to know how to fix it,
> so perhaps it's time to revert that commit?

Just sent revert of the broken bit to Linus.

Dave.

Dave Airlie

unread,
Jan 25, 2010, 1:40:01 AM1/25/10
to
On Sun, 2010-01-24 at 22:26 -0500, Alex Deucher wrote:
> On Sun, Jan 24, 2010 at 5:04 PM, Rafael J. Wysocki <r...@sisk.pl> wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> >
> > The following bug entry is on the current list of known regressions
> > from 2.6.32. Please verify if it still should be listed and let me know
> > (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15017
> > Subject : kexec regression, radeon/kms irq related (bisected)
> > Submitter : Markus Trippelsdorf <mar...@trippelsdorf.de>
> > Date : 2010-01-09 18:49 (16 days old)
> > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d8f60cfc93452d0554f6a701aa8e3236cbee4636
> >
> >
> >
>
> That bug has patches which fix the issue attached and queued for 2.6.33.

Just sent to Linus.

Dave.

Borislav Petkov

unread,
Jan 25, 2010, 3:30:01 AM1/25/10
to
On Sun, Jan 24, 2010 at 11:04:33PM +0100, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
>
> The following bug entry is on the current list of known regressions
> from 2.6.32. Please verify if it still should be listed and let me know
> (either way).

Yep, this one is fixed by the patch below. Thanks.

>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14949
> Subject : drm_vm.c:drm_mmap: possible circular locking dependency detected
> Submitter : Borislav Petkov <petk...@googlemail.com>
> Date : 2009-12-26 9:45 (30 days old)
> References : http://marc.info/?l=linux-kernel&m=126182073616279&w=4
> Handled-By : Eric W. Biederman <ebie...@aristanetworks.com>
> Patch : http://patchwork.kernel.org/patch/70461/

--
Regards/Gruss,
Boris.

Thomas Gleixner

unread,
Jan 25, 2010, 4:00:02 AM1/25/10
to
Switched to email. Please reply to all instead of using the bugzilla
interface.

> --- Comment #4 from okias <d.o...@gmail.com> 2010-01-22 10:17:25 ---
> and it's regression. Now I work on 2.6.32.3 and no problem.

That's a really weird one. The system is 50 min up and running and out
of the blue it crashes in clockevents_program_event(). This function
has been called a couple of thousand times before that point.

The only way to crash there is when *dev is pointing into nirwana. dev
comes from

int tick_program_event(ktime_t expires, int force)
{
struct clock_event_device *dev = __get_cpu_var(tick_cpu_device).evtdev;

according to the callchain. At this point nothing fiddles with
tick_cpu_device.evtdev, so I suspect some really nasty memory
corruption going on.

okias, can you please disable highmem support and verify whether the
problem persists ?

Thanks,

tglx

Thomas Gleixner

unread,
Jan 25, 2010, 5:40:03 AM1/25/10
to
On Sun, 24 Jan 2010, Shawn Starr wrote:

> On Sunday 24 January 2010 17:04:33 Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> >
> > The following bug entry is on the current list of known regressions
> > from 2.6.32. Please verify if it still should be listed and let me know
> > (either way).

Why is this on the regression list at all ? The report says that this
is happening with 33-rcX, but there is no comparison to the behaviour
of 32 or earlier kernels on that machine. Instead we have a comparison
of apples and oranges:

> As a comparsion my quad core box has no such issue: (Running 2.6.32-rc7)
> x86_64
> 0: 42 4 1 1 IO-APIC-edge timer
>
> my Lenovo ThinkPad W500 (latest BIOS 3.11) laptop shows the system timer
> flooding the bus (Running 2.6.33-rc1) x86_64
> 0: 66775 70429 IO-APIC-edge timer <-- keeps rising, rapidly

So we look at a quad core desktop machine which probably has no deeper
power states and therefor does not use the broadcast timer and compare
it to a laptop which has deeper power states and needs to use the
broadcast timer, which of course increases the number of IRQ0
events. What a surprise.

Can we please remove this from the regression list unless Shawn
confirms that 32 or earlier kernels do not show that behaviour on the
laptop?

> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14859
> > Subject : System timer firing too much without cause
> > Submitter : Shawn Starr <shawn...@rogers.com>
> > Date : 2009-12-21 19:16 (35 days old)
>
> Continues with -rc5, I really cannot use Dynamic ticks at all, it has to be
> disabled.

Shawn, why can't you use dynamic ticks ? In the bugzilla I just see
that you worry about the IRQ0 interrupts (which are correct and
necessary when the system is in nohz mode) and the extra rescheduling
interrupts. How is the system misbehaving ?

Thanks,

tglx

okias

unread,
Jan 25, 2010, 6:40:03 AM1/25/10
to
Okey, I try without highmem soon.

2010/1/25, Thomas Gleixner <tg...@linutronix.de>:


> Switched to email. Please reply to all instead of using the bugzilla
> interface.
>
>> --- Comment #4 from okias <d.o...@gmail.com> 2010-01-22 10:17:25 ---
>> and it's regression. Now I work on 2.6.32.3 and no problem.
>
> That's a really weird one. The system is 50 min up and running and out
> of the blue it crashes in clockevents_program_event(). This function
> has been called a couple of thousand times before that point.
>
> The only way to crash there is when *dev is pointing into nirwana. dev
> comes from
>
> int tick_program_event(ktime_t expires, int force)
> {
> struct clock_event_device *dev =
> __get_cpu_var(tick_cpu_device).evtdev;
>
> according to the callchain. At this point nothing fiddles with
> tick_cpu_device.evtdev, so I suspect some really nasty memory
> corruption going on.
>
> okias, can you please disable highmem support and verify whether the
> problem persists ?
>
> Thanks,
>
> tglx
>


--
Jabber/XMPP: ok...@isgeek.info
SIP VoIP: sip:17474...@proxy01.sipphone.com

Américo Wang

unread,
Jan 25, 2010, 8:50:01 AM1/25/10
to
On Sun, Jan 24, 2010 at 11:04:41PM +0100, Rafael J. Wysocki wrote:
>This message has been generated automatically as a part of a report
>of recent regressions.
>
>The following bug entry is on the current list of known regressions
>from 2.6.32. Please verify if it still should be listed and let me know
>(either way).
>
>
>Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15137
>Subject : NULL pointer dereference in vlan_skb_recv
>Submitter : Bruno Prémont <bon...@linux-vserver.org>
>Date : 2010-01-23 15:56 (2 days old)
>References : http://marc.info/?l=linux-kernel&m=126426286507497&w=4
>Handled-By : Eric Dumazet <eric.d...@gmail.com>
>Patch : http://patchwork.kernel.org/patch/74999/
> http://patchwork.kernel.org/patch/75002/
>

This one can be closed, patch from Eric is already applied by David Miller.


--
Live like a child, think like the god.

Américo Wang

unread,
Jan 25, 2010, 9:00:01 AM1/25/10
to
On Sun, Jan 24, 2010 at 11:04:33PM +0100, Rafael J. Wysocki wrote:
>This message has been generated automatically as a part of a report
>of recent regressions.
>
>The following bug entry is on the current list of known regressions
>from 2.6.32. Please verify if it still should be listed and let me know
>(either way).
>
>
>Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14924
>Subject : Weird hard hangs when rendering 'some' web-sites in Firefox
>Submitter : David <da...@unsolicited.net>
>Date : 2009-12-21 21:53 (35 days old)
>References : http://marc.info/?l=linux-kernel&m=126143375823340&w=4

Hmm, you have CONFIG_DETECT_SOFTLOCKUP=y, I have no idea what happened,
doing a bisect would be appreciated.

Thanks.

--
Live like a child, think like the god.

Shawn Starr

unread,
Jan 25, 2010, 12:00:02 PM1/25/10
to

Well, this all stems from trying to use Radeon KMS with IRQs on. Doing so I
see system stalls and this is quite noticeable however, I am able to show this
same stall on the quad core with the same GPU.

Right now, it is unclear to me if there is a underlying irq issue or a bug in
the radeon driver code that is showing these stalls. Since the radeon folks -
at the moment - do not think it is a coding problem in their driver

My impression was using dynamic ticks meant ticks were on demand and not
continuous. On the quad core box, with dynamic ticks on, the broadcasts are
not increasing IRQ 0 events this only happens on the laptop.

Thanks,
Shawn.

Thomas Gleixner

unread,
Jan 25, 2010, 12:30:01 PM1/25/10
to
On Mon, 25 Jan 2010, Shawn Starr wrote:
> On Monday 25 January 2010 05:35:50 Thomas Gleixner wrote:
> > Shawn, why can't you use dynamic ticks ? In the bugzilla I just see
> > that you worry about the IRQ0 interrupts (which are correct and
> > necessary when the system is in nohz mode) and the extra rescheduling
> > interrupts. How is the system misbehaving ?
> >

> Well, this all stems from trying to use Radeon KMS with IRQs
> on. Doing so I see system stalls and this is quite noticeable
> however, I am able to show this same stall on the quad core with the

x> same GPU. Right now, it is unclear to me if there is a underlying


> irq issue or a bug in the radeon driver code that is showing these
> stalls. Since the radeon folks - at the moment - do not think it is
> a coding problem in their driver

Does the stall go away, when you disable dynticks ?

> My impression was using dynamic ticks meant ticks were on demand and

Dynamic ticks are providing a continuous tick long as the machine is
busy. When a core becomes idle, we programm the timer to go off at the
next scheduled timer event, if the event is longer away than the next
tick. When the core goes out of idle (due to the timer or some other
event) we restart the tick.

So you see less timer interrupts (IRQ0 + Local timer interrupts)

> not continuous. On the quad core box, with dynamic ticks on, the
> broadcasts are not increasing IRQ 0 events this only happens on the
> laptop.

Right, that is expected as I explained already. Your desktop does not
use deeper power states. Check /proc/acpi/processor/CPU0/power on both
machines to see the difference. You _cannot_ compare a desktop and a
laptop machine and deduce a regression.

The broadcast mechanism is necessary because the local APIC timer
stops in deeper power states. That's a hardware problem. So if the
core goes into a deeper power state then we arm the broadcast timer
which fires on IRQ0 to wake us up. It is a single timer which is used
by all cores in a system to work around this hardware stupidity. It's
named broadcast because it broadcasts the event to the other cores
when necessary. Your desktop does not use deeper power states,
therefor it does not use the broadcast timer either.

So the timer IRQ0 increasing is neither a Linux BUG nor a regression.

Shawn Starr

unread,
Jan 25, 2010, 12:40:03 PM1/25/10
to
On Monday 25 January 2010 12:20:38 Thomas Gleixner wrote:
> On Mon, 25 Jan 2010, Shawn Starr wrote:
> > On Monday 25 January 2010 05:35:50 Thomas Gleixner wrote:
> > > Shawn, why can't you use dynamic ticks ? In the bugzilla I just see
> > > that you worry about the IRQ0 interrupts (which are correct and
> > > necessary when the system is in nohz mode) and the extra rescheduling
> > > interrupts. How is the system misbehaving ?
> >
> > Well, this all stems from trying to use Radeon KMS with IRQs
> > on. Doing so I see system stalls and this is quite noticeable
> > however, I am able to show this same stall on the quad core with the
>
> x> same GPU. Right now, it is unclear to me if there is a underlying
>
> > irq issue or a bug in the radeon driver code that is showing these
> > stalls. Since the radeon folks - at the moment - do not think it is
> > a coding problem in their driver
>
> Does the stall go away, when you disable dynticks ?
>

It does not, no.

> > My impression was using dynamic ticks meant ticks were on demand and
>
> Dynamic ticks are providing a continuous tick long as the machine is
> busy. When a core becomes idle, we programm the timer to go off at the
> next scheduled timer event, if the event is longer away than the next
> tick. When the core goes out of idle (due to the timer or some other
> event) we restart the tick.
>
> So you see less timer interrupts (IRQ0 + Local timer interrupts)

With dynamic ticks on or off, LOC increments rapidly, but I assume that is
normal behavour.

So if none of this really is a kernel issue, I defer it to the radeon folks to
comment further.

Please remove from regression list, I'll close the original bug.

David

unread,
Jan 25, 2010, 2:30:01 PM1/25/10
to
Am�rico Wang wrote:
> On Sun, Jan 24, 2010 at 11:04:33PM +0100, Rafael J. Wysocki wrote:
>
>> This message has been generated automatically as a part of a report
>> of recent regressions.
>>
>> The following bug entry is on the current list of known regressions
>>
> >from 2.6.32. Please verify if it still should be listed and let me know
>
>> (either way).
>>
>>
>> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14924
>> Subject : Weird hard hangs when rendering 'some' web-sites in Firefox
>> Submitter : David <da...@unsolicited.net>
>> Date : 2009-12-21 21:53 (35 days old)
>> References : http://marc.info/?l=linux-kernel&m=126143375823340&w=4
>>
>
> Hmm, you have CONFIG_DETECT_SOFTLOCKUP=y, I have no idea what happened,
> doing a bisect would be appreciated.
>
> Thanks.
>
>

I no longer have the offending hardware, but I think that the issue was
probably corrected by:

cafe6609d6dc0a6a278f9fdbb59ce4d761a35ddd -
drm/radeon/kms: Schedule host path read cache flush through the ring V2

as the offending ATI graphics was indeed R300.

Cheers
David

Rafael J. Wysocki

unread,
Jan 25, 2010, 4:00:02 PM1/25/10
to

OK, closing it right now.

Rafael

Rafael J. Wysocki

unread,
Jan 25, 2010, 4:10:02 PM1/25/10
to
On Monday 25 January 2010, Américo Wang wrote:
> On Sun, Jan 24, 2010 at 11:04:41PM +0100, Rafael J. Wysocki wrote:
> >This message has been generated automatically as a part of a report
> >of recent regressions.
> >
> >The following bug entry is on the current list of known regressions
> >from 2.6.32. Please verify if it still should be listed and let me know
> >(either way).
> >
> >
> >Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15137
> >Subject : NULL pointer dereference in vlan_skb_recv
> >Submitter : Bruno Prémont <bon...@linux-vserver.org>
> >Date : 2010-01-23 15:56 (2 days old)
> >References : http://marc.info/?l=linux-kernel&m=126426286507497&w=4
> >Handled-By : Eric Dumazet <eric.d...@gmail.com>
> >Patch : http://patchwork.kernel.org/patch/74999/
> > http://patchwork.kernel.org/patch/75002/
> >
>
> This one can be closed, patch from Eric is already applied by David Miller.

Is it in the Linus' tree already?

Rafael

Rafael J. Wysocki

unread,
Jan 25, 2010, 4:10:02 PM1/25/10
to
On Monday 25 January 2010, Dave Airlie wrote:
> On Mon, 2010-01-25 at 00:13 +0100, Rafael J. Wysocki wrote:
> > On Sunday 24 January 2010, Johan Hovold wrote:
> > > On Sun, Jan 24, 2010 at 11:04:36PM +0100, Rafael J. Wysocki wrote:
> > > > This message has been generated automatically as a part of a report
> > > > of recent regressions.
> > > >
> > > > The following bug entry is on the current list of known regressions
> > > > from 2.6.32. Please verify if it still should be listed and let me know
> > > > (either way).
> > > >
> > > >
> > > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15038
> > > > Subject : drm/ksm: fbdev blanking regression
> > > > Submitter : Johan Hovold <jho...@gmail.com>
> > > > Date : 2010-01-06 17:00 (19 days old)
> > > > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=731b5a15a3b1474a41c2ca29b4c32b0f21bc852e
> > > > References : http://marc.info/?l=linux-kernel&m=126279726418748&w=4
> > > > Handled-By : James Simmons <jsim...@infradead.org>
> > >
> > > Issue remains in rc5.
> >
> > Thanks for the update.
> >
> > OK, we know what commit broke things, we don't seem to know how to fix it,
> > so perhaps it's time to revert that commit?
>
> Just sent revert of the broken bit to Linus.

Thanks!

Rafael

Rafael J. Wysocki

unread,
Jan 25, 2010, 4:10:02 PM1/25/10
to
On Monday 25 January 2010, David wrote:
> Am�rico Wang wrote:
> > On Sun, Jan 24, 2010 at 11:04:33PM +0100, Rafael J. Wysocki wrote:
> >
> >> This message has been generated automatically as a part of a report
> >> of recent regressions.
> >>
> >> The following bug entry is on the current list of known regressions
> >>
> > >from 2.6.32. Please verify if it still should be listed and let me know
> >
> >> (either way).
> >>
> >>
> >> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14924
> >> Subject : Weird hard hangs when rendering 'some' web-sites in Firefox
> >> Submitter : David <da...@unsolicited.net>
> >> Date : 2009-12-21 21:53 (35 days old)
> >> References : http://marc.info/?l=linux-kernel&m=126143375823340&w=4
> >>
> >
> > Hmm, you have CONFIG_DETECT_SOFTLOCKUP=y, I have no idea what happened,
> > doing a bisect would be appreciated.
> >
> > Thanks.
> >
> >
>
> I no longer have the offending hardware, but I think that the issue was
> probably corrected by:
>
> cafe6609d6dc0a6a278f9fdbb59ce4d761a35ddd -
> drm/radeon/kms: Schedule host path read cache flush through the ring V2
>
> as the offending ATI graphics was indeed R300.

Well, let's assume that's really the case. Closed.

Rafael

Rafael J. Wysocki

unread,
Jan 25, 2010, 4:10:03 PM1/25/10
to
On Monday 25 January 2010, Sid Boyce wrote:
> On 24/01/10 22:04, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> >
> > The following bug entry is on the current list of known regressions
> > from 2.6.32. Please verify if it still should be listed and let me know
> > (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14946
> > Subject : All kernels after 2.6.32-git10 show only 1 CPU
> > Submitter : Sid Boyce <sbo...@blueyonder.co.uk>
> > Date : 2009-12-23 16:55 (33 days old)
> > References : http://marc.info/?l=linux-kernel&m=126158734326801&w=4
> >
> >
> >
>
> Definitely fixed in 2.6.33-rc4, thanks.

Thanks, closed.

Rafael

David Miller

unread,
Jan 25, 2010, 4:40:03 PM1/25/10
to
From: "Rafael J. Wysocki" <r...@sisk.pl>
Date: Mon, 25 Jan 2010 22:07:52 +0100

> On Monday 25 January 2010, Am�rico Wang wrote:
>> On Sun, Jan 24, 2010 at 11:04:41PM +0100, Rafael J. Wysocki wrote:
>> >This message has been generated automatically as a part of a report
>> >of recent regressions.
>> >
>> >The following bug entry is on the current list of known regressions
>> >from 2.6.32. Please verify if it still should be listed and let me know
>> >(either way).
>> >
>> >
>> >Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15137
>> >Subject : NULL pointer dereference in vlan_skb_recv

>> >Submitter : Bruno Pr�mont <bon...@linux-vserver.org>


>> >Date : 2010-01-23 15:56 (2 days old)
>> >References : http://marc.info/?l=linux-kernel&m=126426286507497&w=4
>> >Handled-By : Eric Dumazet <eric.d...@gmail.com>
>> >Patch : http://patchwork.kernel.org/patch/74999/
>> > http://patchwork.kernel.org/patch/75002/
>> >
>>
>> This one can be closed, patch from Eric is already applied by David Miller.
>
> Is it in the Linus' tree already?

No, but it will be there soon, I'll push it to him today.
:-)

Rafael J. Wysocki

unread,
Jan 25, 2010, 5:00:02 PM1/25/10
to
On Monday 25 January 2010, David Miller wrote:
> From: "Rafael J. Wysocki" <r...@sisk.pl>
> Date: Mon, 25 Jan 2010 22:07:52 +0100
>
> > On Monday 25 January 2010, Am�rico Wang wrote:
> >> On Sun, Jan 24, 2010 at 11:04:41PM +0100, Rafael J. Wysocki wrote:
> >> >This message has been generated automatically as a part of a report
> >> >of recent regressions.
> >> >
> >> >The following bug entry is on the current list of known regressions
> >> >from 2.6.32. Please verify if it still should be listed and let me know
> >> >(either way).
> >> >
> >> >
> >> >Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15137
> >> >Subject : NULL pointer dereference in vlan_skb_recv
> >> >Submitter : Bruno Pr�mont <bon...@linux-vserver.org>
> >> >Date : 2010-01-23 15:56 (2 days old)
> >> >References : http://marc.info/?l=linux-kernel&m=126426286507497&w=4
> >> >Handled-By : Eric Dumazet <eric.d...@gmail.com>
> >> >Patch : http://patchwork.kernel.org/patch/74999/
> >> > http://patchwork.kernel.org/patch/75002/
> >> >
> >>
> >> This one can be closed, patch from Eric is already applied by David Miller.
> >
> > Is it in the Linus' tree already?
>
> No, but it will be there soon, I'll push it to him today.
> :-)

Thanks!

Américo Wang

unread,
Jan 25, 2010, 10:10:01 PM1/25/10
to
On Tue, Jan 26, 2010 at 3:26 AM, David <da...@unsolicited.net> wrote:
> Américo Wang wrote:
>> On Sun, Jan 24, 2010 at 11:04:33PM +0100, Rafael J. Wysocki wrote:
>>
>>> This message has been generated automatically as a part of a report
>>> of recent regressions.
>>>
>>> The following bug entry is on the current list of known regressions
>>>
>> >from 2.6.32.  Please verify if it still should be listed and let me know
>>
>>> (either way).
>>>
>>>
>>> Bug-Entry    : http://bugzilla.kernel.org/show_bug.cgi?id=14924
>>> Subject              : Weird hard hangs when rendering 'some' web-sites in Firefox
>>> Submitter    : David <da...@unsolicited.net>
>>> Date         : 2009-12-21 21:53 (35 days old)
>>> References   : http://marc.info/?l=linux-kernel&m=126143375823340&w=4
>>>
>>
>> Hmm, you have CONFIG_DETECT_SOFTLOCKUP=y, I have no idea what happened,
>> doing a bisect would be appreciated.
>>
>> Thanks.
>>
>>
>
> I no longer have the offending hardware, but I think that the issue was
> probably corrected by:
>
>    cafe6609d6dc0a6a278f9fdbb59ce4d761a35ddd -
> drm/radeon/kms: Schedule host path read cache flush through the ring V2
>
> as the offending ATI graphics was indeed R300.
>

Ok, thanks!

Jeff Garrett

unread,
Jan 26, 2010, 2:30:01 AM1/26/10
to
On Sun, Jan 24, 2010 at 11:04:38PM +0100, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
>
> The following bug entry is on the current list of known regressions
> from 2.6.32. Please verify if it still should be listed and let me know
> (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15124
> Subject : PCI host bridge windows ignored (works with pci=use_crs)
> Submitter : Jeff Garrett <je...@jgarrett.org>
> Date : 2010-01-13 5:37 (12 days old)
> References : http://marc.info/?l=linux-kernel&m=126336296600307&w=4
> Handled-By : Yinghai Lu <yin...@kernel.org>
> Bjorn Helgaas <bjorn....@hp.com>

This regression should still be listed. No patch to test yet.

-Jeff Garrett

Rafael J. Wysocki

unread,
Jan 26, 2010, 7:50:02 AM1/26/10
to
On Tuesday 26 January 2010, Jeff Garrett wrote:
> On Sun, Jan 24, 2010 at 11:04:38PM +0100, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> >
> > The following bug entry is on the current list of known regressions
> > from 2.6.32. Please verify if it still should be listed and let me know
> > (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15124
> > Subject : PCI host bridge windows ignored (works with pci=use_crs)
> > Submitter : Jeff Garrett <je...@jgarrett.org>
> > Date : 2010-01-13 5:37 (12 days old)
> > References : http://marc.info/?l=linux-kernel&m=126336296600307&w=4
> > Handled-By : Yinghai Lu <yin...@kernel.org>
> > Bjorn Helgaas <bjorn....@hp.com>
>
> This regression should still be listed. No patch to test yet.

Thanks for the update.

IIRC, we already know how to fix this ...

Rafael

okias

unread,
Jan 26, 2010, 9:10:01 AM1/26/10
to
Lastest git without HIGHMEM look good. If no problems occur, then I
try use kernel with HIGHMEM and see what change...

2010/1/25, okias <d.o...@gmail.com>:

Bjorn Helgaas

unread,
Jan 26, 2010, 12:40:01 PM1/26/10
to
On Tuesday 26 January 2010 05:48:59 am Rafael J. Wysocki wrote:
> On Tuesday 26 January 2010, Jeff Garrett wrote:
> > On Sun, Jan 24, 2010 at 11:04:38PM +0100, Rafael J. Wysocki wrote:
> > > The following bug entry is on the current list of known regressions
> > > from 2.6.32. Please verify if it still should be listed and let me know
> > > (either way).
> > >
> > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15124
> > > Subject : PCI host bridge windows ignored (works with pci=use_crs)
> > > Submitter : Jeff Garrett <je...@jgarrett.org>
> > > Date : 2010-01-13 5:37 (12 days old)
> > > References : http://marc.info/?l=linux-kernel&m=126336296600307&w=4
> > > Handled-By : Yinghai Lu <yin...@kernel.org>
> > > Bjorn Helgaas <bjorn....@hp.com>
> >
> > This regression should still be listed. No patch to test yet.
> ...

> IIRC, we already know how to fix this ...

As far as I know, we do NOT know how to fix this.

This regression occurred when we added intel_bus.c because it's not
yet smart enough to determine the correct host bridge apertures.
Here's what it thinks the bridge aperture is and the Radeon BAR:

IOH bus: 00 index 1 mmio: [e0000000, fdffffff]
pci 0000:04:00.0: reg 10: [mem 0xd0000000-0xdfffffff 64bit pref]

The IOH aperture is obviously not big enough to cover the Radeon BAR.
But the host bridge _CRS tells us this:

pci_root PNP0A08:00: host bridge window [mem 0xc0000000-0xdfffffff]
pci_root PNP0A08:00: host bridge window [mem 0xf0000000-0xfed8ffff]

which IS big enough, and we know the bridge is in fact forwarding the
[mem 0xd0000000-0xdfffffff 64bit pref] region, because the Radeon works
when Jeff boots with "pci=use_crs".

I'm quite concerned about this for .33 because I don't think Jeff's
configuration (Dell desktop with Intel x58 and large graphics device)
is unusual.

The benefit of intel_bus.c is on machines with multiple IOHs, where we
need to figure out which address ranges go to which IOHs so we can
program downstream devices correctly. But even there, _CRS should give
us the information we need, so "pci=use_crs" should make these machines
work.

I think we should remove intel_bus.c before .33. It's breaking boxes
and we don't know how to fix it. Even if we do find out how to fix it,
I think we should move toward using _CRS instead, because that's what
Windows uses and it's an easy way for the firmware to tell us about
platform quirks.

Bjorn

Rafael J. Wysocki

unread,
Jan 26, 2010, 1:10:02 PM1/26/10
to

Perhaps it would be sufficient to make pci=use_crs the default and leave the
option to use intel_bus.c for whoever needs that?

Rafael

Linus Torvalds

unread,
Jan 26, 2010, 1:20:02 PM1/26/10
to

On Tue, 26 Jan 2010, Bjorn Helgaas wrote:
>
> which IS big enough, and we know the bridge is in fact forwarding the
> [mem 0xd0000000-0xdfffffff 64bit pref] region, because the Radeon works
> when Jeff boots with "pci=use_crs".

I bet it's a subtractive decode thing. Sure, it could be just another
undocumented range register (does anybody have the datasheet for that
thing?) but Intel tends to often have subtractive decode.

That system in question has three PCI express root ports, but two of them
have IO and memory disabled according to the lspci info. So maybe it's as
simple as that "I/O Hub PCI Express Root Port 7" just catching anything
that nobody else does, and the single IOH host chip doing the same?

> I think we should remove intel_bus.c before .33. It's breaking boxes
> and we don't know how to fix it. Even if we do find out how to fix it,
> I think we should move toward using _CRS instead, because that's what
> Windows uses and it's an easy way for the firmware to tell us about
> platform quirks.

I suspect that for 33 it is indeed best to just revert. But somebody is
bound to have information on how the actual hardware works. Yinghai?

Linus

Jesse Barnes

unread,
Jan 26, 2010, 1:20:02 PM1/26/10
to
On Tue, 26 Jan 2010 19:02:13 +0100

"Rafael J. Wysocki" <r...@sisk.pl> wrote:
> > I'm quite concerned about this for .33 because I don't think Jeff's
> > configuration (Dell desktop with Intel x58 and large graphics device)
> > is unusual.
> >
> > The benefit of intel_bus.c is on machines with multiple IOHs, where we
> > need to figure out which address ranges go to which IOHs so we can
> > program downstream devices correctly. But even there, _CRS should give
> > us the information we need, so "pci=use_crs" should make these machines
> > work.
> >
> > I think we should remove intel_bus.c before .33. It's breaking boxes
> > and we don't know how to fix it. Even if we do find out how to fix it,
> > I think we should move toward using _CRS instead, because that's what
> > Windows uses and it's an easy way for the firmware to tell us about
> > platform quirks.
>
> Perhaps it would be sufficient to make pci=use_crs the default and leave the
> option to use intel_bus.c for whoever needs that?

We can't make use_crs the default w/o some more _CRS handling fixes
(some firmwares have large lists we need to handle).

We can disable intel_bus.c though. Yinghai, I'm inclined against the
intel_bus.c approach at this point. It seems unlikely we'll ever keep
it up to date with new bridges, since its approach differs so much from
how things are done in the Windows world, where the firmware provides
a list of resources. We'll always be playing catch up, and will
probably be behind the firmware most of the time since the docs with
the necessary info likely won't be public most of the time.

For 2.6.33 I'd like a minimal fix though, can you disable it for all
but the multi-IOH case perhaps?

--
Jesse Barnes, Intel Open Source Technology Center

Linus Torvalds

unread,
Jan 26, 2010, 1:30:01 PM1/26/10
to

On Tue, 26 Jan 2010, Rafael J. Wysocki wrote:
>
> Perhaps it would be sufficient to make pci=use_crs the default and leave the
> option to use intel_bus.c for whoever needs that?

Well, 'use_crs' broke other machines. See:

http://lkml.org/lkml/2009/6/23/715

but maybe that is all fixed..

Linus

Yinghai Lu

unread,
Jan 26, 2010, 1:30:02 PM1/26/10
to

ok, we have one patch to enable that only with multi-IOH case.

YH

Yinghai Lu

unread,
Jan 26, 2010, 1:30:02 PM1/26/10
to
On 01/26/2010 10:16 AM, Linus Torvalds wrote:
>
>
> On Tue, 26 Jan 2010, Bjorn Helgaas wrote:
>>
>> which IS big enough, and we know the bridge is in fact forwarding the
>> [mem 0xd0000000-0xdfffffff 64bit pref] region, because the Radeon works
>> when Jeff boots with "pci=use_crs".
>
> I bet it's a subtractive decode thing. Sure, it could be just another
> undocumented range register (does anybody have the datasheet for that
> thing?) but Intel tends to often have subtractive decode.
>
> That system in question has three PCI express root ports, but two of them
> have IO and memory disabled according to the lspci info. So maybe it's as
> simple as that "I/O Hub PCI Express Root Port 7" just catching anything
> that nobody else does, and the single IOH host chip doing the same?
>
>> I think we should remove intel_bus.c before .33. It's breaking boxes
>> and we don't know how to fix it. Even if we do find out how to fix it,
>> I think we should move toward using _CRS instead, because that's what
>> Windows uses and it's an easy way for the firmware to tell us about
>> platform quirks.
>
> I suspect that for 33 it is indeed best to just revert. But somebody is
> bound to have information on how the actual hardware works. Yinghai?

I have asked intel if there is any bit that could be enabled the routing.
there is no info about for their documentations.

Yinghai

Jesse Barnes

unread,
Jan 26, 2010, 1:40:02 PM1/26/10
to
On Tue, 26 Jan 2010 10:21:29 -0800
Yinghai Lu <yin...@kernel.org> wrote:

> On 01/26/2010 10:16 AM, Linus Torvalds wrote:
> >
> >
> > On Tue, 26 Jan 2010, Bjorn Helgaas wrote:
> >>
> >> which IS big enough, and we know the bridge is in fact forwarding the
> >> [mem 0xd0000000-0xdfffffff 64bit pref] region, because the Radeon works
> >> when Jeff boots with "pci=use_crs".
> >
> > I bet it's a subtractive decode thing. Sure, it could be just another
> > undocumented range register (does anybody have the datasheet for that
> > thing?) but Intel tends to often have subtractive decode.
> >
> > That system in question has three PCI express root ports, but two of them
> > have IO and memory disabled according to the lspci info. So maybe it's as
> > simple as that "I/O Hub PCI Express Root Port 7" just catching anything
> > that nobody else does, and the single IOH host chip doing the same?
> >
> >> I think we should remove intel_bus.c before .33. It's breaking boxes
> >> and we don't know how to fix it. Even if we do find out how to fix it,
> >> I think we should move toward using _CRS instead, because that's what
> >> Windows uses and it's an easy way for the firmware to tell us about
> >> platform quirks.
> >
> > I suspect that for 33 it is indeed best to just revert. But somebody is
> > bound to have information on how the actual hardware works. Yinghai?
>
> I have asked intel if there is any bit that could be enabled the routing.
> there is no info about for their documentations.

I could probably dig something up in our confidential database, but this
is the main problem with intel_bus.c. It'll always be behind with _CRS
provides. Sure _CRS may be wrong sometimes, but it'll always work well
enough to bring Windows up, so we ought not to ignore it.

The underlying problems with our _CRS support still aren't fixed
though, so switching that on for 2.6.33 isn't an option.

--
Jesse Barnes, Intel Open Source Technology Center

Yinghai Lu

unread,
Jan 26, 2010, 6:00:02 PM1/26/10
to
On 01/26/2010 10:17 AM, Jesse Barnes wrote:

>
> For 2.6.33 I'd like a minimal fix though, can you disable it for all
> but the multi-IOH case perhaps?
>

please check,

[PATCH] x86/pci: don't use ioh resource if only have one ioh

some system could use reosurce out of IOH resources when only one ioh is there.

could be BIOS have wrong IOH resources and not enable them.

Signed-off-by: Yinghai Lu <yin...@kernel.org>

---
arch/x86/pci/intel_bus.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 86 insertions(+)

Index: linux-2.6/arch/x86/pci/intel_bus.c
===================================================================
--- linux-2.6.orig/arch/x86/pci/intel_bus.c
+++ linux-2.6/arch/x86/pci/intel_bus.c
@@ -7,9 +7,11 @@
#include <linux/pci.h>
#include <linux/init.h>
#include <asm/pci_x86.h>
+#include <asm/pci-direct.h>

#include "bus_numa.h"

+static int nr_ioh;
static inline void print_ioh_resources(struct pci_root_info *info)
{
int res_num;
@@ -49,6 +51,9 @@ static void __devinit pci_root_bus_res(s
u64 mmioh_base, mmioh_end;
int bus_base, bus_end;

+ if (nr_ioh < 2)
+ return;
+
/* some sys doesn't get mmconf enabled */
if (dev->cfg_size < 0x120)
return;
@@ -92,3 +97,84 @@ static void __devinit pci_root_bus_res(s

/* intel IOH */
DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x342e, pci_root_bus_res);
+
+static void __init count_ioh(int num, int slot, int func)
+{
+ nr_ioh++;
+}
+
+struct pci_check_probe {
+ u32 vendor;
+ u32 device;
+ void (*f)(int num, int slot, int func);
+};
+
+static struct pci_check_probe early_qrk[] __initdata = {
+ { PCI_VENDOR_ID_INTEL, 0x342e, count_ioh },
+ {}
+};
+
+static void __init early_check_pci_dev(int num, int slot, int func)
+{
+ u16 vendor;
+ u16 device;
+ int i;
+
+ vendor = read_pci_config_16(num, slot, func, PCI_VENDOR_ID);
+ device = read_pci_config_16(num, slot, func, PCI_DEVICE_ID);
+
+ for (i = 0; early_qrk[i].f != NULL; i++) {
+ if (((early_qrk[i].vendor == PCI_ANY_ID) ||
+ (early_qrk[i].vendor == vendor)) &&
+ ((early_qrk[i].device == PCI_ANY_ID) ||
+ (early_qrk[i].device == device)))
+ early_qrk[i].f(num, slot, func);
+ }
+}
+
+static void __init early_check_pci_devs(void)
+{
+ unsigned bus, slot, func;
+
+ if (!early_pci_allowed())
+ return;
+
+ for (bus = 0; bus < 256; bus++) {
+ for (slot = 0; slot < 32; slot++) {
+ for (func = 0; func < 8; func++) {
+ u32 class;
+ u8 type;
+
+ class = read_pci_config(bus, slot, func,
+ PCI_CLASS_REVISION);
+ if (class == 0xffffffff)
+ continue;
+
+ early_check_pci_dev(bus, slot, func);
+
+ if (func == 0) {
+ type = read_pci_config_byte(bus, slot,
+ func,
+ PCI_HEADER_TYPE);
+ if (!(type & 0x80))
+ break;
+ }
+ }
+ }
+ }
+}
+
+static int __init intel_postcore_init(void)
+{
+ if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
+ return 0;
+
+ early_check_pci_devs();
+
+ if (nr_ioh)
+ printk(KERN_DEBUG "pci: found %d IOH\n", nr_ioh);
+
+ return 0;
+}
+postcore_initcall(intel_postcore_init);
+

Bjorn Helgaas

unread,
Jan 27, 2010, 11:50:01 AM1/27/10
to
On Tuesday 26 January 2010 03:57:31 pm Yinghai Lu wrote:
> [PATCH] x86/pci: don't use ioh resource if only have one ioh
>
> some system could use reosurce out of IOH resources when only one ioh is there.
>
> could be BIOS have wrong IOH resources and not enable them.

The subtractive decode theory makes sense and would explain what's
happening, but I don't like this patch.

If we assume that this really is a subtractive decode issue, this
patch approaches it the wrong way. We need to know whether a
particular host bridge is configured for subtractive decode. This
patch tests whether we have more than one host bridge, which is quite
a different question.

Imagine these system configurations:

1) a single host bridge with subtractive decode
2) a single host bridge with only positive decode
3) multiple host bridges with subtractive decode enabled on one
4) multiple host bridges with only positive decode

This patch will break if we encounter configs 2 or 3. In config 2,
this patch assumes the bridge performs subtractive decode, so we
think the bridge forwards more address space than it actually does.
If we try to use that address space, the device will never see the
accesses. In config 3, this patch assumes there's no subtractive
decode, so we would see Jeff's problem all over again.

For configs 3 and 4, there might be a single host bridge in domain 0,
with the others in different domains. This patch would find only one
host bridge (the one in domain 0), so we would wrongly assume that ALL
the host bridges use subtractive decode, which is obviously a disaster.

Bjorn

Jesse Barnes

unread,
Jan 27, 2010, 12:00:06 PM1/27/10
to
On Wed, 27 Jan 2010 09:45:15 -0700
Bjorn Helgaas <bjorn....@hp.com> wrote:

> On Tuesday 26 January 2010 03:57:31 pm Yinghai Lu wrote:
> > [PATCH] x86/pci: don't use ioh resource if only have one ioh
> >
> > some system could use reosurce out of IOH resources when only one ioh is there.
> >
> > could be BIOS have wrong IOH resources and not enable them.
>
> The subtractive decode theory makes sense and would explain what's
> happening, but I don't like this patch.
>
> If we assume that this really is a subtractive decode issue, this
> patch approaches it the wrong way. We need to know whether a
> particular host bridge is configured for subtractive decode. This
> patch tests whether we have more than one host bridge, which is quite
> a different question.
>
> Imagine these system configurations:
>
> 1) a single host bridge with subtractive decode
> 2) a single host bridge with only positive decode
> 3) multiple host bridges with subtractive decode enabled on one
> 4) multiple host bridges with only positive decode
>
> This patch will break if we encounter configs 2 or 3. In config 2,
> this patch assumes the bridge performs subtractive decode, so we
> think the bridge forwards more address space than it actually does.
> If we try to use that address space, the device will never see the
> accesses. In config 3, this patch assumes there's no subtractive
> decode, so we would see Jeff's problem all over again.

Right, but OTOH:
- multiple IOH has already been tested with the intel_bus.c code
- we want to move to using _CRS data in these cases instead

So do you have any objection to applying this patch for 2.6.33 and then
moving away from intel_bus.c in .34 (assuming we can get _CRS working
well on the same machines where intel_bus.c was needed)?

--
Jesse Barnes, Intel Open Source Technology Center

Jesse Barnes

unread,
Jan 27, 2010, 1:00:02 PM1/27/10
to
On Sun, 24 Jan 2010 23:04:39 +0100 (CET)

"Rafael J. Wysocki" <r...@sisk.pl> wrote:

> This message has been generated automatically as a part of a report
> of recent regressions.
>

> The following bug entry is on the current list of known regressions
> from 2.6.32. Please verify if it still should be listed and let me know
> (either way).
>
>

> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15129
> Subject : [drm:i915_gem_execbuffer] *ERROR* i915_gem_do_execbuffer returns -512
> Submitter : Miles Lane <miles...@gmail.com>
> Date : 2010-01-14 23:18 (11 days old)
> References : http://lkml.org/lkml/2010/1/14/570
> Handled-By : Chris Wilson <ch...@chris-wilson.co.uk>
>

I think this message has been rightly killed. Getting -ERESTARTSYS from
this function is perfectly normal, so we shouldn't bother printing a
message about it.

Jesse Barnes

unread,
Jan 27, 2010, 1:00:02 PM1/27/10
to
On Sun, 24 Jan 2010 23:04:37 +0100 (CET)

"Rafael J. Wysocki" <r...@sisk.pl> wrote:

> This message has been generated automatically as a part of a report
> of recent regressions.
>
> The following bug entry is on the current list of known regressions
> from 2.6.32. Please verify if it still should be listed and let me know
> (either way).
>
>

> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15043
> Subject : Display goes off with i915.powersave=1
> Submitter : Soeren Sonnenburg <so...@debian.org>
> Date : 2010-01-10 20:09 (15 days old)
> References : http://marc.info/?l=linux-kernel&m=126315457519505&w=4

If this isn't fixed yet, I hope David's patch fixes it. See "[Bug
#14897] i915: Commit 0e442c60 causes flickering".

Bjorn Helgaas

unread,
Jan 27, 2010, 3:50:04 PM1/27/10
to

Without intel_bus.c, we essentially assume config 1 all the time.
If we keep intel_bus.c and this patch for .33, things should work
for configs 1 and 4. Adding support for config 4 is good.

The bad part is that for config 4, intel_bus.c covers up any defects
in the _CRS or the Linux code that interprets it. The reason Yinghai
added intel_bus.c in the first place was to work around a defect in
this area[1]. Keeping it will make it harder to fix the underlying
issue that keeps us from turning on _CRS for that box.

Bjorn

[1] http://lkml.org/lkml/2009/10/6/371

Jesse Barnes

unread,
Jan 27, 2010, 4:00:02 PM1/27/10
to
On Wed, 27 Jan 2010 12:50:12 -0800 (PST)
Linus Torvalds <torv...@linux-foundation.org> wrote:

>
>
> On Wed, 27 Jan 2010, Bjorn Helgaas wrote:
> >
> > Without intel_bus.c, we essentially assume config 1 all the time.
> > If we keep intel_bus.c and this patch for .33, things should work
> > for configs 1 and 4. Adding support for config 4 is good.
>

> Quite frankly, is there any major downside to just disabling/removing
> intel_bus.c for 2.6.33? If we're not planning on having it in the long run
> anyway - or even if we are, but we can't be really happy about the state
> of it as it would be in 2.6.33, not using it at all seems to be the
> smaller headache.
>
> The machines that it helps are also the machines where you can fix things
> up with 'use_csr', no? And they are pretty rare, and they didn't use to
> work without that use_csr in 2.6.32 either, so it's not even a regression.
>
> Am I missing something?

No that's the plan. intel_bus.c was a good effort, but it's just too
different from what Windows does, and it'll always be behind. We'll
disable it for 2.6.33 and try again to move to _CRS in 2.6.34 (but
fixing the problem with large numbers of _CRS resources this time).

--
Jesse Barnes, Intel Open Source Technology Center

Linus Torvalds

unread,
Jan 27, 2010, 4:00:02 PM1/27/10
to

On Wed, 27 Jan 2010, Bjorn Helgaas wrote:
>

> Without intel_bus.c, we essentially assume config 1 all the time.
> If we keep intel_bus.c and this patch for .33, things should work
> for configs 1 and 4. Adding support for config 4 is good.

Quite frankly, is there any major downside to just disabling/removing

intel_bus.c for 2.6.33? If we're not planning on having it in the long run
anyway - or even if we are, but we can't be really happy about the state
of it as it would be in 2.6.33, not using it at all seems to be the
smaller headache.

The machines that it helps are also the machines where you can fix things
up with 'use_csr', no? And they are pretty rare, and they didn't use to
work without that use_csr in 2.6.32 either, so it's not even a regression.

Am I missing something?

Linus

Jesse Barnes

unread,
Jan 27, 2010, 4:10:01 PM1/27/10
to
On Wed, 27 Jan 2010 12:59:05 -0800
Jesse Barnes <jba...@virtuousgeek.org> wrote:

> On Wed, 27 Jan 2010 12:50:12 -0800 (PST)
> Linus Torvalds <torv...@linux-foundation.org> wrote:
>
> >
> >
> > On Wed, 27 Jan 2010, Bjorn Helgaas wrote:
> > >
> > > Without intel_bus.c, we essentially assume config 1 all the time.
> > > If we keep intel_bus.c and this patch for .33, things should work
> > > for configs 1 and 4. Adding support for config 4 is good.
> >
> > Quite frankly, is there any major downside to just disabling/removing
> > intel_bus.c for 2.6.33? If we're not planning on having it in the long run
> > anyway - or even if we are, but we can't be really happy about the state
> > of it as it would be in 2.6.33, not using it at all seems to be the
> > smaller headache.
> >
> > The machines that it helps are also the machines where you can fix things
> > up with 'use_csr', no? And they are pretty rare, and they didn't use to
> > work without that use_csr in 2.6.32 either, so it's not even a regression.
> >
> > Am I missing something?
>
> No that's the plan. intel_bus.c was a good effort, but it's just too
> different from what Windows does, and it'll always be behind. We'll
> disable it for 2.6.33 and try again to move to _CRS in 2.6.34 (but
> fixing the problem with large numbers of _CRS resources this time).

Should say "disable it for 2.6.33 for all but multi-IOH configs", which
seem to be fairly rare anyway, and were what intel_bus.c was designed
to accommodate. On the one machine that motivated it, use_crs was
broken (though it likely isn't now), so it seems the safest route.

Bjorn Helgaas

unread,
Jan 27, 2010, 4:10:02 PM1/27/10
to
On Wednesday 27 January 2010 01:50:12 pm Linus Torvalds wrote:
>
> On Wed, 27 Jan 2010, Bjorn Helgaas wrote:
> >
> > Without intel_bus.c, we essentially assume config 1 all the time.
> > If we keep intel_bus.c and this patch for .33, things should work
> > for configs 1 and 4. Adding support for config 4 is good.
>
> Quite frankly, is there any major downside to just disabling/removing
> intel_bus.c for 2.6.33? If we're not planning on having it in the long run
> anyway - or even if we are, but we can't be really happy about the state
> of it as it would be in 2.6.33, not using it at all seems to be the
> smaller headache.
>
> The machines that it helps are also the machines where you can fix things
> up with 'use_csr', no? And they are pretty rare, and they didn't use to
> work without that use_csr in 2.6.32 either, so it's not even a regression.
>
> Am I missing something?

Only that when we added intel_bus.c, Yinghai reported that the reason
was because a machine had a broken _CRS, so "pci=use_crs" wouldn't help.

At the time, Windows hadn't been brought up on that box. My
speculation is that by now, they've done that bringup and probably
fixed the _CRS issue, so it might work now.

If that's the case, we could drop intel_bus.c from .33 and just use
"pci=use_crs" on those boxes until we can figure out how to turn it
on automatically.

Bjorn

Rafael J. Wysocki

unread,
Jan 27, 2010, 4:10:02 PM1/27/10
to
On Wednesday 27 January 2010, Jesse Barnes wrote:
> On Sun, 24 Jan 2010 23:04:39 +0100 (CET)
> "Rafael J. Wysocki" <r...@sisk.pl> wrote:
>
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> >
> > The following bug entry is on the current list of known regressions
> > from 2.6.32. Please verify if it still should be listed and let me know
> > (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15129
> > Subject : [drm:i915_gem_execbuffer] *ERROR* i915_gem_do_execbuffer returns -512
> > Submitter : Miles Lane <miles...@gmail.com>
> > Date : 2010-01-14 23:18 (11 days old)
> > References : http://lkml.org/lkml/2010/1/14/570
> > Handled-By : Chris Wilson <ch...@chris-wilson.co.uk>
> >
>
> I think this message has been rightly killed. Getting -ERESTARTSYS from
> this function is perfectly normal, so we shouldn't bother printing a
> message about it.

OK, closing this one.

Rafael

Yinghai Lu

unread,
Jan 27, 2010, 6:40:02 PM1/27/10
to
On 01/27/2010 01:03 PM, Bjorn Helgaas wrote:
> On Wednesday 27 January 2010 01:50:12 pm Linus Torvalds wrote:
>>
>> On Wed, 27 Jan 2010, Bjorn Helgaas wrote:
>>>
>>> Without intel_bus.c, we essentially assume config 1 all the time.
>>> If we keep intel_bus.c and this patch for .33, things should work
>>> for configs 1 and 4. Adding support for config 4 is good.
>>
>> Quite frankly, is there any major downside to just disabling/removing
>> intel_bus.c for 2.6.33? If we're not planning on having it in the long run
>> anyway - or even if we are, but we can't be really happy about the state
>> of it as it would be in 2.6.33, not using it at all seems to be the
>> smaller headache.
>>
>> The machines that it helps are also the machines where you can fix things
>> up with 'use_csr', no? And they are pretty rare, and they didn't use to
>> work without that use_csr in 2.6.32 either, so it's not even a regression.
>>
>> Am I missing something?
>
> Only that when we added intel_bus.c, Yinghai reported that the reason
> was because a machine had a broken _CRS, so "pci=use_crs" wouldn't help.
>
> At the time, Windows hadn't been brought up on that box. My
> speculation is that by now, they've done that bringup and probably
> fixed the _CRS issue, so it might work now.
>
> If that's the case, we could drop intel_bus.c from .33 and just use
> "pci=use_crs" on those boxes until we can figure out how to turn it
> on automatically.

BIOS fixed that problem already. but
1. how to turn that pci=use_crs for that box automatically ?
how about our other kind of boxes?
2. how about when apci is disabled?

let's apply that patch at first, and wait for intel give us info about which bit is used to enable routing set up.

YH

Jesse Barnes

unread,
Jan 27, 2010, 8:40:02 PM1/27/10
to
On Tue, 26 Jan 2010 14:57:31 -0800
Yinghai Lu <yin...@kernel.org> wrote:

> On 01/26/2010 10:17 AM, Jesse Barnes wrote:
>
> >
> > For 2.6.33 I'd like a minimal fix though, can you disable it for all
> > but the multi-IOH case perhaps?
> >
> please check,
>
> [PATCH] x86/pci: don't use ioh resource if only have one ioh
>
> some system could use reosurce out of IOH resources when only one ioh is there.
>
> could be BIOS have wrong IOH resources and not enable them.
>
> Signed-off-by: Yinghai Lu <yin...@kernel.org>

I applied this one to my for-linus branch. Jeff can you confirm it
works for you? I'd like to push it to Linus tomorrow.

Thanks,


--
Jesse Barnes, Intel Open Source Technology Center

Yinghai Lu

unread,
Jan 27, 2010, 8:40:02 PM1/27/10
to

will try to produce one patch to handle subtract decoding for legacy IOH aka the one with ESI.

the structure could be something like amd_bus.c, need to do it early, but it need after pci_arch_init to get mmconf.

YH

Linus Torvalds

unread,
Jan 27, 2010, 9:00:02 PM1/27/10
to

On Tue, 26 Jan 2010, Yinghai Lu wrote:
>
> [PATCH] x86/pci: don't use ioh resource if only have one ioh

Please, no.

This patch is too ugly to live.

And it's totally unacceptable to probe every single possible PCI device
for something like this.

If we don't know enough about the hardware workings of those Intel bridges
to know when they are active and how they decode things, then please let's
just disable intel_bus.c entirely.

There's no excuse for hacky tests like this.

Linus

Jesse Barnes

unread,
Jan 27, 2010, 10:30:02 PM1/27/10
to
On Wed, 27 Jan 2010 17:50:17 -0800 (PST)
Linus Torvalds <torv...@linux-foundation.org> wrote:

>
>
> On Tue, 26 Jan 2010, Yinghai Lu wrote:
> >
> > [PATCH] x86/pci: don't use ioh resource if only have one ioh
>
> Please, no.
>
> This patch is too ugly to live.
>
> And it's totally unacceptable to probe every single possible PCI device
> for something like this.
>
> If we don't know enough about the hardware workings of those Intel bridges
> to know when they are active and how they decode things, then please let's
> just disable intel_bus.c entirely.
>
> There's no excuse for hacky tests like this.

Ok, we'll just kill it entirely then. I'll send a patch tomorrow
unless Yinghai beats me to it.

--
Jesse Barnes, Intel Open Source Technology Center

Jeff Garrett

unread,
Jan 27, 2010, 11:10:02 PM1/27/10
to
On Wed, Jan 27, 2010 at 07:24:09PM -0800, Jesse Barnes wrote:
> On Wed, 27 Jan 2010 17:50:17 -0800 (PST)
> Linus Torvalds <torv...@linux-foundation.org> wrote:
> > On Tue, 26 Jan 2010, Yinghai Lu wrote:
> > >
> > > [PATCH] x86/pci: don't use ioh resource if only have one ioh
> >
> > Please, no.
> >
> > This patch is too ugly to live.
> >
> > And it's totally unacceptable to probe every single possible PCI device
> > for something like this.
> >
> > If we don't know enough about the hardware workings of those Intel bridges
> > to know when they are active and how they decode things, then please let's
> > just disable intel_bus.c entirely.
> >
> > There's no excuse for hacky tests like this.
>
> Ok, we'll just kill it entirely then. I'll send a patch tomorrow
> unless Yinghai beats me to it.

What about something like this (works for me, without pci=use_crs)?

---
Remove intel_bus.c Intel-specific PCI/IOH logic

Signed-off-by: Jeff Garrett <je...@jgarrett.org>
---
arch/x86/pci/Makefile | 2 +-
arch/x86/pci/intel_bus.c | 94 ----------------------------------------------
2 files changed, 1 insertions(+), 95 deletions(-)

diff --git a/arch/x86/pci/Makefile b/arch/x86/pci/Makefile
index 564b008..39fba37 100644
--- a/arch/x86/pci/Makefile
+++ b/arch/x86/pci/Makefile
@@ -15,7 +15,7 @@ obj-$(CONFIG_X86_NUMAQ) += numaq_32.o

obj-y += common.o early.o
obj-y += amd_bus.o
-obj-$(CONFIG_X86_64) += bus_numa.o intel_bus.o
+obj-$(CONFIG_X86_64) += bus_numa.o

ifeq ($(CONFIG_PCI_DEBUG),y)
EXTRA_CFLAGS += -DDEBUG
diff --git a/arch/x86/pci/intel_bus.c b/arch/x86/pci/intel_bus.c
deleted file mode 100644
index f81a2fa..0000000
--- a/arch/x86/pci/intel_bus.c
+++ /dev/null
@@ -1,94 +0,0 @@
-/*
- * to read io range from IOH pci conf, need to do it after mmconfig is there
- */
-
-#include <linux/delay.h>
-#include <linux/dmi.h>
-#include <linux/pci.h>
-#include <linux/init.h>
-#include <asm/pci_x86.h>
-
-#include "bus_numa.h"
-
-static inline void print_ioh_resources(struct pci_root_info *info)
-{
- int res_num;
- int busnum;
- int i;
-
- printk(KERN_DEBUG "IOH bus: [%02x, %02x]\n",
- info->bus_min, info->bus_max);
- res_num = info->res_num;
- busnum = info->bus_min;
- for (i = 0; i < res_num; i++) {
- struct resource *res;
-
- res = &info->res[i];
- printk(KERN_DEBUG "IOH bus: %02x index %x %s: [%llx, %llx]\n",
- busnum, i,
- (res->flags & IORESOURCE_IO) ? "io port" :
- "mmio",
- res->start, res->end);
- }
-}
-
-#define IOH_LIO 0x108
-#define IOH_LMMIOL 0x10c
-#define IOH_LMMIOH 0x110
-#define IOH_LMMIOH_BASEU 0x114
-#define IOH_LMMIOH_LIMITU 0x118
-#define IOH_LCFGBUS 0x11c
-
-static void __devinit pci_root_bus_res(struct pci_dev *dev)
-{
- u16 word;
- u32 dword;
- struct pci_root_info *info;
- u16 io_base, io_end;
- u32 mmiol_base, mmiol_end;
- u64 mmioh_base, mmioh_end;
- int bus_base, bus_end;
-
- /* some sys doesn't get mmconf enabled */
- if (dev->cfg_size < 0x120)
- return;
-
- if (pci_root_num >= PCI_ROOT_NR) {
- printk(KERN_DEBUG "intel_bus.c: PCI_ROOT_NR is too small\n");
- return;
- }
-
- info = &pci_root_info[pci_root_num];
- pci_root_num++;
-
- pci_read_config_word(dev, IOH_LCFGBUS, &word);
- bus_base = (word & 0xff);
- bus_end = (word & 0xff00) >> 8;
- sprintf(info->name, "PCI Bus #%02x", bus_base);
- info->bus_min = bus_base;
- info->bus_max = bus_end;
-
- pci_read_config_word(dev, IOH_LIO, &word);
- io_base = (word & 0xf0) << (12 - 4);
- io_end = (word & 0xf000) | 0xfff;
- update_res(info, io_base, io_end, IORESOURCE_IO, 0);
-
- pci_read_config_dword(dev, IOH_LMMIOL, &dword);
- mmiol_base = (dword & 0xff00) << (24 - 8);
- mmiol_end = (dword & 0xff000000) | 0xffffff;
- update_res(info, mmiol_base, mmiol_end, IORESOURCE_MEM, 0);
-
- pci_read_config_dword(dev, IOH_LMMIOH, &dword);
- mmioh_base = ((u64)(dword & 0xfc00)) << (26 - 10);
- mmioh_end = ((u64)(dword & 0xfc000000) | 0x3ffffff);
- pci_read_config_dword(dev, IOH_LMMIOH_BASEU, &dword);
- mmioh_base |= ((u64)(dword & 0x7ffff)) << 32;
- pci_read_config_dword(dev, IOH_LMMIOH_LIMITU, &dword);
- mmioh_end |= ((u64)(dword & 0x7ffff)) << 32;
- update_res(info, mmioh_base, mmioh_end, IORESOURCE_MEM, 0);
-
- print_ioh_resources(info);
-}
-
-/* intel IOH */
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x342e, pci_root_bus_res);

Jeff Garrett

unread,
Jan 27, 2010, 11:10:02 PM1/27/10
to
On Wed, Jan 27, 2010 at 05:35:50PM -0800, Jesse Barnes wrote:
> On Tue, 26 Jan 2010 14:57:31 -0800
> Yinghai Lu <yin...@kernel.org> wrote:
>
> > On 01/26/2010 10:17 AM, Jesse Barnes wrote:
> >
> > >
> > > For 2.6.33 I'd like a minimal fix though, can you disable it for all
> > > but the multi-IOH case perhaps?
> > >
> > please check,
> >
> > [PATCH] x86/pci: don't use ioh resource if only have one ioh
> >
> > some system could use reosurce out of IOH resources when only one ioh is there.
> >
> > could be BIOS have wrong IOH resources and not enable them.
> >
> > Signed-off-by: Yinghai Lu <yin...@kernel.org>
>
> I applied this one to my for-linus branch. Jeff can you confirm it
> works for you? I'd like to push it to Linus tomorrow.
>
> Thanks,

FWIW, works...

Bjorn Helgaas

unread,
Jan 27, 2010, 11:40:01 PM1/27/10
to

Yes, we need a way to turn on "pci=use_crs" automatically. My first
thought is to turn it on for all BIOSes with dates of 2010 or later, and
in addition, have a whitelist of the pre-2010 machines that require it.

> 2. how about when apci is disabled?

When ACPI is disabled, I think we just have to accept that we lose some
functionality. I don't see the need for alternate ways to accomplish
everything that ACPI does. It's becoming less and less useful to
disable ACPI; I think it's only interesting as a debugging tool, and
even then it's a sledgehammer.

Bjorn

Soeren Sonnenburg

unread,
Jan 28, 2010, 12:10:01 AM1/28/10
to
On Wed, 2010-01-27 at 09:57 -0800, Jesse Barnes wrote:
> On Sun, 24 Jan 2010 23:04:37 +0100 (CET)
> "Rafael J. Wysocki" <r...@sisk.pl> wrote:
>
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> >
> > The following bug entry is on the current list of known regressions
> > from 2.6.32. Please verify if it still should be listed and let me know
> > (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15043
> > Subject : Display goes off with i915.powersave=1
> > Submitter : Soeren Sonnenburg <so...@debian.org>
> > Date : 2010-01-10 20:09 (15 days old)
> > References : http://marc.info/?l=linux-kernel&m=126315457519505&w=4
>
> If this isn't fixed yet, I hope David's patch fixes it. See "[Bug
> #14897] i915: Commit 0e442c60 causes flickering".

that sounds indeed like it, because sometimes I observe a flickering
before darkness :) I will give the new kernel a try though I don't have
time to do it in the next few days :/

Soeren
--
For the one fact about the future of which we can be certain is that it
will be utterly fantastic. -- Arthur C. Clarke, 1962

signature.asc

Yinghai Lu

unread,
Jan 28, 2010, 1:00:01 AM1/28/10
to

some systems when acpi is enabled could have interrupt storm.
and have to disable acpi.

YH

Rafael J. Wysocki

unread,
Jan 28, 2010, 5:50:02 AM1/28/10
to

Blacklist them?

Rafael

Bjorn Helgaas

unread,
Jan 28, 2010, 11:20:01 AM1/28/10
to

We should fix that problem rather than just covering it up by
disabling ACPI. Can you provide any details?

I think it's crazy to add code to work around Problem B that only
occurs because we disabled ACPI to work around Problem A. We should
just fix Problem A instead.

Bjorn

Jesse Barnes

unread,
Jan 28, 2010, 11:30:03 AM1/28/10
to
On Wed, 27 Jan 2010 22:02:26 -0600
je...@jgarrett.org (Jeff Garrett) wrote:

> On Wed, Jan 27, 2010 at 07:24:09PM -0800, Jesse Barnes wrote:
> > On Wed, 27 Jan 2010 17:50:17 -0800 (PST)
> > Linus Torvalds <torv...@linux-foundation.org> wrote:
> > > On Tue, 26 Jan 2010, Yinghai Lu wrote:
> > > >
> > > > [PATCH] x86/pci: don't use ioh resource if only have one ioh
> > >
> > > Please, no.
> > >
> > > This patch is too ugly to live.
> > >
> > > And it's totally unacceptable to probe every single possible PCI device
> > > for something like this.
> > >
> > > If we don't know enough about the hardware workings of those Intel bridges
> > > to know when they are active and how they decode things, then please let's
> > > just disable intel_bus.c entirely.
> > >
> > > There's no excuse for hacky tests like this.
> >
> > Ok, we'll just kill it entirely then. I'll send a patch tomorrow
> > unless Yinghai beats me to it.
>
> What about something like this (works for me, without pci=use_crs)?
>
> ---
> Remove intel_bus.c Intel-specific PCI/IOH logic
>
> Signed-off-by: Jeff Garrett <je...@jgarrett.org>

Yeah, looks good. I'll push to Linus today.

Thanks,
Jesse

--
Jesse Barnes, Intel Open Source Technology Center

Yinghai Lu

unread,
Jan 28, 2010, 1:20:01 PM1/28/10
to
On 01/28/2010 08:24 AM, Jesse Barnes wrote:
> On Wed, 27 Jan 2010 22:02:26 -0600
> je...@jgarrett.org (Jeff Garrett) wrote:
>
>> On Wed, Jan 27, 2010 at 07:24:09PM -0800, Jesse Barnes wrote:
>>> On Wed, 27 Jan 2010 17:50:17 -0800 (PST)
>>> Linus Torvalds <torv...@linux-foundation.org> wrote:
>>>> On Tue, 26 Jan 2010, Yinghai Lu wrote:
>>>>>
>>>>> [PATCH] x86/pci: don't use ioh resource if only have one ioh
>>>>
>>>> Please, no.
>>>>
>>>> This patch is too ugly to live.
>>>>
>>>> And it's totally unacceptable to probe every single possible PCI device
>>>> for something like this.
>>>>
>>>> If we don't know enough about the hardware workings of those Intel bridges
>>>> to know when they are active and how they decode things, then please let's
>>>> just disable intel_bus.c entirely.
>>>>
>>>> There's no excuse for hacky tests like this.
>>>
>>> Ok, we'll just kill it entirely then. I'll send a patch tomorrow
>>> unless Yinghai beats me to it.
>>
>> What about something like this (works for me, without pci=use_crs)?
>>
>> ---
>> Remove intel_bus.c Intel-specific PCI/IOH logic
>>
>> Signed-off-by: Jeff Garrett <je...@jgarrett.org>
>
> Yeah, looks good. I'll push to Linus today.
>

please don't. will send you another patch, to keep the print out so we can cross check the _CRS.

YH

Yinghai Lu

unread,
Jan 28, 2010, 1:30:01 PM1/28/10
to
On 01/28/2010 08:09 AM, Bjorn Helgaas wrote:
> On Wednesday 27 January 2010 10:53:51 pm Yinghai Lu wrote:
>> On 01/27/2010 08:26 PM, Bjorn Helgaas wrote:
>>> On Wed, 2010-01-27 at 15:34 -0800, Yinghai Lu wrote:
>
>>>> 2. how about when apci is disabled?
>>>
>>> When ACPI is disabled, I think we just have to accept that we lose some
>>> functionality. I don't see the need for alternate ways to accomplish
>>> everything that ACPI does. It's becoming less and less useful to
>>> disable ACPI; I think it's only interesting as a debugging tool, and
>>> even then it's a sledgehammer.
>>
>> some systems when acpi is enabled could have interrupt storm.
>> and have to disable acpi.
>
> We should fix that problem rather than just covering it up by
> disabling ACPI. Can you provide any details?
that is not covering problem. acpi just cause too many problems.

systems using acpi hotplug support, and use acpi aml code to monitor the hotplug status instead of HW
and after one or two days will have interrupt storm with sci/acpi interrupt aka 9.


>
> I think it's crazy to add code to work around Problem B that only
> occurs because we disabled ACPI to work around Problem A. We should
> just fix Problem A instead.

that is not point. fix BIOS or HW or OS?

check many systems have broken acpi?
some system acpi code even clear pci bar when just enable acpi at the first point.

YH

It is loading more messages.
0 new messages