2.6.12-rc6-mm1

Andrew Morton

unread,

Jun 7, 2005, 7:40:14 AM6/7/05

to

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2.6.12-rc6-mm1/

- Added v9fs

- Various random fixes

- Probably a similar number of breakages

Changes since 2.6.12-rc5-mm2:

-fix-ide-scsi-eh-locking.patch
-ext3-fix-log_do_checkpoint-assertion-failure.patch
-ext3-fix-list-scanning-in-__cleanup_transaction.patch
-namei-fixes-01-19.patch
-namei-fixes-02-19.patch
-namei-fixes-03-19.patch
-namei-fixes-04-19.patch
-namei-fixes-05-19.patch
-namei-fixes-06-19.patch
-namei-fixes-07-19.patch
-namei-fixes-08-19.patch
-namei-fixes-09-19.patch
-namei-fixes-10-19.patch
-namei-fixes-11-19.patch
-namei-fixes-12-19.patch
-namei-fixes-13-19.patch
-namei-fixes-14-19.patch
-namei-fixes-15-19.patch
-namei-fixes-16-19.patch
-namei-fixes-17-19.patch
-namei-fixes-18-19.patch
-namei-fixes-19-19.patch
-ipmi-class_simple-fixes.patch
-gregkh-i2c-i2c-ali1563.patch
-git-ocfs-fix-for-shemminger-tcp-stuff.patch
-gregkh-pci-pci-hotplug-shpchp-_HPP-fix.patch
-gregkh-pci-pci-hotplug-shpchp-PERR-fix.patch
-gregkh-pci-pci-amd74xx-ids.patch
-gregkh-pci-pci-cpci-update.patch
-gregkh-usb-usb-sl811-hcd-fixes.patch
-gregkh-usb-usb-sl811_cs.patch
-gregkh-usb-usb-ftdi_sio-new-id.patch
-gregkh-usb-usb-serial-generic-init-fix.patch
-gregkh-usb-usb-ub_multi_lun.patch
-gregkh-usb-usb-remove_pwc_changelog.patch
-gregkh-usb-usb-add-new-wacom-device-to-usb-hid-core-list.patch
-gregkh-usb-usb-urb_documentation.patch
-gregkh-usb-usb-earthmate-hid-blacklist.patch
-gregkh-usb-usb-storage-trumpion.patch
-gregkh-usb-usb-modalias-shrink.patch
-gregkh-usb-usb-cp2101-flow-control.patch
-gregkh-usb-usb-usbatm-reduce-log-spam.patch
-gregkh-usb-usb-usbatm-avoid-oops-on-bind-failure.patch
-gregkh-usb-usb-usbatm-1-fix.patch
-usb-option-card-driver.patch
-usb-wacom-tablet-driver.patch
-atm-nicstar-remove-a-bunch-of-pointless-casts-of-null.patch
-fix-atm-build-with-o=.patch
-drivers-net-hamradio-baycom_eppc-cleanups.patch
-ppc32-apple-device-tree-bug-fix.patch
-ppc32-ppc64-cleanup-proc-device-tree.patch
-ppc64-cleanup-spr-definitions.patch
-ppc64-cleanup-iseries-runlight-support.patch
-ppc64-remove-decr_overclock.patch
-ppc64-fix-a-device-tree-bug-on-apples.patch
-i386-collect-host-bridge-resources.patch
-x86_64-collect-host-bridge-resources.patch
-allow-ev_abs-to-work-in-uinputc.patch
-serial-update-nec-vr4100-series-serial-support.patch

Merged

+ppc32-add-linux-compilerh-to-asm-sigcontexth.patch
+include-linux-configh-before-testing-config_acpi.patch
+uml-make-the-emulated-iomem-driver-work-on-26.patch
+uml-compile-fixes-for-gcc-4.patch
+uml-fix-strace-f.patch
+uml-clean-up-error-path.patch
+uml-link-tt-mode-against-nptl.patch
+send_ipi_mask_sequence-warning-fix.patch
+ppc32-add-405ep-cpu_spec-entry.patch
+input-disable-scroll-feature-on-at-keyboards.patch

Planned for 2.6.12

+x86_64-task_size-fixes-for-compatibility-mode-processes.patch

x86_64 critical fixes (needs work)

+ia64-disable-preempt.patch

Disable CONFIG_PREEMPT on ia64 (it has problem with floating-point
save/restore)

+fix-up-macro-abuse-in-drivers-acpi-sleep-procc.patch

ACPI cleanup

+git-arm.patch
+git-arm-smp.patch

ARM git trees

-git-cpufreq.patch

Empty

+fix-warning-in-powernow-k8c.patch

Fix a cpufreq warning

+gregkh-driver-ipmi-class_simple-fixes.patch
+gregkh-driver-sysfs-permissions-01.patch
+gregkh-driver-sysfs-permissions-02.patch
+gregkh-driver-sysfs-permissions-03.patch
+gregkh-driver-dont-loose-devices-on-suspend-failure.patch

New driver core patches

-bk-drm.patch
-bk-drm-via.patch

DRM is moving to git

-update-drm-ioctl-compatibility-to-new-world-order.patch

The code which this pathces isn't there any more (it will come back)

+git-drm-initmap.patch
+git-drm-via.patch

Some DRM git trees

+gregkh-i2c-i2c-Kconfig-update.patch
+gregkh-i2c-i2c-pcf8574-cleanup.patch
+gregkh-i2c-i2c-adm9240-docs.patch
+gregkh-i2c-i2c-device-attr-lm90.patch
+gregkh-i2c-i2c-device-attr-lm83.patch
+gregkh-i2c-i2c-device-attr-lm63.patch
+gregkh-i2c-i2c-device-attr-it87.patch
+gregkh-i2c-hwmon-01.patch
+gregkh-i2c-hwmon-02.patch
+gregkh-i2c-hwmon-03.patch

i2c tree updates

+i2c-chips-need-hwmon.patch
+gregkh-i2c-hwmon-02-sparc64-fix.patch

Fix a few things in the i2c tree

+sonypi-make-sure-that-input_work-is-not-running-when-unloading.patch

sonypi fix

-git-libata-adma.patch
-git-libata-ahci-msi.patch
-git-libata-bridge-detect.patch
-git-libata-chs-support.patch
-git-libata-docs.patch
-git-libata-svw.patch
-git-libata-promise-sata-pata.patch
-git-libata-pdc2027x.patch

Dropped the libata tree - it changes all the time and I can't wqork out wtf
is going on.

-git-netdev-r8169.patch

Too many rejects from this one.

+fix-recursive-ipw2200-dependencies.patch
+drivers-net-chelsio-cxgb2-use-the-dma_3264bit_mask-constants.patch
+drivers-net-wireless-ipw2100-use-the-dma_32bit_mask-constant.patch
+drivers-net-wireless-ipw2200-use-the-dma_32bit_mask-constant.patch
+fix-tulip-suspend-resume.patch

Net driver fixes

+scalable-tcp-cleaned.patch

"scalable TCP"

+git-serial.patch

Serial subsystem tree

+gregkh-pci-pci-fix-routing-in-parent-bridge.patch
+gregkh-pci-pci-dma-bursting-advice.patch
+gregkh-pci-pci-collect-host-bridge-resources-01.patch
+gregkh-pci-pci-collect-host-bridge-resources-02.patch

PCI subsystem tree updates

+gregkh-pci-pci-dma-bursting-advice-fix.patch

Fix it

-git-scsi-rc-fixes.patch

This is empty

+gregkh-usb-usb-usbatm-reduce-log-spam.patch
+gregkh-usb-usb-usbatm-avoid-oops-on-bind-failure.patch
+gregkh-usb-usb-usbatm-fix-gcc-2.95.x.patch
+gregkh-usb-usb-usbatm-kcalloc.patch
+gregkh-usb-usb-uhci-detect-invalid-ports.patch
+gregkh-usb-usb-export-getput_intf.patch
+gregkh-usb-usb-cdc-acm-reference-count-fix.patch
+gregkh-usb-usb-ehci-fix-page-pointer-allocate.patch
+gregkh-usb-usb-wireless-definitions.patch
+gregkh-usb-usb-usblp-race-fix.patch
+gregkh-usb-usb-stv680-creative-mini.patch
+gregkh-usb-usb-atiremote-sysfs-links.patch
+gregkh-usb-usb-gotemp.patch

USB tree updates

+sparsemem-memory-model-fix-4.patch
+sparsemem-memory-model-fix-5.patch

Fix sparsemem-memory-model.patch even more

+sparsemem-hotplug-base-fix.patch

Fix sparsemem-hotplug-base.patch

-vm-merge_lru_pages.patch
-vm-page-cache-reclaim-core.patch
-vm-page-cache-reclaim-core-tidy.patch
-vm-reclaim_page_cache_node-syscall.patch
-vm-reclaim_page_cache_node-syscall-x86.patch
-vm-automatic-reclaim-through-mempolicy.patch
+vm-add-may_swap-flag-to-scan_control.patch
+vm-early-zone-reclaim.patch
+vm-early-zone-reclaim-tidy.patch
+vm-add-__gfp_noreclaim.patch
+vm-rate-limit-early-reclaim.patch

These patches were updated

+node-local-per-cpu-pages-tidy-2-fix.patch

Fix node-local-per-cpu-pages.patch some more.

+avoiding-mmap-fragmentation-revert-unneeded-64-bit-changes-vs-x86_64-task_size-fixes-for-compatibility-mode-processes.patch

Fix a patch clash

+__mod_page_state-pass-unsigned-long-instead-of-unsigned.patch
+__read_page_state-pass-unsigned-long-instead-of-unsigned.patch

Warning fixes

+add-oom-debug.patch

Additional debug output when the box goes oom.

+periodically-drain-non-local-pagesets.patch
+periodically-drain-non-local-pagesets-fix.patch

Shrink the per-cpu-pages caches occasionally

+ia64-uncached-alloc.patch
+sn2-xpc-build-patches.patch

Special allocator for uncached pages

+shmem-restore-superblock-info.patch
+mbind-fix-verify_pages-pte_page.patch
+mbind-check_range-use-standard-ptwalk.patch
+dup_mmap-update-comment-on-new-vma.patch
+bad_page-clear-reclaim-and-slab.patch
+rme96xx-fix-pagereserved-range.patch
+get_user_pages-kill-get_page_map.patch
+do_wp_page-cannot-share-file-page.patch
+can_share_swap_page-use-page_mapcount.patch
+msync-check-pte-dirty-earlier.patch

Various mm fixes

+sunzilog-warning-fixes.patch
+ppp-handle-misaligned-accesses.patch

Net fixes

+ppc32-removed-dependency-on-config_cpm2-for-building.patch
+ppc32-converted-mpc10x-bridge-to-use-platform.patch
+cpm_uart-route-scc2-pins-for-the-stx-gp3-board.patch

ppc32 updates

+ppc64-iseries-remove-iseries_proch.patch
+ppc64-iseries-header-file-white-space-cleanups.patch
+ppc64-iseries-more-header-file-white-space-cleanups.patch
+ppc64-iseries-obvious-code-simplifications.patch
+ppc64-iseries-remove-lpardatah.patch
+ppc64-iseries-eliminate-some-unused-inline-functions.patch
+ppc64-iseries-remove-hvcallcfgh.patch
+ppc64-iseries-cleanup-itlpqueueh.patch
+ppc64-iseries-tidy-up-some-includes-and-hvcallh.patch
+ppc64-iseries-misc-header-cleanups.patch
+update-ppc64-defconfig.patch
+ppc64-iseries-remove-iseries_pci_resetc.patch
+ppc64-iseries-iommuh-cleanups.patch
+ppc64-iseries-iseries_vpdinfoc-cleanups.patch
+ppc64-iseries-iseries_pcih-cleanups.patch
+ppc64-iseries-remove-ioretry-from-iseries_device_node.patch
+ppc64-iseries-remove-some-more-members-of.patch

ppc64 updates

+x86-x86_64-pcibus_to_node-fix.patch

Fix x86-x86_64-pcibus_to_node.patch

+mempool-bounce-buffer-restriction.patch

Limit the amount of memory which can be used for bounce buffers

+arm-irqs_disabled-type-fix.patch

ARM warning fix

+variable-overflow-after-hundreds-round-of-hotplug-cpu.patch

CPU hotplug fix

+x86_64-change-init-sections-for-cpu-hotplug-support.patch
+x86_64-change-init-sections-for-cpu-hotplug-support-fix.patch
+x86_64-cpu-hotplug-support.patch
+x86_64-cpu-hotplug-sibling-map-cleanup.patch
+x86_64-dont-use-broadcast-shortcut-to-make-it-cpu-hotplug-safe.patch
+x86_64-provide-ability-to-choose-using-shortcuts-for-ipi-in-flat-mode.patch

CPU hotplug for x86_64

+m32r-support-m3a-2170mappi-iii-platform-fix.patch
+m32r-support-m3a-2170mappi-iii-platform-fix-2.patch
+m32r-update-setup_xxxxxc.patch
+m32r-update-m32r_cfc-to-support-mappi-iii-fix.patch
+m32r-cleanup-arch-m32r-mm-extablec.patch
+m32r-remove-include-asm-m32r-m32102perih.patch
+m32r-update-defconfig-files.patch
+m32r-use-asm-generic-div64h.patch

m32r fixes and updates

+s390-cio-max-channels-checks.patch
+s390-cio-documentation.patch
+s390-ifdefs-in-compat_ioctls.patch
+s390-kernel-stack-overflow-panic.patch
+s390-cmm-sender-parameter-visibility.patch
+s390-memory-detection-32gb.patch
+s390-pending-interrupt-after-ipl-from-reader.patch

s/390 updates

+ecryptfs-export-user-key-type.patch

Export a symbol

+x86_64-specific-function-return-probes.patch
+kprobes-ia64-cleanup-2.patch
+kprobes-ia64-cmp-ctype-unc-support.patch
+kprobes-ia64-safe-register-kprobe.patch
+kprobes-temporary-disarming-of-reentrant-probe-for-x86_64-fix.patch
+allow-a-jprobe-to-coexist-with-muliple-kprobes.patch

kprobes updates

+cs4236-irq-handling-fix.patch

OSS driver fix

+block-add-unlocked_ioctl-support-for-block-devices.patch

Support lock_kernel-less ioctls on blockdevs

+pcdp-handle-tables-that-dont-supply-baud-rate.patch

serial driver update

+stop-arch-i386-kernel-vsyscall-noteo-being-rebuilt-every-time.patch

kbuild fix

+remove-f_error-field-from-struct-file.patch

cleanup

+autofs4-avoid-panic-on-bind-mount-of-autofs-owned-directory.patch
+autofs4-post-expire-race-fix.patch
+autofs4-bad-lookup-fix.patch
+autofs4-subversion-bump-to-identify-these-changes.patch

autofs4 updates

+rapidio-support-core-base.patch
+rapidio-support-core-includes.patch
+rapidio-support-core-enum.patch
+rapidio-support-ppc32.patch
+rapidio-support-net-driver.patch

RapidIO driver

+dlm-lockspaces-callbacks-directory-dlm-consistent-ifdefs.patch
+dlm-lockspaces-callbacks-directory-fix-2-dlm-dont-repeat-include.patch
+dlm-lockspaces-callbacks-directory-fix-3.patch
+dlm-lockspaces-callbacks-directory-dlm-dont-free-lvb-twice.patch
+dlm-communication-dlm-dont-add-duplicate-node-addresses.patch
+dlm-recovery-dlm-timer-cant-be-global.patch
+dlm-recovery-dlm-clear-recovery-flags.patch
+dlm-device-interface-dlm-uncomment-unregister_lockspace.patch
+dlm-device-interface-dlm-newline-in-printks.patch
+dlm-debug-fs-dlm-consistent-ifdefs.patch

Various fixes and updates to the DLM driver

+tuner-corec-improvments-and-ymec-tvision-tvf8533mf.patch

v4l udpate

+oprofile-report-anonymous-region-samples.patch

oprofile feature

+lockd-flush-signals-on-shutdown.patch
+nfs4-hold-filp-while-reading-or-writing.patch
+nfsd4-fix-probe_callback.patch
+nfsd4-nfs4_check_open_reclaim-cleanup.patch
+nfsd4-create-separate-laundromat-workqueue.patch
+nfsd4-simplify-lease-changing.patch
+nfsd4-delegation-recovery.patch
+nfsd4-rename-nfs4_state_init.patch
+nfsd4-clean-up-state-initialization.patch
+nfsd4-remove-nfs4_reclaim_init.patch
+nfsd4-idmap-initialization.patch
+nfsd4-setclientid-simplification.patch
+nfsd4-reboot-hash.patch
+nfsd4-add-find_unconf_by_str-functions-to-simplify-setclientid.patch
+nfsd4-grace-period-end.patch
+nfsd4-make-needlessly-global-code-static.patch
+nfsd4-fix-uncomfirmed-list.patch
+nfsd4-fix-setclientid_confirm-cases.patch
+nfsd4-fix-setclientid_confirm-error-return.patch
+nfsd4-setclientid_confirm-gotoectomy.patch
+nfsd4-setclientid_confirm-comments.patch
+nfsd4-miscellaneous-setclientid_confirm-cleanup.patch
+nfsd4-rename-state-list-fields.patch
+nfsd4-allow-multiple-lockowners.patch
+nfsd4-remove-cb_parsed.patch
+nfsd4-initialize-recovery-directory.patch
+nfsd4-reboot-recovery.patch
+nfsd4-reboot-dirname.patch

nfsd updates

+isofs-show-hidden-files-add-granularity-for-assoc-hidden-files-flags.patch
+isofs-show-hidden-files-add-granularity-for-assoc-hidden-files-flags-tidy.patch
+isofs-show-hidden-files-add-granularity-for-assoc-hidden-files-flags-fix.patch

isofs feature work

+numa-aware-slab-allocator-v5.patch

The NUMA-aware slab allocator is back. Needs ifdef-reduction work.

-periodically-scan-redzone-entries-and-slab-control-structures.patch
-slab-leak-detector.patch
-slab-leak-detector-warning-fixes.patch

It broke these.

+numa-aware-slab-allocator-v3-__bad_size-fix.patch

Fix it.

+sched-run-sched_normal-tasks-with-real-time-tasks-on-smt-siblings.patch

CPU scheduler fix

+v4l-add-support-for-pixelview-ultra-pro.patch
+dvico-fusionhdtv3-gold-t-documentation-fix.patch

v4l updates

+kexec-code-cleanup.patch

Make all the kexec patches resemble CodingStyle.

+v9fs-documentation-makefiles-configuration.patch
+v9fs-documentation-makefiles-configuration-fix.patch
+v9fs-vfs-file-dentry-and-directory-operations.patch
+v9fs-vfs-file-dentry-and-directory-operations-fix.patch
+v9fs-vfs-inode-operations.patch
+v9fs-vfs-superblock-operations-and-glue.patch
+v9fs-9p-protocol-implementation.patch
+v9fs-transport-modules.patch
+v9fs-debug-and-support-routines.patch
+v9fs-debug-and-support-routines-fix.patch

The plan9 networked filesystem

+framebuffer-driver-for-arc-lcd-board.patch
+framebuffer-driver-for-arc-lcd-board-tidy.patch
+framebuffer-driver-for-arc-lcd-board-update.patch
+new-pci-id-for-chipsfb.patch

fbdev updates

+modules-add-version-and-srcversion-to-sysfs-fix.patch
+modules-add-version-and-srcversion-to-sysfs-fix-2.patch

Fix modules-add-version-and-srcversion-to-sysfs.patch

+fuse-device-functions-fuse-serious-information-leak-fix.patch

FUSE fix

+remove-redundant-info-from-submittingpatches.patch

Documentation update

-unexport-slab_reclaim_pages.patch

Drop this due to some reject.

number of patches in -mm: 1397
number of changesets in external trees: 53
number of patches in -mm only: 1395
total patches: 1448

All 1397 patches:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2.6.12-rc6-mm1/patch-list

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Wolfgang Wander

unread,

Jun 7, 2005, 11:00:30 AM6/7/05

to

Wolfgang Wander wrote:
> Andrew Morton wrote:
>
>> +avoiding-mmap-fragmentation-revert-unneeded-64-bit-changes-vs-x86_64-task_size-fixes-for-compatibility-mode-processes.patch
>>
>
>
> As a heads-up.
>
> This one breaks the fragmentation reduction patch in 32 bit emulation mode.
> Our test case shows the standard 17 fragmented regions in
> /proc/self/maps (as in
> the 2.6 standard kernel) vs the 2 regions in 2.6.12-rc5-mm2 (and before).
>
> Somehow the new way of detecting 32 bit remulation mode seems to fail here.
>
> I'll try to figure out a fix.
>

Here is one possibility:

Since rc6 the difference between TASK_UNMAPPED_64 and TASK_UNMAPPED_32 is gone
and both are now merged into TASK_UNMAPPED_BASE. Therefore we can no longer
check our local base against TASK_UNMAPPED_BASE to see if we are running in 32bit
emulation mode. The appended patch uses other (hopefully the right) means.

Tested on x86_64 in 32 and 64 mode (64 bit fragments as desired, 32 bit
collapses as desired).

Signed-off-by: Wolfgang Wander <w...@rentec.com>

avoiding-mmap-fragmentation-revert-unneeded-64-bit-changes-vs-x86_64-task_size-fixes-for-compatibility-mode-processes-fix.patch

Brice Goglin

unread,

Jun 7, 2005, 11:00:29 AM6/7/05

to

Andrew Morton a écrit :

> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2.6.12-rc6-mm1/
>
> - Added v9fs
>
> - Various random fixes
>
> - Probably a similar number of breakages

Hi Andrew,

I didn't see any breakage. But I get these two lines during boot:
yenta 0000:02:03.1: no resource of type 100 available, trying to continue...
yenta 0000:02:03.1: no resource of type 100 available, trying to continue...

Anyway, my PCMCIA slots seem to still work.

Brice

Wolfgang Wander

unread,

Jun 7, 2005, 10:30:14 AM6/7/05

to

Andrew Morton wrote:

> +avoiding-mmap-fragmentation-revert-unneeded-64-bit-changes-vs-x86_64-task_size-fixes-for-compatibility-mode-processes.patch

As a heads-up.

This one breaks the fragmentation reduction patch in 32 bit emulation mode.
Our test case shows the standard 17 fragmented regions in /proc/self/maps (as in
the 2.6 standard kernel) vs the 2 regions in 2.6.12-rc5-mm2 (and before).

Somehow the new way of detecting 32 bit remulation mode seems to fail here.

I'll try to figure out a fix.

Wolfgang

Adrian Bunk

unread,

Jun 7, 2005, 5:10:30 PM6/7/05

to

On Tue, Jun 07, 2005 at 04:29:31AM -0700, Andrew Morton wrote:
>...
> Changes since 2.6.12-rc5-mm2:
>...

> +rapidio-support-core-base.patch
> +rapidio-support-core-includes.patch
> +rapidio-support-core-enum.patch
> +rapidio-support-ppc32.patch
> +rapidio-support-net-driver.patch
>
> RapidIO driver

>...

That we do now have both drivers/rio/ and drivers/char/rio/ and that
they are for completely different things is confusing.

What about drivers/rapidio/ ?

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

Matt Porter

unread,

Jun 7, 2005, 5:50:24 PM6/7/05

to

On Tue, Jun 07, 2005 at 10:59:06PM +0200, Adrian Bunk wrote:
> On Tue, Jun 07, 2005 at 04:29:31AM -0700, Andrew Morton wrote:
> >...
> > Changes since 2.6.12-rc5-mm2:
> >...
> > +rapidio-support-core-base.patch
> > +rapidio-support-core-includes.patch
> > +rapidio-support-core-enum.patch
> > +rapidio-support-ppc32.patch
> > +rapidio-support-net-driver.patch
> >
> > RapidIO driver
> >...
>
> That we do now have both drivers/rio/ and drivers/char/rio/ and that
> they are for completely different things is confusing.
>
> What about drivers/rapidio/ ?

Fine with me. I'll roll it into my next update.

-Matt

Francois Romieu

unread,

Jun 7, 2005, 7:20:11 PM6/7/05

to

Andrew Morton <ak...@osdl.org> :
[...]

> -git-netdev-r8169.patch
>
> Too many rejects from this one.

How did you generate git-netdev-r8169.patch ?

Jeff's 'upstream-2.6.13' includes all the pending r8169 changes and
nothing will be merged before 2.6.12 is out. Imho you can safely
ignore any r8169 change until 2.6.12 appears.

--
Ueimor

Søren Lott

unread,

Jun 7, 2005, 10:10:07 PM6/7/05

to

On Tuesday 07 June 2005 08:29, Andrew Morton wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2.
>6.12-rc6-mm1/

[snip]

> +gregkh-i2c-i2c-Kconfig-update.patch
> +gregkh-i2c-i2c-pcf8574-cleanup.patch
> +gregkh-i2c-i2c-adm9240-docs.patch
> +gregkh-i2c-i2c-device-attr-lm90.patch
> +gregkh-i2c-i2c-device-attr-lm83.patch
> +gregkh-i2c-i2c-device-attr-lm63.patch
> +gregkh-i2c-i2c-device-attr-it87.patch
> +gregkh-i2c-hwmon-01.patch
> +gregkh-i2c-hwmon-02.patch
> +gregkh-i2c-hwmon-03.patch
>
> i2c tree updates
>
> +i2c-chips-need-hwmon.patch
> +gregkh-i2c-hwmon-02-sparc64-fix.patch
>
> Fix a few things in the i2c tree

[snip]

after those changes i don't get entries in /sys for my W83627THF chip.
(p4c800-D, i875,ICH5)

relevant config parts:

CONFIG_HWMON=y
CONFIG_I2C=y
CONFIG_I2C_ISA=y
CONFIG_I2C_SENSOR=y
CONFIG_SENSORS_W83627HF=y

thanks.

-SL

Jean Delvare

unread,

Jun 8, 2005, 2:00:29 AM6/8/05

to

Hi Soren,

> [snip]
>
> > +gregkh-i2c-i2c-Kconfig-update.patch
> > +gregkh-i2c-i2c-pcf8574-cleanup.patch
> > +gregkh-i2c-i2c-adm9240-docs.patch
> > +gregkh-i2c-i2c-device-attr-lm90.patch
> > +gregkh-i2c-i2c-device-attr-lm83.patch
> > +gregkh-i2c-i2c-device-attr-lm63.patch
> > +gregkh-i2c-i2c-device-attr-it87.patch
> > +gregkh-i2c-hwmon-01.patch
> > +gregkh-i2c-hwmon-02.patch
> > +gregkh-i2c-hwmon-03.patch
> >
> > i2c tree updates
> >
> > +i2c-chips-need-hwmon.patch
> > +gregkh-i2c-hwmon-02-sparc64-fix.patch
> >
> > Fix a few things in the i2c tree
>
> [snip]
>
> after those changes i don't get entries in /sys for my W83627THF chip.
>
> (p4c800-D, i875,ICH5)
>
> relevant config parts:
>
> CONFIG_HWMON=y
> CONFIG_I2C=y
> CONFIG_I2C_ISA=y
> CONFIG_I2C_SENSOR=y
> CONFIG_SENSORS_W83627HF=y

Which kernel are you upgrading from?

Is CONFIG_PNPACPI set? If it is, try whithout it.

If it doesn't work, please try reverting (in reverse order):
gregkh-i2c-hwmon-01.patch
gregkh-i2c-hwmon-02.patch
gregkh-i2c-hwmon-03.patch
i2c-chips-need-hwmon.patch
gregkh-i2c-hwmon-02-sparc64-fix.patch
and see how it goes.

Thanks,
--
Jean Delvare

Søren Lott

unread,

Jun 8, 2005, 3:10:12 AM6/8/05

to

On Wednesday 08 June 2005 02:53, Jean Delvare wrote:
> Hi Soren,
Hi,

> Which kernel are you upgrading from?

from 2.6.12-rc5-mm2

> Is CONFIG_PNPACPI set? If it is, try whithout it.

nope, don't even have CONFIG_PNP set.

> If it doesn't work, please try reverting (in reverse order):
> gregkh-i2c-hwmon-01.patch
> gregkh-i2c-hwmon-02.patch
> gregkh-i2c-hwmon-03.patch
> i2c-chips-need-hwmon.patch
> gregkh-i2c-hwmon-02-sparc64-fix.patch
> and see how it goes.

yeap, reverting these did the trick, all i2c entries in sysfs are back. :)

> Thanks,

thanks alot.
cheers.

-SL

Andy Whitcroft

unread,

Jun 8, 2005, 10:30:07 AM6/8/05

to

We've been seeing an early boot hang on IBM x-series (at least on an
x440) with -rc6-mm1. Finally got hold of a box to go search for this
and it seems that backing out the three patches below fixes it.

515 dmi-move-acpi-boot-quirk.patch
516 dmi-move-acpi-sleep-quirk.patch
517 dmi-remove-central-blacklist.patch

I am pretty sure it is actually the first one (thats where my bisection
search pointed) but I had to drop the other two to back it out. Anyhow,
2.6.12-rc6-mm1 boots on an x440 with these backed out.

Cheers.

-apw

Andrew James Wade

unread,

Jun 8, 2005, 10:50:13 AM6/8/05

to

2.6.12-rc5-mm1 didn't crash.

kernel BUG at include/linux/list.h:166!
invalid operand: 0000 [#1]
PREEMPT
CPU: 0
EIP: 0060:[<c0319cd4>] Not tainted VLI
EFLAGS: 00010a83 (2.6.12-rc6-mm1)
EIP is at i2c_detach_client+0xb4/0x110
eax: dfc0bcc0 ebx: c15fc26c ecx: c15fc264 edx: c04378d0
esi: c15fc14c edi: c0437720 ebp: 00000000 esp: dff81f10
ds: 007b es: 007b ss: 0068
Process swapper (pid: 1, threadinfo=dff80000 task=c14dca00)
Stack: dfff6110 dfc0bdb4 00000286 00000286 c15fc26c c15fc14c c15fc160 ffffffed
c031d512 c15fc160 c03edac1 c15fc26c 00000000 0000002d 00000001 0000002d
c0437720 00000000 c0437c5c 00000001 00000000 c031b100 00000000 00000000
Call Trace:
[<c031d512>] asb100_detect+0x442/0x520
[<c031b100>] i2c_detect+0x240/0x380
[<c031d0d0>] asb100_detect+0x0/0x520
[<c0319889>] i2c_add_driver+0x89/0xc0
[<c047e7eb>] do_initcalls+0x2b/0xc0
[<c015a915>] kern_mount+0x15/0x19
[<c01002b0>] init+0x0/0x110
[<c01002df>] init+0x2f/0x110
[<c0100f28>] kernel_thread_helper+0x0/0x18
[<c0100f2d>] kernel_thread_helper+0x5/0x18
Code: 89 40 04 89 f0 e8 8d 31 fa ff 89 f0 e8 16 34 fa ff ff 47 2c 0f 8e 25 11
00 00 89 d8 e8 56 53 09 00 89 e8 83 c4 10 5b 5e 5f 5d c3 <0f> 0b a6 00 44
56 3c c0 eb 91 0f 0b a5 00 44 56 3c c0 e9 79 ff
<0>Kernel panic - not syncing: Attempted to kill init!

.config attached.

.config

Jean Delvare

unread,

Jun 8, 2005, 1:00:32 PM6/8/05

to

Hi Andrew,

> 2.6.12-rc5-mm1 didn't crash.
>
> kernel BUG at include/linux/list.h:166!
> invalid operand: 0000 [#1]
> PREEMPT
> CPU: 0
> EIP: 0060:[<c0319cd4>] Not tainted VLI
> EFLAGS: 00010a83 (2.6.12-rc6-mm1)
> EIP is at i2c_detach_client+0xb4/0x110
> eax: dfc0bcc0 ebx: c15fc26c ecx: c15fc264 edx: c04378d0
> esi: c15fc14c edi: c0437720 ebp: 00000000 esp: dff81f10
> ds: 007b es: 007b ss: 0068
> Process swapper (pid: 1, threadinfo=dff80000 task=c14dca00)
> Stack: dfff6110 dfc0bdb4 00000286 00000286 c15fc26c c15fc14c c15fc160
> ffffffed
> c031d512 c15fc160 c03edac1 c15fc26c 00000000 0000002d 00000001
> 0000002d c0437720 00000000 c0437c5c 00000001 00000000 c031b100
> 00000000 00000000
> Call Trace:
> [<c031d512>] asb100_detect+0x442/0x520
> [<c031b100>] i2c_detect+0x240/0x380
> [<c031d0d0>] asb100_detect+0x0/0x520
> [<c0319889>] i2c_add_driver+0x89/0xc0

I suspect you didn't "make oldconfig" before compiling 2.6.12-rc6-mm1.
You should have CONFIG_HWMON=Y in .config, and I don't see it. Note that
I can't explain why it results in the BUG right above, but it must be
related.

If "make oldconfig" doesn't help, try reverting:
http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2.6.12-rc6-mm1/broken-out/gregkh-i2c-hwmon-03.patch

Thanks,
--
Jean Delvare

Andrew Morton

unread,

Jun 8, 2005, 4:10:14 PM6/8/05

to

Andy Whitcroft <a...@shadowen.org> wrote:
>
> We've been seeing an early boot hang on IBM x-series (at least on an
> x440) with -rc6-mm1. Finally got hold of a box to go search for this
> and it seems that backing out the three patches below fixes it.
>
> 515 dmi-move-acpi-boot-quirk.patch
> 516 dmi-move-acpi-sleep-quirk.patch
> 517 dmi-remove-central-blacklist.patch

Thanks for taking the time to do that - it helps enormously.

The patches aren't terribly important - I'll drop them if nobody sees the
problem. It might be an incorrect __init/__initdata/etc marking. But that
wouldn't cause an "early" boot hang...

Andrew Morton

unread,

Jun 8, 2005, 5:40:11 PM6/8/05

to

Andrew James Wade <ajw...@cpe00095b3131a0-cm0011ae8cd564.cpe.net.cable.rogers.com> wrote:
>
> 2.6.12-rc5-mm1 didn't crash.
>
> kernel BUG at include/linux/list.h:166!
> invalid operand: 0000 [#1]
> PREEMPT
> CPU: 0
> EIP: 0060:[<c0319cd4>] Not tainted VLI
> EFLAGS: 00010a83 (2.6.12-rc6-mm1)
> EIP is at i2c_detach_client+0xb4/0x110
> eax: dfc0bcc0 ebx: c15fc26c ecx: c15fc264 edx: c04378d0
> esi: c15fc14c edi: c0437720 ebp: 00000000 esp: dff81f10
> ds: 007b es: 007b ss: 0068
> Process swapper (pid: 1, threadinfo=dff80000 task=c14dca00)
> Stack: dfff6110 dfc0bdb4 00000286 00000286 c15fc26c c15fc14c c15fc160 ffffffed
> c031d512 c15fc160 c03edac1 c15fc26c 00000000 0000002d 00000001 0000002d
> c0437720 00000000 c0437c5c 00000001 00000000 c031b100 00000000 00000000
> Call Trace:
> [<c031d512>] asb100_detect+0x442/0x520

Were there no interesting printks before this BUG hit?

It's due to the kernel running list_del() on a list_head which isn't on a list.

Seems there is an error-path bug in that driver, but I don' thtink the fix
will fix it. Please test?

From: Andrew Morton <ak...@osdl.org>

Fix error backing-out code in asb100.c

Cc: Greg KH <gr...@kroah.com>
Signed-off-by: Andrew Morton <ak...@osdl.org>
---

drivers/i2c/chips/asb100.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)

diff -puN drivers/i2c/chips/asb100.c~asb100-fix drivers/i2c/chips/asb100.c
--- 25/drivers/i2c/chips/asb100.c~asb100-fix 2005-06-08 14:23:52.000000000 -0700
+++ 25-akpm/drivers/i2c/chips/asb100.c 2005-06-08 14:24:13.000000000 -0700
@@ -690,18 +690,20 @@ static int asb100_detect_subclients(stru
if ((err = i2c_attach_client(data->lm75[0]))) {
dev_err(&new_client->dev, "subclient %d registration "
"at address 0x%x failed.\n", i, data->lm75[0]->addr);
- goto ERROR_SC_2;
+ goto ERROR_SC_3;
}

if ((err = i2c_attach_client(data->lm75[1]))) {
dev_err(&new_client->dev, "subclient %d registration "
"at address 0x%x failed.\n", i, data->lm75[1]->addr);
- goto ERROR_SC_3;
+ goto ERROR_SC_4;
}

return 0;

/* Undo inits in case of errors */
+ERROR_SC_4:
+ i2c_detach_client(data->lm75[1]);
ERROR_SC_3:
i2c_detach_client(data->lm75[0]);
ERROR_SC_2:
_

Andrew James Wade

unread,

Jun 8, 2005, 7:10:07 PM6/8/05

to

On June 8, 2005 05:26 pm, Andrew Morton wrote:
> Were there no interesting printks before this BUG hit?

Nope :-(

> It's due to the kernel running list_del() on a list_head which isn't on a list.
>
> Seems there is an error-path bug in that driver, but I don' thtink the fix
> will fix it. Please test?

Will do. But I don't think that's it. I've been adding printks to determine the
execution path and it goes through the ERROR3 path in asb100_detect(), which means
AFACT that the error path in asb100_detect_subclients() isn't taken:

ERROR3:
i2c_detach_client(data->lm75[0]);
kfree(data->lm75[1]);
kfree(data->lm75[0]);
ERROR2:
i2c_detach_client(new_client); // <--- BUG() in here.
ERROR1:
kfree(data);
ERROR0:
return err;

But the ERROR2 path does work despite the location of the bug. If I apply:

--- 2.6.12-rc6-mm1/drivers/i2c/chips/asb100.c 2005-06-08 17:46:02.123864000 -0400
+++ linux/drivers/i2c/chips/asb100.c 2005-06-08 17:59:21.461819500 -0400
@@ -811,6 +811,7 @@ static int asb100_detect(struct i2c_adap
if ((err = i2c_attach_client(new_client)))
goto ERROR1;

+ goto ERROR2;
/* Attach secondary lm75 clients */
if ((err = asb100_detect_subclients(adapter, address, kind,
new_client)))
@@ -874,7 +875,6 @@ static int asb100_detach_client(struct i
{
int err;

- hwmon_device_unregister(client->class_dev);

if ((err = i2c_detach_client(client))) {
dev_err(&client->dev, "client deregistration failed; "

No bug(). But the ERROR3 path doesn't:
--- 2.6.12-rc6-mm1/drivers/i2c/chips/asb100.c 2005-06-08 17:46:02.123864000 -0400
+++ linux/drivers/i2c/chips/asb100.c 2005-06-08 17:58:15.749712750 -0400
@@ -815,6 +815,7 @@ static int asb100_detect(struct i2c_adap
if ((err = asb100_detect_subclients(adapter, address, kind,
new_client)))
goto ERROR2;
+ goto ERROR3;

/* Initialize the chip */
asb100_init_client(new_client);
@@ -874,7 +875,6 @@ static int asb100_detach_client(struct i
{
int err;

- hwmon_device_unregister(client->class_dev);

if ((err = i2c_detach_client(client))) {
dev_err(&client->dev, "client deregistration failed; "

causes a BUG(). I've yet to track the problem down further. Unfortunately
I have no more time today, I'll play with it again tomorrow.

Regards,
Andrew

Martin J. Bligh

unread,

Jun 8, 2005, 7:20:44 PM6/8/05

to

--On Wednesday, June 08, 2005 13:01:17 -0700 Andrew Morton <ak...@osdl.org> wrote:

> Andy Whitcroft <a...@shadowen.org> wrote:
>>
>> We've been seeing an early boot hang on IBM x-series (at least on an
>> x440) with -rc6-mm1. Finally got hold of a box to go search for this
>> and it seems that backing out the three patches below fixes it.
>>
>> 515 dmi-move-acpi-boot-quirk.patch
>> 516 dmi-move-acpi-sleep-quirk.patch
>> 517 dmi-remove-central-blacklist.patch
>
> Thanks for taking the time to do that - it helps enormously.
>
> The patches aren't terribly important - I'll drop them if nobody sees the
> problem. It might be an incorrect __init/__initdata/etc marking. But that
> wouldn't cause an "early" boot hang...

That does indeed make it boot. However ... once it's booted it seems
to hit another problem, a hang condition ;-( I suspect it's unrelated.
The box is still up and responsive, but cp spins.

I'm still chasing the other boot/hang double problem (amd64), so can't
really look at this right now, but if anyone has any bright ideas they
want me to try, or wants more info, let me know (machine is still hung
in that state).

Some snippets:

ps -ef:

root 10980 10979 0 09:02 ? 00:00:00 /bin/bash /usr/local/autobench/scripts/run test kernbench 32 5 -
m 2^M
root 11060 10980 0 09:02 ? 00:00:00 /bin/bash /usr/local/autobench/scripts/getsysinfo before /usr/lo
cal/autobench/logs/k^M
root 11219 11060 0 09:02 ? 00:00:00 /bin/bash /usr/local/autobench/scripts/archive_dir /proc/scsi /u
sr/local/autobench/l^M
root 11221 11219 99 09:02 ? 04:13:26 cp -r /proc/scsi/aic7xxx /proc/scsi/device_info /proc/scsi/scsi
/usr/local/autobench^M

alt+sysrq+t

^M^@getsysinfo S CB5260CC 0 11060 10980 11219 (NOTLB)
^M^@d5fc1f40 00000082 fffffe00 cb5260cc 00000000 c011259b 2691b900 003d08e4
^M^@ 080fa558 00000001 d5fc1f38 c04715c0 c0473080 bfcb43b8 d740e000 cb526020
^M^@ 00000001 cb526020 00000007 d5fc1fbc 0008b824 26cec200 003d08e4 c02fc928
^M^@Call Trace:
^M^@ [<c011259b>] do_page_fault+0x193/0x60f
^M^@ [<c011d584>] do_wait+0x2a4/0x358
^M^@ [<c0115ff8>] default_wake_function+0x0/0x1c
^M^@ [<c0115ff8>] default_wake_function+0x0/0x1c
^M^@ [<c011d6c6>] sys_wait4+0x26/0x38
^M^@ [<c011d6ee>] sys_waitpid+0x16/0x1a
^M^@ [<c0102a19>] syscall_call+0x7/0xb
^M^@archive_dir S CBB810CC 0 11219 11060 11221 (NOTLB)
^M^@d7793f40 00000082 fffffe00 cbb810cc 00000000 c011259b 28b70a00 003d08e4
^M^@ 080fa158 00000001 d7793f38 c04715c0 c0473080 bfc51a68 c040e000 cbb81020
^M^@ 00000001 cbb81020 00000007 d7793fbc 00000000 28b70a00 003d08e4 c02fc928
^M^@Call Trace:
^M^@ [<c011259b>] do_page_fault+0x193/0x60f
^M^@ [<c011d584>] do_wait+0x2a4/0x358
^M^@ [<c0115ff8>] default_wake_function+0x0/0x1c
^M^@ [<c0115ff8>] default_wake_function+0x0/0x1c
^M^@ [<c011d6c6>] sys_wait4+0x26/0x38
^M^@ [<c011d6ee>] sys_waitpid+0x16/0x1a
^M^@ [<c0102a19>] syscall_call+0x7/0xb
^M^@cp R running 0 11221 11219 (NOTLB)
^M^@sleep S D77A1F68 0 11906 1409 (NOTLB)
^M^@d77a1f58 00000086 0039a67c d77a1f68 bfade9d8 272d8698 b605a700 003d16b7
^M^@ d5c1e804 d6ecdbac d77a1f50 c04715c0 c0473080 d77a1fbc d6ecd814 d76d3020
^M^@ 00000282 c0121f31 0039a67c c107d0e0 00000000 b605a700 003d16b7 d77a1f68
^M^@Call Trace:
^M^@ [<c0121f31>] lock_timer_base+0x19/0x3c
^M^@ [<c02ef4db>] schedule_timeout+0x7b/0x9c
^M^@ [<c0122904>] process_timeout+0x0/0xc
^M^@ [<c01229fb>] sys_nanosleep+0xdb/0x158
^M^@ [<c0102a19>] syscall_call+0x7/0xb
^M^@BUG: soft lockup detected on CPU#0!
^M
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c02efcd9>] CPU: 0
^M^@EIP is at _spin_unlock_irqrestore+0x5/0x8
^M^@ EFLAGS: 00000292 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: c03b9b84 EBX: c03b9ad4 ECX: 0a000000 EDX: 00000292
^M^@ESI: 00000074 EDI: c040ffa4 EBP: d5c16000 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 080f9008 CR3: 16dd0300 CR4: 000006b0
^M^@ [<c020e729>] __handle_sysrq+0x121/0x128
^M^@ [<c020e74f>] handle_sysrq+0x1f/0x24
^M^@ [<c021dda4>] receive_chars+0x16c/0x270
^M^@ [<c021e0a2>] serial8250_interrupt+0x66/0xe4
^M^@ [<c01320f0>] handle_IRQ_event+0x28/0x58
^M^@ [<c0132203>] __do_IRQ+0xe3/0x134
^M^@ [<c0104b4b>] do_IRQ+0x1b/0x28
^M^@ [<c01033d6>] common_interrupt+0x1a/0x20
^M^@ [<c0100bb0>] default_idle+0x0/0x2c
^M^@ [<c0100bd3>] default_idle+0x23/0x2c
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c01002c8>] rest_init+0x28/0x2c
^M^@ [<c0410899>] start_kernel+0x19d/0x1a0

alt+sysrq+p does wierd stuff (is that new patch in your tree Andrew?
doesn't seem to inter-react with the other NMI code well)

Command> break
^@SysRq : Show Regs
^M
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 0
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: c040e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7e3f5a0 CR3: 16dd0300 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c01002c8>] rest_init+0x28/0x2c
^M^@ [<c0410899>] start_kernel+0x19d/0x1a0
^M^@ Uhhuh. NMI received for unknown reason 00 on CPU 1.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 16.
^M^@Dazed and confused, but trying to continue
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 3.
^M^@Do you have a strange power saving mode enabled?
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 17.
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 16
^M^@EIP is at default_idle+0x23/0x2c
^M^@Dazed and confused, but trying to continue
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 2.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 18.
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 19.
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7420000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@CR0: 8005003b CR2: 00000000 CR3: 17771800 CR4: 000006b0
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 6.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 20.
^M^@ start_secondary+0x13d/0x140
^M^@Dazed and confused, but trying to continue
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 18
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 10.
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7426000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f25d9c CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 29.
^M^@ cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 2
^M^@ EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7400000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7edeb00 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 23.
^M^@ start_secondary+0x13d/0x140
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 7.
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 3
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7402000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@Do you have a strange power saving mode enabled?
^M^@CR0: 8005003b CR2: b7f95438 CR3: 17771800 CR4: 000006b0
^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 4.
^M^@ cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 17
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 5.
^M^@Dazed and confused, but trying to continue
^M^@EIP is at default_idle+0x23/0x2c
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 14.
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7424000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 9.
^M^@ cpu_idle+0x7b/0x8c
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 25.
^M^@ [<c010e79d>] start_secondary+0x13d/0x140Dazed and confused, but trying to continue
^M
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 19
^M^@ EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7428000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f30d9c CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 13.
^M^@ start_secondary+0x13d/0x140
^M^@ Do you have a strange power saving mode enabled?
^M^@----------- IPI show regs -----------Uhhuh. NMI received for unknown reason 00 on CPU 8.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 11.
^M^@Dazed and confused, but trying to continue
^M^@Dazed and confused, but trying to continue
^M^@Dazed and confused, but trying to continue
^M
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 20
^M^@Dazed and confused, but trying to continue
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 22.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 26.
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@ESI: d742a000 EDI: c0470300 EBP: c0470300Uhhuh. NMI received for unknown reason 00 on CPU 30.
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Do you have a strange power saving mode enabled?
^M^@Do you have a strange power saving mode enabled?
^M^@ DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 21.
^M^@ [<c0100ca3>]Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@ cpu_idle+0x7b/0x8c
^M^@Dazed and confused, but trying to continue
^M^@ [<c010e79d>]Do you have a strange power saving mode enabled?
^M^@ start_secondary+0x13d/0x140
^M^@Do you have a strange power saving mode enabled?
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 27.
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 24.
^M^@ ----------- IPI show regs -----------
^M^@Pid: 11221, comm: cp
^M^@EIP: 0060:[<c02efbdc>] CPU: 5
^M^@Do you have a strange power saving mode enabled?
^M^@EIP is at _spin_lock_irqsave+0x14/0x20
^M^@ EFLAGS: 00000286 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@Dazed and confused, but trying to continue
^M^@EAX: 00000286 EBX: d6ce4800 ECX: c03cabe0 EDX: c049ba84
^M^@ESI: ffffffea EDI: d55f8000 EBP: d55f8000 DS: 007b ES: 007b
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@CR0: 80050033 CR2: bfc7d2fc CR3: 16dd02e0 CR4: 000006b0
^M^@ [<c0270377>]Uhhuh. NMI received for unknown reason 00 on CPU 31.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 12.
^M^@ ahc_linux_proc_info+0x27/0x212
^M^@Do you have a strange power saving mode enabled?
^M^@Do you have a strange power saving mode enabled?
^M^@ [<c0149052>]Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@ page_add_anon_rmap+0x62/0x68
^M^@ [<c0144358>]Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 15.
^M^@Uhhuh. NMI received for unknown reason 00 on CPU 28.
^M^@Dazed and confused, but trying to continue
^M^@ do_anonymous_page+0x1f0/0x21c
^M^@ [<c0144370>]Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@ do_anonymous_page+0x208/0x21c
^M^@Dazed and confused, but trying to continue
^M^@ [<c01443d9>]Do you have a strange power saving mode enabled?
^M^@Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@Do you have a strange power saving mode enabled?
^M^@ do_no_page+0x55/0x3e8
^M^@ [<c01372b5>] prep_new_page+0x49/0x50
^M^@ [<c0137973>] buffered_rmqueue+0x16f/0x1d0
^M^@ [<c0137e1b>] __alloc_pages+0x3bb/0x3c8
^M^@ [<c0257cdb>] proc_scsi_read+0x2b/0x44
^M^@ [<c0182f28>] proc_file_read+0xec/0x200
^M^@ [<c0152ff9>] vfs_read+0x91/0x12c
^M^@ [<c01532e4>] sys_read+0x40/0x6c
^M^@ [<c0102a19>] syscall_call+0x7/0xb
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 7
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d740c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 4
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7404000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 080f9c48 CR3: 17771320 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c02efcae>] CPU: 30
^M^@EIP is at _spin_lock+0xa/0x10
^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: c1050aa0 EBX: c1050aa0 ECX: d7463ea8 EDX: 00000003
^M^@ESI: c10d9620 EDI: c10d9fe0 EBP: d7463eb0 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7eea900 CR3: 00474000 CR4: 000006b0
^M^@ [<c011583b>] load_balance+0xcf/0x170
^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104
^M^@ [<c0115d77>] scheduler_tick+0x97/0x318
^M^@ [<c01225b3>] update_process_times+0xef/0x100
^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4
^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24
^M^@ [<c0100bb0>] default_idle+0x0/0x2c
^M^@ [<c0100bd3>] default_idle+0x23/0x2c
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c02efb6a>] CPU: 15
^M^@EIP is at _spin_trylock+0x6/0x14
^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c1050aa0 ECX: 00000008 EDX: c1050aa0
^M^@ESI: c10875a0 EDI: c1087f60 EBP: d741fe84 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0114fda>] double_lock_balance+0x12/0x48
^M^@ [<c01157e4>] load_balance+0x78/0x170
^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104
^M^@ [<c0115d77>] scheduler_tick+0x97/0x318
^M^@ [<c01225b3>] update_process_times+0xef/0x100
^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4
^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24
^M^@ [<c0100bb0>] default_idle+0x0/0x2c
^M^@ [<c0100bd3>] default_idle+0x23/0x2c
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 21
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d742c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 14
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d741c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7e64070 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 27
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d745c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f66d9c CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 8
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d740e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 080f133c CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 25
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7436000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f74d9c CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c02efb6a>] CPU: 29
^M^@EIP is at _spin_trylock+0x6/0x14
^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000001 EBX: c1050aa0 ECX: 00000008 EDX: c1050aa0
^M^@ESI: c10d3ea0 EDI: c10d4860 EBP: d7461e84 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0114fda>] double_lock_balance+0x12/0x48
^M^@ [<c01157e4>] load_balance+0x78/0x170
^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104
^M^@ [<c0115d77>] scheduler_tick+0x97/0x318
^M^@ [<c01225b3>] update_process_times+0xef/0x100
^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4
^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24
^M^@ [<c0100bb0>] default_idle+0x0/0x2c
^M^@ [<c0100bd3>] default_idle+0x23/0x2c
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 31
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7464000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 24
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7434000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f3cdd8 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 10
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7412000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7ea6920 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 26
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7438000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c01154a3>] CPU: 13
^M^@EIP is at find_busiest_group+0x103/0x2f8
^M^@ EFLAGS: 00000086 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000005 EBX: 00000005 ECX: c1050aa0 EDX: 00000000
^M^@ESI: c04813ac EDI: 00000200 EBP: d741be7c DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7e7e070 CR3: 00474000 CR4: 000006b0
^M^@ [<c01157a2>] load_balance+0x36/0x170
^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104
^M^@ [<c0115d77>] scheduler_tick+0x97/0x318
^M^@ [<c01225b3>] update_process_times+0xef/0x100
^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4
^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24
^M^@ [<c0100bb0>] default_idle+0x0/0x2c
^M^@ [<c0100bd3>] default_idle+0x23/0x2c
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 28
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d745e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c011e897>] CPU: 12
^M^@EIP is at __do_softirq+0x47/0x100
^M^@ EFLAGS: 00000006 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: c0470380 EBX: c0476020 ECX: 00000030 EDX: c1075ce0
^M^@ESI: 00000002 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f54000 CR3: 00474000 CR4: 000006b0
^M^@ [<c011e97f>] do_softirq+0x2f/0x34
^M^@ [<c011ea24>] irq_exit+0x34/0x38
^M^@ [<c010f601>] smp_apic_timer_interrupt+0xdd/0xe4
^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24
^M^@ [<c0100bb0>] default_idle+0x0/0x2c
^M^@ [<c0100bd3>] default_idle+0x23/0x2c
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 9
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7410000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f1d900 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 11
^M^@ EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7414000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 23
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7432000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 6
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7408000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7f3cdd8 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 22
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: d7430000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ Dazed and confused, but trying to continue
^M^@Do you have a strange power saving mode enabled?
^M^@----------- IPI show regs -----------
^M^@Pid: 0, comm: swapper
^M^@EIP: 0060:[<c0100bd3>] CPU: 1
^M^@EIP is at default_idle+0x23/0x2c
^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1)
^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43
^M^@ESI: c13fc000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b
^M^@CR0: 8005003b CR2: b7ee1d9c CR3: 17771640 CR4: 000006b0
^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c
^M^@ [<c010e79d>] start_secondary+0x13d/0x140
^M^@ ^M

Andrew Morton

unread,

Jun 8, 2005, 7:30:16 PM6/8/05

to

"Martin J. Bligh" <mbl...@mbligh.org> wrote:
>
> alt+sysrq+p does wierd stuff (is that new patch in your tree Andrew?
> doesn't seem to inter-react with the other NMI code well)

What patch?

Andrew Morton

unread,

Jun 8, 2005, 7:40:10 PM6/8/05

to

Andrew James Wade <ajw...@cpe00095b3131a0-cm0011ae8cd564.cpe.net.cable.rogers.com> wrote:
>

> On June 8, 2005 05:26 pm, Andrew Morton wrote:
> > Were there no interesting printks before this BUG hit?
> Nope :-(
>
> > It's due to the kernel running list_del() on a list_head which isn't on a list.
> >
> > Seems there is an error-path bug in that driver, but I don' thtink the fix
> > will fix it. Please test?
> Will do. But I don't think that's it. I've been adding printks to determine the
> execution path and it goes through the ERROR3 path in asb100_detect(), which means
> AFACT that the error path in asb100_detect_subclients() isn't taken:
>
> ERROR3:
> i2c_detach_client(data->lm75[0]);
> kfree(data->lm75[1]);
> kfree(data->lm75[0]);
> ERROR2:
> i2c_detach_client(new_client); // <--- BUG() in here.
> ERROR1:
> kfree(data);
> ERROR0:
> return err;

hm, the tree I have here doesn't do that. What kernel do you have there?

I suggest you work against
http://www.zip.com.au/~akpm/linux/patches/stuff/x.bz2 which is a patch
against 2.6.12-rc6 containing everybody's latest everything.

Martin J. Bligh

unread,

Jun 8, 2005, 7:40:10 PM6/8/05

to

--On Wednesday, June 08, 2005 16:22:47 -0700 Andrew Morton <ak...@osdl.org> wrote:

> "Martin J. Bligh" <mbl...@mbligh.org> wrote:
>>
>> alt+sysrq+p does wierd stuff (is that new patch in your tree Andrew?
>> doesn't seem to inter-react with the other NMI code well)
>
> What patch?

Sorry.

nmi-lockup-and-altsysrq-p-dumping-calltraces-on-_all_-cpus.patch

It does seem to work. But probably needs some cleanup for the NMI
errors.

Mark M. Hoffman

unread,

Jun 8, 2005, 11:50:04 PM6/8/05

to

Hi Soren, et. al.:

> On Wednesday 08 June 2005 02:53, Jean Delvare wrote:
> > If it doesn't work, please try reverting (in reverse order):
> > gregkh-i2c-hwmon-01.patch
> > gregkh-i2c-hwmon-02.patch
> > gregkh-i2c-hwmon-03.patch
> > i2c-chips-need-hwmon.patch
> > gregkh-i2c-hwmon-02-sparc64-fix.patch
> > and see how it goes.

* Søren Lott <sor...@gmail.com> [2005-06-08 04:08:04 -0300]:

> yeap, reverting these did the trick, all i2c entries in sysfs are back. :)

My bad. Although I will redo the hwmon patches soon anyway, here is a
patch that you can apply (after reapplying the above) that should get
you working again. BTW: I tested it on almost identical h/w as yours,
this time with the same relevant config options, against 2.6.12-rc5-mm1.
This applies to -rc6-mm1.

---------------

This patch fixes an init order bug between hwmon and i2c/chips,
without which many sensors drivers will not initialize properly
(in non-modular systems).

Signed-off-by: Mark M. Hoffman <mhof...@lightlink.com>

Index: linux-2.6.12-rc6-mm1/drivers/Makefile
===================================================================
--- linux-2.6.12-rc6-mm1.orig/drivers/Makefile
+++ linux-2.6.12-rc6-mm1/drivers/Makefile
@@ -53,8 +53,11 @@ obj-$(CONFIG_USB_GADGET) += usb/gadget/
obj-$(CONFIG_GAMEPORT) += input/gameport/
obj-$(CONFIG_INPUT) += input/
obj-$(CONFIG_I2O) += message/
-obj-$(CONFIG_I2C) += i2c/
+
+# most of i2c/chips depends on hwmon/
obj-$(CONFIG_HWMON) += hwmon/
+obj-$(CONFIG_I2C) += i2c/
+
obj-$(CONFIG_W1) += w1/
obj-$(CONFIG_PHONE) += telephony/
obj-$(CONFIG_MD) += md/

--
Mark M. Hoffman
mhof...@lightlink.com

Andrey Panin

unread,

Jun 9, 2005, 12:30:14 AM6/9/05

to

On 159, 06 08, 2005 at 03:22:57 +0100, Andy Whitcroft wrote:
> We've been seeing an early boot hang on IBM x-series (at least on an
> x440) with -rc6-mm1. Finally got hold of a box to go search for this
> and it seems that backing out the three patches below fixes it.
>
> 515 dmi-move-acpi-boot-quirk.patch
> 516 dmi-move-acpi-sleep-quirk.patch
> 517 dmi-remove-central-blacklist.patch
>
> I am pretty sure it is actually the first one (thats where my bisection
> search pointed) but I had to drop the other two to back it out. Anyhow,
> 2.6.12-rc6-mm1 boots on an x440 with these backed out.

Yeah, probably brown paper bag time... Please try the attached patch.

--
Andrey Panin | Linux and UNIX system administrator
pa...@donpac.ru | PGP key: wwwkeys.pgp.net

patch-stupid-dmi-bug

signature.asc

Kirill Korotaev

unread,

Jun 9, 2005, 3:20:11 AM6/9/05

to

> --On Wednesday, June 08, 2005 16:22:47 -0700 Andrew Morton <ak...@osdl.org> wrote:
>
>
>>"Martin J. Bligh" <mbl...@mbligh.org> wrote:
>>
>>>alt+sysrq+p does wierd stuff (is that new patch in your tree Andrew?
>>> doesn't seem to inter-react with the other NMI code well)
>>
>>What patch?
>
>
> Sorry.
>
> nmi-lockup-and-altsysrq-p-dumping-calltraces-on-_all_-cpus.patch
>
> It does seem to work. But probably needs some cleanup for the NMI
> errors.

If you give me to know where the problem come from I can fix it and make
a cleanup.

Kirill

Jean Delvare

unread,

Jun 9, 2005, 4:00:22 AM6/9/05

to

Hi Andrew, Andrew, all,

[Adding Mark M. Hoffman in the loop, as the author and recent modifier of
the asb100 driver.]

> From: Andrew Morton <ak...@osdl.org>
>
> Fix error backing-out code in asb100.c
>
> Cc: Greg KH <gr...@kroah.com>
> Signed-off-by: Andrew Morton <ak...@osdl.org>

> (...)
> --- 25/drivers/i2c/chips/asb100.c~asb100-fix
> +++ 25-akpm/drivers/i2c/chips/asb100.c

> @@ -690,18 +690,20 @@ static int asb100_detect_subclients(stru
> if ((err = i2c_attach_client(data->lm75[0]))) {
> dev_err(&new_client->dev, "subclient %d registration "
> "at address 0x%x failed.\n", i, data->lm75[0]->addr);
> - goto ERROR_SC_2;
> + goto ERROR_SC_3;
> }
>
> if ((err = i2c_attach_client(data->lm75[1]))) {
> dev_err(&new_client->dev, "subclient %d registration "
> "at address 0x%x failed.\n", i, data->lm75[1]->addr);
> - goto ERROR_SC_3;
> + goto ERROR_SC_4;
> }
>
> return 0;
>
> /* Undo inits in case of errors */
> +ERROR_SC_4:
> + i2c_detach_client(data->lm75[1]);
> ERROR_SC_3:
> i2c_detach_client(data->lm75[0]);
> ERROR_SC_2:

This patch looks broken to me, please discard it. You do not want to call
i2c_detach_client when the corresponding i2c_attach_client failed. The
original code was fine in that respect. I don't think there is any
problem in the asb100_detect_subclients() function.

I do however think that there is a problem in the asb100_detect()
function, where a call to i2c_detach client() is missing:

ERROR3:
i2c_detach_client(data->lm75[1]); <-- HERE

i2c_detach_client(data->lm75[0]);
kfree(data->lm75[1]);
kfree(data->lm75[0]);

If we take that error path, it means that both subclients have been
successfully attached, thus need proper detach.

The reason why the bug triggered on Andrew (James Wade) is probably that
hwmon_device_register() failed, due to an order problem in a Makefile.
See http://lkml.org/lkml/2005/6/8/338, which has an explanation and a
patch fixing it (I think).

This still doesn't explain why the error path triggers the BUG(), and
although applying the aforementioned patch will probably get the driver
working, I'd really like to understand what's going on there.

Thanks,
--
Jean Delvare

Jean Delvare

unread,

Jun 9, 2005, 4:10:07 AM6/9/05

to

Hi Andrew,

> > Will do. But I don't think that's it. I've been adding printks to
> > determine the execution path and it goes through the ERROR3 path in
> > asb100_detect(), which means AFACT that the error path in
> > asb100_detect_subclients() isn't taken:
> >
> > ERROR3:
> > i2c_detach_client(data->lm75[0]);
> > kfree(data->lm75[1]);
> > kfree(data->lm75[0]);
> > ERROR2:
> > i2c_detach_client(new_client); // <--- BUG() in here.
> > ERROR1:
> > kfree(data);
> > ERROR0:
> > return err;
>
> hm, the tree I have here doesn't do that. What kernel do you have there?

I suspect that the bug will only show when the i2c-core and asb100
drivers (and the relevant i2c bus driver) are built into the kernel.
(See my previous post.)

Thanks,
--
Jean Delvare

Andrew James Wade

unread,

Jun 9, 2005, 7:10:11 AM6/9/05

to

On June 9, 2005 03:47 am, Jean Delvare wrote:
> The reason why the bug triggered on Andrew (James Wade) is probably that
> hwmon_device_register() failed, due to an order problem in a Makefile.
> See http://lkml.org/lkml/2005/6/8/338, which has an explanation and a
> patch fixing it (I think).

Yup, the kernel now boots.

> This still doesn't explain why the error path triggers the BUG(), and
> although applying the aforementioned patch will probably get the driver
> working, I'd really like to understand what's going on there.

Ok, I'll keep playing around with the kernel to see what I can find out.

(and I'll take a look at
http://www.zip.com.au/~akpm/linux/patches/stuff/x.bz2 as Andrew Morton
suggested)

Thanks,
Andrew

Andy Whitcroft

unread,

Jun 9, 2005, 9:30:21 AM6/9/05

to

Andrey Panin wrote:

> Yeah, probably brown paper bag time... Please try the attached patch.

Ok. I can confirm that linux-2.6.12-rc6-mm1 + just this fix boots fine
and works. And yes I said works? I can't understand why backing the
others out left us with the odd spin hang and this combination doesn't.
I've managed to run 4 sets of boot and kernbench (10 runs) without a hang.

/me feels there is something else ugly in here we don't want but
unrelated to this patch.

Andrew James Wade

unread,

Jun 9, 2005, 9:40:05 AM6/9/05

to

Mystery solved.

ERROR3:
i2c_detach_client(data->lm75[1]); <-- HERE
i2c_detach_client(data->lm75[0]);
kfree(data->lm75[1]);
kfree(data->lm75[0]);

The missing i2c_detach_client call meant that data->lm75[1] was still on
the list of i2c devices when it was freed. This was corrupting the list.
The ERROR3 path now works on my kernel.

Thanks for your help.
Andrew

Martin J. Bligh

unread,

Jun 9, 2005, 9:50:11 AM6/9/05

to

--Kirill Korotaev <d...@sw.ru> wrote (on Thursday, June 09, 2005 11:17:43 +0400):

>> --On Wednesday, June 08, 2005 16:22:47 -0700 Andrew Morton <ak...@osdl.org> wrote:
>>
>>
>>> "Martin J. Bligh" <mbl...@mbligh.org> wrote:
>>>
>>>> alt+sysrq+p does wierd stuff (is that new patch in your tree Andrew?
>>>> doesn't seem to inter-react with the other NMI code well)
>>>
>>> What patch?
>>
>>
>> Sorry.
>>
>> nmi-lockup-and-altsysrq-p-dumping-calltraces-on-_all_-cpus.patch
>>
>> It does seem to work. But probably needs some cleanup for the NMI
>> errors.
> If you give me to know where the problem come from I can fix it and make a cleanup.

It gets a lot of the "dazed and confused" errors. Possibly you just need
to disable that part of the handler?

-

Jean Delvare

unread,

Jun 9, 2005, 12:00:16 PM6/9/05

to

Hi Andrew,

> Mystery solved.
>
> ERROR3:
> i2c_detach_client(data->lm75[1]); <-- HERE
> i2c_detach_client(data->lm75[0]);
> kfree(data->lm75[1]);
> kfree(data->lm75[0]);
>
> The missing i2c_detach_client call meant that data->lm75[1] was still
> on the list of i2c devices when it was freed. This was corrupting the
> list. The ERROR3 path now works on my kernel.

Oh my, I had it right under my nose and didn't see it ;) Thanks for the
clarification.

Greg, please apply the following patch on top of the hwmon patches until
Mark submits an updated version of the whole thing.

----------------------------------

Fix a broken error path in the asb100 driver.

Signed-off-by: Jean Delvare <kh...@linux-fr.org>

--- linux-2.6.12-rc6/drivers/i2c/chips/asb100.c.orig Wed Jun 8 09:47:53 2005
+++ linux-2.6.12-rc6/drivers/i2c/chips/asb100.c Thu Jun 9 11:58:34 2005
@@ -859,6 +859,7 @@
return 0;

ERROR3:

+ i2c_detach_client(data->lm75[1]);

i2c_detach_client(data->lm75[0]);
kfree(data->lm75[1]);
kfree(data->lm75[0]);

--
Jean Delvare

Greg KH

unread,

Jun 10, 2005, 2:10:04 AM6/10/05

to

Hm, what tree is this against? Am I missing some inbetween patch here?

thanks,

greg k-h

Jean Delvare

unread,

Jun 10, 2005, 3:30:14 AM6/10/05

to

Hi Greg,

> > --- linux-2.6.12-rc6/drivers/i2c/chips/asb100.c.orig
> > +++ linux-2.6.12-rc6/drivers/i2c/chips/asb100.c

> > @@ -859,6 +859,7 @@
> > return 0;
> >
> > ERROR3:
> > + i2c_detach_client(data->lm75[1]);
> > i2c_detach_client(data->lm75[0]);
> > kfree(data->lm75[1]);
> > kfree(data->lm75[0]);
>
> Hm, what tree is this against? Am I missing some inbetween patch here?

2.6.12-rc6-mm1, but that was a fix to Mark's hwmon patches, which you
just backed out from your tree - so this fix is no more needed (and
should unsurprisingly fail to apply).

Thanks,
--
Jean Delvare

Kirill Korotaev

unread,

Jun 10, 2005, 8:20:09 AM6/10/05

to

>>>>>alt+sysrq+p does wierd stuff (is that new patch in your tree Andrew?
>>>>>doesn't seem to inter-react with the other NMI code well)
>>>>
>>>>What patch?
>>>
>>>
>>>Sorry.
>>>
>>>nmi-lockup-and-altsysrq-p-dumping-calltraces-on-_all_-cpus.patch
>>>
>>>It does seem to work. But probably needs some cleanup for the NMI
>>>errors.
>>
>>If you give me to know where the problem come from I can fix it and make a cleanup.
>
>
> It gets a lot of the "dazed and confused" errors. Possibly you just need
> to disable that part of the handler?

can you try this cleanup patch?
This fixes the problem for me, though I do no like the way it does so
very much...

Kirill

altsysrq-p-cleanup

Benoit Boissinot

unread,

Jun 11, 2005, 8:00:12 AM6/11/05

to

On 6/7/05, Andrew Morton <ak...@osdl.org> wrote:
>
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2.6.12-rc6-mm1/
>
> - Added v9fs
>
> - Various random fixes
>
> - Probably a similar number of breakages
>
I just had the following Oopses:

Unable to handle kernel paging request at virtual address 901a1960
printing eip:
c0139251
*pde = 00000000
Oops: 0002 [#1]
Modules linked in: radeon drm tun snd_seq snd_pcm_oss snd_mixer_oss
snd_via82xx snd_ac97_codec snd_pcm snd_timer snd_page_alloc
snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore ipt_multiport
ipt_state ipt_limit ipt_MASQUERADE ipt_mark iptable_mangle ipt_MARK
ipt_REJECT iptable_filter iptable_nat ip_tables ip_conntrack_irc
ip_conntrack_ftp ip_conntrack skge 8139too mii usbcore ide_cd cdrom
CPU: 0
EIP: 0060:[<c0139251>] Not tainted VLI
EFLAGS: 00010086 (2.6.12-rc6-mm1-arakou)
EIP is at find_lock_page+0x21/0xb0
eax: 901a195c ebx: 901a195c ecx: d8a3b094 edx: 00000003
esi: 00109380 edi: c18e4b08 ebp: d822cb10 esp: d822cb00

ds: 007b es: 007b ss: 0068

Process emerge (pid: 31977, threadinfo=d822c000 task=cbb9d040)
Stack: c18e4b04 c1218060 00000000 00000050 d822cb34 c013930e 00000050 00109380
c18e4b04 c0333d04 00109380 c18e4a00 00001000 d822cb50 c0157986 d822cb50
00109380 00109380 00001000 c18e4a00 d822cb70 c0157af5 00001000 d822cb70
Call Trace:
[<c0103d17>] show_stack+0x97/0xd0
[<c0103ec5>] show_registers+0x155/0x1f0
[<c01040c1>] die+0xc1/0x140
[<c01157ec>] do_page_fault+0x23c/0x6b5
[<c010395f>] error_code+0x4f/0x54
[<c013930e>] find_or_create_page+0x2e/0xd0
[<c0157986>] grow_dev_page+0x26/0x110
[<c0157af5>] __getblk_slow+0x85/0x130
[<c0157e8b>] __getblk+0x3b/0x50
[<c01a788b>] search_by_key+0x9b/0xf40
[<c0195095>] reiserfs_read_locked_inode+0x65/0x110
[<c01951e9>] reiserfs_iget+0x79/0xa0
[<c0190330>] reiserfs_lookup+0xd0/0x130
[<c0161f80>] real_lookup+0xb0/0xd0
[<c01622be>] do_lookup+0x7e/0x90
[<c0162a06>] __link_path_walk+0x736/0xd50
[<c016306a>] link_path_walk+0x4a/0x110
[<c01633b4>] path_lookup+0x74/0x120
[<c01635ee>] __user_walk+0x2e/0x50
[<c015e240>] vfs_stat+0x20/0x50
[<c015e834>] sys_stat64+0x14/0x30
[<c0102e0f>] sysenter_past_esp+0x54/0x75
Code: c3 89 f6 8d bc 27 00 00 00 00 55 89 e5 57 56 89 d6 53 83 ec 04
89 45 f0 fa 8d 78 04 89 f2 89 f8 e8 35 04 0b 00 85 c0 89 c3 74 56 <ff>
40 04 0f ba 28 00 19 c0 85 c0 74 49 fb 0f ba 2b 00 19 c0 85

<1>Unable to handle kernel paging request at virtual address 71ef2710
printing eip:
c0157140
*pde = 00000000
Oops: 0000 [#2]
Modules linked in: radeon drm tun snd_seq snd_pcm_oss snd_mixer_oss
snd_via82xx snd_ac97_codec snd_pcm snd_timer snd_page_alloc
snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore ipt_multiport
ipt_state ipt_limit ipt_MASQUERADE ipt_mark iptable_mangle ipt_MARK
ipt_REJECT iptable_filter iptable_nat ip_tables ip_conntrack_irc
ip_conntrack_ftp ip_conntrack skge 8139too mii usbcore ide_cd cdrom
CPU: 0
EIP: 0060:[<c0157140>] Not tainted VLI
EFLAGS: 00010a16 (2.6.12-rc6-mm1-arakou)
EIP is at __find_get_block_slow+0x90/0x140
eax: 00000000 ebx: 71ef26fc ecx: cb96f0e7 edx: 00000001
esi: c1309f20 edi: 000f9df5 ebp: e35d5b98 esp: e35d5b74

ds: 007b es: 007b ss: 0068

Process vim (pid: 32081, threadinfo=e35d5000 task=c2cc55b0)
Stack: df7fb6bc f6de8a4c f7cf12fc f6de8cec 00000002 c18e4584 dcb43d7c c18e4520
000f9df5 e35d5bac c0157e1c 00001000 000f9df5 c18e4520 e35d5bc0 c0157e6c
00003e94 0000003e 000f9df5 e35d5ce0 c01a788b 0000001e 0000001f e35d5bf0
Call Trace:
[<c0103d17>] show_stack+0x97/0xd0
[<c0103ec5>] show_registers+0x155/0x1f0
[<c01040c1>] die+0xc1/0x140
[<c01157ec>] do_page_fault+0x23c/0x6b5
[<c010395f>] error_code+0x4f/0x54
[<c0157e1c>] __find_get_block+0x6c/0xa0
[<c0157e6c>] __getblk+0x1c/0x50
[<c01a788b>] search_by_key+0x9b/0xf40
[<c018fc2c>] search_by_entry_key+0x1c/0x1f0
[<c01901e0>] reiserfs_find_entry+0x90/0x110
[<c01902d2>] reiserfs_lookup+0x72/0x130
[<c0161f80>] real_lookup+0xb0/0xd0
[<c01622be>] do_lookup+0x7e/0x90
[<c0162a06>] __link_path_walk+0x736/0xd50
[<c016306a>] link_path_walk+0x4a/0x110
[<c01633b4>] path_lookup+0x74/0x120
[<c0163a09>] open_namei+0x79/0x5f0
[<c0154c29>] filp_open+0x29/0x50
[<c0154fac>] sys_open+0x3c/0xc0
[<c0102e0f>] sysenter_past_esp+0x54/0x75
Code: 89 f0 e8 34 b8 fe ff 89 d8 83 c4 18 5b 5e 5f c9 c3 8b 06 f6 c4
08 0f 84 a4 00 00 00 8b 5e 0c ba 01 00 00 00 89 d9 90 8d 74 26 00 <3b>
7b 14 74 7b 8b 03 8b 5b 04 a8 10 b8 00 00 00 00 0f 44 d0 39

Bad page state at free_hot_cold_page (in process 'firefox-bin', page c1309360)
flags:0x40000000 mapping:00000000 mapcount:-1 count:0
Backtrace:
[<c0103d67>] dump_stack+0x17/0x20
[<c013cb52>] bad_page+0x72/0xb0
[<c013d2da>] free_hot_cold_page+0x4a/0xe0
[<c013da81>] __pagevec_free+0x31/0x40
[<c0142a9d>] release_pages+0x9d/0x150
[<c0142b68>] __pagevec_release+0x18/0x30
[<c01430bb>] truncate_inode_pages_range+0x13b/0x300
[<c014329a>] truncate_inode_pages+0x1a/0x20
[<c016d8e2>] generic_delete_inode+0xb2/0xd0
[<c016da1f>] generic_drop_inode+0xf/0x20
[<c016da92>] iput+0x62/0x90
[<c016494f>] sys_unlink+0xdf/0x110
[<c0102e0f>] sysenter_past_esp+0x54/0x75
Trying to fix it up, but a reboot is needed

regards,

Benoit Boissinot

Andrew Morton

unread,

Jun 18, 2005, 6:50:09 PM6/18/05

to

Richard Purdie <rpu...@rpsys.net> wrote:
>
> On Tue, 2005-06-07 at 04:29 -0700, Andrew Morton wrote:
> > +git-arm-smp.patch
> >
> > ARM git trees
>
> The arm pxa255 based Zaurus won't resume from a suspend with the patches
> from the above tree applied. The suspend looks normal and gets at least
> as far as pxa_pm_enter(). After that, the device appears to be dead and
> needs a battery removal to reset. I'm unsure if it actually suspends and
> is failing to resume or is crashing in the latter suspend stages.
>
> Is there some documentation on what the above patch is aiming to do
> anywhere?

Did you apply just that patch, or are you talking about the whole -mm lineup?

If the latter, please test with only git-arm-smp.patch.

Richard Purdie

unread,

Jun 18, 2005, 6:50:06 PM6/18/05

to

On Tue, 2005-06-07 at 04:29 -0700, Andrew Morton wrote:
> +git-arm-smp.patch
>
> ARM git trees

The arm pxa255 based Zaurus won't resume from a suspend with the patches
from the above tree applied. The suspend looks normal and gets at least
as far as pxa_pm_enter(). After that, the device appears to be dead and
needs a battery removal to reset. I'm unsure if it actually suspends and
is failing to resume or is crashing in the latter suspend stages.

Is there some documentation on what the above patch is aiming to do
anywhere?

Richard

Richard Purdie

unread,

Jun 18, 2005, 7:00:16 PM6/18/05

to

On Sat, 2005-06-18 at 15:44 -0700, Andrew Morton wrote:
> Richard Purdie <rpu...@rpsys.net> wrote:
> >
> > On Tue, 2005-06-07 at 04:29 -0700, Andrew Morton wrote:
> > > +git-arm-smp.patch
> > >
> > > ARM git trees
> >
> > The arm pxa255 based Zaurus won't resume from a suspend with the patches
> > from the above tree applied. The suspend looks normal and gets at least
> > as far as pxa_pm_enter(). After that, the device appears to be dead and
> > needs a battery removal to reset. I'm unsure if it actually suspends and
> > is failing to resume or is crashing in the latter suspend stages.
> >
> > Is there some documentation on what the above patch is aiming to do
> > anywhere?
>
> Did you apply just that patch, or are you talking about the whole -mm lineup?
>
> If the latter, please test with only git-arm-smp.patch.

Sorry, I wasn't clear. I had problems with the -mm lineup and tracked it
down to the above patch. With the above patch removed, -mm works fine.

(I know there's a number of changes to the arm pxa suspend/resume code
in git-arm.patch but they're definitely not causing the problem.)

Richard

Richard Purdie

unread,

Jun 18, 2005, 7:20:06 PM6/18/05

to

On Sat, 2005-06-18 at 23:57 +0100, Richard Purdie wrote:
> On Sat, 2005-06-18 at 15:44 -0700, Andrew Morton wrote:
> > > > +git-arm-smp.patch
> > > > ARM git trees
> > >
> > > The arm pxa255 based Zaurus won't resume from a suspend with the patches
> > > from the above tree applied. The suspend looks normal and gets at least
> > > as far as pxa_pm_enter(). After that, the device appears to be dead and
> > > needs a battery removal to reset. I'm unsure if it actually suspends and
> > > is failing to resume or is crashing in the latter suspend stages.
> > >
> > > Is there some documentation on what the above patch is aiming to do
> > > anywhere?
> >
> > Did you apply just that patch, or are you talking about the whole -mm lineup?
> >
> > If the latter, please test with only git-arm-smp.patch.
>
> Sorry, I wasn't clear. I had problems with the -mm lineup and tracked it
> down to the above patch. With the above patch removed, -mm works fine.
>
> (I know there's a number of changes to the arm pxa suspend/resume code
> in git-arm.patch but they're definitely not causing the problem.)

I meant to add that git-arm-smp.patch breaks suspend/resume, even
applied in isolation against 2.6.12-rc6.

Russell King

unread,

Jun 18, 2005, 7:30:13 PM6/18/05

to

On Sat, Jun 18, 2005 at 11:39:18PM +0100, Richard Purdie wrote:
> On Tue, 2005-06-07 at 04:29 -0700, Andrew Morton wrote:
> > +git-arm-smp.patch
> >
> > ARM git trees
>
> The arm pxa255 based Zaurus won't resume from a suspend with the patches
> from the above tree applied. The suspend looks normal and gets at least
> as far as pxa_pm_enter(). After that, the device appears to be dead and
> needs a battery removal to reset. I'm unsure if it actually suspends and
> is failing to resume or is crashing in the latter suspend stages.

<grumble>Well, its a bit late for this since (a) stuff has rapidly
moved on at rmk towers since 2.6.12 was released this morning, and
(b) I've just asked Linus to pull this.</grumble>

Thinking about what's probably happening, I suspect all the ARM suspend
and resume code needs to be reworked to save more state. I'll try to
cook up a patch tomorrow to fix it, but I'll need you to provide
feedback.

Please note that you may see other ARM breakage over the next month
or so - I'm going to be concentrating on merging ARM SMP support,
and whatever bashing other people like yourself can give the kernel
will help ensure that problems are picked up quickly.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core

Richard Purdie

unread,

Jun 18, 2005, 9:30:09 PM6/18/05

to

On Sun, 2005-06-19 at 00:18 +0100, Russell King wrote:
> On Sat, Jun 18, 2005 at 11:39:18PM +0100, Richard Purdie wrote:
> > On Tue, 2005-06-07 at 04:29 -0700, Andrew Morton wrote:
> > > +git-arm-smp.patch
> > >
> > > ARM git trees
> >
> > The arm pxa255 based Zaurus won't resume from a suspend with the patches
> > from the above tree applied. The suspend looks normal and gets at least
> > as far as pxa_pm_enter(). After that, the device appears to be dead and
> > needs a battery removal to reset. I'm unsure if it actually suspends and
> > is failing to resume or is crashing in the latter suspend stages.
>
> <grumble>Well, its a bit late for this since (a) stuff has rapidly
> moved on at rmk towers since 2.6.12 was released this morning, and
> (b) I've just asked Linus to pull this.</grumble>

Please don't underestimate the time it takes to wade through all the
patches in the -mm tree, find the one causing the breakage, investigate
the patch and report it to the person concerned. I'm doing the Zaurus
work in my spare time and don't get paid for it. Just reflashing and
booting a new kernel probably takes ~15mins on the Zaurus. The
copy/clearpage problem took a complete weekend to track down (as it was
showing up randomly) and then needed further evenings to debug your
patch which is a large chunk of my free time. The Checked-By: line
didn't quite give the full picture.

I realise its taken me a while to find enough time to test/debug this
kernel but as least you now know there's a problem...

> Thinking about what's probably happening, I suspect all the ARM suspend
> and resume code needs to be reworked to save more state. I'll try to
> cook up a patch tomorrow to fix it, but I'll need you to provide
> feedback.

Ok, thanks. I'm happy to test any fixes/patches.

> Please note that you may see other ARM breakage over the next month
> or so - I'm going to be concentrating on merging ARM SMP support,
> and whatever bashing other people like yourself can give the kernel
> will help ensure that problems are picked up quickly.

In order to assist with that, can you publish these patches somewhere?
That way, I can apply them against a known good Zaurus kernel tree and
know straight away if they break anything (diff/patch format would be
preferable as my Zaurus trees are all patch based).

On a positive note, something in the later 2.6.12-rc kernels has made a
massive difference to the speed on the Zaurus - I suspect the removal of
the preempt locks on copy/clearpage. It boots up ~1.5x faster and the
speed gain will make a lot of people very happy :)

Richard

Russell King

unread,

Jun 19, 2005, 5:10:14 AM6/19/05

to

On Sun, Jun 19, 2005 at 02:20:48AM +0100, Richard Purdie wrote:
> On Sun, 2005-06-19 at 00:18 +0100, Russell King wrote:
> > Thinking about what's probably happening, I suspect all the ARM suspend
> > and resume code needs to be reworked to save more state. I'll try to
> > cook up a patch tomorrow to fix it, but I'll need you to provide
> > feedback.
>
> Ok, thanks. I'm happy to test any fixes/patches.

This should resolve the problem - we now rely on the stack pointer for
each CPU mode to remain constant throughout the running time of the
kernel, which includes across suspend/resume cycles.

--- a/arch/arm/mach-pxa/sleep.S
+++ b/arch/arm/mach-pxa/sleep.S
@@ -38,6 +38,16 @@ ENTRY(pxa_cpu_suspend)
#endif
stmfd sp!, {r2 - r12, lr} @ save registers on stack

+ @ preserve IRQ, abort and undefined mode stack pointers
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | IRQ_MODE
+ mov r4, sp
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | ABT_MODE
+ mov r5, sp
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | UND_MODE
+ mov r6, sp
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE
+ stmfd sp!, {r4 - r6}
+
@ get coprocessor registers
mrc p14, 0, r3, c6, c0, 0 @ clock configuration, for turbo mode
mrc p15, 0, r4, c15, c1, 0 @ CP access reg
@@ -229,6 +239,17 @@ resume_after_mmu:
#ifdef CONFIG_XSCALE_CACHE_ERRATA
bl cpu_xscale_proc_init
#endif
+
+ @ restore IRQ, abort and undefined mode stack pointers
+ ldmfd sp!, {r4 - r6}
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | IRQ_MODE
+ mov sp, r4
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | ABT_MODE
+ mov sp, r5
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | UND_MODE
+ mov sp, r6
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE
+
ldmfd sp!, {r2, r3}
#ifndef CONFIG_IWMMXT
mar acc0, r2, r3
--- a/arch/arm/mach-sa1100/sleep.S
+++ b/arch/arm/mach-sa1100/sleep.S
@@ -37,6 +37,16 @@ ENTRY(sa1100_cpu_suspend)

stmfd sp!, {r4 - r12, lr} @ save registers on stack

+ @ preserve IRQ, abort and undefined mode stack pointers
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | IRQ_MODE
+ mov r4, sp
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | ABT_MODE
+ mov r5, sp
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | UND_MODE
+ mov r6, sp
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE
+ stmfd sp!, {r4 - r6}
+
@ get coprocessor registers
mrc p15, 0, r4, c3, c0, 0 @ domain ID
mrc p15, 0, r5, c2, c0, 0 @ translation table base addr
@@ -210,6 +220,17 @@ sleep_save_sp:
.text
resume_after_mmu:
mcr p15, 0, r1, c15, c1, 2 @ enable clock switching
+
+ @ restore IRQ, abort and undefined mode stack pointers
+ ldmfd sp!, {r4 - r6}
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | IRQ_MODE
+ mov sp, r4
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | ABT_MODE
+ mov sp, r5
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | UND_MODE
+ mov sp, r6
+ msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE
+
ldmfd sp!, {r4 - r12, pc} @ return to caller

> > Please note that you may see other ARM breakage over the next month
> > or so - I'm going to be concentrating on merging ARM SMP support,
> > and whatever bashing other people like yourself can give the kernel
> > will help ensure that problems are picked up quickly.
>
> In order to assist with that, can you publish these patches somewhere?
> That way, I can apply them against a known good Zaurus kernel tree and
> know straight away if they break anything (diff/patch format would be
> preferable as my Zaurus trees are all patch based).

I'll see what I can do, but I'm going to be working fairly rapidly on
merging this, so expect roughly a patch each day. Hopefully though,
the later patches will only affect the Integrator platform.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core

Russell King

unread,

Jun 19, 2005, 5:20:07 AM6/19/05

to

On Sun, Jun 19, 2005 at 10:02:26AM +0100, Russell King wrote:
> On Sun, Jun 19, 2005 at 02:20:48AM +0100, Richard Purdie wrote:
> > On Sun, 2005-06-19 at 00:18 +0100, Russell King wrote:
> > > Thinking about what's probably happening, I suspect all the ARM suspend
> > > and resume code needs to be reworked to save more state. I'll try to
> > > cook up a patch tomorrow to fix it, but I'll need you to provide
> > > feedback.
> >
> > Ok, thanks. I'm happy to test any fixes/patches.
>
> This should resolve the problem - we now rely on the stack pointer for
> each CPU mode to remain constant throughout the running time of the
> kernel, which includes across suspend/resume cycles.

Actually, this patch is probably an all-round better solution.

--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -328,7 +328,7 @@ static void __init setup_processor(void)
* cpu_init dumps the cache information, initialises SMP specific
* information, and sets up the per-CPU stacks.
*/
-void __init cpu_init(void)
+void cpu_init(void)
{
unsigned int cpu = smp_processor_id();
struct stack *stk = &stacks[cpu];
--- a/arch/arm/mach-pxa/pm.c
+++ b/arch/arm/mach-pxa/pm.c
@@ -133,6 +133,8 @@ static int pxa_pm_enter(suspend_state_t
/* *** go zzz *** */
pxa_cpu_pm_enter(state);

+ cpu_init();
+
/* after sleeping, validate the checksum */
checksum = 0;
for (i = 0; i < SLEEP_SAVE_SIZE - 1; i++)
--- a/arch/arm/mach-sa1100/pm.c
+++ b/arch/arm/mach-sa1100/pm.c
@@ -88,6 +88,8 @@ static int sa11x0_pm_enter(suspend_state
/* go zzz */
sa1100_cpu_suspend();

+ cpu_init();
+
/*
* Ensure not to come back here if it wasn't intended
*/
--- a/include/asm-arm/system.h
+++ b/include/asm-arm/system.h
@@ -104,6 +104,7 @@ extern void show_pte(struct mm_struct *m
extern void __show_regs(struct pt_regs *);

extern int cpu_architecture(void);
+extern void cpu_init(void);

#define set_cr(x) \
__asm__ __volatile__( \

Richard Purdie

unread,

Jun 19, 2005, 1:20:08 PM6/19/05

to

On Sun, 2005-06-19 at 10:11 +0100, Russell King wrote:
> On Sun, Jun 19, 2005 at 10:02:26AM +0100, Russell King wrote:
> > On Sun, Jun 19, 2005 at 02:20:48AM +0100, Richard Purdie wrote:
> > > On Sun, 2005-06-19 at 00:18 +0100, Russell King wrote:
> > > > Thinking about what's probably happening, I suspect all the ARM suspend
> > > > and resume code needs to be reworked to save more state. I'll try to
> > > > cook up a patch tomorrow to fix it, but I'll need you to provide
> > > > feedback.
> > >
> > > Ok, thanks. I'm happy to test any fixes/patches.
> >
> > This should resolve the problem - we now rely on the stack pointer for
> > each CPU mode to remain constant throughout the running time of the
> > kernel, which includes across suspend/resume cycles.
>
> Actually, this patch is probably an all-round better solution.

This patch (the simpler of the two using cpu_init()) allows the pxa to
suspend/resume happily with the git-arm-smp.patch applied.

Richard

Russell King

unread,

Jun 19, 2005, 1:50:10 PM6/19/05

to

On Sun, Jun 19, 2005 at 06:12:38PM +0100, Richard Purdie wrote:
> This patch (the simpler of the two using cpu_init()) allows the pxa to
> suspend/resume happily with the git-arm-smp.patch applied.

Good. Fix committed.

Next batched smp patch can be found at www.home.arm.../~rmk/nightly
which I'm currently planning to go to Linus tonight.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core

Richard Purdie

unread,

Jun 19, 2005, 2:30:13 PM6/19/05

to

On Sun, 2005-06-19 at 18:39 +0100, Russell King wrote:
> Good. Fix committed.

Thanks.

> Next batched smp patch can be found at www.home.arm.../~rmk/nightly
> which I'm currently planning to go to Linus tonight.

I applied smp-20050619.patch to 2.6.12-rc6-mm1 + the last fix and the
Zaurus seems perfectly happy with it. Let me know as and when you have
further releases that need testing (a message to linux-arm-kernel might
be the best way to announce them?).

Richard

Russell King

unread,

Jun 19, 2005, 3:00:18 PM6/19/05

to

On Sun, Jun 19, 2005 at 07:25:59PM +0100, Richard Purdie wrote:
> On Sun, 2005-06-19 at 18:39 +0100, Russell King wrote:
> > Next batched smp patch can be found at www.home.arm.../~rmk/nightly
> > which I'm currently planning to go to Linus tonight.
>
> I applied smp-20050619.patch to 2.6.12-rc6-mm1 + the last fix and the
> Zaurus seems perfectly happy with it. Let me know as and when you have
> further releases that need testing (a message to linux-arm-kernel might
> be the best way to announce them?).

Thanks for testing. Most of the other patches are platform specific
so this may not be required. However, if there are other changes to
non-platform specific, I'll try to point them out a couple of days
before they get merged.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core

Dominik Karall

unread,

Jun 21, 2005, 10:00:17 AM6/21/05

to

On Tuesday 07 June 2005 13:29, Andrew Morton wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2.
>6.12-rc6-mm1/

After looking in my dmesg output today, I saw following error with
2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when it exactly
happens, cause I never used mono last time, I just did an emerge mono on my
gentoo system, maybe this forced the failure.

note: mono[26736] exited with preempt_count 1
scheduling while atomic: mono/0x10000001/26736

Call Trace:<ffffffff803e13ea>{schedule+122} <ffffffff8013197b>{vprintk+635}
<ffffffff803e2738>{cond_resched+56} <ffffffff80164de3>{unmap_vmas+1587}
<ffffffff8016a560>{exit_mmap+128} <ffffffff8012e7bf>{mmput+31}
<ffffffff80133466>{do_exit+438}
<ffffffff8013bf25>{__dequeue_signal+501}
<ffffffff801340c8>{do_group_exit+280}
<ffffffff8013e147>{get_signal_to_deliver+1575}
<ffffffff8010de92>{do_signal+162}
<ffffffff8012d1e0>{default_wake_function+0}
<ffffffff8010e8e1>{sys_rt_sigreturn+577}
<ffffffff8010eb3f>{sysret_signal+28}
<ffffffff8010ee27>{ptregscall_common+103}

cheers,
dominik

Alexey Dobriyan

unread,

Jun 24, 2005, 5:50:04 PM6/24/05

to

On Tuesday 21 June 2005 17:20, Dominik Karall wrote:
> After looking in my dmesg output today, I saw following error with
> 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when it exactly
> happens, cause I never used mono last time, I just did an emerge mono on my
> gentoo system, maybe this forced the failure.
>
> note: mono[26736] exited with preempt_count 1
> scheduling while atomic: mono/0x10000001/26736

I've filed a bug at kernel bugzilla, so your report won't be lost.
See http://bugme.osdl.org/show_bug.cgi?id=4794

You can register at http://bugme.osdl.org/createaccount.cgi and add yourself
to CC list.

Andrew Morton

unread,

Jul 29, 2005, 1:00:10 AM7/29/05

to

A couple of people reported this, but all seems to have gone quiet. Is it
fixed in later -mm's? Is 2.6.13-rc4 running OK?

Thanks.

Dominik Karall

unread,

Jul 29, 2005, 9:50:12 AM7/29/05

to

On Friday 29 July 2005 06:54, Andrew Morton wrote:
> Dominik Karall <dominik...@gmx.net> wrote:
> > On Tuesday 07 June 2005 13:29, Andrew Morton wrote:
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc

> > >6/2. 6.12-rc6-mm1/

> >
> > After looking in my dmesg output today, I saw following error with
> > 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when it exactly
> > happens, cause I never used mono last time, I just did an emerge mono on
> > my gentoo system, maybe this forced the failure.
> >
> > note: mono[26736] exited with preempt_count 1
> > scheduling while atomic: mono/0x10000001/26736
> >
> > Call Trace:<ffffffff803e13ea>{schedule+122}
> > <ffffffff8013197b>{vprintk+635} <ffffffff803e2738>{cond_resched+56}
> > <ffffffff80164de3>{unmap_vmas+1587} <ffffffff8016a560>{exit_mmap+128}
> > <ffffffff8012e7bf>{mmput+31} <ffffffff80133466>{do_exit+438}
> > <ffffffff8013bf25>{__dequeue_signal+501}
> > <ffffffff801340c8>{do_group_exit+280}
> > <ffffffff8013e147>{get_signal_to_deliver+1575}
> > <ffffffff8010de92>{do_signal+162}
> > <ffffffff8012d1e0>{default_wake_function+0}
> > <ffffffff8010e8e1>{sys_rt_sigreturn+577}
> > <ffffffff8010eb3f>{sysret_signal+28}
> > <ffffffff8010ee27>{ptregscall_common+103}
>
> A couple of people reported this, but all seems to have gone quiet. Is it
> fixed in later -mm's? Is 2.6.13-rc4 running OK?
>
> Thanks.

hi andrew!

I'm sorry, but it's not fixed in current 2.6.13-rc3-mm3. I did an emerge mono
right now to test it, and I got this one:
Jul 29 15:26:37 [kernel] note: mono[11138] exited with preempt_count 1
Jul 29 15:26:50 [kernel] file[14627]: segfault at 00002aaaab453000 rip
00002aaaaaf652cf rsp 00007fffffe43b50 error 4
Jul 29 15:26:50 [kernel] file[14633]: segfault at 00002aaaab453000 rip
00002aaaaaf652cf rsp 00007fffffcc87a0 error 4
Jul 29 15:26:51 [kernel] file[14669]: segfault at 00002aaaab453000 rip
00002aaaaaf652cf rsp 00007fffff905f80 error 4

DEBUG_KERNEL/ PREEMPT/ SPINLOCK are enabled, but I didn't get more info about
the bug. Did I forget any debug option?

greets,
dominik

Andrew Morton

unread,

Jul 29, 2005, 2:30:15 PM7/29/05

to

Gee, I don't know how to find this one. Do you know if the problem is
specific to -mm?

Dominik Karall

unread,

Jul 29, 2005, 5:30:24 PM7/29/05

to

On Friday 29 July 2005 20:22, Andrew Morton wrote:
> Dominik Karall <dominik...@gmx.net> wrote:
> > On Friday 29 July 2005 06:54, Andrew Morton wrote:
> > > Dominik Karall <dominik...@gmx.net> wrote:
> > > > On Tuesday 07 June 2005 13:29, Andrew Morton wrote:
> > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.1

> > > > >2-rc 6/2. 6.12-rc6-mm1/

Tested with 2.6.13-rc4 and it seems to work. Didn't get any error.

So it seems to be -mm related. Do you suspect any patch which could cause the
error?

dominik

Dominik Karall

unread,

Jul 29, 2005, 5:40:15 PM7/29/05

to

On Friday 29 July 2005 23:27, Andrew Morton wrote:
> Dominik Karall <dominik...@gmx.net> wrote:
> > On Friday 29 July 2005 20:22, Andrew Morton wrote:
> > > Dominik Karall <dominik...@gmx.net> wrote:
> > > > On Friday 29 July 2005 06:54, Andrew Morton wrote:
> > > > > Dominik Karall <dominik...@gmx.net> wrote:
> > > > > > On Tuesday 07 June 2005 13:29, Andrew Morton wrote:
> > > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2

> > > > > > >.6.1 2-rc 6/2. 6.12-rc6-mm1/

> Great, thanks for that.

>
> > So it seems to be -mm related. Do you suspect any patch which could cause
> > the error?
>

> I wouldn't know, sorry. Possible the scheduler patches, possibly an
> x86_64-specific patch. Is the problem repeatable? If so, a binary search
> would only take ten build-n-boots ;)

Yes, it is repeatable. I tested on lastest -mm about 4 times. Ok, I will try
to find the right patch tomorrow, 10 build-n-boots would end up in morning ;)

btw, as the error occured in 2.6.12-rc6-mm1 too, it must be an old patch which
wasn't merged to linus tree till now...hope there aren't a lot of them :)

Andrew Morton

unread,

Jul 29, 2005, 5:40:15 PM7/29/05

to

Great, thanks for that.

> So it seems to be -mm related. Do you suspect any patch which could cause the
> error?

I wouldn't know, sorry. Possible the scheduler patches, possibly an

x86_64-specific patch. Is the problem repeatable? If so, a binary search
would only take ten build-n-boots ;)

Andrew Morton

unread,

Aug 4, 2005, 3:50:08 PM8/4/05

to

Any progress on this? It kinda measn that the whole of the -mm lineup is
stuck until we can identify the offending patch. We have a couple of weeks
in which to do this but if you can identify the bad patch it'd help
enormously, thanks.

Message has been deleted

Dominik Karall

unread,

Aug 4, 2005, 6:50:13 PM8/4/05

to

On Friday 05 August 2005 00:28, Andrew Morton wrote:

> Andrew Morton <ak...@osdl.org> wrote:
> > Dominik Karall <dominik...@gmx.net> wrote:
> > > On Friday 29 July 2005 23:27, Andrew Morton wrote:
> > > > Dominik Karall <dominik...@gmx.net> wrote:
> > > > > On Friday 29 July 2005 20:22, Andrew Morton wrote:
> > > > > > Dominik Karall <dominik...@gmx.net> wrote:
> > > > > > > On Friday 29 July 2005 06:54, Andrew Morton wrote:
> > > > > > > > Dominik Karall <dominik...@gmx.net> wrote:
> > > > > > > > > On Tuesday 07 June 2005 13:29, Andrew Morton wrote:
> > > > > > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches

> > > > > > > > > >/2.6/2 .6.1 2-rc 6/2. 6.12-rc6-mm1/

> OK, Bartosz Taudul tells me that he's occasionally seeing this on stock
> 2.6.12 (thanks!). So there's not a lot of point in doing the -mm bisection
> search.
>
> I think Ingo was planning on coming up with some infrastructure which would
> allow us to debug this further.

I'm sorry that I couldn't do the tests earlier, but I had no time this week. I
did some tests now and noticed that the bug only occures when kde is
running...weird.
I'm going to continue testing tomorrow after work, exactly in 12 hours ;)

I will let you know if I have any news!

dominik

Ingo Molnar

unread,

Aug 5, 2005, 7:00:21 AM8/5/05

to

* Andrew Morton <ak...@osdl.org> wrote:

> I think Ingo was planning on coming up with some infrastructure which
> would allow us to debug this further.

yeah. I've done this today and have split it out of the -RT tree, see
the patch below. After some exposure in -mm i'd like this feature to go
upstream too.

the patch is against recent Linus trees, 2.6.13-rc4 or later should all
work. Dominik, could you try it and send us the new kernel logs whenever
you happen to hit that warning message again? (Please also enable
CONFIG_KALLSYMS_ALL, so that we get as much symbolic data as possible.)

Ingo

------
this patch implements the "non-preemptible section trace" feature, which
prints out a "critical section nesting" trace after stackdumps:

Call Trace:
[<c0103db1>] show_stack+0x7a/0x90
[<c0103f36>] show_registers+0x156/0x1ce
[<c010412e>] die+0xe8/0x172
[<c010422e>] do_trap+0x76/0xa3
[<c01044fe>] do_invalid_op+0xa3/0xad
[<c01039ef>] error_code+0x4f/0x54
[<c0120be9>] test+0x8/0xa
[<c0120c41>] sys_gettimeofday+0x56/0x74
[<c0102eeb>] sysenter_past_esp+0x54/0x75
---------------------------
| preempt count: 00000004 ]
| 4 levels deep critical section nesting:
-----------------------------------------
.. [<c0120bbe>] .... test3+0xd/0xf
.....[<c0120bc8>] .. ( <= test2+0x8/0x21)
.. [<c0120bbe>] .... test3+0xd/0xf
.....[<c0120bcd>] .. ( <= test2+0xd/0x21)
.. [<c0120bd7>] .... test2+0x17/0x21
.....[<c0120be9>] .. ( <= test+0x8/0xa)
.. [<c010407f>] .... die+0x39/0x172
.....[<c010422e>] .. ( <= do_trap+0x76/0xa3)

this feature is implemented via a low-overhead mechanism by keeping
the caller and caller-parent addresses for each disable_preempt()
call site, and printing it upon crashes. Note that every other
locking API is thus traced too, such as spinlocks, rwlocks, per-cpu
variables, etc. This feature is especially useful in identifying
leaked preemption counts, as the missing count is displayed as an
extra entry in the stack.

the feature is active when PREEMPT_DEBUG is enabled.

i've also cleaned up preemption-count debugging by moving the debug
functions out of sched.c into lib/preempt.c.

also, i have added preemption-counter-imbalance checks to the hardirq
and softirq processing codepaths. The behavior of preemption-counter
checks is now uniform: a warning is printed with all info we have at
that point, and the preemption counter is then restored to the old
value.

on x86 i have changed the 4KSTACKS feature to inherit the low bits of
the preemption-count across hardirq/softirq-context switching, so that
the preemption trace entries of interrupts do not overwrite process
level preemption trace entries.

boot-tested on x86. Should work on all architectures, but only x86 and
x64 has been updated to print the trace-stack out at stackdump time.

This feature was part of the PREEMPT_RT tree for some time and was very
useful in debugging preempt-counter leaks and deadlock/lockup
situations.

Signed-off-by: Ingo Molnar <mi...@elte.hu>

Index: linux/arch/i386/kernel/irq.c
===================================================================
--- linux.orig/arch/i386/kernel/irq.c
+++ linux/arch/i386/kernel/irq.c
@@ -55,6 +55,9 @@ fastcall unsigned int do_IRQ(struct pt_r
{
/* high bits used in ret_from_ code */
int irq = regs->orig_eax & 0xff;
+#ifdef CONFIG_DEBUG_PREEMPT
+ u32 count = preempt_count() & PREEMPT_MASK;
+#endif
#ifdef CONFIG_4KSTACKS
union irq_ctx *curctx, *irqctx;
u32 *isp;
@@ -95,6 +98,14 @@ fastcall unsigned int do_IRQ(struct pt_r
irqctx->tinfo.task = curctx->tinfo.task;
irqctx->tinfo.previous_esp = current_stack_pointer;

+ /*
+ * Keep the preemption-count offset, so that the
+ * process-level preemption-trace entries do not
+ * get overwritten by the hardirq context:
+ */
+#ifdef CONFIG_DEBUG_PREEMPT
+ irqctx->tinfo.preempt_count += count;
+#endif
asm volatile(
" xchgl %%ebx,%%esp \n"
" call __do_IRQ \n"
@@ -103,6 +114,9 @@ fastcall unsigned int do_IRQ(struct pt_r
: "0" (irq), "1" (regs), "2" (isp)
: "memory", "cc", "ecx"
);
+#ifdef CONFIG_DEBUG_PREEMPT
+ irqctx->tinfo.preempt_count -= count;
+#endif
} else
#endif
__do_IRQ(irq, regs);
@@ -165,6 +179,9 @@ extern asmlinkage void __do_softirq(void

asmlinkage void do_softirq(void)
{
+#ifdef CONFIG_DEBUG_PREEMPT
+ u32 count = preempt_count() & PREEMPT_MASK;
+#endif
unsigned long flags;
struct thread_info *curctx;
union irq_ctx *irqctx;
@@ -181,6 +198,14 @@ asmlinkage void do_softirq(void)
irqctx->tinfo.task = curctx->task;
irqctx->tinfo.previous_esp = current_stack_pointer;

+ /*
+ * Keep the preemption-count offset, so that the
+ * process-level preemption-trace entries do not
+ * get overwritten by the softirq context:
+ */
+#ifdef CONFIG_DEBUG_PREEMPT
+ irqctx->tinfo.preempt_count += count;
+#endif
/* build the stack frame on the softirq stack */
isp = (u32*) ((char*)irqctx + sizeof(*irqctx));

@@ -192,6 +217,9 @@ asmlinkage void do_softirq(void)
: "0"(isp)
: "memory", "cc", "edx", "ecx", "eax"
);
+#ifdef CONFIG_DEBUG_PREEMPT
+ irqctx->tinfo.preempt_count -= count;
+#endif
}

local_irq_restore(flags);
Index: linux/arch/i386/kernel/traps.c
===================================================================
--- linux.orig/arch/i386/kernel/traps.c
+++ linux/arch/i386/kernel/traps.c
@@ -164,6 +164,7 @@ void show_trace(struct task_struct *task
break;
printk(" =======================\n");
}
+ print_preempt_trace(task, preempt_count());
}

void show_stack(struct task_struct *task, unsigned long *esp)
Index: linux/arch/x86_64/kernel/traps.c
===================================================================
--- linux.orig/arch/x86_64/kernel/traps.c
+++ linux/arch/x86_64/kernel/traps.c
@@ -221,6 +221,7 @@ void show_trace(unsigned long *stack)
HANDLE_STACK (((long) stack & (THREAD_SIZE-1)) != 0);
#undef HANDLE_STACK
printk("\n");
+ print_traces(task);
}

void show_stack(struct task_struct *tsk, unsigned long * rsp)
Index: linux/include/linux/sched.h
===================================================================
--- linux.orig/include/linux/sched.h
+++ linux/include/linux/sched.h
@@ -592,6 +592,14 @@ extern int groups_search(struct group_in
#define GROUP_AT(gi, i) \
((gi)->blocks[(i)/NGROUPS_PER_BLOCK][(i)%NGROUPS_PER_BLOCK])

+#ifdef CONFIG_DEBUG_PREEMPT
+# define MAX_PREEMPT_TRACE 25
+extern void print_preempt_trace(struct task_struct *task, u32 count);
+#else
+static inline void print_preempt_trace(struct task_struct *task, u32 count)
+{
+}
+#endif

struct audit_context; /* See audit.c */
struct mempolicy;
@@ -770,6 +778,11 @@ struct task_struct {
int cpuset_mems_generation;
#endif
atomic_t fs_excl; /* holding fs exclusive resources */
+
+#ifdef CONFIG_DEBUG_PREEMPT
+ void *preempt_off_caller[MAX_PREEMPT_TRACE];
+ void *preempt_off_parent[MAX_PREEMPT_TRACE];
+#endif
};

static inline pid_t process_group(struct task_struct *tsk)
Index: linux/kernel/exit.c
===================================================================
--- linux.orig/kernel/exit.c
+++ linux/kernel/exit.c
@@ -821,10 +821,11 @@ fastcall NORET_TYPE void do_exit(long co
tsk->it_prof_expires = cputime_zero;
tsk->it_sched_expires = 0;

- if (unlikely(in_atomic()))
- printk(KERN_INFO "note: %s[%d] exited with preempt_count %d\n",
- current->comm, current->pid,
- preempt_count());
+ if (unlikely(in_atomic())) {
+ printk(KERN_ERR "BUG: %s[%d] exited with nonzero preempt_count %d!\n",
+ tsk->comm, tsk->pid, preempt_count());
+ print_preempt_trace(tsk, preempt_count());
+ }

acct_update_integrals(tsk);
update_mem_hiwater(tsk);
Index: linux/kernel/irq/handle.c
===================================================================
--- linux.orig/kernel/irq/handle.c
+++ linux/kernel/irq/handle.c
@@ -85,7 +85,24 @@ fastcall int handle_IRQ_event(unsigned i
local_irq_enable();

do {
+#ifdef CONFIG_DEBUG_PREEMPT
+ u32 in_count = preempt_count(), out_count;
+#endif
ret = action->handler(irq, action->dev_id, regs);
+#ifdef CONFIG_DEBUG_PREEMPT
+ out_count = preempt_count();
+ if (in_count != out_count) {
+ printk(KERN_ERR "BUG: irq %d [%s] preempt-count "
+ "imbalance: in=%08x, out=%08x!\n",
+ irq, action->name, in_count, out_count);
+ print_preempt_trace(current, out_count);
+ /*
+ * We already printed all the useful info,
+ * fix up the preemption count now:
+ */
+ preempt_count() = in_count;
+ }
+#endif
if (ret == IRQ_HANDLED)
status |= action->flags;
retval |= ret;
Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -47,6 +47,7 @@
#include <linux/syscalls.h>
#include <linux/times.h>
#include <linux/acct.h>
+#include <linux/kallsyms.h>
#include <asm/tlb.h>

#include <asm/unistd.h>
@@ -2707,38 +2708,6 @@ static inline int dependent_sleeper(int
}
#endif

-#if defined(CONFIG_PREEMPT) && defined(CONFIG_DEBUG_PREEMPT)
-
-void fastcall add_preempt_count(int val)
-{
- /*
- * Underflow?
- */
- BUG_ON((preempt_count() < 0));
- preempt_count() += val;
- /*
- * Spinlock count overflowing soon?
- */
- BUG_ON((preempt_count() & PREEMPT_MASK) >= PREEMPT_MASK-10);
-}
-EXPORT_SYMBOL(add_preempt_count);
-
-void fastcall sub_preempt_count(int val)
-{
- /*
- * Underflow?
- */
- BUG_ON(val > preempt_count());
- /*
- * Is the spinlock portion underflowing?
- */
- BUG_ON((val < PREEMPT_MASK) && !(preempt_count() & PREEMPT_MASK));
- preempt_count() -= val;
-}
-EXPORT_SYMBOL(sub_preempt_count);
-
-#endif
-
/*
* schedule() is the main scheduler function.
*/
Index: linux/kernel/softirq.c
===================================================================
--- linux.orig/kernel/softirq.c
+++ linux/kernel/softirq.c
@@ -92,7 +92,23 @@ restart:

do {
if (pending & 1) {
+#ifdef CONFIG_DEBUG_PREEMPT
+ u32 in_count = preempt_count();
+#endif
h->action(h);
+#ifdef CONFIG_DEBUG_PREEMPT
+ out_count = preempt_count();
+ if (in_count != out_count) {
+ printk(KERN_ERR "BUG: softirq %d preempt-count "
+ "imbalance: in=%08x, out=%08x!\n",
+ h - softirq_vec, in_count, out_count);
+ print_preempt_trace(current, out_count);
+ /*
+ * Fix up the bad preemption count:
+ */
+ preempt_count() = in_count;
+ }
+#endif
rcu_bh_qsctr_inc(cpu);
}
h++;
Index: linux/kernel/timer.c
===================================================================
--- linux.orig/kernel/timer.c
+++ linux/kernel/timer.c
@@ -33,6 +33,7 @@
#include <linux/posix-timers.h>
#include <linux/cpu.h>
#include <linux/syscalls.h>
+#include <linux/kallsyms.h>

#include <asm/uaccess.h>
#include <asm/unistd.h>
@@ -480,6 +481,7 @@ static inline void __run_timers(tvec_bas
while (!list_empty(head)) {
void (*fn)(unsigned long);
unsigned long data;
+ int in_count, out_count;

timer = list_entry(head->next,struct timer_list,entry);
fn = timer->function;
@@ -488,17 +490,20 @@ static inline void __run_timers(tvec_bas
set_running_timer(base, timer);
detach_timer(timer, 1);
spin_unlock_irq(&base->t_base.lock);
- {
- int preempt_count = preempt_count();
- fn(data);
- if (preempt_count != preempt_count()) {
- printk(KERN_WARNING "huh, entered %p "
- "with preempt_count %08x, exited"
- " with %08x?\n",
- fn, preempt_count,
- preempt_count());
- BUG();
- }
+
+ in_count = preempt_count();
+ fn(data);
+ out_count = preempt_count();
+ if (in_count != out_count) {
+ print_symbol(KERN_ERR "BUG: %s", (long)fn);
+ printk(KERN_ERR "(%p) preempt-count imbalance: "
+ "in=%08x, out=%08x!",
+ fn, in_count, out_count);
+ print_preempt_trace(current, out_count);
+ /*
+ * Fix up the bad preemption count:
+ */
+ preempt_count() = in_count;
}
spin_lock_irq(&base->t_base.lock);
}
Index: linux/lib/Makefile
===================================================================
--- linux.orig/lib/Makefile
+++ linux/lib/Makefile
@@ -20,7 +20,7 @@ lib-$(CONFIG_RWSEM_GENERIC_SPINLOCK) +=
lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
lib-$(CONFIG_GENERIC_FIND_NEXT_BIT) += find_next_bit.o
obj-$(CONFIG_LOCK_KERNEL) += kernel_lock.o
-obj-$(CONFIG_DEBUG_PREEMPT) += smp_processor_id.o
+obj-$(CONFIG_DEBUG_PREEMPT) += smp_processor_id.o preempt.o

ifneq ($(CONFIG_HAVE_DEC_LOCK),y)
lib-y += dec_and_lock.o
Index: linux/lib/preempt.c
===================================================================
--- /dev/null
+++ linux/lib/preempt.c
@@ -0,0 +1,101 @@
+/*
+ * lib/preempt.c
+ *
+ * DEBUG_PREEMPT variant of add_preempt_count() and sub_preempt_count().
+ * Preemption tracing.
+ *
+ * (C) 2005 Ingo Molnar, Red Hat
+ */
+#include <linux/module.h>
+#include <linux/hardirq.h>
+#include <linux/kallsyms.h>
+
+/*
+ * Add a value to the preemption count, and check for overflows,
+ * underflows and maintain a small stack of callers that gets
+ * printed upon crashes.
+ */
+void fastcall add_preempt_count(int val)
+{
+ unsigned int count = preempt_count(), idx = count & PREEMPT_MASK;
+
+ /*
+ * Underflow?
+ */
+ BUG_ON(count < 0);
+
+ preempt_count() += val;
+
+ /*
+ * Spinlock count overflowing soon?
+ */
+ BUG_ON(idx >= PREEMPT_MASK-10);
+
+ /*
+ * Maintain the per-task preemption-nesting stack (which
+ * will be printed upon crashes). It's a low-overhead thing,
+ * constant overhead per preempt-disable.
+ */
+ if (idx < MAX_PREEMPT_TRACE) {
+ void *caller = __builtin_return_address(0), *parent = NULL;
+
+#ifdef CONFIG_FRAME_POINTER
+ parent = __builtin_return_address(1);
+ if (in_lock_functions(parent)) {
+ parent = __builtin_return_address(2);
+ if (in_lock_functions(parent))
+ parent = __builtin_return_address(3);
+ }
+#endif
+ current->preempt_off_caller[idx] = caller;
+ current->preempt_off_parent[idx] = parent;
+ }
+}
+EXPORT_SYMBOL(add_preempt_count);
+
+void fastcall sub_preempt_count(int val)
+{
+ unsigned int count = preempt_count();
+
+ /*
+ * Underflow?
+ */
+ BUG_ON(val > count);
+ /*
+ * Is the spinlock portion underflowing?
+ */
+ BUG_ON((val < PREEMPT_MASK) && !(count & PREEMPT_MASK));
+
+ preempt_count() -= val;
+}
+EXPORT_SYMBOL(sub_preempt_count);
+
+void print_preempt_trace(struct task_struct *task, u32 count)
+{
+ unsigned int i, idx = count & PREEMPT_MASK;
+
+ preempt_disable();
+
+ printk("---------------------------\n");
+ printk("| preempt count: %08x ]\n", count);
+ if (count) {
+ printk("| %d level deep critical section nesting:\n", idx);
+ printk("----------------------------------------\n");
+ } else
+ printk("---------------------------\n");
+ for (i = 0; i < idx; i++) {
+ printk(".. [<%p>] .... ", task->preempt_off_caller[i]);
+ print_symbol("%s\n", (long)task->preempt_off_caller[i]);
+ printk(".....[<%p>] .. ( <= ",
+ task->preempt_off_parent[i]);
+ print_symbol("%s)\n", (long)task->preempt_off_parent[i]);
+ if (i == MAX_PREEMPT_TRACE-1) {
+ printk("[rest truncated, reached MAX_PREEMPT_TRACE]\n");
+ break;
+ }
+ }
+ printk("\n");
+
+ preempt_enable();
+}
+

Dominik Karall

unread,

Aug 5, 2005, 7:50:06 AM8/5/05

to

On Friday 05 August 2005 12:48, Ingo Molnar wrote:
> * Andrew Morton <ak...@osdl.org> wrote:
> > I think Ingo was planning on coming up with some infrastructure which
> > would allow us to debug this further.
>
> yeah. I've done this today and have split it out of the -RT tree, see
> the patch below. After some exposure in -mm i'd like this feature to go
> upstream too.
>
> the patch is against recent Linus trees, 2.6.13-rc4 or later should all
> work. Dominik, could you try it and send us the new kernel logs whenever
> you happen to hit that warning message again? (Please also enable
> CONFIG_KALLSYMS_ALL, so that we get as much symbolic data as possible.)

I tried to compile the patch on top of 2.6.13-rc4-mm1, it applied with a few
offsets, but it looked ok.
Here is the error I get when I compiled it:

CC arch/x86_64/kernel/traps.o
arch/x86_64/kernel/traps.c: In function `show_trace':
arch/x86_64/kernel/traps.c:228: warning: implicit declaration of function
`print_traces'
arch/x86_64/kernel/traps.c:228: error: `task' undeclared (first use in this
function)
arch/x86_64/kernel/traps.c:228: error: (Each undeclared identifier is reported
only once
arch/x86_64/kernel/traps.c:228: error: for each function it appears in.)
make[1]: *** [arch/x86_64/kernel/traps.o] Error 1

I took a look at the traps.c file, but couldn't find any solution, as there is
no print_traces function and task variable too in this section.

dominik

Dominik Karall

unread,

Aug 5, 2005, 10:30:16 AM8/5/05

to

On Friday 05 August 2005 12:48, Ingo Molnar wrote:

> * Andrew Morton <ak...@osdl.org> wrote:
> > I think Ingo was planning on coming up with some infrastructure which
> > would allow us to debug this further.
>
> yeah. I've done this today and have split it out of the -RT tree, see
> the patch below. After some exposure in -mm i'd like this feature to go
> upstream too.
>
> the patch is against recent Linus trees, 2.6.13-rc4 or later should all
> work. Dominik, could you try it and send us the new kernel logs whenever
> you happen to hit that warning message again? (Please also enable
> CONFIG_KALLSYMS_ALL, so that we get as much symbolic data as possible.)

Here's a preempt trace output from mono. To compile preempt-trace.patch I
remove the traps.c patch and added u32 definition for out_count in handle.c.
After those changes, the kernel compiled fine.

Now here's the output, let me know if it is ok, or if you can make any reveals
where the bug is located.

BUG: mono[10011] exited with nonzero preempt_count 1!
---------------------------
| preempt count: 00000001 ]
| 1 level deep critical section nesting:
----------------------------------------
.. [<ffffffff803f791e>] .... _spin_lock+0xe/0x70
.....[<0000000000000000>] .. ( <= 0x0)

If there is anything I should test, let me know!

dominik

Ingo Molnar

unread,

Aug 5, 2005, 11:20:11 AM8/5/05

to

* Dominik Karall <dominik...@gmx.net> wrote:

> > yeah. I've done this today and have split it out of the -RT tree, see
> > the patch below. After some exposure in -mm i'd like this feature to go
> > upstream too.
> >
> > the patch is against recent Linus trees, 2.6.13-rc4 or later should all
> > work. Dominik, could you try it and send us the new kernel logs whenever
> > you happen to hit that warning message again? (Please also enable
> > CONFIG_KALLSYMS_ALL, so that we get as much symbolic data as possible.)
>
> I tried to compile the patch on top of 2.6.13-rc4-mm1, it applied with a few
> offsets, but it looked ok.
> Here is the error I get when I compiled it:

ok, does the additional patch below fix things for you?

Ingo

------

- fix the x64 build

- get the preempt_count from the right task on x86 (it's usually
'current', but not always.)

- fix compiler warning in kernel/softirq.c on 64-bit platforms

Signed-off-by: Ingo Molnar <mi...@elte.hu>

Index: linux-preempt-trace/arch/i386/kernel/traps.c
===================================================================
--- linux-preempt-trace.orig/arch/i386/kernel/traps.c
+++ linux-preempt-trace/arch/i386/kernel/traps.c
@@ -164,7 +164,7 @@ void show_trace(struct task_struct *task

break;
printk(" =======================\n");
}

- print_preempt_trace(task, preempt_count());
+ print_preempt_trace(task, task->thread_info->preempt_count);

}

void show_stack(struct task_struct *task, unsigned long *esp)

Index: linux-preempt-trace/arch/x86_64/kernel/process.c
===================================================================
--- linux-preempt-trace.orig/arch/x86_64/kernel/process.c
+++ linux-preempt-trace/arch/x86_64/kernel/process.c
@@ -311,7 +311,7 @@ void __show_regs(struct pt_regs * regs)
void show_regs(struct pt_regs *regs)
{
__show_regs(regs);
- show_trace(&regs->rsp);
+ show_trace(current, &regs->rsp);
}

/*
Index: linux-preempt-trace/arch/x86_64/kernel/traps.c
===================================================================
--- linux-preempt-trace.orig/arch/x86_64/kernel/traps.c
+++ linux-preempt-trace/arch/x86_64/kernel/traps.c
@@ -29,6 +29,7 @@
#include <linux/module.h>
#include <linux/moduleparam.h>
#include <linux/nmi.h>
+#include <linux/sched.h>

#include <asm/system.h>
#include <asm/uaccess.h>
@@ -156,7 +157,7 @@ static unsigned long *in_exception_stack
* severe exception (double fault, nmi, stack fault, debug, mce) hardware stack
*/

-void show_trace(unsigned long *stack)
+void show_trace(struct task_struct *task, unsigned long *stack)
{
unsigned long addr;
const unsigned cpu = safe_smp_processor_id();
@@ -221,7 +222,7 @@ void show_trace(unsigned long *stack)

HANDLE_STACK (((long) stack & (THREAD_SIZE-1)) != 0);
#undef HANDLE_STACK
printk("\n");

- print_traces(task);
+ print_preempt_trace(task, task->thread_info->preempt_count);

}

void show_stack(struct task_struct *tsk, unsigned long * rsp)

@@ -258,7 +259,7 @@ void show_stack(struct task_struct *tsk,
printk("%016lx ", *stack++);
touch_nmi_watchdog();
}
- show_trace((unsigned long *)rsp);
+ show_trace(tsk, (unsigned long *)rsp);
}

/*
@@ -267,7 +268,7 @@ void show_stack(struct task_struct *tsk,
void dump_stack(void)
{
unsigned long dummy;
- show_trace(&dummy);
+ show_trace(current, &dummy);
}

EXPORT_SYMBOL(dump_stack);
Index: linux-preempt-trace/include/asm-x86_64/proto.h
===================================================================
--- linux-preempt-trace.orig/include/asm-x86_64/proto.h
+++ linux-preempt-trace/include/asm-x86_64/proto.h
@@ -66,7 +66,7 @@ extern unsigned long end_pfn_map;

extern cpumask_t cpu_initialized;

-extern void show_trace(unsigned long * rsp);
+extern void show_trace(struct task_struct *task, unsigned long *rsp);
extern void show_registers(struct pt_regs *regs);

extern void exception_table_check(void);
Index: linux-preempt-trace/kernel/softirq.c
===================================================================
--- linux-preempt-trace.orig/kernel/softirq.c
+++ linux-preempt-trace/kernel/softirq.c
@@ -99,7 +99,7 @@ restart:
#ifdef CONFIG_DEBUG_PREEMPT
out_count = preempt_count();
if (in_count != out_count) {
- printk(KERN_ERR "BUG: softirq %d preempt-count "
+ printk(KERN_ERR "BUG: softirq %ld preempt-count "

"imbalance: in=%08x, out=%08x!\n",

h - softirq_vec, in_count, out_count);

print_preempt_trace(current, out_count);

Ingo Molnar

unread,

Aug 5, 2005, 11:30:19 AM8/5/05

to

* Dominik Karall <dominik...@gmx.net> wrote:

> BUG: mono[10011] exited with nonzero preempt_count 1!
> ---------------------------
> | preempt count: 00000001 ]
> | 1 level deep critical section nesting:
> ----------------------------------------
> .. [<ffffffff803f791e>] .... _spin_lock+0xe/0x70
> .....[<0000000000000000>] .. ( <= 0x0)
>
> If there is anything I should test, let me know!

please enable CONFIG_FRAME_POINTERS!

we now know that it's a spin_lock reference that got leaked, but we dont
(yet) know the parent.

Ingo

Dominik Karall

unread,

Aug 5, 2005, 2:00:16 PM8/5/05

to

On Friday 05 August 2005 17:22, Ingo Molnar wrote:
> * Dominik Karall <dominik...@gmx.net> wrote:
> > BUG: mono[10011] exited with nonzero preempt_count 1!
> > ---------------------------
> >
> > | preempt count: 00000001 ]
> > | 1 level deep critical section nesting:
> >
> > ----------------------------------------
> > .. [<ffffffff803f791e>] .... _spin_lock+0xe/0x70
> > .....[<0000000000000000>] .. ( <= 0x0)
> >
> > If there is anything I should test, let me know!
>
> please enable CONFIG_FRAME_POINTERS!
>
> we now know that it's a spin_lock reference that got leaked, but we dont
> (yet) know the parent.

I'm sorry, but I think I can't enable CONFIG_FRAME_POINTERS.
Depends on: DEBUG_KERNEL && (X86 && !X86_64 || CRIS || M68K || M68KNOMMU ||
FRV || UML)

Seems to be disabled for x86_64.

dominik

Andrew Morton

unread,

Aug 5, 2005, 2:20:07 PM8/5/05

to

Ingo Molnar <mi...@elte.hu> wrote:
>
>
> * Dominik Karall <dominik...@gmx.net> wrote:
>
> > BUG: mono[10011] exited with nonzero preempt_count 1!
> > ---------------------------
> > | preempt count: 00000001 ]
> > | 1 level deep critical section nesting:
> > ----------------------------------------
> > .. [<ffffffff803f791e>] .... _spin_lock+0xe/0x70
> > .....[<0000000000000000>] .. ( <= 0x0)
> >
> > If there is anything I should test, let me know!

Thanks, Dominik.

> please enable CONFIG_FRAME_POINTERS!

Seems a bit tricky. Wouldn't it be best if enabling CONFIG_DEBUG_PREEMPT
autoselected CONFIG_KALLSYMS_ALL, CONFIG_FRAME_POINTER and whatever else
we need?

Dominik Karall

unread,

Aug 5, 2005, 2:20:11 PM8/5/05

to

On Friday 05 August 2005 17:13, Ingo Molnar wrote:
> * Dominik Karall <dominik...@gmx.net> wrote:
> > > yeah. I've done this today and have split it out of the -RT tree, see
> > > the patch below. After some exposure in -mm i'd like this feature to go
> > > upstream too.
> > >
> > > the patch is against recent Linus trees, 2.6.13-rc4 or later should all
> > > work. Dominik, could you try it and send us the new kernel logs
> > > whenever you happen to hit that warning message again? (Please also
> > > enable CONFIG_KALLSYMS_ALL, so that we get as much symbolic data as
> > > possible.)
> >
> > I tried to compile the patch on top of 2.6.13-rc4-mm1, it applied with a
> > few offsets, but it looked ok.
> > Here is the error I get when I compiled it:
>
> ok, does the additional patch below fix things for you?

Yes, only out_count wasn't defined in softirq.c, here's the patch to fix it.
The first patch in traps.c failed on rc4-mm1, but it doesn't matter, as
sched.h seems to be already included there. I think it is even included in
-rc4 too.

dominik

-----

--- linux/kernel/softirq.c.orig 2005-08-05 20:00:28.000000000 +0200
+++ linux/kernel/softirq.c 2005-08-05 20:02:40.000000000 +0200
@@ -93,7 +93,7 @@ restart:

do {
if (pending & 1) {

#ifdef CONFIG_DEBUG_PREEMPT
- u32 in_count = preempt_count();

+ u32 in_count = preempt_count(), out_count;

#endif
h->action(h);
#ifdef CONFIG_DEBUG_PREEMPT

Hugh Dickins

unread,

Aug 5, 2005, 2:50:09 PM8/5/05

to

On Fri, 5 Aug 2005, Dominik Karall wrote:
> On Friday 05 August 2005 17:22, Ingo Molnar wrote:
> >

> > please enable CONFIG_FRAME_POINTERS!

>
> I'm sorry, but I think I can't enable CONFIG_FRAME_POINTERS.
> Depends on: DEBUG_KERNEL && (X86 && !X86_64 || CRIS || M68K || M68KNOMMU ||
> FRV || UML)
>
> Seems to be disabled for x86_64.

It is disabled for x86_64, but not for any very good reason (beyond
reducing the test matrix). I work with CONFIG_FRAME_POINTERS on x86_64
with no trouble, just add in the patch below, make oldconfig, choose
frame pointers and rebuild). But I can't guarantee it'll actually
reveal the info Ingo and all are longing to see.

Hugh

--- 2.6.13-rc5/lib/Kconfig.debug 2005-06-17 20:48:29.000000000 +0100
+++ linux/lib/Kconfig.debug 2005-07-29 18:40:28.000000000 +0100
@@ -151,7 +151,7 @@ config DEBUG_FS

config FRAME_POINTER
bool "Compile the kernel with frame pointers"
- depends on DEBUG_KERNEL && ((X86 && !X86_64) || CRIS || M68K || M68KNOMMU || FRV || UML)
+ depends on DEBUG_KERNEL && (X86 || CRIS || M68K || M68KNOMMU || FRV || UML)
default y if DEBUG_INFO && UML
help
If you say Y here the resulting kernel image will be slightly larger

Dominik Karall

unread,

Aug 5, 2005, 3:30:10 PM8/5/05

to

On Friday 05 August 2005 20:46, Hugh Dickins wrote:
> On Fri, 5 Aug 2005, Dominik Karall wrote:
> > On Friday 05 August 2005 17:22, Ingo Molnar wrote:
> > > please enable CONFIG_FRAME_POINTERS!
> >
> > I'm sorry, but I think I can't enable CONFIG_FRAME_POINTERS.
> > Depends on: DEBUG_KERNEL && (X86 && !X86_64 || CRIS || M68K || M68KNOMMU
> > || FRV || UML)
> >
> > Seems to be disabled for x86_64.
>
> It is disabled for x86_64, but not for any very good reason (beyond
> reducing the test matrix). I work with CONFIG_FRAME_POINTERS on x86_64
> with no trouble, just add in the patch below, make oldconfig, choose
> frame pointers and rebuild). But I can't guarantee it'll actually
> reveal the info Ingo and all are longing to see.

With FRAME_POINTERS enabled:

BUG: mono[3193] exited with nonzero preempt_count 1!

---------------------------
| preempt count: 00000001 ]
| 1 level deep critical section nesting:
----------------------------------------

.. [<ffffffff80400a46>] .... _spin_lock+0x16/0x80
.....[<ffffffff801ed30c>] .. ( <= sys_semtimedop+0x28c/0x7c0)

hth, let me know!

dominik

Ingo Molnar

unread,

Aug 5, 2005, 4:20:10 PM8/5/05

to

* Dominik Karall <dominik...@gmx.net> wrote:

> With FRAME_POINTERS enabled:
>
> BUG: mono[3193] exited with nonzero preempt_count 1!
> ---------------------------
> | preempt count: 00000001 ]
> | 1 level deep critical section nesting:
> ----------------------------------------
> .. [<ffffffff80400a46>] .... _spin_lock+0x16/0x80
> .....[<ffffffff801ed30c>] .. ( <= sys_semtimedop+0x28c/0x7c0)

thanks. It seems semundo->lock somehow leaked. One possibility would be
of semundo->refcount going from 2 to 1 while another thread has it
locked. I dont see what prevents this scenario from happening. To test
this theory, could you apply the patch below, which will do semundo
locking not conditional on the refcount - does it fix the bug?

Ingo

ipc/sem.c | 10 +++-------
1 files changed, 3 insertions(+), 7 deletions(-)

Index: linux-preempt-trace/ipc/sem.c
===================================================================
--- linux-preempt-trace.orig/ipc/sem.c
+++ linux-preempt-trace/ipc/sem.c
@@ -895,7 +895,7 @@ static inline void lock_semundo(void)
struct sem_undo_list *undo_list;

undo_list = current->sysvsem.undo_list;
- if ((undo_list != NULL) && (atomic_read(&undo_list->refcnt) != 1))
+ if (undo_list)
spin_lock(&undo_list->lock);
}

@@ -915,7 +915,7 @@ static inline void unlock_semundo(void)
struct sem_undo_list *undo_list;

undo_list = current->sysvsem.undo_list;
- if ((undo_list != NULL) && (atomic_read(&undo_list->refcnt) != 1))
+ if (undo_list)
spin_unlock(&undo_list->lock);
}

@@ -943,9 +943,7 @@ static inline int get_undo_list(struct s
if (undo_list == NULL)
return -ENOMEM;
memset(undo_list, 0, size);
- /* don't initialize unodhd->lock here. It's done
- * in copy_semundo() instead.
- */
+ spin_lock_init(&undo_list->lock);
atomic_set(&undo_list->refcnt, 1);
current->sysvsem.undo_list = undo_list;
}
@@ -1231,8 +1229,6 @@ int copy_semundo(unsigned long clone_fla
error = get_undo_list(&undo_list);
if (error)
return error;
- if (atomic_read(&undo_list->refcnt) == 1)
- spin_lock_init(&undo_list->lock);
atomic_inc(&undo_list->refcnt);
tsk->sysvsem.undo_list = undo_list;
} else

Ingo Molnar

unread,

Aug 5, 2005, 4:20:11 PM8/5/05

to

* Andrew Morton <ak...@osdl.org> wrote:

> > please enable CONFIG_FRAME_POINTERS!
>
> Seems a bit tricky. Wouldn't it be best if enabling
> CONFIG_DEBUG_PREEMPT autoselected CONFIG_KALLSYMS_ALL,
> CONFIG_FRAME_POINTER and whatever else we need?

ok, agreed:

-----
when DEBUG_PREEMPT is enabled, select FRAME_POINTER and KALLSYMS_ALL
as well, to make the debug output more useful.

Signed-off-by: Ingo Molnar <mi...@elte.hu>

lib/Kconfig.debug | 3 +++
1 files changed, 3 insertions(+)

Index: linux-preempt-trace/lib/Kconfig.debug
===================================================================
--- linux-preempt-trace.orig/lib/Kconfig.debug
+++ linux-preempt-trace/lib/Kconfig.debug
@@ -70,6 +70,9 @@ config DEBUG_PREEMPT
bool "Debug preemptible kernel"
depends on DEBUG_KERNEL && PREEMPT
default y
+ select FRAME_POINTER
+ select KALLSYMS
+ select KALLSYMS_ALL
help
If you say Y here then the kernel will use a debug variant of the
commonly used smp_processor_id() function and will print warnings

Ingo Molnar

unread,

Aug 5, 2005, 4:30:16 PM8/5/05

to

here's a full patch again of all things preempt-trace (excludes the sysv
semaphores change):

--------

boot-tested on x86. Should work on all architectures.

Signed-off-by: Ingo Molnar <mi...@elte.hu>

Index: linux-preempt-trace/arch/i386/kernel/irq.c
===================================================================
--- linux-preempt-trace.orig/arch/i386/kernel/irq.c
+++ linux-preempt-trace/arch/i386/kernel/irq.c

Index: linux-preempt-trace/arch/i386/kernel/traps.c
===================================================================
--- linux-preempt-trace.orig/arch/i386/kernel/traps.c
+++ linux-preempt-trace/arch/i386/kernel/traps.c

@@ -164,6 +164,7 @@ void show_trace(struct task_struct *task
break;
printk(" =======================\n");
}

+ print_preempt_trace(task, task->thread_info->preempt_count);

}

void show_stack(struct task_struct *task, unsigned long *esp)

@@ -221,6 +222,7 @@ void show_trace(unsigned long *stack)

HANDLE_STACK (((long) stack & (THREAD_SIZE-1)) != 0);
#undef HANDLE_STACK
printk("\n");

+ print_preempt_trace(task, task->thread_info->preempt_count);

}

void show_stack(struct task_struct *tsk, unsigned long * rsp)

@@ -257,7 +259,7 @@ void show_stack(struct task_struct *tsk,

printk("%016lx ", *stack++);
touch_nmi_watchdog();
}
- show_trace((unsigned long *)rsp);
+ show_trace(tsk, (unsigned long *)rsp);
}

/*

@@ -266,7 +268,7 @@ void show_stack(struct task_struct *tsk,

void dump_stack(void)
{
unsigned long dummy;
- show_trace(&dummy);
+ show_trace(current, &dummy);
}

EXPORT_SYMBOL(dump_stack);
Index: linux-preempt-trace/include/asm-x86_64/proto.h
===================================================================
--- linux-preempt-trace.orig/include/asm-x86_64/proto.h
+++ linux-preempt-trace/include/asm-x86_64/proto.h
@@ -66,7 +66,7 @@ extern unsigned long end_pfn_map;

extern cpumask_t cpu_initialized;

-extern void show_trace(unsigned long * rsp);
+extern void show_trace(struct task_struct *task, unsigned long *rsp);
extern void show_registers(struct pt_regs *regs);

extern void exception_table_check(void);

Index: linux-preempt-trace/include/linux/sched.h
===================================================================
--- linux-preempt-trace.orig/include/linux/sched.h
+++ linux-preempt-trace/include/linux/sched.h

@@ -592,6 +592,14 @@ extern int groups_search(struct group_in
#define GROUP_AT(gi, i) \
((gi)->blocks[(i)/NGROUPS_PER_BLOCK][(i)%NGROUPS_PER_BLOCK])

+#ifdef CONFIG_DEBUG_PREEMPT
+# define MAX_PREEMPT_TRACE 25
+extern void print_preempt_trace(struct task_struct *task, u32 count);
+#else
+static inline void print_preempt_trace(struct task_struct *task, u32 count)
+{
+}
+#endif

struct audit_context; /* See audit.c */
struct mempolicy;
@@ -770,6 +778,11 @@ struct task_struct {
int cpuset_mems_generation;
#endif
atomic_t fs_excl; /* holding fs exclusive resources */
+
+#ifdef CONFIG_DEBUG_PREEMPT
+ void *preempt_off_caller[MAX_PREEMPT_TRACE];
+ void *preempt_off_parent[MAX_PREEMPT_TRACE];
+#endif
};

static inline pid_t process_group(struct task_struct *tsk)

Index: linux-preempt-trace/kernel/exit.c
===================================================================
--- linux-preempt-trace.orig/kernel/exit.c
+++ linux-preempt-trace/kernel/exit.c

@@ -821,10 +821,11 @@ fastcall NORET_TYPE void do_exit(long co
tsk->it_prof_expires = cputime_zero;
tsk->it_sched_expires = 0;

- if (unlikely(in_atomic()))
- printk(KERN_INFO "note: %s[%d] exited with preempt_count %d\n",
- current->comm, current->pid,
- preempt_count());
+ if (unlikely(in_atomic())) {
+ printk(KERN_ERR "BUG: %s[%d] exited with nonzero preempt_count %d!\n",
+ tsk->comm, tsk->pid, preempt_count());
+ print_preempt_trace(tsk, preempt_count());
+ }

acct_update_integrals(tsk);
update_mem_hiwater(tsk);

Index: linux-preempt-trace/kernel/irq/handle.c
===================================================================
--- linux-preempt-trace.orig/kernel/irq/handle.c
+++ linux-preempt-trace/kernel/irq/handle.c

@@ -85,7 +85,24 @@ fastcall int handle_IRQ_event(unsigned i
local_irq_enable();

do {
+#ifdef CONFIG_DEBUG_PREEMPT
+ u32 in_count = preempt_count(), out_count;
+#endif
ret = action->handler(irq, action->dev_id, regs);
+#ifdef CONFIG_DEBUG_PREEMPT
+ out_count = preempt_count();
+ if (in_count != out_count) {
+ printk(KERN_ERR "BUG: irq %d [%s] preempt-count "
+ "imbalance: in=%08x, out=%08x!\n",
+ irq, action->name, in_count, out_count);
+ print_preempt_trace(current, out_count);
+ /*
+ * We already printed all the useful info,
+ * fix up the preemption count now:
+ */
+ preempt_count() = in_count;
+ }
+#endif
if (ret == IRQ_HANDLED)
status |= action->flags;
retval |= ret;

Index: linux-preempt-trace/kernel/sched.c
===================================================================
--- linux-preempt-trace.orig/kernel/sched.c
+++ linux-preempt-trace/kernel/sched.c

Index: linux-preempt-trace/kernel/softirq.c
===================================================================
--- linux-preempt-trace.orig/kernel/softirq.c
+++ linux-preempt-trace/kernel/softirq.c

@@ -92,7 +92,23 @@ restart:

do {
if (pending & 1) {
+#ifdef CONFIG_DEBUG_PREEMPT

+ u32 in_count = preempt_count(), out_count;

+#endif
h->action(h);
+#ifdef CONFIG_DEBUG_PREEMPT
+ out_count = preempt_count();
+ if (in_count != out_count) {

+ printk(KERN_ERR "BUG: softirq %ld preempt-count "

+ "imbalance: in=%08x, out=%08x!\n",
+ h - softirq_vec, in_count, out_count);
+ print_preempt_trace(current, out_count);
+ /*
+ * Fix up the bad preemption count:
+ */
+ preempt_count() = in_count;
+ }
+#endif
rcu_bh_qsctr_inc(cpu);
}
h++;

Index: linux-preempt-trace/kernel/timer.c
===================================================================
--- linux-preempt-trace.orig/kernel/timer.c
+++ linux-preempt-trace/kernel/timer.c

@@ -914,6 +919,10 @@ static void run_timer_softirq(struct sof

if (time_after_eq(jiffies, base->timer_jiffies))
__run_timers(base);
+ if (panic_timeout == 2) {
+ panic_timeout = 0;
+ preempt_disable();
+ }
}

/*
@@ -922,6 +931,10 @@ static void run_timer_softirq(struct sof
void run_local_timers(void)
{
raise_softirq(TIMER_SOFTIRQ);
+ if (panic_timeout == 1) {
+ panic_timeout = 0;
+ preempt_disable();
+ }
}

/*

Index: linux-preempt-trace/lib/Kconfig.debug
===================================================================
--- linux-preempt-trace.orig/lib/Kconfig.debug
+++ linux-preempt-trace/lib/Kconfig.debug
@@ -70,6 +70,9 @@ config DEBUG_PREEMPT
bool "Debug preemptible kernel"
depends on DEBUG_KERNEL && PREEMPT
default y
+ select FRAME_POINTER
+ select KALLSYMS
+ select KALLSYMS_ALL
help
If you say Y here then the kernel will use a debug variant of the
commonly used smp_processor_id() function and will print warnings

Index: linux-preempt-trace/lib/Makefile
===================================================================
--- linux-preempt-trace.orig/lib/Makefile
+++ linux-preempt-trace/lib/Makefile

@@ -20,7 +20,7 @@ lib-$(CONFIG_RWSEM_GENERIC_SPINLOCK) +=
lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
lib-$(CONFIG_GENERIC_FIND_NEXT_BIT) += find_next_bit.o
obj-$(CONFIG_LOCK_KERNEL) += kernel_lock.o
-obj-$(CONFIG_DEBUG_PREEMPT) += smp_processor_id.o
+obj-$(CONFIG_DEBUG_PREEMPT) += smp_processor_id.o preempt.o

ifneq ($(CONFIG_HAVE_DEC_LOCK),y)
lib-y += dec_and_lock.o

Index: linux-preempt-trace/lib/preempt.c
===================================================================
--- /dev/null
+++ linux-preempt-trace/lib/preempt.c

Dominik Karall

unread,

Aug 5, 2005, 4:50:11 PM8/5/05

to

On Friday 05 August 2005 22:04, Ingo Molnar wrote:
> * Dominik Karall <dominik...@gmx.net> wrote:
> > With FRAME_POINTERS enabled:
> >
> > BUG: mono[3193] exited with nonzero preempt_count 1!
> > ---------------------------
> >
> > | preempt count: 00000001 ]
> > | 1 level deep critical section nesting:
> >
> > ----------------------------------------
> > .. [<ffffffff80400a46>] .... _spin_lock+0x16/0x80
> > .....[<ffffffff801ed30c>] .. ( <= sys_semtimedop+0x28c/0x7c0)
>
> thanks. It seems semundo->lock somehow leaked. One possibility would be
> of semundo->refcount going from 2 to 1 while another thread has it
> locked. I dont see what prevents this scenario from happening. To test
> this theory, could you apply the patch below, which will do semundo
> locking not conditional on the refcount - does it fix the bug?

yeah! it works, great job! :)

dominik