2.6.13-rc3-mm1

Andrew Morton

unread,

Jul 15, 2005, 4:40:16 AM7/15/05

to

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/

(http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.13-rc3-mm1.gz until
kernel.org syncs up)

- Added the CKRM patches. This is just here for people to look at at this
stage.

Changes since 2.6.13-rc2-mm2:

git-drm.patch
git-audit.patch
git-input.patch
git-kbuild.patch
git-libata-adma-mwi.patch
git-libata-chs-support.patch
git-libata-passthru.patch
git-libata-promise-sata-pata.patch
git-netdev-chelsio.patch
git-netdev-e100.patch
git-netdev-smc91x-eeprom.patch
git-netdev-ieee80211-wifi.patch
git-ntfs.patch
git-ocfs2.patch
git-scsi-block.patch
git-scsi-misc.patch
git-scsi-misc-drivers-scsi-chc-remove-devfs-stuff.patch

Subsystem trees

-name_to_dev_t-warning-fix.patch
-aacraid-swapped-kmalloc-args.patch
-quiet-ide-cd-warning.patch
-device-mapper-multipath-barriers-not-supported.patch
-device-mapper-multipath-flush-workqueue-when-destroying.patch
-device-mapper-multipath-avoid-possible-suspension-deadlock.patch
-device-mapper-multipath-fix-pg-initialisation-races.patch
-device-mapper-fix-dm_swap_table-error-cases.patch
-device-mapper-snapshots-handle-origin-extension.patch
-lower-vm_dontcopy-total_vm.patch
-__wait_on_freeing_inode-fix.patch
-vfs-bugfix-two-read_inode-calles-without.patch
-sparc64-read_mostly-build-fix.patch
-x86_64-section-linkage-fix.patch
-pcmcia-fix-pcmcia-cs-compilation.patch
-yenta-fix-parent-resource-determination.patch
-pcmcia-documentation-update.patch
-yenta-same-resources-in-same-structs.patch
-yenta-allocate-resource-fixes.patch
-acpi-20050408-2.6.13-rc1.patch
-fix-recursive-ipw2200-dependencies.patch
-drivers-net-wireless-ipw2100-use-the-dma_32bit_mask-constant.patch
-drivers-net-wireless-ipw2200-use-the-dma_32bit_mask-constant.patch
-ipw2100-assume-recent-kernel.patch
-ipw2100-kill-dead-macros.patch
-ipw2100-small-cleanups.patch
-ipw2100-remove-commented-out-code.patch
-wireless-device-attr-fixes.patch
-wireless-device-attr-fixes-2.patch
-ipw2100-old-gcc-fix.patch
-ipvs-add-and-reorder-bh-locks-after-moving-to-keventd.patch
-drivers-net-wireless-ipw2200c-remove-division-by-zero.patch
-ppc64-kill-bitfields-in-ppc64-hash-code.patch
-alpha-pgprot_uncached-comment.patch
-uml-remove-user_constantsh-on-clean.patch
-uml-tlb-flushing-fix.patch
-xtensa-remove-old-syscalls-2-2.patch
-xtensa-use-ssleep-instead-of-schedule_timeout.patch
-xip-empty_zero_page-build-fix.patch
-reset-real_timer-target-on-exec-leader-change.patch
-reset-real_timer-target-on-exec-leader-change-coding-style-fixes.patch
-fix-ext3-options-parsing.patch
-fix-ext2-mount-options-parting.patch
-cdev-cdev_put-oops.patch
-tlb-warning-fix.patch
-nfs-procfs-sysctl-interfaces-for-lockd-do-not-work-on-x86_64.patch
-ibm_asm-kconfig-corrections.patch
-tb0219-add-pci-irq-initialization.patch
-documentation-kernel-parameterstxt-fix-a-typo.patch
-kexec-ppc-fix-for-ksysfs-crash_notes.patch
-irda-users-listssourceforgenet-is-subscribers-only.patch
-hardirq-uses-preempt.patch
-fix-soft-lockup-due-to-ntfs-vfs-part-and-explanation.patch
-inotify-45.patch
-dvb-lgdt3302-qam256-initialization-fix.patch
-dvb-lgdt3302-qam256-initialization-fix-fix.patch
-v4l-bttv-input.patch
-v4l-bttv-update.patch
-v4l-cx88-update.patch
-v4l-documentation.patch
-v4l-saa7134-hybrid-dvb.patch
-v4l-i2c-bt832.patch
-v4l-i2c-infrared-remote-control.patch
-v4l-i2c-miscelaneous.patch
-v4l-i2c-tuner.patch
-v4l-drivers-media-video-kconfig.patch
-v4l-mxb-fix-to-correct-tuner-ioctl.patch
-v4l-saa7134-update.patch
-v4l-tuner-3026-replace-obsolete-ioctl.patch
-v4l-tv-eeprom.patch
-kernel-auditc-fix-sparse-warnings-__nocast-type.patch
-scripts-kernel-doc-dont-use-uninitialized-srctree.patch
-net-kconfig-two-atm-related-spelling-fixes.patch

Merged

+vtc-build-fix.patch

Compile fix

+fix-raid0s-attempt-to-divide-by-64bit-numbers.patch

RAID0 fix

+v4l-bug-fixes-for-tuner-cx88-and-tea5767.patch

v4l fix

+xip-empty_zero_page-build-fix.patch
+mm-fix-execute-in-place.patch

Fix the new execute-in-place code for various architectures.

+visws-reexport-pm_power_off.patch

visws build fix

+inotify-documentation-update.patch

inotify docs.

+deprecate-register_serial-and-unregister_serial.patch

Emit nasty build warnings

+rocketc-fix-ldisc-ref-count-handling.patch

Fix the rocket driver

+md-raid1-clear-bitmap-when-fullsync-completes.patch

RAID1 fix

+uart_handle_sysrq_char-warning-fix.patch

Fix a warning

-update-filesystems-for-new-delete_inode-behavior.patch

Drop this - it's in git-ocfs2.patch anyway

+update-filesystems-for-new-delete_inode-behavior-fix.patch

This needs to be, too.

+acpi-fix-table-discovery-from-efi-for-x86.patch

ACPI fix

-gregkh-i2c-i2c-via686a-cleanups.patch
-gregkh-i2c-i2c-tps6501x-cleanups.patch
-gregkh-i2c-i2c-string-strip.patch
-gregkh-i2c-i2c-max6875-may-do-bad-things.patch
-gregkh-i2c-i2c-max6875-documentation.patch
-gregkh-i2c-i2c-max6875-Kconfig.patch
-gregkh-i2c-i2c-m41t00-kfree-fix.patch
-gregkh-i2c-i2c-idr-core.patch
-gregkh-i2c-i2c-drop-bogus-eeprom-comment.patch
-gregkh-i2c-i2c-docs-01.patch
-gregkh-i2c-i2c-docs-02.patch
-gregkh-i2c-i2c-dev-doc-update.patch
-gregkh-i2c-i2c-atxp1-build-fix.patch
-gregkh-i2c-w1-bigendian-crc-fix.patch

The i2c tree is all merged up

+input-synaptics-dynabook.diff.patch

Changes in the input subsystem tree

+apple-usb-touchpad-driver.patch

USB driver

+git-netdev-ieee80211-wifi.patch

Fix up Jeff's stuff

+remove-pci_bridge_ctl_vga-handling-from-setup-busc.patch

PCI fix

+qla-remove-anonymous-union.patch
+qla2xxx-Kconfig-dependency-fix.patch
+fc4-warning-fix.patch

Fix stuff in git-scsi-misc.patch

-gregkh-usb-usb-bMaxPacketSize0-sysfs.patch
-gregkh-usb-usb-storage-unusual-ids-01.patch
-gregkh-usb-usb-khubd-use-kthread.patch
-gregkh-usb-usb-ftdi_sio-device_id-clutter-reduction.patch
-gregkh-usb-usb-ftdi_sio-remove-TIOCMBIS.patch
-gregkh-usb-usb-ftdi_sio-fix-compiler-warnings.patch
-gregkh-usb-usb-atm-01.patch
-gregkh-usb-usb-atm-02.patch
-gregkh-usb-usb-atm-03.patch
-gregkh-usb-usb-sis-makefile-fix.patch
-gregkh-usb-usb-usbmon-print-control-packets.patch
-gregkh-usb-usb-isp116x-hcd-cleanup.patch
-gregkh-usb-usb-kmalloc-flag-cleanup.patch
-gregkh-usb-usb-net2280-warning-fix.patch
-gregkh-usb-usb-keyspan-remote.patch
-gregkh-usb-usb-coverity-desc-bitmap-overrun-fix.patch
-gregkh-usb-usb-ld-hid-blacklist.patch
-gregkh-usb-usb-sn9c10x-update.patch
-gregkh-usb-usb-gadget-ether-fix-01.patch
-gregkh-usb-usb-gadget-ether-fix-02.patch
-gregkh-usb-usb-ohci-udc-tweaks.patch
-gregkh-usb-usb-ohci-omap-pm-updates.patch
-gregkh-usb-usb-ohci-merge-fix.patch
-gregkh-usb-usb-cdc-descriptor-add.patch
-gregkh-usb-usb-export-getput_intf.patch
-gregkh-usb-usb-cdc-acm-reference-count-fix.patch
-gregkh-usb-usb-ldusb.patch

The USB subsystem tree is mostly merged up

+option-card-driver-update-maintainer-entry-fixes.patch

USB driver coding style tweaks

+i6300esb-pci_match_device-fix.patch

Fix bug in bk-watchdog.patch

+proc-pid-numa_maps-to-show-on-which-nodes-pages-reside.patch
+proc-pid-numa_maps-to-show-on-which-nodes-pages-reside-tidy.patch

Add /proc/pid/numa_maps

+smaps-print-more-fields.patch

Print more stuff in /proc/pid/smaps

+s2io-fix-a-compiler-warning-in-a-printk.patch

s2io fix

+tmpfs-enable-atomic-inode-security.patch
+remove-security_inode_post_create-mkdir-symlink-mknod.patch
+remove-the-inode_post_link-and-inode_post_rename-lsm-hooks.patch

Fixes and updates to the LSM work in -mm.

+ppc-ppc64-use-kconfighz.patch
+ppc32-update-defconfigs.patch
+ppc32-add-proper-prototype-for-cpm2_reset.patch
+ppc32-make-the-uarts-on-mpc824x-individual-platform-devices.patch
+ppc64-update-defconfigs.patch
+ppc64-hide-config_adb.patch
+ppc64-genrtc-build-fix.patch

ppc32/ppc64 updates

+hpet-use-read_timer_tsc-only-when-cpu-has-tsc.patch

hpet driver fix

+suspend-update-documentation.patch
+swsusp-fix-printks-and-cleanups.patch
+swsusp-fix-remaining-u32-vs-pm_message_t-confusion.patch
+swsusp-switch-pm_message_t-to-struct.patch
+swsusp-switch-pm_message_t-to-struct-pmac_zilog-fix.patch
+swsusp-switch-pm_message_t-to-struct-ppc32-fixes.patch
+fix-pm_message_t-stuff-in-mm-tree-netdev.patch

Third attempt at finishing off the pm_mesage_t conversion and switching it
to be a struct. Seems to be OK now.

+aio-add-enosys-into-sys_io_cancel.patch

AIO return value fix

+tpm-support-for-infineon-tpm.patch
+ppc64-tpm_infineon-build-fix.patch

TPm driver updates

+mb_cache_shrink-frees-unexpected-caches.patch

mbcache fix

+inotify-speedup.patch

inotify speed tweak

+kprobes-prevent-possible-race-conditions-generic.patch
+kprobes-prevent-possible-race-conditions-generic-fixes.patch
+kprobes-prevent-possible-race-conditions-i386-changes.patch
+kprobes-prevent-possible-race-conditions-x86_64-changes.patch
+kprobes-prevent-possible-race-conditions-ppc64-changes.patch
+kprobes-prevent-possible-race-conditions-ia64-changes.patch
+kprobes-prevent-possible-race-conditions-ia64-changes-fixes.patch
+kprobes-prevent-possible-race-conditions-sparc64-changes.patch
+kprobes-ia64-fix-race-when-break-hits-and-kprobe-not-found.patch

kprobes work

-pivot_root-circular-reference-fix.patch
+pivot_root-circular-reference-fix-2.patch

Fix this problem in a new way

+ckrm-core-ckrm-event-callbacks.patch
+ckrm-processor-delay-accounting.patch
+ckrm-processor-delay-accounting-warning-fixes.patch
+ckrm-core-infrastructure.patch
+ckrm-resource-control-file-system-rcfs.patch
+ckrm-classtype-definitions-for-task-class.patch
+ckrm-classtype-definitions-for-socket-class.patch
+ckrm-numtasks-controller.patch
+ckrm-documentation.patch
+ckrm-add-missing-read_unlock.patch
+ckrm-move-callbacks-from-listenaq-to-socketclass.patch
+ckrm-change-ipaddr_port-syntax.patch
+ckrm-check-to-see-if-my-guarantee-is-set-to-dontcare.patch
+ckrm-minor-cosmetic-cleanups-in-numtasks-controller.patch
+ckrm-undo-removal-of-check-in-numtasks_put_ref_local.patch
+ckrm-rule-based-classification-engine-stub-rcfs-support.patch
+ckrm-rule-based-classification-engine-basic-rcfs-support.patch
+ckrm-rule-based-classification-engine-bitvector-support-for-classification-info.patch
+ckrm-rule-based-classification-engine-full-ce.patch
+ckrm-rule-based-classification-engine-more-advanced-classification-engine.patch
+ckrm-clean-up-typo-in-printk-message.patch
+ckrm-fix-for-compiler-warnings.patch
+ckrm-fix-share-calculation.patch
+ckrm-fix-edge-cases-with-empty-lists-and-rule-deletion.patch
+ckrm-add-numtasks-controller-config-file-write-support.patch
+ckrm-add-fork-rate-control-to-the-numtasks-controller.patch
+ckrm-classification-engines-rbce-and-crbce-are-mutually-exclusive.patch
+ckrm-make-get_class-global.patch
+ckrm-cleanups-to-ckrm-initialization.patch
+ckrm-replace-target-file-interface-with-a-writable-members-file.patch
+ckrm-use-sizeof-instead-of-define-for-the-array-size-in-taskclass.patch
+ckrm-fix-a-bug-in-the-use-of-classtype.patch
+ckrm-include-taskdelaysh-in-crbceh.patch
+ckrm-send-timestamps-to-userspace-in-msecs-instead-of-jiffies.patch
+ckrm-fix-compile-warnings-and-delete-dead-code.patch
+ckrm-fix-a-null-dereference-bug.patch
+ckrm-classification-engine-configuration-support-cleanup.patch
+ckrm-use-sizeof-instead-of-define-for-the-array-size-in-rbce.patch
+ckrm-delete-target-file-from-tc_magicc.patch

Class-based kernel resource management

-nfs-fix-client-hang-due-to-race-condition.patch
+nfs-split-nfsi-flags-into-two-fields.patch
+nfs-use-atomic-bitops-to-manipulate-flags-in-nfsi-flags.patch
+nfs-introduce-the-use-of-inode-i_lock-to-protect-fields-in-nfsi.patch

Fix this NFS race more cleanly

+spinlock-consolidation-s390-fix.patch

+numa-aware-slab-allocator-v5-fix.patch

Fix numa-aware-slab-allocator-v5.patch

+fix-pm_message_t-stuff-in-mm-tree-perfctr.patch

Update perfctr for the new pm_message_t regime

+fix-page-becoming-writable-in-do_wp_page.patch
+fix-page-becoming-writable-vm_page_prot.patch
+fix-page-becoming-writable-in-do_file_page.patch

Fix add-page-becoming-writable-notification.patch (part of cachefs)

+v9fs-documentation-makefiles-configuration-resend-take-2.patch
+v9fs-vfs-file-dentry-and-directory-operations-resend-take-2.patch
+v9fs-vfs-inode-operations-resend-take-2.patch
+v9fs-vfs-superblock-operations-and-glue-resend-take-2.patch
+v9fs-9p-protocol-implementation-resend-take-2.patch
+v9fs-transport-modules-resend-take-2.patch
+v9fs-debug-and-support-routines-resend-take-2.patch
+v9fs-clean-up-vfs_inode-and-setattr-functions-2.patch

v8fs updates

+fbmon-horizontal-frequency-rounding-fix.patch
+fbmem-use-unregister_chrdev-on-unload.patch
+radeonfb-clean-up-edid-sysfs-attribute.patch
+fbdev-colormap-fixes.patch

fbdev updates

+device-mapper-fix-deadlocks-in-core-prep-fix.patch

Fix device-mapper-fix-deadlocks-in-core-prep.patch

+drivers-scsi-aic7xxx-possible-cleanups.patch
+mm-swap_state-fix-nocast-type-warnings.patch
+spelling-fixes-for-documentation.patch
+lib-radix-tree-fix-nocast-type-warnings.patch
+dmapool-fix-nocast-type-warnings.patch
+telephony-ixj-use-msleep-instead-of-schedule_timeout.patch
+i386-smpboot-use-msleep-instead-of-schedule_timeout.patch

Little fxes and cleanups

+remove-linux-versionh-includes.patch
+remove-linux-versionh-from-drivers-net.patch
+remove-linux-versionh-from-drivers-scsi.patch
+move-kernel_version-from-linux-versionh-to-linux-utsnameh.patch
+move-kernel_version-from-linux-versionh-to-linux-utsnameh-fix.patch
+move-kernel_version-from-linux-versionh-to-linux-utsnameh-fix-2.patch
+move-kernel_version-from-linux-versionh-to-linux-utsnameh-fix-3.patch
+move-kernel_version-from-linux-versionh-to-linux-utsnameh-fix-4.patch
+move-kernel_version-from-linux-versionh-to-linux-utsnameh-fix-5.patch
+remove-linux-versionh-include-for-mm.patch
+remove-linux-versionh-from-net-ieee80211.patch
+remove-linux-versionh-from-drivers-scsi-for-mm.patch
+remove-linux-versionh-from-drivers-net-for-mm.patch

Futz with header files, waste much time.

number of patches in -mm: 591
number of changesets in external trees: 9
number of patches in -mm only: 590
total patches: 599

All 591 patches:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/patch-list

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Russell King

unread,

Jul 15, 2005, 5:00:15 AM7/15/05

to

On Fri, Jul 15, 2005 at 01:36:53AM -0700, Andrew Morton wrote:
> +uart_handle_sysrq_char-warning-fix.patch
>
> Fix a warning

Andrew, this requires a little more fixing than your simple patch.
Several drivers omit 'regs' from the receive handler when sysrq is
not enabled. Hence, this simple fix on its own will cause compile
failures.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core

Andrew Morton

unread,

Jul 15, 2005, 5:10:09 AM7/15/05

to

Russell King <rmk+...@arm.linux.org.uk> wrote:
>
> On Fri, Jul 15, 2005 at 01:36:53AM -0700, Andrew Morton wrote:
> > +uart_handle_sysrq_char-warning-fix.patch
> >
> > Fix a warning
>
> Andrew, this requires a little more fixing than your simple patch.
> Several drivers omit 'regs' from the receive handler when sysrq is
> not enabled. Hence, this simple fix on its own will cause compile
> failures.

Me no understand. It replaces a three-arg macro with a three-arg static
inline?

Russell King

unread,

Jul 15, 2005, 5:10:10 AM7/15/05

to

On Fri, Jul 15, 2005 at 01:56:29AM -0700, Andrew Morton wrote:
> Russell King <rmk+...@arm.linux.org.uk> wrote:
> >
> > On Fri, Jul 15, 2005 at 01:36:53AM -0700, Andrew Morton wrote:
> > > +uart_handle_sysrq_char-warning-fix.patch
> > >
> > > Fix a warning
> >
> > Andrew, this requires a little more fixing than your simple patch.
> > Several drivers omit 'regs' from the receive handler when sysrq is
> > not enabled. Hence, this simple fix on its own will cause compile
> > failures.
>
> Me no understand. It replaces a three-arg macro with a three-arg static
> inline?

Some serial drivers drop 'regs' from the parent function when sysrq is
disabled. 'regs' is only passed for sysrq support.

(Yes, it's disgusting, but I thought at the time I had the choice of
being lynched by the "as efficient as possible" mob, or...)

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core

Andrew Morton

unread,

Jul 15, 2005, 5:20:08 AM7/15/05

to

Russell King <rmk+...@arm.linux.org.uk> wrote:
>
> On Fri, Jul 15, 2005 at 01:56:29AM -0700, Andrew Morton wrote:
> > Russell King <rmk+...@arm.linux.org.uk> wrote:
> > >
> > > On Fri, Jul 15, 2005 at 01:36:53AM -0700, Andrew Morton wrote:
> > > > +uart_handle_sysrq_char-warning-fix.patch
> > > >
> > > > Fix a warning
> > >
> > > Andrew, this requires a little more fixing than your simple patch.
> > > Several drivers omit 'regs' from the receive handler when sysrq is
> > > not enabled. Hence, this simple fix on its own will cause compile
> > > failures.
> >
> > Me no understand. It replaces a three-arg macro with a three-arg static
> > inline?
>
> Some serial drivers drop 'regs' from the parent function when sysrq is
> disabled. 'regs' is only passed for sysrq support.
>

Me still no understand.

+static inline int uart_handle_sysrq_char(struct uart_port *port,
+ unsigned int ch, struct pt_regs *regs)
+{
+ return 0;
+}

That function doesn't touch *regs, and all callers pass in either
a pt_regs* or NULL??

Matthias Urlichs

unread,

Jul 15, 2005, 5:30:14 AM7/15/05

to

Hi, Andrew Morton wrote:

> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/
>
Also GITtable, as soon as the mirrors' work is done:

http://www.kernel.org/git/?p=linux/kernel/git/smurf/v2.6.13-rc3-mm1.git;a=summary

... since people asked:
- trees from GIT are properly parent-linked.
- yes, I can import other people's patch series.
- the whole import runs in a couple of minutes.
In fact, I keep suspecting that it must be skipping some important
step or other. ;-) Kudos to Linus, and of course everybody else who's
involved with git, for yet another tool done *right*.

--
Matthias Urlichs | {M:U} IT Design @ m-u-it.de | sm...@smurf.noris.de
Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de
- -
A classic is something that everybody wants to have read
and nobody wants to read.
-- Mark Twain

Grant Coady

unread,

Jul 15, 2005, 6:30:15 AM7/15/05

to

On Fri, 15 Jul 2005 01:36:53 -0700, Andrew Morton <ak...@osdl.org> wrote:

>
>ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/

Did _not_ break Yenta + CardBus on Toshiba ToPIC100:
http://scatter.mine.nu/test/linux-2.6/tosh/dmesg-2.6.13-rc3-mm1a.gz

--Grant.

Adrian Bunk

unread,

Jul 15, 2005, 6:40:23 AM7/15/05

to

On Fri, Jul 15, 2005 at 01:36:53AM -0700, Andrew Morton wrote:

>...
> Changes since 2.6.13-rc2-mm2:
>...
> git-scsi-misc.patch
>...
> Subsystem trees
>...

--- linux-2.6.13-rc3/drivers/scsi/qla2xxx/Makefile 2005-06-17 16:04:01.000000000 -0700
+++ devel/drivers/scsi/qla2xxx/Makefile 2005-07-15 00:46:18.000000000 -0700
@@ -1,4 +1,6 @@
EXTRA_CFLAGS += -DUNIQUE_FW_NAME
+CONFIG_SCSI_QLA24XX=m
+EXTRA_CFLAGS += -DCONFIG_SCSI_QLA24XX -DCONFIG_SCSI_QLA24XX_MODULE

qla2xxx-y := qla_os.o qla_init.o qla_mbx.o qla_iocb.o qla_isr.o qla_gs.o \
qla_dbg.o qla_sup.o qla_rscn.o qla_attr.o
@@ -14,3 +16,4 @@ obj-$(CONFIG_SCSI_QLA22XX) += qla2xxx.o
obj-$(CONFIG_SCSI_QLA2300) += qla2xxx.o qla2300.o
obj-$(CONFIG_SCSI_QLA2322) += qla2xxx.o qla2322.o
obj-$(CONFIG_SCSI_QLA6312) += qla2xxx.o qla6312.o
+obj-$(CONFIG_SCSI_QLA24XX) += qla2xxx.o

I don't know what exactly you want to achieve, but this is so horribly
wrong.

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

Andrew Morton

unread,

Jul 15, 2005, 6:50:09 AM7/15/05

to

Grant Coady <x0...@dodo.com.au> wrote:
>
> On Fri, 15 Jul 2005 01:36:53 -0700, Andrew Morton <ak...@osdl.org> wrote:
>
> >
> >ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/
>
> Did _not_ break Yenta + CardBus on Toshiba ToPIC100:
> http://scatter.mine.nu/test/linux-2.6/tosh/dmesg-2.6.13-rc3-mm1a.gz

What does this mean?

Andrew Vasquez

unread,

Jul 15, 2005, 10:50:10 AM7/15/05

to

On Fri, 15 Jul 2005, Adrian Bunk wrote:

> On Fri, Jul 15, 2005 at 01:36:53AM -0700, Andrew Morton wrote:
> >...
> > Changes since 2.6.13-rc2-mm2:
> >...
> > git-scsi-misc.patch
> >...
> > Subsystem trees
> >...
>

...

> +obj-$(CONFIG_SCSI_QLA24XX) += qla2xxx.o
>
>
> I don't know what exactly you want to achieve, but this is so horribly
> wrong.

Yes, quite. How about the following to correct the intention.

Add correct Kconfig option for ISP24xx support.

Signed-off-by: Andrew Vasquez <andrew....@qlogic.com>
---

diff --git a/drivers/scsi/qla2xxx/Kconfig b/drivers/scsi/qla2xxx/Kconfig
--- a/drivers/scsi/qla2xxx/Kconfig
+++ b/drivers/scsi/qla2xxx/Kconfig
@@ -39,3 +39,11 @@ config SCSI_QLA6312
---help---
This driver supports the QLogic 63xx (ISP6312 and ISP6322) host
adapter family.
+
+config SCSI_QLA24XX
+ tristate "QLogic ISP24xx host adapter family support"
+ depends on SCSI_QLA2XXX
+ select SCSI_FC_ATTRS
+ ---help---
+ This driver supports the QLogic 24xx (ISP2422 and ISP2432) host
+ adapter family.
diff --git a/drivers/scsi/qla2xxx/Makefile b/drivers/scsi/qla2xxx/Makefile
--- a/drivers/scsi/qla2xxx/Makefile
+++ b/drivers/scsi/qla2xxx/Makefile
@@ -1,6 +1,4 @@
EXTRA_CFLAGS += -DUNIQUE_FW_NAME
-CONFIG_SCSI_QLA24XX=m
-EXTRA_CFLAGS += -DCONFIG_SCSI_QLA24XX -DCONFIG_SCSI_QLA24XX_MODULE

qla2xxx-y := qla_os.o qla_init.o qla_mbx.o qla_iocb.o qla_isr.o qla_gs.o \
qla_dbg.o qla_sup.o qla_rscn.o qla_attr.o

-

Christoph Hellwig

unread,

Jul 15, 2005, 11:10:14 AM7/15/05

to

On Fri, Jul 15, 2005 at 01:36:53AM -0700, Andrew Morton wrote:
>

> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/
>
> (http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.13-rc3-mm1.gz until
> kernel.org syncs up)
>
>
> - Added the CKRM patches. This is just here for people to look at at this
> stage.

Andrew, do we really need to add every piece of crap lying on the street
to -mm? It's far away from mainline enough already without adding obviously
unmergeable stuff like this.

Joel Becker

unread,

Jul 15, 2005, 1:20:12 PM7/15/05

to

On Fri, Jul 15, 2005 at 01:36:53AM -0700, Andrew Morton wrote:

> +update-filesystems-for-new-delete_inode-behavior-fix.patch
>
> This needs to be, too.

Applied.

Joel

--

"There are only two ways to live your life. One is as though nothing
is a miracle. The other is as though everything is a miracle."
- Albert Einstein

Joel Becker
Senior Member of Technical Staff
Oracle
E-mail: joel....@oracle.com
Phone: (650) 506-8127

Matthias Urlichs

unread,

Jul 15, 2005, 1:50:08 PM7/15/05

to

Hi, Matthias Urlichs wrote:

> Also GITtable, as soon as the mirrors' work is done:
>
> http://www.kernel.org/git/?p=linux/kernel/git/smurf/v2.6.13-rc3-mm1.git;a=summary

Moved to

http://www.kernel.org/git/?p=linux/kernel/git/smurf/linux-trees.git;a=summary

--
Matthias Urlichs | {M:U} IT Design @ m-u-it.de | sm...@smurf.noris.de
Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de
- -

A marriage is always made up of two people who are prepared to swear that
only the other one snores.
-- Terry Pratchett (The Fifth Elephant)

Andrew Morton

unread,

Jul 15, 2005, 4:30:14 PM7/15/05

to

Christoph Hellwig <h...@infradead.org> wrote:
>
> On Fri, Jul 15, 2005 at 01:36:53AM -0700, Andrew Morton wrote:
> >
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/
> >
> > (http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.13-rc3-mm1.gz until
> > kernel.org syncs up)
> >
> >
> > - Added the CKRM patches. This is just here for people to look at at this
> > stage.
>
> Andrew, do we really need to add every piece of crap lying on the street
> to -mm? It's far away from mainline enough already without adding obviously
> unmergeable stuff like this.

My gut reaction to ckrm is the same as yours. But there's been a lot of
work put into this and if we're to flatly reject the feature then the
developers are owed a much better reason than "eww yuk".

Otherwise, if there are certain specific problems in the code then it's
best that they be pointed out now rather than later on.

What, in your opinion, makes it "obviously unmregeable"?

J.A. Magallon

unread,

Jul 15, 2005, 6:20:04 PM7/15/05

to

On 07.15, Andrew Morton wrote:
>
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/
>
> (http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.13-rc3-mm1.gz until
> kernel.org syncs up)
>

This are fixes that I still have in my small patchset, collected from the list,
Just post them fwiw (they don't hurt but I'm no more sure if they are needed)
Patches come in replys to this mail...

--
J.A. Magallon <jamagallon()able!es> \ Software is like sex:
werewolf!able!es \ It's better when it's free
Mandriva Linux release 2006.0 (Cooker) for i586
Linux 2.6.12-jam9 (gcc 4.0.1 (4.0.1-0.2mdk for Mandriva Linux release 2006.0))

J.A. Magallon

unread,

Jul 15, 2005, 6:30:13 PM7/15/05

to

On 07.16, J.A. Magallon wrote:
>
> On 07.15, Andrew Morton wrote:
> >
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/
> >

This time I did not break anything... and they shut up gcc4 ;)

--- linux-2.6.12-jam1/scripts/mod/sumversion.c.orig 2005-06-21 23:44:30.000000000 +0200
+++ linux-2.6.12-jam1/scripts/mod/sumversion.c 2005-06-21 23:47:09.000000000 +0200
@@ -252,9 +252,9 @@
}

/* FIXME: Handle .s files differently (eg. # starts comments) --RR */
-static int parse_file(const signed char *fname, struct md4_ctx *md)
+static int parse_file(const char *fname, struct md4_ctx *md)
{
- signed char *file;
+ char *file;
unsigned long i, len;

file = grab_file(fname, &len);
@@ -332,7 +332,7 @@
Sum all files in the same dir or subdirs.
*/
while ((line = get_next_line(&pos, file, flen)) != NULL) {
- signed char* p = line;
+ char* p = line;
if (strncmp(line, "deps_", sizeof("deps_")-1) == 0) {
check_files = 1;
continue;
@@ -458,7 +458,7 @@
close(fd);
}

-static int strip_rcs_crap(signed char *version)
+static int strip_rcs_crap(char *version)
{
unsigned int len, full_len;

--- linux-2.6.12-jam1/scripts/lxdialog/inputbox.c.orig 2005-06-21 23:40:27.000000000 +0200
+++ linux-2.6.12-jam1/scripts/lxdialog/inputbox.c 2005-06-21 23:42:39.000000000 +0200
@@ -21,7 +21,7 @@

#include "dialog.h"

-unsigned char dialog_input_result[MAX_LEN + 1];
+char dialog_input_result[MAX_LEN + 1];

/*
* Print the termination buttons
@@ -48,7 +48,7 @@
{
int i, x, y, box_y, box_x, box_width;
int input_x = 0, scroll = 0, key = 0, button = -1;
- unsigned char *instr = dialog_input_result;
+ char *instr = dialog_input_result;
WINDOW *dialog;

/* center dialog box on screen */
--- linux-2.6.12-jam1/scripts/lxdialog/dialog.h.orig 2005-06-21 23:42:55.000000000 +0200
+++ linux-2.6.12-jam1/scripts/lxdialog/dialog.h 2005-06-21 23:43:19.000000000 +0200
@@ -163,7 +163,7 @@
int dialog_checklist (const char *title, const char *prompt, int height,
int width, int list_height, int item_no,
const char * const * items, int flag);
-extern unsigned char dialog_input_result[];
+extern char dialog_input_result[];
int dialog_inputbox (const char *title, const char *prompt, int height,
int width, const char *init);

--- linux-2.6.12-jam1/scripts/conmakehash.c.orig 2005-06-22 00:16:58.000000000 +0200
+++ linux-2.6.12-jam1/scripts/conmakehash.c 2005-06-22 00:17:21.000000000 +0200
@@ -33,7 +33,7 @@

int getunicode(char **p0)
{
- unsigned char *p = *p0;
+ char *p = *p0;

while (*p == ' ' || *p == '\t')
p++;
--- linux-2.6.12-jam7/scripts/kallsyms.c.orig 2005-07-06 00:16:39.000000000 +0200
+++ linux-2.6.12-jam7/scripts/kallsyms.c 2005-07-06 00:42:24.000000000 +0200
@@ -166,9 +166,9 @@
* move then they may get dropped in pass 2, which breaks the
* kallsyms rules.
*/
- if ((s->addr == _etext && strcmp(s->sym + offset, "_etext")) ||
- (s->addr == _einittext && strcmp(s->sym + offset, "_einittext")) ||
- (s->addr == _eextratext && strcmp(s->sym + offset, "_eextratext")))
+ if ((s->addr == _etext && strcmp((char*)s->sym + offset, "_etext")) ||
+ (s->addr == _einittext && strcmp((char*)s->sym + offset, "_einittext")) ||
+ (s->addr == _eextratext && strcmp((char*)s->sym + offset, "_eextratext")))
return 0;

J.A. Magallon

unread,

Jul 15, 2005, 6:30:15 PM7/15/05

to

On 07.16, J.A. Magallon wrote:
>

> On 07.15, Andrew Morton wrote:
> >
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/
> >

--- linux-2.6.12/fs/smbfs/request.c~ 2005-07-07 14:41:11.000000000 -0400
+++ linux-2.6.12/fs/smbfs/request.c 2005-07-07 14:41:22.000000000 -0400
@@ -348,6 +348,7 @@ int smb_add_request(struct smb_request *
smb_rput(req);
}
smb_unlock_server(server);
+ return -EINTR;
}

if (!timeleft) {

Yoichi Yuasa

unread,

Jul 15, 2005, 7:00:11 PM7/15/05

to

Hi Andrew

I got the following error.

make ARCH=mips oldconfig
scripts/kconfig/conf -o arch/mips/Kconfig
drivers/video/Kconfig:7:warning: type of 'FB' redefined from 'boolean' to 'tristate'

file drivers/char/speakup/Kconfig already scanned?
make[1]: *** [oldconfig] Error 1
make: *** [oldconfig] Error 2

gregkh-driver-speakup-core.patch

arch/arm/Kconfig | 1
arch/mips/Kconfig | 2
arch/sparc64/Kconfig | 2

It is not necessary to change these three files.
Please remove these changes.

Yoichi

Yoichi Yuasa

unread,

Jul 15, 2005, 7:10:10 PM7/15/05

to

Hi again,

On Sat, 16 Jul 2005 07:52:42 +0900
Yoichi Yuasa <yu...@hh.iij4u.or.jp> wrote:

> Hi Andrew
>
> I got the following error.
>
> make ARCH=mips oldconfig
> scripts/kconfig/conf -o arch/mips/Kconfig
> drivers/video/Kconfig:7:warning: type of 'FB' redefined from 'boolean' to 'tristate'
>
> file drivers/char/speakup/Kconfig already scanned?
> make[1]: *** [oldconfig] Error 1
> make: *** [oldconfig] Error 2
>
>
> gregkh-driver-speakup-core.patch
>
> arch/arm/Kconfig | 1
> arch/mips/Kconfig | 2
> arch/sparc64/Kconfig | 2
>
> It is not necessary to change these three files.
> Please remove these changes.

Sorry, I mistook.
It is not necessary to change for mips.
Please remove mips Kconfig change.

Andrew Morton

unread,

Jul 15, 2005, 7:40:08 PM7/15/05

to

Yoichi Yuasa <yu...@hh.iij4u.or.jp> wrote:
>
> Hi Andrew
>
> I got the following error.
>
> make ARCH=mips oldconfig
> scripts/kconfig/conf -o arch/mips/Kconfig
> drivers/video/Kconfig:7:warning: type of 'FB' redefined from 'boolean' to 'tristate'
>
> file drivers/char/speakup/Kconfig already scanned?
> make[1]: *** [oldconfig] Error 1
> make: *** [oldconfig] Error 2
>

Well arch/mips/Kconfig is defining CONFIG_FB as bool and
drivers/video/Kconfig was changed a while ago to define it as tristate. I
assume this failure also happens in linus's current tree.

It seems odd that mips is privately duplicating the generic code's
definition. Maybe that needs to be taken out of there.

I'll cc the fbdev guys - could someone please come up with fix? It's a
showstopper for the MIPS architecture.

Yoichi Yuasa

unread,

Jul 15, 2005, 9:20:07 PM7/15/05

to

Hi,

On Fri, 15 Jul 2005 16:23:49 -0700
Andrew Morton <ak...@osdl.org> wrote:

> Yoichi Yuasa <yu...@hh.iij4u.or.jp> wrote:
> >
> > Hi Andrew
> >
> > I got the following error.
> >
> > make ARCH=mips oldconfig
> > scripts/kconfig/conf -o arch/mips/Kconfig
> > drivers/video/Kconfig:7:warning: type of 'FB' redefined from 'boolean' to 'tristate'
> >
> > file drivers/char/speakup/Kconfig already scanned?
> > make[1]: *** [oldconfig] Error 1
> > make: *** [oldconfig] Error 2
> >
>
> Well arch/mips/Kconfig is defining CONFIG_FB as bool and
> drivers/video/Kconfig was changed a while ago to define it as tristate. I
> assume this failure also happens in linus's current tree.
>
> It seems odd that mips is privately duplicating the generic code's
> definition. Maybe that needs to be taken out of there.

Yes, It can be removed.

> I'll cc the fbdev guys - could someone please come up with fix? It's a
> showstopper for the MIPS architecture.

Yoichi

Signed-off-by: Yoichi Yuasa <yu...@hh.iij4u.or.jp>

diff -urN -X dontdiff mm1-orig/arch/mips/Kconfig mm1/arch/mips/Kconfig
--- mm1-orig/arch/mips/Kconfig 2005-07-15 21:44:53.000000000 +0900
+++ mm1/arch/mips/Kconfig 2005-07-16 10:01:29.000000000 +0900
@@ -1090,41 +1090,6 @@
depends on MACH_JAZZ || SNI_RM200_PCI || SGI_IP22 || SGI_IP32
default y

-config FB
- bool
- depends on MIPS_MAGNUM_4000 || OLIVETTI_M700
- default y
- ---help---
- The frame buffer device provides an abstraction for the graphics
- hardware. It represents the frame buffer of some video hardware and
- allows application software to access the graphics hardware through
- a well-defined interface, so the software doesn't need to know
- anything about the low-level (hardware register) stuff.
-
- Frame buffer devices work identically across the different
- architectures supported by Linux and make the implementation of
- application programs easier and more portable; at this point, an X
- server exists which uses the frame buffer device exclusively.
- On several non-X86 architectures, the frame buffer device is the
- only way to use the graphics hardware.
-
- The device is accessed through special device nodes, usually located
- in the /dev directory, i.e. /dev/fb*.
-
- You need an utility program called fbset to make full use of frame
- buffer devices. Please read <file:Documentation/fb/framebuffer.txt>
- and the Framebuffer-HOWTO at <http://www.tldp.org/docs.html#howto>
- for more information.
-
- Say Y here and to the driver for your graphics board below if you
- are compiling a kernel for a non-x86 architecture.
-
- If you are compiling for the x86 architecture, you can say Y if you
- want to play with it, but it is not essential. Please note that
- running graphical applications that directly touch the hardware
- (e.g. an accelerated X server) and that are not frame buffer
- device-aware may cause unexpected results. If unsure, say N.
-
config HAVE_STD_PC_SERIAL_PORT
bool

diff -urN -X dontdiff mm1-orig/drivers/video/Kconfig mm1/drivers/video/Kconfig
--- mm1-orig/drivers/video/Kconfig 2005-07-13 13:46:46.000000000 +0900
+++ mm1/drivers/video/Kconfig 2005-07-16 09:56:59.000000000 +0900
@@ -1399,8 +1399,8 @@
Say Y here to enable kernel support for the on-board framebuffer.

config FB_G364
- bool
- depends on MIPS_MAGNUM_4000 || OLIVETTI_M700
+ bool "G364 frame buffer support"
+ depends on (FB = y) && (MIPS_MAGNUM_4000 || OLIVETTI_M700)
select FB_CFB_FILLRECT
select FB_CFB_COPYAREA
select FB_CFB_IMAGEBLIT

Sam Ravnborg

unread,

Jul 16, 2005, 4:30:10 AM7/16/05

to

On Fri, Jul 15, 2005 at 10:14:43PM +0000, J.A. Magallon wrote:
>
> On 07.16, J.A. Magallon wrote:
> >
> > On 07.15, Andrew Morton wrote:
> > >
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/
> > >
>
> This time I did not break anything... and they shut up gcc4 ;)

Thanks.
Can you please resend with proper changelog and signed-off-by.
Diff should be done on top of latest -linus preferable.
Also this patch seems relative small compared to the others floating
around to cure signed warnings in scripts/
Does this really fix all of them or only a subset of the warnings?

I do not have gcc4 present but maybe thats easy - running gentoo?

> --- linux-2.6.12-jam7/scripts/kallsyms.c.orig 2005-07-06 00:16:39.000000000 +0200
> +++ linux-2.6.12-jam7/scripts/kallsyms.c 2005-07-06 00:42:24.000000000 +0200
> @@ -166,9 +166,9 @@
> * move then they may get dropped in pass 2, which breaks the
> * kallsyms rules.
> */
> - if ((s->addr == _etext && strcmp(s->sym + offset, "_etext")) ||
> - (s->addr == _einittext && strcmp(s->sym + offset, "_einittext")) ||
> - (s->addr == _eextratext && strcmp(s->sym + offset, "_eextratext")))
> + if ((s->addr == _etext && strcmp((char*)s->sym + offset, "_etext")) ||
> + (s->addr == _einittext && strcmp((char*)s->sym + offset, "_einittext")) ||
> + (s->addr == _eextratext && strcmp((char*)s->sym + offset, "_eextratext")))
> return 0;
> }

Can we have a local variable so we do not have all the casts in the if
condition?

Sam

Jindrich Makovicka

unread,

Jul 16, 2005, 1:40:06 PM7/16/05

to

Andrew Vasquez wrote:
> Yes, quite. How about the following to correct the intention.
>
>
>
> Add correct Kconfig option for ISP24xx support.
>
> Signed-off-by: Andrew Vasquez <andrew....@qlogic.com>
> ---
>
> diff --git a/drivers/scsi/qla2xxx/Kconfig b/drivers/scsi/qla2xxx/Kconfig
> --- a/drivers/scsi/qla2xxx/Kconfig
> +++ b/drivers/scsi/qla2xxx/Kconfig
> @@ -39,3 +39,11 @@ config SCSI_QLA6312
> ---help---
> This driver supports the QLogic 63xx (ISP6312 and ISP6322) host
> adapter family.
> +
> +config SCSI_QLA24XX
> + tristate "QLogic ISP24xx host adapter family support"
> + depends on SCSI_QLA2XXX
> + select SCSI_FC_ATTRS

there should be also "select FW_LOADER", as it uses request_firmware &
release_firmware

> + ---help---
> + This driver supports the QLogic 24xx (ISP2422 and ISP2432) host
> + adapter family.

--
Jindrich Makovicka

Rafael J. Wysocki

unread,

Jul 16, 2005, 5:40:07 PM7/16/05

to

On Friday, 15 of July 2005 10:36, Andrew Morton wrote:
>
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/
>
> (http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.13-rc3-mm1.gz until
> kernel.org syncs up)

There seems to be a regression wrt 2.6.13-rc3 which causes my box (Asus L5D,
Athlon 64 + nForce3) to hang solid during resume from disk on battery power.

First, 2.6.13-rc3-mm1 is affected by the problems described at:
http://bugzilla.kernel.org/show_bug.cgi?id=4416
http://bugzilla.kernel.org/show_bug.cgi?id=4665
These problems go away after applying the two attached patches. Then, the
box resumes on AC power but hangs solid during resume on battery power.
The problem is 100% reproducible and I think it's related to ACPI.

Greets,
Rafael

--
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
-- Lewis Carroll "Alice's Adventures in Wonderland"

2.6.13-rc3-mm1-irq_router-suspend.patch

2.6.13-rc3-ec-burst-mode-revert.patch

Andrew Morton

unread,

Jul 16, 2005, 5:50:06 PM7/16/05

to

"Rafael J. Wysocki" <r...@sisk.pl> wrote:
>
> On Friday, 15 of July 2005 10:36, Andrew Morton wrote:
> >
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/
> >
> > (http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.13-rc3-mm1.gz until
> > kernel.org syncs up)
>
> There seems to be a regression wrt 2.6.13-rc3 which causes my box (Asus L5D,
> Athlon 64 + nForce3) to hang solid during resume from disk on battery power.
>
> First, 2.6.13-rc3-mm1 is affected by the problems described at:
> http://bugzilla.kernel.org/show_bug.cgi?id=4416
> http://bugzilla.kernel.org/show_bug.cgi?id=4665
> These problems go away after applying the two attached patches. Then, the
> box resumes on AC power but hangs solid during resume on battery power.
> The problem is 100% reproducible and I think it's related to ACPI.

That recent acpi merge seems to have damaged a number of people...

Are you able to test Linus's latest -git spanshot? See if there's a
difference between -linus and -mm behaviour?

Laurent Riffard

unread,

Jul 16, 2005, 6:20:07 PM7/16/05

to

Le 15.07.2005 10:36, Andrew Morton wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/

Hello,

I just got this oops :

Unable to handle kernel NULL pointer dereference at virtual address 00000104
printing eip:
c016c7c4
*pde = 00000000
Oops: 0000 [#1]
last sysfs file:
Modules linked in: isofs pktcdvd autofs4 lp parport_pc parport snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss snd_ens1371 gameport snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm snd_timer snd_page_alloc snd soundcore af_packet floppy ne2k_pci 8390 ide_cd cdrom ohci1394 ieee1394 loop aes_i586 dm_crypt nls_iso8859_1 nls_cp850 vfat fat reiser4 zlib_deflate zlib_inflate reiserfs pcspkr via_agp agpgart dm_mod joydev usbhid uhci_hcd usbcore video hotkey configfs
CPU: 0
EIP: 0060:[dnotify_parent+19/67] Not tainted VLI
EIP: 0060:[<c016c7c4>] Not tainted VLI
EFLAGS: 00010202 (2.6.13-rc3-mm1)
EIP is at dnotify_parent+0x13/0x43
eax: c5d13f70 ebx: cffddfd8 ecx: 00000000 edx: 00000002
esi: c6f22504 edi: c5d13f70 ebp: c712ff0c esp: c712ff08
ds: 007b es: 007b ss: 0068
Process gnome-settings- (pid: 5078, threadinfo=c712e000 task=c708d050)
Jul 16 23:36:30 antares gconfd (laurent-4942): Sortie
Stack: 00000002 c712ff7c c0147dcf 00000001 080afdf0 00000000 0000000c c712ff30
c6e62b60 00000001 080afdfc 00000000 c712ff5c c015a7d5 c712ff40 c712ff40
cffdddd8 c02217e7 00000008 00000000 c63afc60 c712ff64 c015a810 c712ff84
Call Trace:
[show_stack+118/126] show_stack+0x76/0x7e
[<c0103861>] show_stack+0x76/0x7e
[show_registers+234/338] show_registers+0xea/0x152
[<c010396a>] show_registers+0xea/0x152
[die+194/316] die+0xc2/0x13c
[<c0103b0e>] die+0xc2/0x13c
[do_page_fault+916/1344] do_page_fault+0x394/0x540
[<c026f801>] do_page_fault+0x394/0x540
[error_code+79/84] error_code+0x4f/0x54
[<c0103583>] error_code+0x4f/0x54
[do_readv_writev+514/572] do_readv_writev+0x202/0x23c
[<c0147dcf>] do_readv_writev+0x202/0x23c
[vfs_writev+64/71] vfs_writev+0x40/0x47
[<c0147e8d>] vfs_writev+0x40/0x47
[sys_writev+59/147] sys_writev+0x3b/0x93
[<c0147f62>] sys_writev+0x3b/0x93
[sysenter_past_esp+84/117] sysenter_past_esp+0x54/0x75
[<c0102a7f>] sysenter_past_esp+0x54/0x75
Code: 85 db 75 be 83 7d ec 00 74 07 89 f8 e8 ba fd ff ff 58 5a 5b 5e 5f c9 c3 83 3d 94 a1 2b c0 00 55 89 e5 53 74 30 8b 58 0c 8b 4b 08 <85> 91 04 01 00 00 74 22 85 db 74 10 8b 03 85 c0 75 08 0f 0b 27
<1>Unable to handle kernel NULL pointer dereference at virtual address 00000099
printing eip:
c015a51b
*pde = 00000000
Oops: 0000 [#2]
last sysfs file:
Modules linked in: isofs pktcdvd autofs4 lp parport_pc parport snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss snd_ens1371 gameport snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm snd_timer snd_page_alloc snd soundcore af_packet floppy ne2k_pci 8390 ide_cd cdrom ohci1394 ieee1394 loop aes_i586 dm_crypt nls_iso8859_1 nls_cp850 vfat fat reiser4 zlib_deflate zlib_inflate reiserfs pcspkr via_agp agpgart dm_mod joydev usbhid uhci_hcd usbcore video hotkey configfs
CPU: 0
EIP: 0060:[dcache_shrinker_add+16/52] Not tainted VLI
EIP: 0060:[<c015a51b>] Not tainted VLI
EFLAGS: 00010202 (2.6.13-rc3-mm1)
EIP is at dcache_shrinker_add+0x10/0x34
eax: 00000001 ebx: c712fdb0 ecx: c5d13f70 edx: cffddfd8
esi: cffddfd8 edi: c6e62b60 ebp: c712fda8 esp: c712fda4
ds: 007b es: 007b ss: 0068
Process gnome-settings- (pid: 5078, threadinfo=c712e000 task=c708d050)
Stack: c5d13f70 c712fdcc c015a776 c712fdb8 c026b5cc cffddfd8 c02217e7 00000008
00000000 c6e62b60 c712fdd4 c015a810 c712fdf4 c0148546 c6f22504 cffe04e0
c5d13f70 c6e62b60 c131f040 00000000 c712fe00 c0148422 c6e62b60 c712fe14
Call Trace:
[show_stack+118/126] show_stack+0x76/0x7e
[<c0103861>] show_stack+0x76/0x7e
[show_registers+234/338] show_registers+0xea/0x152
[<c010396a>] show_registers+0xea/0x152
[die+194/316] die+0xc2/0x13c
[<c0103b0e>] die+0xc2/0x13c
[do_page_fault+916/1344] do_page_fault+0x394/0x540
[<c026f801>] do_page_fault+0x394/0x540
[error_code+79/84] error_code+0x4f/0x54
[<c0103583>] error_code+0x4f/0x54
[dput_recursive+326/466] dput_recursive+0x146/0x1d2
[<c015a776>] dput_recursive+0x146/0x1d2
[dput+14/16] dput+0xe/0x10
[<c015a810>] dput+0xe/0x10
[__fput+287/354] __fput+0x11f/0x162
[<c0148546>] __fput+0x11f/0x162
[fput+46/51] fput+0x2e/0x33
[<c0148422>] fput+0x2e/0x33
[filp_close+78/88] filp_close+0x4e/0x58
[<c01470fd>] filp_close+0x4e/0x58
[put_files_struct+132/183] put_files_struct+0x84/0xb7
[<c0117d52>] put_files_struct+0x84/0xb7
[do_exit+376/844] do_exit+0x178/0x34c
[<c011885d>] do_exit+0x178/0x34c
[do_divide_error+0/153] do_divide_error+0x0/0x99
[<c0103b88>] do_divide_error+0x0/0x99
[do_page_fault+916/1344] do_page_fault+0x394/0x540
[<c026f801>] do_page_fault+0x394/0x540
[error_code+79/84] error_code+0x4f/0x54
[<c0103583>] error_code+0x4f/0x54
[do_readv_writev+514/572] do_readv_writev+0x202/0x23c
[<c0147dcf>] do_readv_writev+0x202/0x23c
[vfs_writev+64/71] vfs_writev+0x40/0x47
[<c0147e8d>] vfs_writev+0x40/0x47
[sys_writev+59/147] sys_writev+0x3b/0x93
[<c0147f62>] sys_writev+0x3b/0x93
[sysenter_past_esp+84/117] sysenter_past_esp+0x54/0x75
[<c0102a7f>] sysenter_past_esp+0x54/0x75
Code: 8b 50 10 85 d2 74 04 89 d8 ff d2 8d 43 4c ba bc a4 15 c0 e8 7c 90 fc ff 5b c9 c3 55 39 ca 89 e5 53 89 c3 74 22 8b 42 44 89 53 08 <8b> 90 98 00 00 00 8d 88 98 00 00 00 89 5a 04 89 13 89 4b 04 89
<1>Fixing recursive fault but reboot is needed!

It just happened once. I noticed it because I wasn't able to suspend to disk, gnome-setting was stuck in D state.

I don't know if this is specific to 2.6.13-rc3-mm1.

.config is attached below.

Feel free to ask for more information.
~~
laurent

signature.asc

config-2.6.13-rc3-mm1

Joseph Fannin

unread,

Jul 16, 2005, 9:40:06 PM7/16/05

to

On Fri, Jul 15, 2005 at 01:36:53AM -0700, Andrew Morton wrote:
>
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/

> +suspend-update-documentation.patch
> +swsusp-fix-printks-and-cleanups.patch
> +swsusp-fix-remaining-u32-vs-pm_message_t-confusion.patch
> +swsusp-switch-pm_message_t-to-struct.patch
> +swsusp-switch-pm_message_t-to-struct-pmac_zilog-fix.patch
> +swsusp-switch-pm_message_t-to-struct-ppc32-fixes.patch
> +fix-pm_message_t-stuff-in-mm-tree-netdev.patch

I'm getting this (on ppc32, though I don't think it matters):

CC drivers/video/chipsfb.o
drivers/video/chipsfb.c: In function `chipsfb_pci_suspend':
drivers/video/chipsfb.c:465: error: invalid operands to binary ==
drivers/video/chipsfb.c:467: error: invalid operands to binary !=
make[3]: *** [drivers/video/chipsfb.o] Error 1
make[2]: *** [drivers/video] Error 2
make[1]: *** [drivers] Error 2
make[1]: Leaving directory
`/usr/src/linux-ctesiphon/linux-2.6.13-rc3-mm1'
make: *** [stamp-build] Error 2

The above-quoted patches seem to be the culprit, but my feeble
attempts at making a patch didn't work out.

While I'm complaining:

> Q: Why we cannot suspend to a swap file?

> A: Because accessing swap file needs the filesystem mounted, and
> filesystem might do something wrong (like replaying the journal)
> during mount. [Probably could be solved by modifying every filesystem
> to support some kind of "really read-only!" option. Patches welcome.]

I seem to recall that swsusp2 can do this.

I don't hold out much hope that suspend will ever work on my
laptop, with its i815 video chipset, at least not from X (and then
there's no point). The i81x and the linux video architecture just
don't get along, even if I do away with i810fb and DRM support.

But I can't help but notice that every linux-suspend HOWTO tells
you to patch in swsusp2 as a first step -- the consensus seems to be
that it you want clean and conservative code, use swsusp1; if you want
suspending to *work*, use swsusp2. How many people are actually able
to make use of swsusp1? Is anyone testing it besides Mr. Machek?

This is a case in point; every time I partition a system for
Linux, I have to consider whether or not I'm ever going to want swsusp
to work on that box. The performance penalty for swap files went
away in 2.6, so this is sort of a regression.

I know I'm not going to be writing any of those patches, but I'd
sure be nice if Linux got around to having usable suspend support
without being beholden to the whatever patches Mr. Cunningham gets
around to putting out.

--
Joseph Fannin
j...@rivenstone.net

/* So there I am, in the middle of my `netfilter-is-wonderful'
talk in Sydney, and someone asks `What happens if you try
to enlarge a 64k packet here?'. I think I said something
eloquent like `fuck'. - RR */

Adrian Bunk

unread,

Jul 16, 2005, 10:50:04 PM7/16/05

to

[ The subject was adapted to linux-kernel spam filters... ]

On Fri, Jul 15, 2005 at 07:40:37AM -0700, Andrew Vasquez wrote:
> On Fri, 15 Jul 2005, Adrian Bunk wrote:
>
> > On Fri, Jul 15, 2005 at 01:36:53AM -0700, Andrew Morton wrote:
> > >...
> > > Changes since 2.6.13-rc2-mm2:
> > >...
> > > git-scsi-misc.patch
> > >...
> > > Subsystem trees
> > >...
> >
> ...
> > +obj-$(CONFIG_SCSI_QLA24XX) += qla2xxx.o
> >
> >
> > I don't know what exactly you want to achieve, but this is so horribly
> > wrong.
>
>
> Yes, quite. How about the following to correct the intention.

>...

It looks good (except that you used spaces instead of a tab in the
"select" line, but that's only a minor nitpick).

Below is another fix for a different issue that was already present.

cu
Adrian

<-- snip -->

SCSI_QLA2XXX is automatically enabled for (SCSI && PCI).
It therefore mustn't select SCSI_FC_ATTRS, since it otherwise
unconditionally enables SCSI_FC_ATTRS for all users with
(SCSI && PCI) enabled, even when they don't need any support for
QLogic hardware.

This patch also does a cosmetic change for making the "default" look
more like in other kernel code.

Signed-off-by: Adrian Bunk <bu...@stusta.de>

--- linux-2.6.13-rc3-mm1-full/drivers/scsi/qla2xxx/Kconfig.old 2005-07-15 22:05:19.000000000 +0200
+++ linux-2.6.13-rc3-mm1-full/drivers/scsi/qla2xxx/Kconfig 2005-07-15 22:07:42.000000000 +0200
@@ -1,8 +1,7 @@
config SCSI_QLA2XXX
tristate
- default (SCSI && PCI)
depends on SCSI && PCI
- select SCSI_FC_ATTRS
+ default y

config SCSI_QLA21XX
tristate "QLogic ISP2100 host adapter family support"

Lee Revell

unread,

Jul 16, 2005, 11:20:06 PM7/16/05

to

On Sun, 2005-07-17 at 04:38 +0200, Adrian Bunk wrote:
> SCSI_QLA2XXX is automatically enabled for (SCSI && PCI).

This has bugged me for a while. Why does this one SCSI driver default
to Y in the first place?

Lee

randy_dunlap

unread,

Jul 17, 2005, 12:10:04 AM7/17/05

to

On Sat, 16 Jul 2005 23:11:26 -0400 Lee Revell wrote:

> On Sun, 2005-07-17 at 04:38 +0200, Adrian Bunk wrote:
> > SCSI_QLA2XXX is automatically enabled for (SCSI && PCI).
>
> This has bugged me for a while. Why does this one SCSI driver default
> to Y in the first place?

It's not a driver, it's a subdirectory.

---
~Randy

Lee Revell

unread,

Jul 17, 2005, 12:30:09 AM7/17/05

to

On Sat, 2005-07-16 at 21:04 -0700, randy_dunlap wrote:
> On Sat, 16 Jul 2005 23:11:26 -0400 Lee Revell wrote:
>
> > On Sun, 2005-07-17 at 04:38 +0200, Adrian Bunk wrote:
> > > SCSI_QLA2XXX is automatically enabled for (SCSI && PCI).
> >
> > This has bugged me for a while. Why does this one SCSI driver default
> > to Y in the first place?
>
> It's not a driver, it's a subdirectory.

Ah, ok. Thanks.

Lee

Paul Jackson

unread,

Jul 17, 2005, 11:30:47 AM7/17/05

to

Andrew, replying to Christoph, about CKRM:
> What, in your opinion, makes it "obviously unmergeable"?

Thanks to some earlier discussions on the relation of CKRM with
cpusets, I've spent some time looking at CKRM. I'm not Christoph,
but perhaps my notes will be of some use in this matter.

CKRM is big, it's difficult for us mere mortals to understand, and it
has attracted only limited review - inadequate review in proportion
to its size and impact. I tried, and failed, sometime last year to
explain some of what I found difficult to grasp of CKRM to the folks
doing it. See further an email thread entitled:

Classes: 1) what are they, 2) what is their name?
http://sourceforge.net/mailarchive/forum.php?thread_id=5328162&forum_id=35191

on the ckrm...@lists.sourceforge.net email list between Aug 14 and
Aug 27, 2004

As to its size, CKRM is in a 2.6.5 variant of SuSE that I happen to be
building just now for other reasons. The source files that have 'ckrm'
in the pathname, _not_ counting Doc files, total 13044 lines of text.
The CONFIG_CKRM* config options add 144 Kbytes to the kernel text.

The CKRM patches in 2.6.13-rc3-mm1 are similar in size. These patch
files total 14367 lines of text.

It is somewhat intrusive in the areas it controls, such as some large
ifdef's in kernel/sched.c.

The sched hooks may well impact the cost of maintaining the sched code,
which is always a hotbed of Linux kernel development. However others
who work in that area will have to speak to that concern.

I tried just now to read through the ckrm hooks in fork, to see
what sort of impact they might have on scalability on large systems.
But I gave up after a couple layers of indirection. I saw several
atomic counters and a couple of spinlocks that I suspect (not at all
sure) lay on the fork main code path. I'd be surprised if this didn't
impact scalability. Earlier, according to my notes, I saw mention of
lmbench results in the OLS 2004 slides, indicating a several percent
cost of available cpu cycles.

A feature of this size and impact needs to attract a fair bit of
discussion, because it is essential to a variety of people, or because
it is intriguing in some other way.

I suspect that the main problem is that this patch is not a mainstream
kernel feature that will gain multiple uses, but rather provides
support for a specific vendor middleware product used by that
vendor and a few closely allied vendors. If it were smaller or
less intrusive, such as a driver, this would not be a big problem.
That's not the case.

The threshold of what is sufficient review needs to be set rather high
for such a patch, quite a bit higher than I believe it has obtained
so far. It will not be easy for them to obtain that level of review,
until they get better at arousing the substained interest of other
kernel developers.

There may well be multiple end users and applications depending on
CKRM, but I have not been able to identify how many separate vendors
provide middleware that depends on CKRM. I am guessing that only one
vendor has a serious middleware software product that provides full
CKRM support. Acceptance of CKRM would be easier if multiple competing
middleware vendors were using it. It is also a concern that CKRM
is not really usable for its primary intended purpose except if it
is accompanied by this corresponding middleware, which I presume is
proprietary code. I'd like to see a persuasive case that CKRM is
useful and used on production systems not running substantial sole
sourced proprietary middleware.

The development and maintenance costs so far of CKRM appear (to
this outsider) to have been substantial, which suggests that the
maintenance costs of CKRM once in the kernel would be non-trivial.
Given the size of the project, its impact on kernel code, and the
rather limited degree to which developers outside of the CKRM project
have participated in CKRM's development or review, this could either
leave the Linux kernel overly dependent on one vendor for maintaining
CKRM, or place an undo maintenance burden on other kernel developers.

CKRM is in part a generalization and descendent of what I call fair
share schedulers. For example, the fork hooks for CKRM include a
forkrates controller, to slow down the rate of forking of tasks using
too much resources.

No doubt the CKRM experts are already familiar with these, but for
the possible benefit of other readers:

UNICOS Resource Administration - Chapter 4. Fair-share Scheduler
http://oscinfo.osc.edu:8080/dynaweb/all/004-2302-001/@Generic__BookTextView/22883

SHARE II -- A User Administration and Resource Control System for UNIX
http://www.c-side.com/c/papers/lisa-91.html

Solaris Resource Manager White Paper
http://wwws.sun.com/software/resourcemgr/wp-mixed/

ON THE PERFORMANCE IMPACT OF FAIR SHARE SCHEDULING
http://www.cs.umb.edu/~eb/goalmode/cmg2000final.htm

A Fair Share Scheduler, J. Kay and P. Lauder
Communications of the ACM, January 1988, Volume 31, Number 1, pp 44-55.

The documentation that I've noticed (likely I've missed something)
doesn't do an adequate job of making the case - providing the
motivation and context essential to understanding this patch set.

Because CKRM provides an infrastructure for multiple controllers
(limiting forks, memory allocation and network rates) and multiple
classifiers and policies, its critical interfaces have rather
generic and abstract names. This makes it difficult for others to
approach CKRM, reducing the rate of peer review by other Linux kernel
developers, which is perhaps the key impediment to acceptance of CKRM.
If anything, CKRM tends to be a little too abstract.

Inclusion of diffstat output would help convey to others the scope
of the patchset.

My notes from many months ago indicate something about a 128 CPU
limit in CKRM. I don't know why, nor if it still applies. It is
certainly a smaller limit than the systems I care about.

A major restructuring of this patch set could be considered, This
might involve making the metric tools (that monitor memory, fork
and network usage rates per task) separate patches useful for other
purposes. It might also make the rate limiters in fork, alloc and
network i/o separately useful patches. I mean here genuinely useful
and understandable in their own right, independent of some abstract
CKRM framework.

Though hints have been dropped, I have not seen any public effort to
integrate CKRM with either cpusets or scheduler domains or process
accounting. By this I don't mean recoding cpusets using the CKRM
infrastructure; that proposal received _extensive_ consideration
earlier, and I am as certain as ever that it made no sense. Rather I
could imagine the CKRM folks extending cpusets to manage resources
on a per-cpuset basis, not just on a per-task or task class basis.
Similarly, it might make sense to use CKRM to manage resources on
a per-sched domain basis, and to integrate the resource tracking
of CKRM with the resource tracking needs of system accounting.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <p...@sgi.com> 1.925.600.0401

Mark Hahn

unread,

Jul 17, 2005, 3:10:05 PM7/17/05

to

> I suspect that the main problem is that this patch is not a mainstream
> kernel feature that will gain multiple uses, but rather provides
> support for a specific vendor middleware product used by that
> vendor and a few closely allied vendors. If it were smaller or
> less intrusive, such as a driver, this would not be a big problem.
> That's not the case.

yes, that's the crux. CKRM is all about resolving conflicting resource
demands in a multi-user, multi-server, multi-purpose machine. this is a
huge undertaking, and I'd argue that it's completely inappropriate for
*most* servers. that is, computers are generally so damn cheap that
the clear trend is towards dedicating a machine to a specific purpose,
rather than running eg, shell/MUA/MTA/FS/DB/etc all on a single machine.

this is *directly* in conflict with certain prominent products, such as
the Altix and various less-prominent Linux-based mainframes. they're all
about partitioning/virtualization - the big-iron aesthetic of splitting up
a single machine. note that it's not just about "big", since cluster-based
approaches can clearly scale far past big-iron, and are in effect statically
partitioned. yes, buying a hideously expensive single box, and then chopping
it into little pieces is more than a little bizarre, and is mainly based
on a couple assumptions:

- that clusters are hard. really, they aren't. they are not
necessarily higher-maintenance, can be far more robust, usually
do cost less. just about the only bad thing about clusters is
that they tend to be somewhat larger in size.

- that partitioning actually makes sense. the appeal is that if
you have a partition to yourself, you can only hurt yourself.
but it also follows that burstiness in resource demand cannot be
overlapped without either constantly tuning the partitions or
infringing on the guarantee.

CKRM is one of those things that could be done to Linux, and will benefit a
few, but which will almost certainly hurt *most* of the community.

let me say that the CKRM design is actually quite good. the issue is whether
the extensive hooks it requires can be done (at all) in a way which does
not disporportionately hurt maintainability or efficiency.

CKRM requires hooks into every resource-allocation decision fastpath:
- if CKRM is not CONFIG, the only overhead is software maintenance.
- if CKRM is CONFIG but not loaded, the overhead is a pointer check.
- if CKRM is CONFIG and loaded, the overhead is a pointer check
and a nontrivial callback.

but really, this is only for CKRM-enforced limits. CKRM really wants to
change behavior in a more "weighted" way, not just causing an
allocation/fork/packet to fail. a really meaningful CKRM needs to
be tightly integrated into each resource manager - effecting each scheduler
(process, memory, IO, net). I don't really see how full-on CKRM can be
compiled out, unless these schedulers are made fully pluggable.

finally, I observe that pluggable, class-based resource _limits_ could
probably be done without callbacks and potentially with low overhead.
but mere limits doesn't meet CKRM's goal of flexible, wide-spread resource
partitioning within a large, shared machine.

regards, mark hahn.

Rafael J. Wysocki

unread,

Jul 17, 2005, 4:20:09 PM7/17/05

to

On Saturday, 16 of July 2005 23:39, Andrew Morton wrote:
> "Rafael J. Wysocki" <r...@sisk.pl> wrote:
> >
> > On Friday, 15 of July 2005 10:36, Andrew Morton wrote:
> > >
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/
> > >
> > > (http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.13-rc3-mm1.gz until
> > > kernel.org syncs up)
> >
> > There seems to be a regression wrt 2.6.13-rc3 which causes my box (Asus L5D,
> > Athlon 64 + nForce3) to hang solid during resume from disk on battery power.
> >
> > First, 2.6.13-rc3-mm1 is affected by the problems described at:
> > http://bugzilla.kernel.org/show_bug.cgi?id=4416
> > http://bugzilla.kernel.org/show_bug.cgi?id=4665
> > These problems go away after applying the two attached patches. Then, the
> > box resumes on AC power but hangs solid during resume on battery power.
> > The problem is 100% reproducible and I think it's related to ACPI.
>
> That recent acpi merge seems to have damaged a number of people...
>
> Are you able to test Linus's latest -git spanshot? See if there's a
> difference between -linus and -mm behaviour?

I was afraid you would say so. ;-)

The -rc3-git-[2-4] kernels are unaffected by the problem described, so it seems
to be specific to -rc3-mm1.

Greets,
Rafael

--
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
-- Lewis Carroll "Alice's Adventures in Wonderland"

Rafael J. Wysocki

unread,

Jul 17, 2005, 4:31:10 PM7/17/05

to

On Friday, 15 of July 2005 10:36, Andrew Morton wrote:
>
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/
>
> (http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.13-rc3-mm1.gz until
> kernel.org syncs up)

Apparently, mount does not work with partitions located on a 3ware RAID
(8006-2PL controller) in a dual-Opteron box (64-bit kernel).

If the kernel is configured with preemption and NUMA, it cannot mount any
"real" filesystems and the output of "mount" is the following:

rootfs on / type ext3 (rw)
/dev/root on / type ext3 (rw)
proc on /proc type proc (rw,nodiratime)
sysfs on /sys type sysfs (rw)
tmpfs on /dev/shm type tmpfs (rw)

(hand-copied from the screen). I have tried some other combinations (ie.
preemption w/o NUMA, NUMA w/o preemption etc.) and it seems that it works
better with CONFIG_PREEMPT_NONE set, although even it this case some
filesystems are mounted read-only.

The mainline kernels (ie. -rc3 and -rc3-git[1-4]) have no such problems.

Hirokazu Takahashi

unread,

Jul 18, 2005, 6:20:08 AM7/18/05

to

Hi,

> > What, in your opinion, makes it "obviously unmergeable"?

Controlling resource assignment, I think that concept is good.
But the design is another matter that it seems somewhat overkilled
with the current CKRM.

> I suspect that the main problem is that this patch is not a mainstream
> kernel feature that will gain multiple uses, but rather provides
> support for a specific vendor middleware product used by that
> vendor and a few closely allied vendors. If it were smaller or
> less intrusive, such as a driver, this would not be a big problem.
> That's not the case.

I believe this feature would also make desktop users happier -- controlling
X-server, mpeg player, video capturing and all that -- if the code
becomes much simpler and easier to use.

> A major restructuring of this patch set could be considered, This
> might involve making the metric tools (that monitor memory, fork
> and network usage rates per task) separate patches useful for other
> purposes. It might also make the rate limiters in fork, alloc and
> network i/o separately useful patches. I mean here genuinely useful
> and understandable in their own right, independent of some abstract
> CKRM framework.

That makes sense.

> Though hints have been dropped, I have not seen any public effort to
> integrate CKRM with either cpusets or scheduler domains or process
> accounting. By this I don't mean recoding cpusets using the CKRM
> infrastructure; that proposal received _extensive_ consideration
> earlier, and I am as certain as ever that it made no sense. Rather I
> could imagine the CKRM folks extending cpusets to manage resources
> on a per-cpuset basis, not just on a per-task or task class basis.
> Similarly, it might make sense to use CKRM to manage resources on
> a per-sched domain basis, and to integrate the resource tracking
> of CKRM with the resource tracking needs of system accounting.

From a standpoint of the users, CKRM and CPUSETS should be managed
seamlessly through the same interface though I'm not sure whether
your idea is the best yet.

Thanks,
Hirokazu Takahashi.

Paulo Marques

unread,

Jul 18, 2005, 7:20:28 AM7/18/05

to

Sam Ravnborg wrote:
> On Fri, Jul 15, 2005 at 10:14:43PM +0000, J.A. Magallon wrote:
>
>>On 07.16, J.A. Magallon wrote:

>>[...]

>>This time I did not break anything... and they shut up gcc4 ;)
>
> Thanks.
> Can you please resend with proper changelog and signed-off-by.
> Diff should be done on top of latest -linus preferable.
> Also this patch seems relative small compared to the others floating
> around to cure signed warnings in scripts/
> Does this really fix all of them or only a subset of the warnings?

Well, current -linus already has a patch from me to change the
compression scheme that also fixes most of the signedness problems. The
ones below escaped me because my gcc3.3.2 didn't complain about them
even with all the -W[xxx] switches I could find.

This takes a big hunk out of previous patches I've seen, so that might
explain the difference.

--
Paulo Marques - www.grupopie.com

It is a mistake to think you can solve any major problems
just with potatoes.
Douglas Adams

Paulo Marques

unread,

Jul 18, 2005, 7:40:25 AM7/18/05

to

Paulo Marques wrote:
> Sam Ravnborg wrote:
> [...]

>> Also this patch seems relative small compared to the others floating
>> around to cure signed warnings in scripts/
>> Does this really fix all of them or only a subset of the warnings?
>
> Well, current -linus already has a patch from me to change the

^^^^^^
I meant -mm... :P

Pavel Machek

unread,

Jul 18, 2005, 7:50:12 AM7/18/05

to

Hi!

> I'm getting this (on ppc32, though I don't think it matters):
>
> CC drivers/video/chipsfb.o
> drivers/video/chipsfb.c: In function `chipsfb_pci_suspend':
> drivers/video/chipsfb.c:465: error: invalid operands to binary ==
> drivers/video/chipsfb.c:467: error: invalid operands to binary !=
> make[3]: *** [drivers/video/chipsfb.o] Error 1
> make[2]: *** [drivers/video] Error 2
> make[1]: *** [drivers] Error 2
> make[1]: Leaving directory
> `/usr/src/linux-ctesiphon/linux-2.6.13-rc3-mm1'
> make: *** [stamp-build] Error 2
>
> The above-quoted patches seem to be the culprit, but my feeble
> attempts at making a patch didn't work out.

Should be easy. Just add .event at right places...

> But I can't help but notice that every linux-suspend HOWTO tells
> you to patch in swsusp2 as a first step -- the consensus seems to be
> that it you want clean and conservative code, use swsusp1; if you want
> suspending to *work*, use swsusp2. How many people are actually able
> to make use of swsusp1? Is anyone testing it besides Mr. Machek?

SuSE ships it in production, so I believe we have at least as many
users as suspend2...
Pavel
--
teflon -- maybe it is a trademark, but it should not be.

Joseph Fannin

unread,

Jul 18, 2005, 10:30:20 AM7/18/05

to

On Sat, Jul 16, 2005 at 09:32:49PM -0400, wrote:
> On Fri, Jul 15, 2005 at 01:36:53AM -0700, Andrew Morton wrote:
> >
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/
>
> > +suspend-update-documentation.patch
> > +swsusp-fix-printks-and-cleanups.patch
> > +swsusp-fix-remaining-u32-vs-pm_message_t-confusion.patch
> > +swsusp-switch-pm_message_t-to-struct.patch
> > +swsusp-switch-pm_message_t-to-struct-pmac_zilog-fix.patch
> > +swsusp-switch-pm_message_t-to-struct-ppc32-fixes.patch
> > +fix-pm_message_t-stuff-in-mm-tree-netdev.patch
>

I needed this little patch too. It's boot-tested; I have a MESH
controller.

Thanks!

-

diff -aurN linux-2.6.13-rc3-mm1/drivers/scsi/mesh.c linux-2.6.13-rc3-mm1_changed/drivers/scsi/mesh.c
--- linux-2.6.13-rc3-mm1/drivers/scsi/mesh.c 2005-07-16 01:46:44.000000000 -0400
+++ linux-2.6.13-rc3-mm1_changed/drivers/scsi/mesh.c 2005-07-18 07:52:04.000000000 -0400
@@ -1766,7 +1766,7 @@
struct mesh_state *ms = (struct mesh_state *)macio_get_drvdata(mdev);
unsigned long flags;

- if (state == mdev->ofdev.dev.power.power_state || state < 2)
+ if (state.event == mdev->ofdev.dev.power.power_state.event || state.event < 2)
return 0;

scsi_block_requests(ms->host);
@@ -1781,7 +1781,7 @@
disable_irq(ms->meshintr);
set_mesh_power(ms, 0);

- mdev->ofdev.dev.power.power_state = state;
+ mdev->ofdev.dev.power.power_state.event = state.event;

return 0;
}
@@ -1791,7 +1791,7 @@
struct mesh_state *ms = (struct mesh_state *)macio_get_drvdata(mdev);
unsigned long flags;

- if (mdev->ofdev.dev.power.power_state == 0)
+ if (mdev->ofdev.dev.power.power_state.event == 0)
return 0;

set_mesh_power(ms, 1);
@@ -1802,7 +1802,7 @@
enable_irq(ms->meshintr);
scsi_unblock_requests(ms->host);

- mdev->ofdev.dev.power.power_state = 0;
+ mdev->ofdev.dev.power.power_state.event = 0;

return 0;
}

--
Joseph Fannin
jfa...@gmail.com

"That's all I have to say about that." -- Forrest Gump.

Adrian Bunk

unread,

Jul 19, 2005, 10:20:06 AM7/19/05

to

[ The subject was adapted to linux-kernel spam filters... ]

On Sat, Jul 16, 2005 at 07:26:44PM +0200, Jindrich Makovicka wrote:
> Andrew Vasquez wrote:
> > Yes, quite. How about the following to correct the intention.
> >
> >
> >
> > Add correct Kconfig option for ISP24xx support.
> >
> > Signed-off-by: Andrew Vasquez <andrew....@qlogic.com>
> > ---
> >
> > diff --git a/drivers/scsi/qla2xxx/Kconfig b/drivers/scsi/qla2xxx/Kconfig
> > --- a/drivers/scsi/qla2xxx/Kconfig
> > +++ b/drivers/scsi/qla2xxx/Kconfig
> > @@ -39,3 +39,11 @@ config SCSI_QLA6312
> > ---help---
> > This driver supports the QLogic 63xx (ISP6312 and ISP6322) host
> > adapter family.
> > +
> > +config SCSI_QLA24XX
> > + tristate "QLogic ISP24xx host adapter family support"
> > + depends on SCSI_QLA2XXX
> > + select SCSI_FC_ATTRS
>
> there should be also "select FW_LOADER", as it uses request_firmware &
> release_firmware

>...

You are right, patch below.

> Jindrich Makovicka

cu
Adrian

<-- snip -->

qla_init.c now uses code that requires FW_LOADER.

Additionally, this patch removes spaces instead of tabs at the
SCSI_FC_ATTRS selects.

Signed-off-by: Adrian Bunk <bu...@stusta.de>

--- linux-2.6.13-rc3-mm1-full/drivers/scsi/qla2xxx/Kconfig.old 2005-07-17 15:44:26.000000000 +0200
+++ linux-2.6.13-rc3-mm1-full/drivers/scsi/qla2xxx/Kconfig 2005-07-17 15:45:45.000000000 +0200
@@ -1,49 +1,55 @@
config SCSI_QLA2XXX
tristate

depends on SCSI && PCI

default y

config SCSI_QLA21XX
tristate "QLogic ISP2100 host adapter family support"

depends on SCSI_QLA2XXX
- select SCSI_FC_ATTRS
+ select SCSI_FC_ATTRS
+ select FW_LOADER
---help---
This driver supports the QLogic 21xx (ISP2100) host adapter family.

config SCSI_QLA22XX
tristate "QLogic ISP2200 host adapter family support"
depends on SCSI_QLA2XXX
- select SCSI_FC_ATTRS
+ select SCSI_FC_ATTRS
+ select FW_LOADER
---help---
This driver supports the QLogic 22xx (ISP2200) host adapter family.

config SCSI_QLA2300
tristate "QLogic ISP2300 host adapter family support"
depends on SCSI_QLA2XXX
- select SCSI_FC_ATTRS
+ select SCSI_FC_ATTRS
+ select FW_LOADER
---help---
This driver supports the QLogic 2300 (ISP2300 and ISP2312) host
adapter family.

config SCSI_QLA2322
tristate "QLogic ISP2322 host adapter family support"
depends on SCSI_QLA2XXX
- select SCSI_FC_ATTRS
+ select SCSI_FC_ATTRS
+ select FW_LOADER
---help---
This driver supports the QLogic 2322 (ISP2322) host adapter family.

config SCSI_QLA6312
tristate "QLogic ISP63xx host adapter family support"
depends on SCSI_QLA2XXX
- select SCSI_FC_ATTRS
+ select SCSI_FC_ATTRS
+ select FW_LOADER

---help---
This driver supports the QLogic 63xx (ISP6312 and ISP6322) host
adapter family.

config SCSI_QLA24XX

tristate "QLogic ISP24xx host adapter family support"

depends on SCSI_QLA2XXX
- select SCSI_FC_ATTRS
+ select SCSI_FC_ATTRS
+ select FW_LOADER
---help---

This driver supports the QLogic 24xx (ISP2422 and ISP2432) host

adapter family.

Coywolf Qi Hunt

unread,

Jul 19, 2005, 10:30:14 AM7/19/05

to

On 7/15/05, Andrew Morton <ak...@osdl.org> wrote:
>
>
> Changes since 2.6.13-rc2-mm2:
>
>
> git-drm.patch
> git-audit.patch
> git-input.patch
> git-kbuild.patch

make help br0ken, missing matching `'' for binrpm-pkg.

--
Coywolf Qi Hunt
http://ahbl.org/~coywolf/

Coywolf Qi Hunt

unread,

Jul 19, 2005, 10:50:10 AM7/19/05

to

On Tue, Jul 19, 2005 at 10:21:30PM +0800, Coywolf Qi Hunt wrote:
> On 7/15/05, Andrew Morton <ak...@osdl.org> wrote:
> >
> >
> > Changes since 2.6.13-rc2-mm2:
> >
> >
> > git-drm.patch
> > git-audit.patch
> > git-input.patch
> > git-kbuild.patch
>
> make help br0ken, missing matching `'' for binrpm-pkg.
>

This fixes kbuild make help binrpm-pkg missing `''.

Signed-off-by: Coywolf Qi Hunt <coy...@lovecn.org>

--- 2.6.13-rc3-mm1-cy/scripts/package/Makefile~binrpm-pkg-fix 2005-07-19 22:25:27.000000000 +0800
+++ 2.6.13-rc3-mm1-cy/scripts/package/Makefile 2005-07-19 22:25:47.000000000 +0800
@@ -94,7 +94,7 @@ clean-dirs += $(objtree)/tar-install/
# ---------------------------------------------------------------------------
help:
@echo ' rpm-pkg - Build the kernel as an RPM package'
- @echo ' binrpm-pkg - Build an rpm package containing the compiled kernel
+ @echo ' binrpm-pkg - Build an rpm package containing the compiled kernel'
@echo ' and modules'
@echo ' deb-pkg - Build the kernel as an deb package'
@echo ' tar-pkg - Build the kernel as an uncompressed tarball'

Jesper Juhl

unread,

Jul 20, 2005, 9:40:08 AM7/20/05

to

I send a patch for this yesterday that lets SCSI_QLA2XXX select
FW_LOADER. I believe that's a bit better since the other options
depend on SCSI_QLA2XXX anyway, there's no point in having them all set
FW_LOADER. My patch also fixes another little issue; that you cannot
disable SCSI_QLA2XXX if you don't need it.

See the patch here: http://lkml.org/lkml/2005/7/19/147
The mail contains 3 patches, but the third one is the best fix IMHO.

--
Jesper Juhl <jespe...@gmail.com>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

Paul Jackson

unread,

Jul 20, 2005, 9:50:06 PM7/20/05

to

Well said, Mark. Thanks.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <p...@sgi.com> 1.925.600.0401

Ed Tomlinson

unread,

Jul 21, 2005, 7:40:07 AM7/21/05

to

Hi,

I just tried 13-rc2-mm1 and dri is working again. Its reported to also work with
13-rc3. What in mm1 is apt to be breaking dri?

Thanks
Ed Tomlinson

---------- Forwarded Message ----------

Subject: Re: Xorg and RADEON (dri disabled)
Date: Wednesday 20 July 2005 21:25
From: Ed Tomlinson <tom...@cam.org>
To: debian...@lists.debian.org
Cc: Michal Schmidt <xsch...@stud.feec.vutbr.cz>

On Wednesday 20 July 2005 21:13, Michal Schmidt wrote:
> Ed Tomlinson wrote:
> > Hi,
> >
> > With Xorg I get:
> >
> > (==) RADEON(0): Write-combining range (0xd0000000,0x8000000)
> > drmOpenDevice: node name is /dev/dri/card0
> > drmOpenDevice: open result is -1, (No such device)
> > drmOpenDevice: open result is -1, (No such device)
> > drmOpenDevice: Open failed
> > drmOpenDevice: node name is /dev/dri/card0
> > drmOpenDevice: open result is -1, (No such device)
> > drmOpenDevice: open result is -1, (No such device)
> > drmOpenDevice: Open failed
> > drmOpenByBusid: Searching for BusID pci:0000:01:00.0
> > drmOpenDevice: node name is /dev/dri/card0
> > drmOpenDevice: open result is 7, (OK)
> > drmOpenByBusid: drmOpenMinor returns 7
> > drmOpenByBusid: drmGetBusid reports pci:0000:01:00.0
> > (II) RADEON(0): [drm] loaded kernel module for "radeon" driver
> > (II) RADEON(0): [drm] DRM interface version 1.2
> > (II) RADEON(0): [drm] created "radeon" driver at busid "pci:0000:01:00.0"
> > (II) RADEON(0): [drm] added 8192 byte SAREA at 0xffffc20000411000
> > (II) RADEON(0): [drm] drmMap failed
> > (EE) RADEON(0): [dri] DRIScreenInit failed. Disabling DRI.
> >
> > And glxgears reports 300 frames per second. How do I get dri back? It
> > was working fine with XFree. The XF86Config-4 was changed by the upgrade
> > dropping some parms in the Device section. Restoring them has no effect
> > on the problem.

> What kernel do you use? I get the same behaviour with 2.6.13-rc3-mm1,
> but it works with 2.6.13-rc3.

I also use 2.6.13-rc3-mm1. Will try with a previous version an report to lkml if
it works.

Thanks
Ed

-------------------------------------------------------

Adrian Bunk

unread,

Jul 21, 2005, 11:30:21 AM7/21/05

to

On Wed, Jul 20, 2005 at 03:38:02PM +0200, Jesper Juhl wrote:
>...

> I send a patch for this yesterday that lets SCSI_QLA2XXX select
> FW_LOADER. I believe that's a bit better since the other options
> depend on SCSI_QLA2XXX anyway, there's no point in having them all set
> FW_LOADER. My patch also fixes another little issue; that you cannot
> disable SCSI_QLA2XXX if you don't need it.

>...

That's not an issue, this seems to be intentional.

Whether SCSI_QLA2XXX should be user-visible (as your patches make it) or
stay as it is (with the fixes from my patches) doesn't matter much -
both are valid setups.

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

Andrew Morton

unread,

Jul 21, 2005, 12:00:40 PM7/21/05

to

> I just tried 13-rc2-mm1 and dri is working again. Its reported to also work
> with 13-rc3.

Useful info, thanks.

> What in mm1 is apt to be breaking dri?

Faulty kernel programming ;)

I assume that the failure to open /dev/dri/card0 only happens in rc3-mm1?

Could you compare the dmesg output for 2.6.13-rc3 versus 2.6.13-rc3-mm1?
And double-check the .config settings: occasionally config options will be
renamed and `make oldconfig' causes things to get acidentally disabled.

Sam Ravnborg

unread,

Jul 21, 2005, 4:20:11 PM7/21/05

to

On Tue, Jul 19, 2005 at 09:42:54AM -0500, Coywolf Qi Hunt wrote:
> This fixes kbuild make help binrpm-pkg missing `''.

Applied, thanks.

Sam

Matthew Helsley

unread,

Jul 21, 2005, 6:50:07 PM7/21/05

to

On Sun, 2005-07-17 at 08:20 -0700, Paul Jackson wrote:
<snip>

> It is somewhat intrusive in the areas it controls, such as some large
> ifdef's in kernel/sched.c.

I don't see the large ifdefs you're referring to in -mm's
kernel/sched.c.

> The sched hooks may well impact the cost of maintaining the sched code,
> which is always a hotbed of Linux kernel development. However others
> who work in that area will have to speak to that concern.

I don't see the hooks you're referring to in the -mm scheduler.

> I tried just now to read through the ckrm hooks in fork, to see
> what sort of impact they might have on scalability on large systems.
> But I gave up after a couple layers of indirection. I saw several
> atomic counters and a couple of spinlocks that I suspect (not at all
> sure) lay on the fork main code path. I'd be surprised if this didn't
> impact scalability. Earlier, according to my notes, I saw mention of
> lmbench results in the OLS 2004 slides, indicating a several percent
> cost of available cpu cycles.

The OLS2004 slides are roughly 1 year old. Have you looked at more
recent benchmarks posted on CKRM-Tech around April 15th 2005? They
should be available in the CKRM-Tech archives on SourceForge at
http://sourceforge.net/mailarchive/forum.php?thread_id=7025751&forum_id=35191

(OLS 2004 Slide 24 of
http://ckrm.sourceforge.net/downloads/ckrm-ols04-slides.pdf )

The OLS slide indicates that the overhead is generally less than
0.5usec compared to a total context switch time of anywhere from 2 to
5.5usec. There appears to be little difference in scalability since the
overhead appears to oscillate around a constant.

<snip>

> vendor has a serious middleware software product that provides full
> CKRM support. Acceptance of CKRM would be easier if multiple competing
> middleware vendors were using it. It is also a concern that CKRM
> is not really usable for its primary intended purpose except if it
> is accompanied by this corresponding middleware, which I presume is

The Rule-Based Classification Engine (RBCE) makes CKRM useful without
middleware. It uses a table of rules to classify tasks. For example
rules that would classify shells:

echo 'path=/bin/bash,class=/rcfs/taskclass/shells' > /rcfs/ce/rules/classify_bash_shells
echo 'path=/bin/tcsh,class=/rcfs/taskclass/shells' > /rcfs/ce/rules/classify_tcsh_shells
..

And class shares would control the fork rate of those shells:

echo 'res=numtasks,forkrate=10000,forkrate_interval=1' > '/rcfs/taskclass/config'
echo 'res=numtasks,guarantee=1000,limit=5000' > '/rcfs/taskclass/shells'

No middleware necessary.

<snip>

> CKRM is in part a generalization and descendent of what I call fair
> share schedulers. For example, the fork hooks for CKRM include a
> forkrates controller, to slow down the rate of forking of tasks using
> too much resources.
>
> No doubt the CKRM experts are already familiar with these, but for
> the possible benefit of other readers:
>
> UNICOS Resource Administration - Chapter 4. Fair-share Scheduler
> http://oscinfo.osc.edu:8080/dynaweb/all/004-2302-001/@Generic__BookTextView/22883
>
> SHARE II -- A User Administration and Resource Control System for UNIX
> http://www.c-side.com/c/papers/lisa-91.html
>
> Solaris Resource Manager White Paper
> http://wwws.sun.com/software/resourcemgr/wp-mixed/
>
> ON THE PERFORMANCE IMPACT OF FAIR SHARE SCHEDULING
> http://www.cs.umb.edu/~eb/goalmode/cmg2000final.htm
>
> A Fair Share Scheduler, J. Kay and P. Lauder
> Communications of the ACM, January 1988, Volume 31, Number 1, pp 44-55.
>
> The documentation that I've noticed (likely I've missed something)
> doesn't do an adequate job of making the case - providing the
> motivation and context essential to understanding this patch set.

The choice of algorithm is entirely up to the scheduler, memory
allocator, etc. CKRM currently provides an interface for reading share
values and does not impose any meaning on those shares -- that is the
role of the scheduler.

> Because CKRM provides an infrastructure for multiple controllers
> (limiting forks, memory allocation and network rates) and multiple
> classifiers and policies, its critical interfaces have rather
> generic and abstract names. This makes it difficult for others to
> approach CKRM, reducing the rate of peer review by other Linux kernel
> developers, which is perhaps the key impediment to acceptance of CKRM.
> If anything, CKRM tends to be a little too abstract.

Generic and abstract names are appropriate for infrastructure that is
not tied to hardware. If you could be more specific I'd be able to
respond in less general and abstract terms.

<snip>

> My notes from many months ago indicate something about a 128 CPU
> limit in CKRM. I don't know why, nor if it still applies. It is
> certainly a smaller limit than the systems I care about.

I haven't seen this limitation in the CKRM patches that went into -mm
and I'd like to look into this. Where did you see this limit?

Thanks,
-Matt Helsley

Ed Tomlinson

unread,

Jul 21, 2005, 6:50:06 PM7/21/05

to

The difference in the X logs is that the working one does not have the:

(II) RADEON(0): [drm] created "radeon" driver at busid "pci:0000:01:00.0"
(II) RADEON(0): [drm] added 8192 byte SAREA at 0xffffc20000411000
(II) RADEON(0): [drm] drmMap failed

message. When it works it has has:

(II) RADEON(0): [drm] created "radeon" driver at busid "pci:0000:01:00.0"

(II) RADEON(0): [drm] added 8192 byte SAREA at 0x10001000
(II) RADEON(0): [drm] mapped SAREA 0x10001000 to 0x2aaab2e67000
(II) RADEON(0): [drm] framebuffer handle = 0xd0000000
(II) RADEON(0): [drm] added 1 reserved context for kernel
(II) RADEON(0): [agp] Mode 0x1f004209 [AGP 0x10de/0x00e1; Card 0x1002/0x5961]
(II) RADEON(0): [agp] 8192 kB allocated with handle 0x00000001
(II) RADEON(0): [agp] ring handle = 0xe0000000

> Could you compare the dmesg output for 2.6.13-rc3 versus 2.6.13-rc3-mm1?
> And double-check the .config settings: occasionally config options will be
> renamed and `make oldconfig' causes things to get acidentally disabled.

From 13-rc2-mm1:

Jul 21 07:31:20 grover kernel: [ 13.652465] Linux agpgart interface v0.101 (c) Dave Jones
Jul 21 07:31:20 grover kernel: [ 13.652492] [drm] Initialized drm 1.0.0 20040925

and later

Jul 21 07:31:34 grover kernel: [ 72.401795] [drm] Initialized radeon 1.16.0 20050311 on minor 0: ATI Technologies Inc RV280 [Radeon 9200]
Jul 21 07:31:34 grover kernel: [ 72.402388] agpgart: Found an AGP 3.0 compliant device at 0000:00:00.0.
Jul 21 07:31:34 grover kernel: [ 72.402399] agpgart: Putting AGP V3 device at 0000:00:00.0 into 4x mode
Jul 21 07:31:34 grover kernel: [ 72.402419] agpgart: Putting AGP V3 device at 0000:01:00.0 into 4x mode
Jul 21 07:31:35 grover kernel: [ 72.421888] [drm] Loading R200 Microcode

From 13-rc3-mm1:

Jul 20 18:59:34 grover kernel: [ 13.837537] Linux agpgart interface v0.101 (c) Dave Jones
Jul 20 18:59:34 grover kernel: [ 13.837565] [drm] Initialized drm 1.0.0 20040925

and later

Jul 20 18:59:48 grover kernel: [ 71.638470] [drm] Initialized radeon 1.16.0 20050311 on minor 0: ATI Techno
logies Inc RV280 [Radeon 9200]

Both .configs are fine. Kernels have agp compiled in with DRM modular.

CONFIG_GART_IOMMU=y

CONFIG_AGP=y
CONFIG_AGP_AMD64=y
# CONFIG_AGP_INTEL is not set
CONFIG_DRM=y
CONFIG_DRM_TDFX=m
CONFIG_DRM_R128=m
CONFIG_DRM_RADEON=m

Hope this helps (Its an AMD64 Kernel),

Ed

Dave Airlie

unread,

Jul 21, 2005, 7:30:15 PM7/21/05

to

> >>
> >> I also use 2.6.13-rc3-mm1. Will try with a previous version an report to lkml if
> >> it works.
> >>
> >
> > I just tried 13-rc2-mm1 and dri is working again. Its reported to also work
> > with 13-rc3.
>

Hmm no idea what could have broken it, I'm at OLS and don't have any
DRI capable machine here yet.. so it'll be a while before I get to
take a look at it .. I wouldn't be terribly surprised if some of the
new mapping code might have some issues..

Dave.

Paul Jackson

unread,

Jul 21, 2005, 7:40:04 PM7/21/05

to

Matthew wrote:
> I don't see the large ifdefs you're referring to in -mm's
> kernel/sched.c.

Perhaps someone who knows CKRM better than I can explain why the CKRM
version in some SuSE releases based on 2.6.5 kernels has substantial
code and some large ifdef's in sched.c, but the CKRM in *-mm doesn't.
Or perhaps I'm confused. There's a good chance that this represents
ongoing improvements that CKRM is making to reduce their footprint
in core kernel code. Or perhaps there is a more sophisticated cpu
controller in the SuSE kernel.

> Have you looked at more
> recent benchmarks posted on CKRM-Tech around April 15th 2005?

> ...
> http://ckrm.sourceforge.net/downloads/ckrm-ols04-slides.pdf

I had not seen these before. Thanks for the pointer.

> The Rule-Based Classification Engine (RBCE) makes CKRM useful
> without middleware.

I'd be encouraged more if this went one step further, past pointing
out that the API can be manipulated from the shell without requiring C
code, to providing examples of who intends to _directly_ use this
interface. The issue is perhaps less whether it's API is naturally C or
shell code, or more of how many actual, independent, uses of this API
are known to the community. A non-trivial API and mechanism that
is de facto captive to a single middleware implementation (which
may or may not apply here - I don't know) creates an additional review
burden, because some of the natural forces that guide us to healthy
long lasting interfaces are missing. If that concern applies here,
it's certainly not insurmountable - but it should in my view raise the
review barrier to acceptance. If other middleware or direct users
are not essentially performing some of the review for us, we have to do
it here with greater thoroughness.

> If you could be more specific I'd be able to
> respond in less general and abstract terms.

Good come back <grin>.

I made an effort along these lines last year, in the thread
I referenced a few days ago:

Classes: 1) what are they, 2) what is their name?
http://sourceforge.net/mailarchive/forum.php?thread_id=5328162&forum_id=35191

I doubt that it I have much more to contribute along
these lines now.

Sorry.

> I haven't seen this limitation [128 cpus] ...

Good - I presume that there is no longer, if there ever was, such a
limitation.

Thanks for you reply.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <p...@sgi.com> 1.925.600.0401

Martin J. Bligh

unread,

Jul 21, 2005, 8:40:04 PM7/21/05

to

Paul Jackson wrote:

>Matthew wrote:
>
>
>Perhaps someone who knows CKRM better than I can explain why the CKRM
>version in some SuSE releases based on 2.6.5 kernels has substantial
>code and some large ifdef's in sched.c, but the CKRM in *-mm doesn't.
>Or perhaps I'm confused. There's a good chance that this represents
>ongoing improvements that CKRM is making to reduce their footprint
>in core kernel code. Or perhaps there is a more sophisticated cpu
>controller in the SuSE kernel.
>
>

No offense, but I really don't see why this matters at all ... the stuff
in -mm is what's under consideration for merging - what's in SuSE is
wholly irrelevant ? One obvious thing is that that codebase will be
much older ... would be useful if people can direct critiques at the
current codebase ;-)

M.

Peter Williams

unread,

Jul 21, 2005, 9:10:08 PM7/21/05

to

Paul Jackson wrote:
> Matthew wrote:
>
>>I don't see the large ifdefs you're referring to in -mm's
>>kernel/sched.c.
>
>
> Perhaps someone who knows CKRM better than I can explain why the CKRM
> version in some SuSE releases based on 2.6.5 kernels has substantial
> code and some large ifdef's in sched.c, but the CKRM in *-mm doesn't.
> Or perhaps I'm confused. There's a good chance that this represents
> ongoing improvements that CKRM is making to reduce their footprint
> in core kernel code. Or perhaps there is a more sophisticated cpu
> controller in the SuSE kernel.

As there is NO CKRM cpu controller in 2.6.13-rc3-mm1 (that I can see)
the one in 2.6.5 is certainly more sophisticated :-). So the reason
that the considerable mangling of sched.c evident in SuSE's 2.6.5 kernel
source is not present is that the cpu controller is not included in
these patches.

I imagine that the cpu controller is missing from this version of CKRM
because the bugs introduced to the cpu controller during upgrading from
2.6.5 to 2.6.10 version have not yet been resolved.

Peter
--
Peter Williams pwil...@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce

Gerrit Huizenga

unread,

Jul 21, 2005, 11:30:11 PM7/21/05

to

On Fri, 22 Jul 2005 11:06:14 +1000, Peter Williams wrote:
> Paul Jackson wrote:
> > Matthew wrote:
> >
> >>I don't see the large ifdefs you're referring to in -mm's
> >>kernel/sched.c.
> >
> >
> > Perhaps someone who knows CKRM better than I can explain why the CKRM
> > version in some SuSE releases based on 2.6.5 kernels has substantial
> > code and some large ifdef's in sched.c, but the CKRM in *-mm doesn't.
> > Or perhaps I'm confused. There's a good chance that this represents
> > ongoing improvements that CKRM is making to reduce their footprint
> > in core kernel code. Or perhaps there is a more sophisticated cpu
> > controller in the SuSE kernel.
>
> As there is NO CKRM cpu controller in 2.6.13-rc3-mm1 (that I can see)
> the one in 2.6.5 is certainly more sophisticated :-). So the reason
> that the considerable mangling of sched.c evident in SuSE's 2.6.5 kernel
> source is not present is that the cpu controller is not included in
> these patches.

Yeah - I don't really consider the current CPU controller code something
ready for consideration yet for mainline merging. That doesn't mean
we don't want a CPU controller for CKRM - just that what we have
doesn't integrate cleanly/nicely with mainline.

> I imagine that the cpu controller is missing from this version of CKRM
> because the bugs introduced to the cpu controller during upgrading from
> 2.6.5 to 2.6.10 version have not yet been resolved.

I don't know what bugs you are referring to here. I don't think we
have any open defects with SuSE on the CPU scheduler in their releases.
And that is not at all related to the reason for not having a CPU
controller in the current patch set.

gerrit

Shailabh Nagar

unread,

Jul 22, 2005, 12:00:12 AM7/22/05

to

Mark Hahn wrote:
>>I suspect that the main problem is that this patch is not a mainstream
>>kernel feature that will gain multiple uses, but rather provides
>>support for a specific vendor middleware product used by that
>>vendor and a few closely allied vendors. If it were smaller or
>>less intrusive, such as a driver, this would not be a big problem.
>>That's not the case.
>
>
> yes, that's the crux. CKRM is all about resolving conflicting resource
> demands in a multi-user, multi-server, multi-purpose machine. this is a
> huge undertaking, and I'd argue that it's completely inappropriate for
> *most* servers. that is, computers are generally so damn cheap that
> the clear trend is towards dedicating a machine to a specific purpose,
> rather than running eg, shell/MUA/MTA/FS/DB/etc all on a single machine.

The argument about scale-up vs. scale-out is nowhere close to being
resolved. To argue that any support for performance partitioning (which
CKRM does) is in support of a lost cause is premature to say the least.

> this is *directly* in conflict with certain prominent products, such as
> the Altix and various less-prominent Linux-based mainframes. they're all
> about partitioning/virtualization - the big-iron aesthetic of splitting up
> a single machine. note that it's not just about "big", since cluster-based
> approaches can clearly scale far past big-iron, and are in effect statically
> partitioned. yes, buying a hideously expensive single box, and then chopping
> it into little pieces is more than a little bizarre, and is mainly based
> on a couple assumptions:
>
> - that clusters are hard. really, they aren't. they are not
> necessarily higher-maintenance, can be far more robust, usually
> do cost less. just about the only bad thing about clusters is
> that they tend to be somewhat larger in size.
>
> - that partitioning actually makes sense. the appeal is that if
> you have a partition to yourself, you can only hurt yourself.
> but it also follows that burstiness in resource demand cannot be
> overlapped without either constantly tuning the partitions or
> infringing on the guarantee.

"constantly tuning the partitions" is effectively whats done by workload
managers. But our earlier presentations and papers have made the case
that this is not the only utility for performance isolation - simple
needs like isolating one user from another on a general purpose server
is also a need that cannot be met by any existing or proposed Linux
kernel mechanisms today.

If partitioning made so little sense and the case for clusters was that
obvious, one would be hard put to explain why server consolidation is
being actively pursued by so many firms, Solaris is bothering with
coming up with Containers and Xen/VMWare getting all this attention.
I don't think the concept of partitioning can be dismissed so easily.

Of course, it must be noted that CKRM only provides performance
isolation not fault isolation. But there is a need for that. Whether
Linux chooses to let this need influence its design is another matter
(which I hope we'll also discuss besides the implementation issues).

> CKRM is one of those things that could be done to Linux, and will benefit a
> few, but which will almost certainly hurt *most* of the community.
>
> let me say that the CKRM design is actually quite good. the issue is whether
> the extensive hooks it requires can be done (at all) in a way which does
> not disporportionately hurt maintainability or efficiency.

If there are suggestions on implementing this better, it'll certainly be
very welcome.

>
> CKRM requires hooks into every resource-allocation decision fastpath:
> - if CKRM is not CONFIG, the only overhead is software maintenance.
> - if CKRM is CONFIG but not loaded, the overhead is a pointer check.
> - if CKRM is CONFIG and loaded, the overhead is a pointer check
> and a nontrivial callback.
>
> but really, this is only for CKRM-enforced limits. CKRM really wants to
> change behavior in a more "weighted" way, not just causing an
> allocation/fork/packet to fail. a really meaningful CKRM needs to
> be tightly integrated into each resource manager - effecting each scheduler
> (process, memory, IO, net). I don't really see how full-on CKRM can be
> compiled out, unless these schedulers are made fully pluggable.

This is a valid point for the CPU, memory and network controllers (I/O
can be made pluggable quite easily). For the CPU controller in SuSE, the
CKRM CPU controller can be turned on and off dynamically at runtime.
Exploring a similar option for memory and network (incurring only a
pointer check) could be explored. Keeping the overhead close to zero for
kernel users not interested in CKRM is certainly one of our objectives.

> finally, I observe that pluggable, class-based resource _limits_ could
> probably be done without callbacks and potentially with low overhead.
> but mere limits doesn't meet CKRM's goal of flexible, wide-spread resource
> partitioning within a large, shared machine.

True but only limits are not as useful for general workload management.

Peter Williams

unread,

Jul 22, 2005, 12:00:11 AM7/22/05

to

The bugs were in the patches for the 2.6.10 kernel not SuSE's 2.6.5
kernel. I reported some of them to the ckrm-tech mailing list at the
time. There were changes to the vanilla scheduler between 2.6.5 and
2.6.10 that were not handled properly when the CKRM scheduler was
upgraded to the 2.6.10 kernel.

Peter
--
Peter Williams pwil...@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce

Gerrit Huizenga

unread,

Jul 22, 2005, 12:00:12 AM7/22/05

to

On Fri, 22 Jul 2005 13:46:37 +1000, Peter Williams wrote:
> Gerrit Huizenga wrote:
> >>I imagine that the cpu controller is missing from this version of CKRM
> >>because the bugs introduced to the cpu controller during upgrading from
> >>2.6.5 to 2.6.10 version have not yet been resolved.
> >
> >
> > I don't know what bugs you are referring to here. I don't think we
> > have any open defects with SuSE on the CPU scheduler in their releases.
> > And that is not at all related to the reason for not having a CPU
> > controller in the current patch set.
>
> The bugs were in the patches for the 2.6.10 kernel not SuSE's 2.6.5
> kernel. I reported some of them to the ckrm-tech mailing list at the
> time. There were changes to the vanilla scheduler between 2.6.5 and
> 2.6.10 that were not handled properly when the CKRM scheduler was
> upgraded to the 2.6.10 kernel.

Ah - okay - that makes sense. Those patches haven't gone through my
review yet and I'm not directly tracking their status until I figure
out what the right direction is with respect to a fair share style
scheduler of some sort. I'm not convinced that the current one is
something that is ready for mainline or is necessarily the right answer
currently. But we do need to figure out something that will provide
some level of CPU allocation minima & maxima for a class, where that
solution will work well on a laptop or a huge server.

Ideas in that space are welcome - I know of several proposed ideas
in progress - the scheduler in SuSE and the forward port to 2.6.10
that you referred to; an idea for building a very simple interface
on top of sched_domains for SMP systems (no fairness within a
single CPU) and a proposal for timeslice manipulation that might
provide some fairness that the Fujitsu folks are thinking about.
There are probably others and honestly, I don't have any clue yet as
to what the right long term/mainline direction should be here as yet.

gerrit

Paul Jackson

unread,

Jul 22, 2005, 12:00:13 AM7/22/05

to

Martin wrote:
> No offense, but I really don't see why this matters at all ... the stuff

> in -mm is what's under consideration for merging - what's in SuSE is ...

Yes - what's in SuSE doesn't matter, at least not directly.

No - we are not just considering the CKRM that is in *-mm now, but also
what can be expected to be proposed as part of CKRM in the future.

If the CPU controller is not in *-mm now, but if one might reasonably
expect it to be proposed as part of CKRM in the future, then we need to
understand that. This is perhaps especially important in this case,
where there is some reason to suspect that this additional piece is
both non-trivial and essential to CKRM's purpose.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <p...@sgi.com> 1.925.600.0401

Shailabh Nagar

unread,

Jul 22, 2005, 12:10:07 AM7/22/05

to

Paul Jackson wrote:
> Martin wrote:
>
>>No offense, but I really don't see why this matters at all ... the stuff
>>in -mm is what's under consideration for merging - what's in SuSE is ...
>
>
> Yes - what's in SuSE doesn't matter, at least not directly.
>
> No - we are not just considering the CKRM that is in *-mm now, but also
> what can be expected to be proposed as part of CKRM in the future.
>
> If the CPU controller is not in *-mm now, but if one might reasonably
> expect it to be proposed as part of CKRM in the future, then we need to
> understand that. This is perhaps especially important in this case,
> where there is some reason to suspect that this additional piece is
> both non-trivial and essential to CKRM's purpose.
>

The CKRM design explicitly considered this problem of some controllers
being more unacceptable than the rest and part of the indirections
introduced in CKRM are to allow the kernel community the flexibility of
cherry-picking acceptable controllers. So if the current CPU controller
implementation is considered too intrusive/unacceptable, it can be
reworked or (and we certainly hope not) even rejected in perpetuity.
Same for the other controllers as and when they're introduced and
proposed for inclusion.

-- Shailabh

Gerrit Huizenga

unread,

Jul 22, 2005, 12:30:12 AM7/22/05

to

Sorry - I didn't see Mark's original comment, so I'm replying to
a reply which I did get. ;-)

On Thu, 21 Jul 2005 23:59:09 EDT, Shailabh Nagar wrote:
> Mark Hahn wrote:
> >>I suspect that the main problem is that this patch is not a mainstream
> >>kernel feature that will gain multiple uses, but rather provides
> >>support for a specific vendor middleware product used by that
> >>vendor and a few closely allied vendors. If it were smaller or
> >>less intrusive, such as a driver, this would not be a big problem.
> >>That's not the case.
> >
> >
> > yes, that's the crux. CKRM is all about resolving conflicting resource
> > demands in a multi-user, multi-server, multi-purpose machine. this is a
> > huge undertaking, and I'd argue that it's completely inappropriate for
> > *most* servers. that is, computers are generally so damn cheap that
> > the clear trend is towards dedicating a machine to a specific purpose,
> > rather than running eg, shell/MUA/MTA/FS/DB/etc all on a single machine.

This is a big NAK - if computers are so damn cheap, why is virtualization
and consolidation such a big deal? Well, the answer is actually that
floor space, heat, and power are also continuing to be very important
in the overall equation. And, buying machines which are dedicated but
often 80-99% idle occasionally bothers people who are concerned about
wasting planetary resources for no good reason. Yeah, we can stamp out
thousands of metal boxes, but if just a couple can do the same work,
well, let's consolidate. Less wasted metal, less wasted heat, less
wasted power, less air conditioning, wow, we are now part of the
eco-computing movement! ;-)

> > this is *directly* in conflict with certain prominent products, such as
> > the Altix and various less-prominent Linux-based mainframes. they're all
> > about partitioning/virtualization - the big-iron aesthetic of splitting up
> > a single machine. note that it's not just about "big", since cluster-based
> > approaches can clearly scale far past big-iron, and are in effect statically
> > partitioned. yes, buying a hideously expensive single box, and then chopping
> > it into little pieces is more than a little bizarre, and is mainly based
> > on a couple assumptions:

Well, yeah IBM has been doing this virtualization & partitioning stuff
for ages at lots of different levels for lots of reasons. If we are
in such direct conflict with Altix, aren't we also in conflict with our
own lines of business which do the same thing? But, well, we aren't
in conflict - this is a complementary part of our overall capabilities.

> > - that clusters are hard. really, they aren't. they are not
> > necessarily higher-maintenance, can be far more robust, usually
> > do cost less. just about the only bad thing about clusters is
> > that they tend to be somewhat larger in size.

This is orthogonal to clusters. Or, well, we are even using CKRM today
is some grid/cluster style applications. But that has no bearing on
whether or not clusters is useful.

> > - that partitioning actually makes sense. the appeal is that if
> > you have a partition to yourself, you can only hurt yourself.
> > but it also follows that burstiness in resource demand cannot be
> > overlapped without either constantly tuning the partitions or
> > infringing on the guarantee.

Well, if you don't think it makes sense, don't buy one. And stay away
from Xen, VMware, VirtualIron, PowerPC/pSeries hardware, Mainframes,
Altix, IA64 platforms, Intel VT, AMD Pacifica, and, well, anyone else
that is working to support virtualization, which is one key level of
partitioning.

I'm sorry but I'm not buying your argument here at all - it just has
no relationship to what's going on at the user side as near as I can
tell.

> > CKRM is one of those things that could be done to Linux, and will benefit a
> > few, but which will almost certainly hurt *most* of the community.
> >
> > let me say that the CKRM design is actually quite good. the issue is whether
> > the extensive hooks it requires can be done (at all) in a way which does
> > not disporportionately hurt maintainability or efficiency.

Can you be more clear on how this will hurt *most* of the community?
CKRM when not in use is not in any way intrusive. Can you take a look
at the patch again and point out the "extensive" hooks for me? I've
looked at "all" of them and I have trouble calling a couple of callbacks
"extensive hooks".

> > CKRM requires hooks into every resource-allocation decision fastpath:
> > - if CKRM is not CONFIG, the only overhead is software maintenance.
> > - if CKRM is CONFIG but not loaded, the overhead is a pointer check.
> > - if CKRM is CONFIG and loaded, the overhead is a pointer check
> > and a nontrivial callback.

You left out a case here: CKRM is CONFIG and loaded and classes are
defined.

In all of the cases that you mentioned, if there are no classes
defined, the overhead is still unmeasurable for any real workload.
Refer to the archives referenced earlier where Nish did some performance
measurements/comparisons. If you think there are real workloads where
this is non-trivial, can you run, measure, and point out the cost? Also,
do this with and without classes defined. I think that without classes
defined, you'll be hard pressed to find a real workload that shows any
statistically significant performance impact.

> > but really, this is only for CKRM-enforced limits. CKRM really wants to
> > change behavior in a more "weighted" way, not just causing an
> > allocation/fork/packet to fail. a really meaningful CKRM needs to
> > be tightly integrated into each resource manager - effecting each scheduler
> > (process, memory, IO, net). I don't really see how full-on CKRM can be
> > compiled out, unless these schedulers are made fully pluggable.

Well, loadable modules make sense for some items - what cost there is
(which is still often small) is born only by the users that use that
specific portion of CKRM. But not everything is likely to be a loadable
module, e.g. the memory controller and the CPU scheduler components
are a bit tougher. And yes, we are concerned about performance on big
and small machines, so that is always a concern with CKRM as well as
with any patches in any subsystem.

> > finally, I observe that pluggable, class-based resource _limits_ could
> > probably be done without callbacks and potentially with low overhead.
> > but mere limits doesn't meet CKRM's goal of flexible, wide-spread resource
> > partitioning within a large, shared machine.

CKRM's goal is to do simple workload management both on laptops and
on servers. I'm not opposed to doing a few things overly simply as long
as we get some basic capability. And we can refine with experience. I'm
definitly not looking to make CKRM any more complex than it has to
be, and yet I also want it to be useful on a laptop, small single CPU
machine, as well as larger servers.

gerrit

Mark Hahn

unread,

Jul 22, 2005, 1:00:10 AM7/22/05

to

> > > yes, that's the crux. CKRM is all about resolving conflicting resource
> > > demands in a multi-user, multi-server, multi-purpose machine. this is a
> > > huge undertaking, and I'd argue that it's completely inappropriate for
> > > *most* servers. that is, computers are generally so damn cheap that
> > > the clear trend is towards dedicating a machine to a specific purpose,
> > > rather than running eg, shell/MUA/MTA/FS/DB/etc all on a single machine.
>
> This is a big NAK - if computers are so damn cheap, why is virtualization
> and consolidation such a big deal? Well, the answer is actually that

yes, you did miss my point. I'm actually arguing that it's bad design
to attempt to arbitrate within a single shared user-space. you make
the fast path slower and less maintainable. if you are really concerned
about isolating many competing servers on a single piece of hardware, then
run separate virtualized environments, each with its own user-space.

Gerrit Huizenga

unread,

Jul 22, 2005, 1:10:05 AM7/22/05

to

On Fri, 22 Jul 2005 00:53:58 EDT, Mark Hahn wrote:
> > > > yes, that's the crux. CKRM is all about resolving conflicting resource
> > > > demands in a multi-user, multi-server, multi-purpose machine. this is a
> > > > huge undertaking, and I'd argue that it's completely inappropriate for
> > > > *most* servers. that is, computers are generally so damn cheap that
> > > > the clear trend is towards dedicating a machine to a specific purpose,
> > > > rather than running eg, shell/MUA/MTA/FS/DB/etc all on a single machine.
> >
> > This is a big NAK - if computers are so damn cheap, why is virtualization
> > and consolidation such a big deal? Well, the answer is actually that
>
> yes, you did miss my point. I'm actually arguing that it's bad design
> to attempt to arbitrate within a single shared user-space. you make
> the fast path slower and less maintainable. if you are really concerned
> about isolating many competing servers on a single piece of hardware, then
> run separate virtualized environments, each with its own user-space.

I'm willing to agree to disagree. I'm in favor of full virtualization
as well, as it is appropriate to certain styles of workloads. I also
have enough end users who also want to share user level, share tasks,
yet also have some level of balancing between the resource consumption
of the various environments. I don't think you are one of those end
users, though. I don't think I'm required to make everyone happy all
the time. ;)

BTW, does your mailer purposefully remove cc:'s? Seems like that is
normally considered impolite.

gerrit

Mark Hahn

unread,

Jul 22, 2005, 1:50:07 AM7/22/05

to

> of the various environments. I don't think you are one of those end
> users, though. I don't think I'm required to make everyone happy all
> the time. ;)

the issue is whether CKRM (in it's real form, not this thin edge)
will noticably hurt Linux's fast-path.

Alan Cox

unread,

Jul 22, 2005, 10:40:08 AM7/22/05

to

On Gwe, 2005-07-22 at 00:53 -0400, Mark Hahn wrote:
> the fast path slower and less maintainable. if you are really concerned
> about isolating many competing servers on a single piece of hardware, then
> run separate virtualized environments, each with its own user-space.

And the virtualisation layer has to do the same job with less
information. That to me implies that the virtualisation case is likely
to be materially less efficient, its just the inefficiency you are
worried about is hidden in a different pieces of code.

Secondly a lot of this doesnt matter if CKRM=n compiles to no code
anyway

Gerrit Huizenga

unread,

Jul 22, 2005, 12:00:35 PM7/22/05

to

On Fri, 22 Jul 2005 15:53:55 BST, Alan Cox wrote:
> On Gwe, 2005-07-22 at 00:53 -0400, Mark Hahn wrote:
> > the fast path slower and less maintainable. if you are really concerned
> > about isolating many competing servers on a single piece of hardware, then
> > run separate virtualized environments, each with its own user-space.
>
> And the virtualisation layer has to do the same job with less
> information. That to me implies that the virtualisation case is likely
> to be materially less efficient, its just the inefficiency you are
> worried about is hidden in a different pieces of code.
>
> Secondly a lot of this doesnt matter if CKRM=n compiles to no code
> anyway

I'm actually trying to keep the impact of CKRM=y to near-zero, ergo
only an impact if you create classes. And even then, the goal is to
keep that impact pretty small as well.

And yes, a hypervisor does have a lot more overhead in many forms.
Something like an overall 2-3% everywhere, where the CKRM impact is
likely to be so small as to be hard to measure in the individual
subsystems, and overall performance impact should be even smaller.
Plus you won't have to manage each operating system instance which
can grow into a pain under virtualization. But I still maintain that
both have their place.

gerrit

Mark Hahn

unread,

Jul 22, 2005, 12:40:10 PM7/22/05

to

> > > the fast path slower and less maintainable. if you are really concerned
> > > about isolating many competing servers on a single piece of hardware, then
> > > run separate virtualized environments, each with its own user-space.
> >
> > And the virtualisation layer has to do the same job with less
> > information. That to me implies that the virtualisation case is likely
> > to be materially less efficient, its just the inefficiency you are
> > worried about is hidden in a different pieces of code.

I imagine you, like me, are currently sitting in the Xen talk,
and I don't believe they are or will do anything so dumb as to throw away
or lose information. yes, in principle, the logic will need to be
somewhere, and I'm suggesting that the virtualization logic should
be in VMM-only code so it has literally zero effect on host-native
processes. *or* the host-native fast-path.

> > Secondly a lot of this doesnt matter if CKRM=n compiles to no code
> > anyway
>
> I'm actually trying to keep the impact of CKRM=y to near-zero, ergo
> only an impact if you create classes. And even then, the goal is to
> keep that impact pretty small as well.

but to really do CKRM, you are going to want quite extensive interaction with
the scheduler, VM page replacement policies, etc. all incredibly
performance-sensitive areas.

actually, let me also say that CKRM is on a continuum that includes
current (global) /proc tuning for various subsystems, ulimits, and
at the other end, Xen/VMM's. it's conceivable that CKRM could wind up
being useful and fast enough to subsume the current global and per-proc
tunables. after all, there are MANY places where the kernel tries to
maintain some sort of context to allow it to tune/throttle/readahead
based on some process-linked context. "embracing and extending"
those could make CKRM attractive to people outside the mainframe market.

> Plus you won't have to manage each operating system instance which
> can grow into a pain under virtualization. But I still maintain that
> both have their place.

CKRM may have its place in an externally-maintained patch ;)

regards, mark hahn.

Alan Cox

unread,

Jul 22, 2005, 3:10:08 PM7/22/05

to

On Gwe, 2005-07-22 at 12:35 -0400, Mark Hahn wrote:
> I imagine you, like me, are currently sitting in the Xen talk,

Out by a few thousand miles ;)

> and I don't believe they are or will do anything so dumb as to throw away
> or lose information. yes, in principle, the logic will need to be

They don't have it in the first place.

> somewhere, and I'm suggesting that the virtualization logic should
> be in VMM-only code so it has literally zero effect on host-native
> processes. *or* the host-native fast-path.

I don't see why you are concerned. If the CKRM=n path is zero impact
then its irrelevant to you. Its more expensive to do a lot of resource
management at the VMM level because the virtualisation engine doesn't
know anything but its getting indications someone wants to be
bigger/smaller.

> but to really do CKRM, you are going to want quite extensive interaction with
> the scheduler, VM page replacement policies, etc. all incredibly
> performance-sensitive areas.

Bingo - and areas the virtualiser can't see into, at least not unless it
uses the same hooks CKRM uses

Alan

Paul Jackson

unread,

Jul 22, 2005, 4:00:15 PM7/22/05

to

Shailabh wrote:
> So if the current CPU controller
> implementation is considered too intrusive/unacceptable, it can be
> reworked or (and we certainly hope not) even rejected in perpetuity.

It is certainly reasonable that you would hope such.

But this hypothetical possibility concerns me a little. Where would
that leave CKRM, if it was in the mainline kernel, but there was no CPU
controller in the mainline kernel? Wouldn't that be a rather serious
problem for many users of CKRM if they wanted to work on mainline
kernels?

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <p...@sgi.com> 1.925.600.0401

Matthew Helsley

unread,

Jul 22, 2005, 4:30:11 PM7/22/05

to

On Fri, 2005-07-22 at 12:35 -0400, Mark Hahn wrote:
<snip>

> actually, let me also say that CKRM is on a continuum that includes
> current (global) /proc tuning for various subsystems, ulimits, and
> at the other end, Xen/VMM's. it's conceivable that CKRM could wind up
> being useful and fast enough to subsume the current global and per-proc
> tunables. after all, there are MANY places where the kernel tries to
> maintain some sort of context to allow it to tune/throttle/readahead
> based on some process-linked context. "embracing and extending"
> those could make CKRM attractive to people outside the mainframe market.

Seems like an excellent suggestion to me! Yeah, it may be possible to
maintain the context the kernel keeps on a per-class basis instead of
globally or per-process. The real question is what constitutes a useful
"extension" :).

I was thinking that per-class nice values might be a good place to
start as well. One advantage of per-class as opposed to per-process nice
is the class is less transient than the process since its lifetime is
determined solely by the system administrator.

CKRM calls this kind of module a "resource controller". There's a small
HOWTO on writing resource controllers here:
http://ckrm.sourceforge.net/ckrm-controller-howto.txt
If anyone wants to investigate writing such a controller please feel
free to ask questions or send HOWTO feedback on the CKRM-Tech mailing
list at <ckrm...@lists.sourceforge.net>.

Thanks,
-Matt Helsley

Adrian Bunk

unread,

Jul 22, 2005, 5:30:12 PM7/22/05

to

On Fri, Jul 15, 2005 at 01:36:53AM -0700, Andrew Morton wrote:
>...
> Changes since 2.6.13-rc2-mm2:
>...
> +ckrm-rule-based-classification-engine-full-ce.patch
>...
> Class-based kernel resource management
>...

This patch fixes the following warning with -Wundef:

<-- snip -->

...
CC kernel/ckrm/rbce/rbce_core.o
kernel/ckrm/rbce/rbce_core.c:323:5: warning: "__NOT_YET__" is not defined
...

<-- snip -->

Signed-off-by: Adrian Bunk <bu...@stusta.de>

--- linux-2.6.13-rc3-mm1-full/kernel/ckrm/rbce/rbce_core.c.old 2005-07-22 18:04:28.000000000 +0200
+++ linux-2.6.13-rc3-mm1-full/kernel/ckrm/rbce/rbce_core.c 2005-07-22 18:04:36.000000000 +0200
@@ -320,7 +320,7 @@

case RBCE_RULE_CMD_PATH:
case RBCE_RULE_CMD:
-#if __NOT_YET__
+#ifdef __NOT_YET__
if (!*filename) { /* get this once */
if (((*filename =
kmalloc(NAME_MAX,

Mark Hahn

unread,

Jul 22, 2005, 8:30:13 PM7/22/05

to

> > actually, let me also say that CKRM is on a continuum that includes
> > current (global) /proc tuning for various subsystems, ulimits, and
> > at the other end, Xen/VMM's. it's conceivable that CKRM could wind up
> > being useful and fast enough to subsume the current global and per-proc
> > tunables. after all, there are MANY places where the kernel tries to
> > maintain some sort of context to allow it to tune/throttle/readahead
> > based on some process-linked context. "embracing and extending"
> > those could make CKRM attractive to people outside the mainframe market.
>
> Seems like an excellent suggestion to me! Yeah, it may be possible to
> maintain the context the kernel keeps on a per-class basis instead of
> globally or per-process.

right, but are the CKRM people ready to take this on? for instance,
I just grepped 'throttle' in kernel/mm and found a per-task RM in
page-writeback.c. it even has a vaguely class-oriented logic, since
it exempts RT tasks. if CKRM can become a way to make this stuff
cleaner and more effective (again, for normal tasks), then great.
but bolting on a big new different, intrusive mechanism that slows
down all normal jobs by 3% just so someone can run 10K mostly-idle
guests on a giant Power box, well, that's gross.

> The real question is what constitutes a useful
> "extension" :).

if CKRM is just extensions, I think it should be an external patch.
if it provides a path towards unifying the many disparate RM mechanisms
already in the kernel, great!

> I was thinking that per-class nice values might be a good place to
> start as well. One advantage of per-class as opposed to per-process nice
> is the class is less transient than the process since its lifetime is
> determined solely by the system administrator.

but the Linux RM needs to subsume traditional Unix process groups,
and inherited nice/schd class, and even CAP_ stuff. I think CKRM
could start to do this, since classes are very general.
but merely adding a new, incompatible feature is just Not A Good Idea.

regards, mark hahn.

Matthew Helsley

unread,

Jul 23, 2005, 12:40:04 AM7/23/05

to

OK, so if it provides a path towards unifying these, what should happen
to the old interfaces when they conflict with those offered by CKRM?

For instance, I'm considering how a per-class (re)nice setting would
work. What should happen when the user (re)nices a process to a
different value than the nice of the process' class? Should CKRM:

a) disable the old interface by
i) removing it
ii) return an error when CKRM is active
iii) return an error when CKRM has specified a nice value for the
process via membership in a class
iv) return an error when the (re)nice value is inconsistent with the
nice value assigned to the class

b) trust the user, ignore the class nice value, and allow the new nice
value

I'd be tempted to do a.iv but it would require some modifications to a
system call. b probably wouldn't require any modifications to non-CKRM
files/dirs.

This sort of question would probably come up for any other CKRM
"embraced-and-extended" tunables. Should they use the answer to this
one, or would it go on a case-by-case basis?

Thanks,
-Matt Helsley

Mark Hahn

unread,

Jul 23, 2005, 11:50:12 AM7/23/05

to

> > if CKRM is just extensions, I think it should be an external patch.
> > if it provides a path towards unifying the many disparate RM mechanisms
> > already in the kernel, great!
>
> OK, so if it provides a path towards unifying these, what should happen
> to the old interfaces when they conflict with those offered by CKRM?

I don't think the name matters, as long as the RM code is simplified/unified.
that is, the only difference at first would be a change in name -
same behavior.

> For instance, I'm considering how a per-class (re)nice setting would
> work. What should happen when the user (re)nices a process to a
> different value than the nice of the process' class? Should CKRM:

it has to behave as it does now, unless the admin has imposed some
class structure other than the normal POSIX one (ie, nice pertains
only to a process and is inherited by future children.)

> a) disable the old interface by
> i) removing it
> ii) return an error when CKRM is active
> iii) return an error when CKRM has specified a nice value for the
> process via membership in a class
> iv) return an error when the (re)nice value is inconsistent with the
> nice value assigned to the class

some interfaces must remain (renice), and if their behavior is implemented
via CKRM, it must, by default, act as before. other interfaces (say
overcommit_ratio) probably don't need to remain.

> b) trust the user, ignore the class nice value, and allow the new nice
> value

users can only nice up, and that policy needs to remain, obviously.
you appear to be asking what happens when the scope of the old mechanism
conflicts with the scope determined by admin-set CKRM classes. I'd
say that nicing a single process should change the nice of the whole
class that the process is in, if any. otherwise, it acts to rip that
process out of the class, which is probably even less 'least surprise'.

> This sort of question would probably come up for any other CKRM
> "embraced-and-extended" tunables. Should they use the answer to this
> one, or would it go on a case-by-case basis?

I don't see that CKRM should play by rules different from other
kernel improvements - preserve standard/former behavior when that
behavior is documented (certainly nice is). in the absense of admin-set
classes, nice would behave the same.

all CKRM is doing here is providing a broader framework to hang the tunables
on. it should be able to express all existing tunables in scope.

Richard Purdie

unread,

Jul 24, 2005, 12:30:12 PM7/24/05

to

On Fri, 2005-07-15 at 01:36 -0700, Andrew Morton wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/

On the Zaurus I'm seeing a couple of false "BUG: soft lockup detected on
CPU#0!" reports. These didn't show under 2.6.12-mm1 which was the last
-mm kernel I tested. softlockup.c seems identical between these versions
so it looks like some other change has caused this to appear...

Both of these are triggered from the nand driver. The functions
concerned (nand_wait_ready and nand_read_buf) are known to be slow (they
wait on hardware).

Richard

BUG: soft lockup detected on CPU#0!
Pid: 1, comm: swapper
CPU: 0
PC is at sharpsl_nand_dev_ready+0x14/0x28
LR is at nand_wait_ready+0x30/0x50
pc : [<c015f15c>] lr : [<c0159ea4>] Not tainted
sp : c034fa24 ip : c034fa34 fp : c034fa30
r10: c3c09400 r9 : c035e890 r8 : 0000c75a
r7 : c022533c r6 : c3c09580 r5 : c3c09400 r4 : ffff8fc3
r3 : c027f0bc r2 : c4856000 r1 : c4856000 r0 : c3c09400
Flags: NzCv IRQs on FIQs on Mode SVC_32 Segment kernel
Control: 397F Table: A0004000 DAC: 00000017
[<c001dd38>] (show_regs+0x0/0x4c) from [<c0059f48>] (softlockup_tick
+0x7c/0xb0)
r4 = C034E000
[<c0059ecc>] (softlockup_tick+0x0/0xb0) from [<c0042198>] (do_timer
+0x278/0x500)
r5 = 00000000 r4 = 00000001
[<c0041f20>] (do_timer+0x0/0x500) from [<c00211ac>] (timer_tick
+0xb4/0xf8)
[<c00210f8>] (timer_tick+0x0/0xf8) from [<c00279a0>]
(pxa_timer_interrupt+0x48/0xa8)
r6 = C034F9DC r5 = C034E000 r4 = F2A00000
[<c0027958>] (pxa_timer_interrupt+0x0/0xa8) from [<c001cb60>] (__do_irq
+0x6c/0xc4)
r8 = C034F9DC r7 = 00000000 r6 = 00000000 r5 = C034E000
r4 = C0226314
[<c001caf4>] (__do_irq+0x0/0xc4) from [<c001cddc>] (do_level_IRQ
+0x68/0xb8)
[<c001cd74>] (do_level_IRQ+0x0/0xb8) from [<c001ce80>] (asm_do_IRQ
+0x54/0x164)
r6 = 04000000 r5 = F2D00000 r4 = FFFFFFFF
[<c001ce2c>] (asm_do_IRQ+0x0/0x164) from [<c001b9b8>] (__irq_svc
+0x38/0x78)
[<c015f148>] (sharpsl_nand_dev_ready+0x0/0x28) from [<c0159ea4>]
(nand_wait_ready+0x30/0x50)
[<c0159e74>] (nand_wait_ready+0x0/0x50) from [<c0159f60>] (nand_command
+0x9c/0x1f0)
r7 = 00000000 r6 = C3C09400 r5 = C3C09580 r4 = 00000000
[<c0159ec4>] (nand_command+0x0/0x1f0) from [<c015b4b8>]
(nand_do_read_ecc+0x720/0x7c8)
r8 = C034FACC r7 = 00000200 r6 = C3C09580 r5 = 0000C75A
r4 = 00000000
[<c015ad98>] (nand_do_read_ecc+0x0/0x7c8) from [<c015b5e8>]
(nand_read_ecc+0x44/0x4c)
[<c015b5a4>] (nand_read_ecc+0x0/0x4c) from [<c0155364>] (part_read_ecc
+0xa4/0xbc)
r4 = 00000000
[<c01552c0>] (part_read_ecc+0x0/0xbc) from [<c00e5280>]
(jffs2_flash_read+0x1fc/0x2b0)
r7 = 00000000 r6 = 011E8B70 r5 = 00000000 r4 = C034FBF0
[<c00e5084>] (jffs2_flash_read+0x0/0x2b0) from [<c00dbd48>]
(jffs2_fill_scan_buf+0x2c/0x4c)
[<c00dbd1c>] (jffs2_fill_scan_buf+0x0/0x4c) from [<c00dc424>]
(jffs2_scan_medium+0x63c/0x1884)
r4 = 011E8B7C
[<c00dbde8>] (jffs2_scan_medium+0x0/0x1884) from [<c00e0020>]
(jffs2_do_mount_fs+0x1bc/0x6cc)
[<c00dfe64>] (jffs2_do_mount_fs+0x0/0x6cc) from [<c00e2d60>]
(jffs2_do_fill_super+0x130/0x2b4)
[<c00e2c30>] (jffs2_do_fill_super+0x0/0x2b4) from [<c00e3244>]
(jffs2_get_sb_mtd+0xf4/0x134)
r8 = 00008401 r7 = C3C4B4E0 r6 = C3C4B4FC r5 = C3C4B200
r4 = C3C4B400
[<c00e3150>] (jffs2_get_sb_mtd+0x0/0x134) from [<c00e32d4>]
(jffs2_get_sb_mtdnr+0x50/0x5c)
[<c00e3284>] (jffs2_get_sb_mtdnr+0x0/0x5c) from [<c00e3410>]
(jffs2_get_sb+0x130/0x1c0)
r7 = 00008001 r6 = C034FD5C r5 = C3C50000 r4 = FFFFFFEA
[<c00e32e0>] (jffs2_get_sb+0x0/0x1c0) from [<c00890d0>] (do_kern_mount
+0x50/0xf4)
[<c0089080>] (do_kern_mount+0x0/0xf4) from [<c00a3de8>] (do_mount
+0x3ac/0x650)
[<c00a3a3c>] (do_mount+0x0/0x650) from [<c00a44c8>] (sys_mount
+0x9c/0xe8)
[<c00a442c>] (sys_mount+0x0/0xe8) from [<c0008b64>] (mount_block_root
+0xb0/0x264)
r7 = C0343000 r6 = C034FF54 r5 = C0343006 r4 = C0343000
[<c0008ab4>] (mount_block_root+0x0/0x264) from [<c0008d74>] (mount_root
+0x5c/0x6c)
[<c0008d18>] (mount_root+0x0/0x6c) from [<c0008dc4>] (prepare_namespace
+0x40/0xe4)
r5 = C0019C70 r4 = C0019CC0
[<c0008d84>] (prepare_namespace+0x0/0xe4) from [<c001b200>] (init
+0x190/0x1fc)
r5 = C034E000 r4 = C01F5AA0
[<c001b070>] (init+0x0/0x1fc) from [<c0039a10>] (do_exit+0x0/0xd8c)
r8 = 00000000 r7 = 00000000 r6 = 00000000 r5 = 00000000
r4 = 00000000
VFS: Mounted root (jffs2 filesystem) readonly.

and

BUG: soft lockup detected on CPU#0!
Pid: 1063, comm: busybox
CPU: 0
PC is at nand_read_buf+0x28/0x3c
LR is at 0x100
pc : [<c0159cb8>] lr : [<00000100>] Not tainted
sp : c355dac8 ip : 0000003b fp : c355dad4
r10: c3c09400 r9 : c3b20884 r8 : 00000002
r7 : 00000000 r6 : c3c09580 r5 : 00000000 r4 : c3b20884
r3 : c4856014 r2 : 000000d3 r1 : c3b20884 r0 : c3c09580
Flags: Nzcv IRQs on FIQs on Mode SVC_32 Segment user
Control: 397F Table: A354C000 DAC: 00000015
[<c001dd38>] (show_regs+0x0/0x4c) from [<c0059f48>] (softlockup_tick
+0x7c/0xb0)
r4 = C355C000
[<c0059ecc>] (softlockup_tick+0x0/0xb0) from [<c0042198>] (do_timer
+0x278/0x500)
r5 = 00000000 r4 = 00000001
[<c0041f20>] (do_timer+0x0/0x500) from [<c00211ac>] (timer_tick
+0xb4/0xf8)
[<c00210f8>] (timer_tick+0x0/0xf8) from [<c00279a0>]
(pxa_timer_interrupt+0x48/0xa8)
r6 = C355DA80 r5 = C355C000 r4 = F2A00000
[<c0027958>] (pxa_timer_interrupt+0x0/0xa8) from [<c001cb60>] (__do_irq
+0x6c/0xc4)
r8 = C355DA80 r7 = 00000000 r6 = 00000000 r5 = C355C000
r4 = C0226314
[<c001caf4>] (__do_irq+0x0/0xc4) from [<c001cddc>] (do_level_IRQ
+0x68/0xb8)
[<c001cd74>] (do_level_IRQ+0x0/0xb8) from [<c001ce80>] (asm_do_IRQ
+0x54/0x164)
r6 = 04000000 r5 = F2D00000 r4 = FFFFFFFF
[<c001ce2c>] (asm_do_IRQ+0x0/0x164) from [<c001b9b8>] (__irq_svc
+0x38/0x78)
[<c0159c90>] (nand_read_buf+0x0/0x3c) from [<c015b328>]
(nand_do_read_ecc+0x590/0x7c8)
[<c015ad98>] (nand_do_read_ecc+0x0/0x7c8) from [<c015b5e8>]
(nand_read_ecc+0x44/0x4c)
[<c015b5a4>] (nand_read_ecc+0x0/0x4c) from [<c0155364>] (part_read_ecc
+0xa4/0xbc)
r4 = 00000000
[<c01552c0>] (part_read_ecc+0x0/0xbc) from [<c00e5280>]
(jffs2_flash_read+0x1fc/0x2b0)
r7 = 00000000 r6 = 0242897C r5 = 00000000 r4 = C355DC4C
[<c00e5084>] (jffs2_flash_read+0x0/0x2b0) from [<c00dbd48>]
(jffs2_fill_scan_buf+0x2c/0x4c)
[<c00dbd1c>] (jffs2_fill_scan_buf+0x0/0x4c) from [<c00dc424>]
(jffs2_scan_medium+0x63c/0x1884)
r4 = 02428988
[<c00dbde8>] (jffs2_scan_medium+0x0/0x1884) from [<c00e0020>]
(jffs2_do_mount_fs+0x1bc/0x6cc)
[<c00dfe64>] (jffs2_do_mount_fs+0x0/0x6cc) from [<c00e2d60>]
(jffs2_do_fill_super+0x130/0x2b4)
[<c00e2c30>] (jffs2_do_fill_super+0x0/0x2b4) from [<c00e3244>]
(jffs2_get_sb_mtd+0xf4/0x134)
r8 = 00000400 r7 = C3C510E0 r6 = C3C510FC r5 = C3F59C00
r4 = C3C51000
[<c00e3150>] (jffs2_get_sb_mtd+0x0/0x134) from [<c00e32d4>]
(jffs2_get_sb_mtdnr+0x50/0x5c)
[<c00e3284>] (jffs2_get_sb_mtdnr+0x0/0x5c) from [<c00e3410>]
(jffs2_get_sb+0x130/0x1c0)
r7 = 00000400 r6 = C355DDB8 r5 = C3D70000 r4 = FFFFFFEA
[<c00e32e0>] (jffs2_get_sb+0x0/0x1c0) from [<c00890d0>] (do_kern_mount
+0x50/0xf4)
[<c0089080>] (do_kern_mount+0x0/0xf4) from [<c00a3de8>] (do_mount
+0x3ac/0x650)
[<c00a3a3c>] (do_mount+0x0/0x650) from [<c00a44c8>] (sys_mount
+0x9c/0xe8)
[<c00a442c>] (sys_mount+0x0/0xe8) from [<c001bdc0>] (ret_fast_syscall
+0x0/0x2c)
r7 = 00000015 r6 = 000AC1F8 r5 = 00000000 r4 = 000A9050

Andrew Morton

unread,

Jul 25, 2005, 2:50:31 AM7/25/05

to

Richard Purdie <rpu...@rpsys.net> wrote:
>
> On Fri, 2005-07-15 at 01:36 -0700, Andrew Morton wrote:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/
>
> On the Zaurus I'm seeing a couple of false "BUG: soft lockup detected on
> CPU#0!" reports. These didn't show under 2.6.12-mm1 which was the last
> -mm kernel I tested. softlockup.c seems identical between these versions
> so it looks like some other change has caused this to appear...
>
> Both of these are triggered from the nand driver. The functions
> concerned (nand_wait_ready and nand_read_buf) are known to be slow (they
> wait on hardware).

OK, thanks. We can stick a touch_softlockup_watchdog() into those two
functions to tell them that we know what we're doing. If you have time to
write-and-test a patch then please do so - otherwise I'll take an untested
shot at it.

Richard Purdie

unread,

Jul 25, 2005, 5:40:12 AM7/25/05

to

Stop the nand functions triggering false softlockup reports.

Signed-off-by: Richard Purdie <rpu...@rpsys.net>

Index: linux-2.6.12/drivers/mtd/nand/nand_base.c
===================================================================
--- linux-2.6.12.orig/drivers/mtd/nand/nand_base.c 2005-07-24 18:49:35.000000000 +0100
+++ linux-2.6.12/drivers/mtd/nand/nand_base.c 2005-07-25 09:31:51.000000000 +0100
@@ -526,6 +526,7 @@
do {
if (this->dev_ready(mtd))
return;
+ touch_softlockup_watchdog();
} while (time_before(jiffies, timeo));

Sam Ravnborg

unread,

Jul 27, 2005, 7:00:23 PM7/27/05

to

On Fri, Jul 15, 2005 at 10:14:43PM +0000, J.A. Magallon wrote:
>
> On 07.16, J.A. Magallon wrote:

> >
> > On 07.15, Andrew Morton wrote:
> > >
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/
> > >
>

> This time I did not break anything... and they shut up gcc4 ;)

I have applied it to my tree. There still is a lot left when I compile
with -Wsign-compare.

Sam

J.A. Magallon

unread,

Jul 27, 2005, 7:50:09 PM7/27/05

to

On 07.27, Sam Ravnborg wrote:
> On Fri, Jul 15, 2005 at 10:14:43PM +0000, J.A. Magallon wrote:
> >
> > On 07.16, J.A. Magallon wrote:
> > >
> > > On 07.15, Andrew Morton wrote:
> > > >
> > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/
> > > >
> >
> > This time I did not break anything... and they shut up gcc4 ;)
>
> I have applied it to my tree. There still is a lot left when I compile
> with -Wsign-compare.
>

All the problems are born here:

struct sym_entry {
unsigned long long addr;
unsigned int len;
unsigned char *sym;
};

I suppose you want sym to be an unsigned char to store the type and to do
the checksum math in there.
And why use a 64bit address in 32bit archs ?. There is no math involved
with 'addr', so you can make it a pointer and let the compiler decide its
size.

Why don't you do something like:

struct sym_entry {
void *addr;
unsigned char type;
unsigned short len;
union {
unsigned char data[KSYM_NAME_LEN+1];
char name[KSYM_NAME_LEN+1];
};
};

Option b) is identify the five lines that do the checksum math and plague
them with (unsigned char) casts...
Will try to do it...

--
J.A. Magallon <jamagallon()able!es> \ Software is like sex:
werewolf!able!es \ It's better when it's free
Mandriva Linux release 2006.0 (Cooker) for i586
Linux 2.6.12-jam10 (gcc 4.0.1 (4.0.1-0.2mdk for Mandriva Linux release 2006.0))

Paulo Marques

unread,

Jul 28, 2005, 6:20:17 AM7/28/05

to

J.A. Magallon wrote:
> On 07.27, Sam Ravnborg wrote:
>
>>On Fri, Jul 15, 2005 at 10:14:43PM +0000, J.A. Magallon wrote:
>>
>>>On 07.16, J.A. Magallon wrote:
>>>
>>>>On 07.15, Andrew Morton wrote:
>>>>
>>>>>ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/
>>>
>>>This time I did not break anything... and they shut up gcc4 ;)
>>
>>I have applied it to my tree. There still is a lot left when I compile
>>with -Wsign-compare.
>
> All the problems are born here:
>
> struct sym_entry {
> unsigned long long addr;
> unsigned int len;
> unsigned char *sym;
> };

What are you guys talking about?

I've just compiled the current version in -mm with -Wsign-compare and it
doesn't give me a single warning.

Is my compiler version the problem (3.3.2), or are you testing with the
old version of kallsyms?

--
Paulo Marques - www.grupopie.com

It is a mistake to think you can solve any major problems
just with potatoes.
Douglas Adams

Bernd Petrovitsch

unread,

Jul 28, 2005, 6:30:24 AM7/28/05

to

On Thu, 2005-07-28 at 11:02 +0100, Paulo Marques wrote:
> J.A. Magallon wrote:
> > On 07.27, Sam Ravnborg wrote:
> >>On Fri, Jul 15, 2005 at 10:14:43PM +0000, J.A. Magallon wrote:
> >>>On 07.16, J.A. Magallon wrote:
> >>>>On 07.15, Andrew Morton wrote:
> >>>>
> >>>>>ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm1/
> >>>
> >>>This time I did not break anything... and they shut up gcc4 ;)

^^^^

> >>I have applied it to my tree. There still is a lot left when I compile
> >>with -Wsign-compare.
> >
> > All the problems are born here:
> >
> > struct sym_entry {
> > unsigned long long addr;
> > unsigned int len;
> > unsigned char *sym;
> > };
>
> What are you guys talking about?

"unsigned char *" is simply the wrong type for mere text strings. "char
*" ist the corrrect one. These are BTW two completely different types
(yes, "char" can be promoted into an "unsigned char" but essentially
these are two completely different types like "int" and "long long *").

> Is my compiler version the problem (3.3.2), or are you testing with the

Compiler version - zse gcc-4.*.

Bernd
--
Firmix Software GmbH http://www.firmix.at/
mobil: +43 664 4416156 fax: +43 1 7890849-55
Embedded Linux Development and Services

Paulo Marques

unread,

Jul 28, 2005, 6:50:13 AM7/28/05

to

Bernd Petrovitsch wrote:
> On Thu, 2005-07-28 at 11:02 +0100, Paulo Marques wrote:
>>J.A. Magallon wrote:

>>[...]

>>>All the problems are born here:
>>>
>>>struct sym_entry {
>>> unsigned long long addr;
>>> unsigned int len;
>>> unsigned char *sym;
>>>};
>>
>>What are you guys talking about?
>
> "unsigned char *" is simply the wrong type for mere text strings. "char
> *" ist the corrrect one. These are BTW two completely different types
> (yes, "char" can be promoted into an "unsigned char" but essentially
> these are two completely different types like "int" and "long long *").

You're comming really late in this thread :)

The problem is that "sym" isn't really a string. It starts out as a
string, but as the compression scheme begins to work it just becomes a
"bunch of bytes" using all the values in the range 0-255 for which
unsigned char is the perfect type.

Since only the loading of the symbols use string functions, and all the
compression process treats these as bytes, it seemed better to treat
them as unsigned chars and just typecast the first few uses.

The union suggested by J.A.Magallon might be a reasonable solution, but
we only need 4 casts in the 500 lines of code of scripts/kallsyms.c to
solve all problems, so this seems really overkill.

>>Is my compiler version the problem (3.3.2), or are you testing with the
>
> Compiler version - zse gcc-4.*.

Yes, I know J.A.Magallon is trying to silence the warnings of gcc 4.0,
but as I understood it, gcc 3 would also complain of the same problems
if -Wsign-compare were specified. It was just that gcc4 would complain
even without -Wsign-compare.

So the question is: is gcc4 complaining about signedness problems that
gcc3 doesn't, even with -Wsign-compare?

Now that I look at the source, I can see that it must be complaining!
There are still 3 calls to strcmp that use sym directly, and gcc3
doesn't say a thing.

I thought that these were already taken care of. In any cased the
attached patch should fix those and make the code more readable too.
With this patch we end up only having 2 casts to (char *) in the whole
source.

Can someone with gcc 4 apply this to the latest -mm and check that it
fixes everything?

kallsyms_sign.patch

Bernd Petrovitsch

unread,

Jul 28, 2005, 7:10:14 AM7/28/05

to

On Thu, 2005-07-28 at 11:40 +0100, Paulo Marques wrote:
[...]

> You're comming really late in this thread :)

Well, the same issue arised recently somewhere else too on this list and
lots of C programmers (not only beginners) don't know about the 3 char
types as speficied in the C standard.
[ The C standard may have got problems recently with the real world
usinf UTF-8 for a printable character but this must be solved
elsewhere. ]

> The problem is that "sym" isn't really a string. It starts out as a
> string, but as the compression scheme begins to work it just becomes a
> "bunch of bytes" using all the values in the range 0-255 for which
> unsigned char is the perfect type.
>
> Since only the loading of the symbols use string functions, and all the
> compression process treats these as bytes, it seemed better to treat
> them as unsigned chars and just typecast the first few uses.

ACK.

> The union suggested by J.A.Magallon might be a reasonable solution, but

Syntactically yes. Conceptually no IMHO. sizeof(char) must be ==
sizeof(unsigned char) and must have the same alignment. So a cast seems
to be the simpler and cleaner solution.

> we only need 4 casts in the 500 lines of code of scripts/kallsyms.c to
> solve all problems, so this seems really overkill.
>
> >>Is my compiler version the problem (3.3.2), or are you testing with the
> >
> > Compiler version - zse gcc-4.*.
>
> Yes, I know J.A.Magallon is trying to silence the warnings of gcc 4.0,
> but as I understood it, gcc 3 would also complain of the same problems
> if -Wsign-compare were specified. It was just that gcc4 would complain
> even without -Wsign-compare.

AFAIK applies -Wsign-compare in gcc-3 only to pure compares (<, >, ...)
and not assignments/passed parameters too.

> So the question is: is gcc4 complaining about signedness problems that
> gcc3 doesn't, even with -Wsign-compare?
>
> Now that I look at the source, I can see that it must be complaining!
> There are still 3 calls to strcmp that use sym directly, and gcc3
> doesn't say a thing.

As above - this will probably be silently promoted by gcc-3.

Helge Hafting

unread,

Jul 28, 2005, 8:50:13 AM7/28/05

to

I usually compile without module support. This time, I turned modules
on in order to compile an external module.

To my surprise, drivers/scsi/qla2xxx/qla2xxx.ko were built even though
no actual modules are selected in my .config, and the source is
not patched at all except the mm1 patch.

Helge Hafting

Adrian Bunk

unread,

Jul 28, 2005, 9:00:45 AM7/28/05

to

On Thu, Jul 28, 2005 at 02:50:24PM +0200, Helge Hafting wrote:

> I usually compile without module support. This time, I turned modules
> on in order to compile an external module.
>
> To my surprise, drivers/scsi/qla2xxx/qla2xxx.ko were built even though
> no actual modules are selected in my .config, and the source is
> not patched at all except the mm1 patch.

Known bug, alresdy fixed in -mm3.

> Helge Hafting

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

Shailabh Nagar

unread,

Jul 28, 2005, 4:20:10 PM7/28/05

to

Paul Jackson wrote:

Sorry for the late response - I just saw this note.

> Shailabh wrote:
>
>>So if the current CPU controller
>> implementation is considered too intrusive/unacceptable, it can be
>>reworked or (and we certainly hope not) even rejected in perpetuity.
>
>
> It is certainly reasonable that you would hope such.
>
> But this hypothetical possibility concerns me a little. Where would
> that leave CKRM, if it was in the mainline kernel, but there was no CPU
> controller in the mainline kernel?

It would be unfortunate indeed since CPU is the first resource that
people want to try and control.

However, I feel the following are also true:

1. It is still better to have CKRM with the I/O, memory, network,
forkrate controllers than to have nothing just because the CPU
controller is unacceptable. Each controller is useful in its own right.
It may not be enough to justify the framework all by itself but together
with others (and the possibility of future controllers and per-class
metrics), it is sufficient.

2. A CPU controller which is acceptable can be developed. It may not
work as well because of the need to keep it simple and not affect the
non-CKRM user path, but it will be better than not having anything.
Years ago, people said a low-overhead SMP scheduler couldn't be written
and they were proved wrong. Currently Ingo is hard at work to make
acceptable-impact real time scheduling happen. So why should we rule out
the possibility of someone being able to develop a CKRM CPU controller
with acceptable impact ?

Basically, I'm pointing out that there is no reason to hold the
acceptance of the CKRM framework + other controller's hostage to its
current CPU controller implementation (or any one controller's
implementation for that matter).

> Wouldn't that be a rather serious
> problem for many users of CKRM if they wanted to work on mainline
> kernels?

Yes it would. And one could say that its one of the features of the
Linux kernel community that they would have to learn to accept. Just
like the embedded folks who were rooting for realtime enhancements to be
made mainstream for years now, like the RAS folks who have been making a
case for better dump/probe tools, and you, who's tried in the past to
get the community to accept PAGG/CSA :-)

But I don't think we need to be resigned to a CPU controller-less
existence quite yet. Using the examples given earlier, realtime is
being discussed seriously now and RAS features are getting acceptance.
So why should one rule out the possibility of an acceptable CPU
controller for CKRM being developed ?

We, the current developers of CKRM, hope our current design can be a
basis for the "one controller to rule them all" ! But if there are other
ways of doing it or people can point out whats wrong with the
implementation, it can be reworked or rewritten from scratch.

The important thing, as Andrew said, is to get real feedback about what
is unacceptable in the current implementation and any ideas on how it
can be done better. But lets start off with what has been put out there
in -mm rather than getting stuck on discussing something that hasn't
been even put out yet ?

--Shailabh

Paul Jackson

unread,

Jul 28, 2005, 7:00:21 PM7/28/05

to

Thanks for your well worded response, Shailabh.

Others will have to make further comments and
decisions here. You have understood what I had
to say, and responded well. I have nothing to
add at this point that would help further.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <p...@sgi.com> 1.925.600.0401