Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

problem: SATA performance drop down

0 views
Skip to first unread message

GUO Zhijun

unread,
Jul 6, 2007, 3:00:13 AM7/6/07
to
Hi all,

It's a Tyan S2925 with single AMDx2 4800+, 4xWD2500YS, 4x1G DDR2 box.
Running etch 2.6.18-4-amd64, soft raid5 and soft raid 1. It's serving
web request for static files and nfs export its storage to other
boxes.

One day its performance dropped down suddenly and loadavg was going up
to 200~400. I used hddtemp to check the temperature. sda sdb is
normal and hddtemp returned immediately but when hddtemp was checking
it stalled for 3-5 seconds and reported they don't seem to have a
sensor. -___-b

I also found the following log

Jul 3 15:00:22 jupiter kernel: ata3: port is slow to respond, please be patient
Jul 3 15:00:22 jupiter kernel: ata3: soft resetting port
Jul 3 15:00:22 jupiter kernel: ata3: SATA link up 3.0 Gbps (SStatus
123 SControl 300)
Jul 3 15:00:22 jupiter kernel: ata3.00: configured for UDMA/133
Jul 3 15:00:22 jupiter kernel: ata3: EH complete
Jul 3 15:00:22 jupiter kernel: SCSI device sdc: 490232639 512-byte
hdwr sectors (250999 MB)
Jul 3 15:00:22 jupiter kernel: sdc: Write Protect is off
Jul 3 15:00:22 jupiter kernel: SCSI device sdc: drive cache: write back

Jul 3 15:00:23 jupiter kernel: ata4: soft resetting port
Jul 3 15:00:23 jupiter kernel: ata4: SATA link up 3.0 Gbps (SStatus
123 SControl 300)
Jul 3 15:00:23 jupiter kernel: ata4.00: configured for UDMA/133
Jul 3 15:00:23 jupiter kernel: ata4: EH complete
Jul 3 15:00:23 jupiter kernel: SCSI device sdd: 490232639 512-byte
hdwr sectors (250999 MB)
Jul 3 15:00:23 jupiter kernel: sdd: Write Protect is off
Jul 3 15:00:23 jupiter kernel: SCSI device sdd: drive cache: write back

the time matches so I think hddtemp triggered this.

And there are some more logs:

Jul 3 15:12:22 jupiter kernel: ata3.00: limiting speed to UDMA/100

Jul 3 15:12:28 jupiter kernel: ata3: soft resetting port
Jul 3 15:12:29 jupiter kernel: ata3: SATA link up 3.0 Gbps (SStatus
123 SControl 300)
Jul 3 15:12:29 jupiter kernel: ata3.00: configured for UDMA/100
Jul 3 15:12:29 jupiter kernel: ata3: EH complete
Jul 3 15:12:29 jupiter kernel: SCSI device sdc: 490232639 512-byte
hdwr sectors (250999 MB)
Jul 3 15:12:29 jupiter kernel: sdc: Write Protect is off
Jul 3 15:12:29 jupiter kernel: SCSI device sdc: drive cache: write back

I googled for this and found there is an updated firmware for WDxxxxYS
series. So I upgrade all hd firmware, flash the bios of motherboard.

After cold boot, the situation stills, here is the hdparm output:

jupiter:~# hdparm -t /dev/sd[a-d]
/dev/sda:
Timing buffered disk reads: 16 MB in 3.17 seconds = 5.05 MB/sec
/dev/sdb:
Timing buffered disk reads: 56 MB in 3.07 seconds = 18.24 MB/sec
/dev/sdc:
Timing buffered disk reads: 34 MB in 3.19 seconds = 10.65 MB/sec
/dev/sdd:
Timing buffered disk reads: 54 MB in 3.21 seconds = 16.83 MB/sec

Crying.... could any one help? any hints?

jupiter:~# lspci
00:00.0 RAM memory: nVidia Corporation MCP55 Memory Controller (rev a2)
00:01.0 ISA bridge: nVidia Corporation MCP55 LPC Bridge (rev a3)
00:01.1 SMBus: nVidia Corporation MCP55 SMBus (rev a3)
00:02.0 USB Controller: nVidia Corporation MCP55 USB Controller (rev a1)
00:02.1 USB Controller: nVidia Corporation MCP55 USB Controller (rev a2)
00:05.0 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3)
00:05.1 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3)
00:06.0 PCI bridge: nVidia Corporation MCP55 PCI bridge (rev a2)
00:08.0 Bridge: nVidia Corporation MCP55 Ethernet (rev a3)
00:09.0 Bridge: nVidia Corporation MCP55 Ethernet (rev a3)
00:0a.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a3)
00:0b.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a3)
00:0c.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a3)
00:0d.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a3)
00:0e.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a3)
00:0f.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a3)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8
[Athlon64/Opteron] HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8
[Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8
[Athlon64/Opteron] DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8
[Athlon64/Opteron] Miscellaneous Control
01:0a.0 VGA compatible controller: XGI - Xabre Graphics Inc Volari Z7

# dmesg
Bootdata ok (command line is root=/dev/md1 ro )
Linux version 2.6.18-4-amd64 (Debian 2.6.18.dfsg.1-12etch2)
(da...@debian.org) (gcc version 4.1.2 20061115 (prerelease) (Debian
4.1.1-21)) #1 SMP Fri May 4 00:37:33 UTC 2007
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000d7fd0000 (usable)
BIOS-e820: 00000000d7fd0000 - 00000000d7fde000 (ACPI data)
BIOS-e820: 00000000d7fde000 - 00000000d8000000 (ACPI NVS)
BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fef00000 (reserved)
BIOS-e820: 0000000100000000 - 0000000128000000 (usable)
DMI present.
ACPI: RSDP (v002 ACPIAM ) @ 0x00000000000f9100
ACPI: XSDT (v001 A M I OEMXSDT 0x04000710 MSFT 0x00000097) @
0x00000000d7fd0100
ACPI: FADT (v003 A M I OEMFACP 0x04000710 MSFT 0x00000097) @
0x00000000d7fd0290
ACPI: MADT (v001 A M I OEMAPIC 0x04000710 MSFT 0x00000097) @
0x00000000d7fd0390
ACPI: MCFG (v001 A M I OEMMCFG 0x04000710 MSFT 0x00000097) @
0x00000000d7fd0400
ACPI: OEMB (v001 A M I AMI_OEM 0x04000710 MSFT 0x00000097) @
0x00000000d7fde040
ACPI: HPET (v001 A M I OEMHPET0 0x04000710 MSFT 0x00000097) @
0x00000000d7fd4e40
ACPI: SSDT (v001 A M I POWERNOW 0x00000001 AMD 0x00000001) @
0x00000000d7fd4e80
ACPI: DSDT (v001 0AAAA 0AAAA000 0x00000000 INTL 0x20051117) @
0x0000000000000000
Scanning NUMA topology in Northbridge 24
Number of nodes 1
Node 0 MemBase 0000000000000000 Limit 0000000128000000
NUMA: Using 63 for the hash shift.
Using node hash shift of 63
Bootmem setup node 0 0000000000000000-0000000128000000
On node 0 totalpages: 1030966
DMA zone: 3054 pages, LIFO batch:0
DMA32 zone: 866312 pages, LIFO batch:31
Normal zone: 161600 pages, LIFO batch:31
ACPI: PM-Timer IO Port: 0x2008
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:11 APIC version 16
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 15:11 APIC version 16
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 high edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 15 high edge)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
ACPI: IRQ14 used by override.
ACPI: IRQ15 used by override.
Setting APIC routing to physical flat
ACPI: HPET id: 0x10de8201 base: 0xfed00000
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at dc000000 (gap: d8000000:26c00000)
SMP: Allowing 2 CPUs, 0 hotplug CPUs
Built 1 zonelists. Total pages: 1030966
Kernel command line: root=/dev/md1 ro
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
time.c: Using 25.000000 MHz WALL HPET GTOD HPET timer.
time.c: Detected 2500.492 MHz processor.
Console: colour VGA+ 80x25
Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
Checking aperture...
CPU 0: aperture @ d8000000 size 128 MB
Memory: 4111800k/4849664k available (1930k kernel code, 81924k
reserved, 868k data, 176k init)
Calibrating delay using timer specific routine.. 5005.55 BogoMIPS (lpj=10011110)
Security Framework v1.0.0 initialized
SELinux: Disabled at boot.
Capability LSM initialized
Mount-cache hash table entries: 256
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU 0/0 -> Node 0
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
SMP alternatives: switching to UP code
ACPI: Core revision 20060707
Using local APIC timer interrupts.
result 12502457
Detected 12.502 MHz APIC timer.
SMP alternatives: switching to SMP code
Booting processor 1/2 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 5000.17 BogoMIPS (lpj=10000349)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU 1/1 -> Node 0
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
AMD Athlon(tm) 64 X2 Dual Core Processor 4800+ stepping 01
CPU 1: Syncing TSC to CPU 0.
CPU 1: synchronized TSC with CPU 0 (last diff 0 cycles, maxerr 576 cycles)
Brought up 2 CPUs
testing NMI watchdog ... OK.
migration_cost=151
checking if image is initramfs... it is
Freeing initrd memory: 5666k freed
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: BIOS Bug: MCFG area at e0000000 is not E820-reserved
PCI: Not using MMCONFIG.
PCI: Using configuration type 1
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
Boot video device is 0000:01:0a.0
PCI: Transparent bridge - 0000:00:06.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P1._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.BR10._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.BR11._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.BR12._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.BR13._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.BR14._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.BR15._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 16 17 18 19) *0, disabled.
ACPI: PCI Interrupt Link [LNKB] (IRQs 16 17 18 19) *0, disabled.
ACPI: PCI Interrupt Link [LNKC] (IRQs 16 17 18 19) *0, disabled.
ACPI: PCI Interrupt Link [LNKD] (IRQs 16 17 18 19) *0, disabled.
ACPI: PCI Interrupt Link [LNEA] (IRQs 16 17 18 19) *0, disabled.
ACPI: PCI Interrupt Link [LNEB] (IRQs 16 17 18 19) *0, disabled.
ACPI: PCI Interrupt Link [LNEC] (IRQs 16 17 18 19) *0, disabled.
ACPI: PCI Interrupt Link [LNED] (IRQs 16 17 18 19) *0, disabled.
ACPI: PCI Interrupt Link [LUB0] (IRQs 20 21 22 23) *5
ACPI: PCI Interrupt Link [LMAD] (IRQs 20 21 22 23) *5
ACPI: PCI Interrupt Link [LUB2] (IRQs 20 21 22 23) *14
ACPI: PCI Interrupt Link [LMAC] (IRQs 20 21 22 23) *11
ACPI: PCI Interrupt Link [LAZA] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [LSMB] (IRQs 20 21 22 23) *10
ACPI: PCI Interrupt Link [LPMU] (IRQs 20 21 22 23) *11
ACPI: PCI Interrupt Link [LSA0] (IRQs 20 21 22 23) *7
ACPI: PCI Interrupt Link [LSA1] (IRQs 20 21 22 23) *10
ACPI: PCI Interrupt Link [LATA] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [LSA2] (IRQs 20 21 22 23) *0, disabled.
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
pnp: ACPI device : hid PNP0A03
pnp: ACPI device : hid PNP0200
pnp: ACPI device : hid PNP0B00
pnp: ACPI device : hid PNP0800
pnp: ACPI device : hid PNP0C04
pnp: ACPI device : hid PNP0C02
pnp: ACPI device : hid PNP0103
pnp: ACPI device : hid PNP0C02
pnp: ACPI device : hid PNP0303
pnp: ACPI device : hid PNP0C02
pnp: ACPI device : hid PNP0C02
pnp: ACPI device : hid PNP0C01
pnp: PnP ACPI: found 12 devices
usbcore: registered new driver usbfs
usbcore: registered new driver hub
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
hpet0: at MMIO 0xfed00000 (virtual 0xffffffffff5fe000), IRQs 2, 8, 31
hpet0: 3 32-bit timers, 25000000 Hz
PCI-DMA: Disabling AGP.
PCI-DMA: aperture base @ d8000000 size 131072 KB
PCI-DMA: using GART IOMMU.
PCI-DMA: Reserving 128MB of IOMMU area in the AGP aperture
pnp: the driver 'system' has been registered
pnp: match found with the PnP device '00:05' and the driver 'system'
pnp: match found with the PnP device '00:07' and the driver 'system'
pnp: 00:07: ioport range 0xca0-0xcaf has been reserved
pnp: match found with the PnP device '00:09' and the driver 'system'
pnp: 00:09: ioport range 0xa00-0xa7f has been reserved
pnp: match found with the PnP device '00:0a' and the driver 'system'
pnp: match found with the PnP device '00:0b' and the driver 'system'
PCI: Ignore bogus resource 6 [0:0] of 0000:01:0a.0
PCI: Bridge: 0000:00:06.0
IO window: e000-efff
MEM window: feb00000-febfffff
PREFETCH window: f8000000-fbffffff
PCI: Bridge: 0000:00:0a.0
IO window: disabled.
MEM window: disabled.
PREFETCH window: disabled.
PCI: Bridge: 0000:00:0b.0
IO window: disabled.
MEM window: disabled.
PREFETCH window: disabled.
PCI: Bridge: 0000:00:0c.0
IO window: disabled.
MEM window: disabled.
PREFETCH window: disabled.
PCI: Bridge: 0000:00:0d.0
IO window: disabled.
MEM window: disabled.
PREFETCH window: disabled.
PCI: Bridge: 0000:00:0e.0
IO window: disabled.
MEM window: disabled.
PREFETCH window: disabled.
PCI: Bridge: 0000:00:0f.0
IO window: disabled.
MEM window: disabled.
PREFETCH window: disabled.
PCI: Setting latency timer of device 0000:00:06.0 to 64
PCI: Setting latency timer of device 0000:00:0a.0 to 64
PCI: Setting latency timer of device 0000:00:0b.0 to 64
PCI: Setting latency timer of device 0000:00:0c.0 to 64
PCI: Setting latency timer of device 0000:00:0d.0 to 64
PCI: Setting latency timer of device 0000:00:0e.0 to 64
PCI: Setting latency timer of device 0000:00:0f.0 to 64
NET: Registered protocol family 2
IP route cache hash table entries: 131072 (order: 8, 1048576 bytes)
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
audit: initializing netlink socket (disabled)
audit(1183699578.360:1): initialized
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
Initializing Cryptographic API
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
PCI: Setting latency timer of device 0000:00:0a.0 to 64
pcie_portdrv_probe->Dev[0376:10de] has invalid IRQ. Check vendor BIOS
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:0a.0:pcie00]
PCI: Setting latency timer of device 0000:00:0b.0 to 64
pcie_portdrv_probe->Dev[0374:10de] has invalid IRQ. Check vendor BIOS
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:0b.0:pcie00]
PCI: Setting latency timer of device 0000:00:0c.0 to 64
pcie_portdrv_probe->Dev[0374:10de] has invalid IRQ. Check vendor BIOS
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:0c.0:pcie00]
PCI: Setting latency timer of device 0000:00:0d.0 to 64
pcie_portdrv_probe->Dev[0378:10de] has invalid IRQ. Check vendor BIOS
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:0d.0:pcie00]
PCI: Setting latency timer of device 0000:00:0e.0 to 64
pcie_portdrv_probe->Dev[0375:10de] has invalid IRQ. Check vendor BIOS
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:0e.0:pcie00]
PCI: Setting latency timer of device 0000:00:0f.0 to 64
pcie_portdrv_probe->Dev[0377:10de] has invalid IRQ. Check vendor BIOS
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:0f.0:pcie00]
Real Time Clock Driver v1.12ac
hpet_resources: 0xfed00000 is busy
Linux agpgart interface v0.101 (c) Dave Jones
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
pnp: the driver 'serial' has been registered
RAMDISK driver initialized: 16 RAM disks of 65536K size 1024 blocksize
pnp: the driver 'i8042 kbd' has been registered
pnp: match found with the PnP device '00:08' and the driver 'i8042 kbd'
pnp: the driver 'i8042 aux' has been registered
PNP: PS/2 Controller [PNP0303:PS2K] at 0x60,0x64 irq 1
PNP: PS/2 controller doesn't have AUX irq; using default 12
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
mice: PS/2 mouse device common for all mice
TCP bic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
NET: Registered protocol family 8
NET: Registered protocol family 20
ACPI: (supports S0 S1 S3 S4 S5)
Freeing unused kernel memory: 176k freed
ACPI: PCI Interrupt Link [LUB2] enabled at IRQ 23
GSI 16 sharing vector 0xE1 and IRQ 16
ACPI: PCI Interrupt 0000:00:02.1[B] -> Link [LUB2] -> GSI 23 (level,
low) -> IRQ 225
PCI: Setting latency timer of device 0000:00:02.1 to 64
ehci_hcd 0000:00:02.1: EHCI Host Controller
ehci_hcd 0000:00:02.1: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:02.1: debug port 1
PCI: cache line size of 64 is not supported by device 0000:00:02.1
ehci_hcd 0000:00:02.1: irq 225, io mem 0xfeafac00
ehci_hcd 0000:00:02.1: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 10 ports detected
ohci_hcd: 2005 April 22 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
forcedeth.c: Reverse Engineered nForce ethernet driver. Version 0.56.
ACPI: PCI Interrupt Link [LUB0] enabled at IRQ 22
GSI 17 sharing vector 0xE9 and IRQ 17
ACPI: PCI Interrupt 0000:00:02.0[A] -> Link [LUB0] -> GSI 22 (level,
low) -> IRQ 233
PCI: Setting latency timer of device 0000:00:02.0 to 64
ohci_hcd 0000:00:02.0: OHCI Host Controller
ohci_hcd 0000:00:02.0: new USB bus registered, assigned bus number 2
ohci_hcd 0000:00:02.0: irq 233, io mem 0xfeafb000
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 10 ports detected
ACPI: PCI Interrupt Link [LMAC] enabled at IRQ 21
GSI 18 sharing vector 0x32 and IRQ 18
ACPI: PCI Interrupt 0000:00:08.0[A] -> Link [LMAC] -> GSI 21 (level,
low) -> IRQ 50
PCI: Setting latency timer of device 0000:00:08.0 to 64
forcedeth: using HIGHDMA
SCSI subsystem initialized
libata version 2.00 loaded.
eth0: forcedeth.c: subsystem: 010de:cb84 bound to 0000:00:08.0
ACPI: PCI Interrupt Link [LMAD] enabled at IRQ 20
GSI 19 sharing vector 0x3A and IRQ 19
ACPI: PCI Interrupt 0000:00:09.0[A] -> Link [LMAD] -> GSI 20 (level,
low) -> IRQ 58
PCI: Setting latency timer of device 0000:00:09.0 to 64
forcedeth: using HIGHDMA
eth1: forcedeth.c: subsystem: 010de:cb84 bound to 0000:00:09.0
sata_nv 0000:00:05.0: version 2.0
ACPI: PCI Interrupt Link [LSA0] enabled at IRQ 23
ACPI: PCI Interrupt 0000:00:05.0[A] -> Link [LSA0] -> GSI 23 (level,
low) -> IRQ 225
PCI: Setting latency timer of device 0000:00:05.0 to 64
ata1: SATA max UDMA/133 cmd 0xD480 ctl 0xD402 bmdma 0xCC00 irq 225
ata2: SATA max UDMA/133 cmd 0xD080 ctl 0xD002 bmdma 0xCC08 irq 225
scsi0 : sata_nv
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata1.00: ATA-7, max UDMA/133, 490234752 sectors: LBA48 NCQ (depth 0/32)
ata1.00: ata1: dev 0 multi count 16
ata1.00: configured for UDMA/133
scsi1 : sata_nv
ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata2.00: ATA-7, max UDMA/133, 490234752 sectors: LBA48 NCQ (depth 0/32)
ata2.00: ata2: dev 0 multi count 16
ata2.00: configured for UDMA/133
Vendor: ATA Model: WDC WD2500YS-01S Rev: 20.0
Type: Direct-Access ANSI SCSI revision: 05
Vendor: ATA Model: WDC WD2500YS-01S Rev: 20.0
Type: Direct-Access ANSI SCSI revision: 05
ACPI: PCI Interrupt Link [LSA1] enabled at IRQ 22
ACPI: PCI Interrupt 0000:00:05.1[B] -> Link [LSA1] -> GSI 22 (level,
low) -> IRQ 233
PCI: Setting latency timer of device 0000:00:05.1 to 64
ata3: SATA max UDMA/133 cmd 0xC880 ctl 0xC802 bmdma 0xC080 irq 233
ata4: SATA max UDMA/133 cmd 0xC480 ctl 0xC402 bmdma 0xC088 irq 233
scsi2 : sata_nv
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata3.00: ATA-7, max UDMA/133, 490232639 sectors: LBA48 NCQ (depth 0/32)
ata3.00: ata3: dev 0 multi count 16
ata3.00: configured for UDMA/133
scsi3 : sata_nv
ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata4.00: ATA-7, max UDMA/133, 490232639 sectors: LBA48 NCQ (depth 0/32)
ata4.00: ata4: dev 0 multi count 16
ata4.00: configured for UDMA/133
Vendor: ATA Model: WDC WD2500YS-01S Rev: 20.0
Type: Direct-Access ANSI SCSI revision: 05
Vendor: ATA Model: WDC WD2500YS-01S Rev: 20.0
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sda: 490234752 512-byte hdwr sectors (251000 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
SCSI device sda: 490234752 512-byte hdwr sectors (251000 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
sda: sda1 sda2 sda3 sda4 < sda5 >
sd 0:0:0:0: Attached scsi disk sda
SCSI device sdb: 490234752 512-byte hdwr sectors (251000 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
SCSI device sdb: 490234752 512-byte hdwr sectors (251000 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 >
sd 1:0:0:0: Attached scsi disk sdb
SCSI device sdc: 490232639 512-byte hdwr sectors (250999 MB)
sdc: Write Protect is off
sdc: Mode Sense: 00 3a 00 00
SCSI device sdc: drive cache: write back
SCSI device sdc: 490232639 512-byte hdwr sectors (250999 MB)
sdc: Write Protect is off
sdc: Mode Sense: 00 3a 00 00
SCSI device sdc: drive cache: write back
sdc: sdc1 sdc2 sdc3 < sdc5 >
sd 2:0:0:0: Attached scsi disk sdc
SCSI device sdd: 490232639 512-byte hdwr sectors (250999 MB)
sdd: Write Protect is off
sdd: Mode Sense: 00 3a 00 00
SCSI device sdd: drive cache: write back
SCSI device sdd: 490232639 512-byte hdwr sectors (250999 MB)
sdd: Write Protect is off
sdd: Mode Sense: 00 3a 00 00
SCSI device sdd: drive cache: write back
sdd: sdd1 sdd2 sdd3 < sdd5 >
sd 3:0:0:0: Attached scsi disk sdd
Probing IDE interface ide0...
Probing IDE interface ide1...
md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: bitmap version 4.39
md: raid1 personality registered for level 1
raid5: automatically using best checksumming function: generic_sse
generic_sse: 7703.000 MB/sec
raid5: using function: generic_sse (7703.000 MB/sec)
raid6: int64x1 2228 MB/s
raid6: int64x2 2955 MB/s
raid6: int64x4 2866 MB/s
raid6: int64x8 1982 MB/s
raid6: sse2x1 3350 MB/s
raid6: sse2x2 4554 MB/s
raid6: sse2x4 4675 MB/s
raid6: using algorithm sse2x4 (4675 MB/s)
md: raid6 personality registered for level 6
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
md: md0 stopped.
md: bind<sdb1>
md: bind<sda1>
raid1: raid set md0 active with 2 out of 2 mirrors
md: md1 stopped.
md: bind<sdb3>
md: bind<sda3>
raid1: raid set md1 active with 2 out of 2 mirrors
md: md2 stopped.
md: bind<sdd2>
md: bind<sdc2>
raid1: raid set md2 active with 2 out of 2 mirrors
md: md3 stopped.
md: bind<sdb5>
md: bind<sdd5>
md: bind<sdc5>
md: bind<sda5>
md: kicking non-fresh sdc5 from array!
md: unbind<sdc5>
md: export_rdev(sdc5)
raid5: device sda5 operational as raid disk 0
raid5: device sdd5 operational as raid disk 3
raid5: device sdb5 operational as raid disk 1
raid5: allocated 4262kB for md3
raid5: raid level 5 set md3 active with 3 out of 4 devices, algorithm 2
RAID5 conf printout:
--- rd:4 wd:3 fd:1
disk 0, o:1, dev:sda5
disk 1, o:1, dev:sdb5
disk 3, o:1, dev:sdd5
device-mapper: ioctl: 4.7.0-ioctl (2006-06-24) initialised: dm-d...@redhat.com
Attempting manual resume
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
input: PC Speaker as /class/input/input0
i2c_adapter i2c-0: nForce2 SMBus adapter at 0x2d00
i2c_adapter i2c-1: nForce2 SMBus adapter at 0x2e00
Adding 498004k swap on /dev/sda2. Priority:-1 extents:1 across:498004k
Adding 498004k swap on /dev/sdb2. Priority:-2 extents:1 across:498004k
Adding 602396k swap on /dev/sdc1. Priority:-3 extents:1 across:602396k
Adding 602396k swap on /dev/sdd1. Priority:-4 extents:1 across:602396k
EXT3 FS on md1, internal journal
loop: loaded (max 8 devices)
kjournald starting. Commit interval 5 seconds
EXT3 FS on md0, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on md2, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on dm-3, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on dm-1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on dm-0, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
IPv6 over IPv4 tunneling driver
Installing knfsd (copyright (C) 1996 ok...@monad.swb.de).
NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
NFSD: starting 90-second grace period
eth0: no IPv6 routers present
eth1: no IPv6 routers present
ip_tables: (C) 2000-2006 Netfilter Core Team
jupiter:~/m#


--
Best regards,
Zhijun(Jam), GUO
jam...@gmail.com


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org

Thierry Chatelet

unread,
Jul 6, 2007, 3:20:14 AM7/6/07
to
On Friday 06 July 2007 08:32, GUO Zhijun wrote:
> Hi all,
>
> It's a Tyan S2925 with single AMDx2 4800+, 4xWD2500YS, 4x1G DDR2 box.
> Running etch 2.6.18-4-amd64, soft raid5 and soft raid 1. It's serving
> web request for static files and nfs export its storage to other
> boxes.
>

> Best regards,
> Zhijun(Jam), GUO
> jam...@gmail.com

I got the same problem with a qsus M2N-E motherboard. Usually a hard reset of
the board resolves the problem.
--
Linux is like a tipee: no Windows, no Gate and an Apache inside

Douglas Allan Tutty

unread,
Jul 6, 2007, 1:40:10 PM7/6/07
to
On Fri, Jul 06, 2007 at 02:32:12PM +0800, GUO Zhijun wrote:
>
> It's a Tyan S2925 with single AMDx2 4800+, 4xWD2500YS, 4x1G DDR2 box.
> Running etch 2.6.18-4-amd64, soft raid5 and soft raid 1. It's serving
> web request for static files and nfs export its storage to other
> boxes.
>
> One day its performance dropped down suddenly and loadavg was going up
> to 200~400. I used hddtemp to check the temperature. sda sdb is
> normal and hddtemp returned immediately but when hddtemp was checking
> it stalled for 3-5 seconds and reported they don't seem to have a
> sensor. -___-b
>
> I also found the following log
>
> Jul 3 15:00:22 jupiter kernel: ata3: port is slow to respond, please be
> patient
> Jul 3 15:00:22 jupiter kernel: ata3: soft resetting port

> Jul 3 15:00:23 jupiter kernel: ata4: soft resetting port

> jupiter:~# hdparm -t /dev/sd[a-d]
> /dev/sda:
> Timing buffered disk reads: 16 MB in 3.17 seconds = 5.05 MB/sec
> /dev/sdb:
> Timing buffered disk reads: 56 MB in 3.07 seconds = 18.24 MB/sec
> /dev/sdc:
> Timing buffered disk reads: 34 MB in 3.19 seconds = 10.65 MB/sec
> /dev/sdd:
> Timing buffered disk reads: 54 MB in 3.21 seconds = 16.83 MB/sec
>
> Crying.... could any one help? any hints?
>
> md: md3 stopped.
> md: bind<sdb5>
> md: bind<sdd5>
> md: bind<sdc5>
> md: bind<sda5>
> md: kicking non-fresh sdc5 from array!
> md: unbind<sdc5>
> md: export_rdev(sdc5)
> raid5: device sda5 operational as raid disk 0
> raid5: device sdd5 operational as raid disk 3
> raid5: device sdb5 operational as raid disk 1
> raid5: allocated 4262kB for md3
> raid5: raid level 5 set md3 active with 3 out of 4 devices, algorithm 2

It looks like something somewhere is messing up with sdc5. Without
sdc5, your CPU will be busy computing the missing data from the parity
info on the other raid5 disks. This will seriously slow down the
system.

The question is, where is the problem? Partition, partition table,
drive, cable, controller, MB?

Since sdc5 is out of the array, what happens if you take it all the way
out, and treat it as a scratch partition, then add it back into the
array as a new partition? Follow syslog and see what errors pop up.

You could, with it out of the array, put a filesystem on it, having it
check badblocks while it does so. See if having each block read and
written causes the drive firmware to fix things.

Doug.

Bob

unread,
Jul 19, 2007, 1:20:07 AM7/19/07
to
GUO Zhijun wrote:
> Hi all,
>
> It's a Tyan S2925 with single AMDx2 4800+, 4xWD2500YS, 4x1G DDR2 box.
> Running etch 2.6.18-4-amd64, soft raid5 and soft raid 1. It's serving
> web request for static files and nfs export its storage to other
> boxes.
<big snip> go here for thread
http://groups.google.com/group/linux.debian.user/browse_frm/thread/99dffc93e01af234/9ca08c7abe7c1949?lnk=gst&rnum=1#9ca08c7abe7c1949

Just spotted this searching for something else so I've CC'd the OP,
sorry if you get it twice.

I had very similar problems with one of the 3 Seagate 500GB drives in my
RAID5 array, it would happen every few weeks and I couldn't find the
problem. The problem was power, the drive was the last of 4 hung off the
same power cable, I moved 2 of the cable and onto another and bingo,
problem solved.

--
Garrr, do your bit for global warming, become a pirate, you can "borrow" my copy of Windows 95 if you want.

0 new messages