03:04.0 SCSI storage controller: Marvell Technology Group Ltd. MV88SX6081 8-port SATA II PCI-X Controller (rev 09)
03:06.0 SCSI storage controller: Marvell Technology Group Ltd. MV88SX6081 8-port SATA II PCI-X Controller (rev 09)
We can trigger the problem within a few seconds by starting a
reconstruction on a drive hooked to port 4 (counting from 0) of the
second controller. Oddly, every other drive works reliably and the
faulty drive works if we connect it to, for example, port 4 of the first
controller.
I'd like to stress that the problem occurs systematically, on two
completely distinct machines. We swapped drives, cables and controllers
to exclude other possibilities.
Tested with Debian kernels 2.6.26-19 and 2.6.30-8. Let me know if
further details are needed.
--
// Bernie Innocenti - http://codewiz.org/
\X/ Sugar Labs - http://sugarlabs.org/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
0x30000040 here means "MRdPerr":
"bad data parity detected during PCI master read".
Which means there that a data parity error happened
during outgoing data transfer on the PCI-X bus.
This could happen due to noise on the bus,
dying capacitors, or (?) bad RAM (not sure about the last one).
The expected behaviour here is for sata_mv to then perform
perform a full SATA reset, after which the I/O will be reattempted.
But it appears to lock up before that happens.
The code does try and clear the PCI error interrupt,
but perhaps it needs clearing in more than the one register
where it currently does so.
Looking over the code and the documentation I have (NDA),
nothing obvious springs to view. There are some extra registers
we could be dumping out, to show exactly what PCI phase and address
caused the error, but reading those won't cause or prevent a lockup.
Best bet would be to try replacing the RAM in that box,
and see if the problem goes away.
Cheers
Oddly, we see this on two different machines. And only on specific
ports of the second controller card.
On one of these machines, we've also found a bunch of MCEs related to
ECC errors, but we were unable to reproduce them by exercising the CPU
and the bus with tools like cpuburn or md5sum of entire drives.
The other one has been running for 2 days with no errors whatsoever.
Bother have successfully completed a 24h cycle of memtest86+.
> The expected behaviour here is for sata_mv to then perform
> perform a full SATA reset, after which the I/O will be reattempted.
>
> But it appears to lock up before that happens.
> The code does try and clear the PCI error interrupt,
> but perhaps it needs clearing in more than the one register
> where it currently does so.
I've got a few of these recoverable errors overnight (perhaps along with
the MCE errors I described above). The bus was reset as you describe.
The PCI errors seem to cause a system freeze only during RAID
reconstruction. Perhaps the bus reset logic is not sufficiently locked
against re-entrance?
> Looking over the code and the documentation I have (NDA),
> nothing obvious springs to view. There are some extra registers
> we could be dumping out, to show exactly what PCI phase and address
> caused the error, but reading those won't cause or prevent a lockup.
>
> Best bet would be to try replacing the RAM in that box,
> and see if the problem goes away.
We'll try this tomorrow, thank you very much for providing these clues.
--
// Bernie Innocenti - http://codewiz.org/
\X/ Sugar Labs - http://sugarlabs.org/
--
Even the controllers were on same slots.
My initial suspicion was that the motherboard does not drop the PCI-X
bus frequency to 100MHz and drives the bus at 133MHz even though there
are 2 controllers connected. Proposed fix was to move the other
controller to other bus, as the H8DME-2 has four PCI-X slots, 2x100MHz
and 2x133MHz, but I haven't yet heard back if it helped.
Even the kernel was same - latest Debian distribution kernel. Might be
worthwile to try using vanilla kernel.org kernel if possible.
I have at home two 6081 controllers at same bus but at 100MHz and no
problems yet.
--
Harri.
Close. Mine is a Supermicro H8DM8-2 with 2x Opteron 2374 HE CPU.
> My initial suspicion was that the motherboard does not drop the PCI-X
> bus frequency to 100MHz and drives the bus at 133MHz even though there
> are 2 controllers connected. Proposed fix was to move the other
> controller to other bus, as the H8DME-2 has four PCI-X slots, 2x100MHz
> and 2x133MHz, but I haven't yet heard back if it helped.
Thanks for this hint, I'll try this tomorrow,
> Even the kernel was same - latest Debian distribution kernel. Might be
> worthwile to try using vanilla kernel.org kernel if possible.
As a matter of fact, yesterday I tried booting off an Open Solaris
Nexenta CD and I couldn't reproduce the issue, although I couldn't
reproduce the exact same conditions that trigger the bug systematically
on Linux.
> I have at home two 6081 controllers at same bus but at 100MHz and no
> problems yet.
Is there a way to find out what the current PCI-X bus frequency is from
Linux? And from the BIOS?
--
// Bernie Innocenti - http://codewiz.org/
\X/ Sugar Labs - http://sugarlabs.org/
--
The early revs of these chips did have a number of errata specific to PCI-X.
Cheers
See below.
Looking at the Status field, is it correct to say that the cards are
definitely running at 133MHz? Is there a way to force them to a
different speed from Linux or from the BIOS?
> The early revs of these chips did have a number of errata specific to PCI-X.
I checked the revision (09) against the sata_mv source and I couldn't
spot anything relevant to us.
03:04.0 SCSI storage controller: Marvell Technology Group Ltd. MV88SX6081 8-port SATA II PCI-X Controller (rev 09)
Subsystem: Marvell Technology Group Ltd. Device 11ab
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 64, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 19
Region 0: Memory at feb00000 (64-bit, non-prefetchable) [size=1M]
Region 2: I/O ports at e800 [size=256]
Region 3: [virtual] Memory at fdc00000 (32-bit, non-prefetchable) [size=4M]
[virtual] Expansion ROM at fd800000 [disabled] [size=4M]
Capabilities: [40] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable-
Address: 0000000000000000 Data: 0000
Capabilities: [60] PCI-X non-bridge device
Command: DPERE- ERO- RBC=512 OST=4
Status: Dev=03:04.0 64bit+ 133MHz+ SCD- USC- DC=simple DMMRBC=512 DMOST=4 DMCRS=8 RSCEM- 266MHz- 533MHz-
Kernel driver in use: sata_mv
Kernel modules: sata_mv
03:06.0 SCSI storage controller: Marvell Technology Group Ltd. MV88SX6081 8-port SATA II PCI-X Controller (rev 09)
Subsystem: Marvell Technology Group Ltd. Device 11ab
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 64, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 18
Region 0: Memory at fea00000 (64-bit, non-prefetchable) [size=1M]
Region 2: I/O ports at e400 [size=256]
Region 3: [virtual] Memory at fd400000 (32-bit, non-prefetchable) [size=4M]
[virtual] Expansion ROM at fd000000 [disabled] [size=4M]
Capabilities: [40] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable-
Address: 0000000000000000 Data: 0000
Capabilities: [60] PCI-X non-bridge device
Command: DPERE- ERO- RBC=512 OST=4
Status: Dev=03:06.0 64bit+ 133MHz+ SCD- USC- DC=simple DMMRBC=512 DMOST=4 DMCRS=8 RSCEM- 266MHz- 533MHz-
Kernel driver in use: sata_mv
Kernel modules: sata_mv
--
// Bernie Innocenti - http://codewiz.org/
\X/ Sugar Labs - http://sugarlabs.org/
--
NEWSFLASH: today we replaced the 4x500GB Seagate drives with 4x1.5TB
drives and reconstruction of the array has been running for 2h without a
glitch.
One interesting difference is that the 500GB drives were being
configured in 1.5Gbps SATA mode. Another notable difference is the
sequential read speed: ~70MB/s vs ~130MB/s with the 1.5TB model.
Could the PCI bus errors be a red herring?
Dunno. Rev.9 == "C0" in Marvell terminology,
and that's the latest/final rev for the 6081 chip,
with most of the PCI-X bugs fixed or worked around.
So not much to go on there.
The Bus error report was real, though.
But with 3.0gb/sec sata connections, the chip will be
using some different internal clocks and timings,
which could be enough to avoid triggering the PCI errors.
I guess. Let's hope so, anyway.
Cheers
I was wrong (the BIOS DMI block is wrong). The motherboard is labeled
as H8DME-2.
Our prayers have not been answered :-(
I tried several things:
- Forcing all the 500GB Seagate drives to 3.0Gbps does not help
- Replacing the 500GB drives with 1.5TB drives seems to make
the PCI error much less frequent
- Moving the controllers to different slots (on different busses)
does not help
- Happens with both 2.6.26 (from lenny) and 2.6.30 (from sid)
- Unplugging one of the controllers appeared to lead to a stable
configuration, but yesterday I left the machines reconstructing
the arrays and this mornings one of them is not answering
to pings ;-(
I want to try reducing the frequency of the PCI-X bus, but the BIOS does
not seem to provide a setting for it. Is there another way?
--
// Bernie Innocenti - http://codewiz.org/
\X/ Sugar Labs - http://sugarlabs.org/
--
Generally this is done with a physical jumper on the board instead.
You'll find it near to the bridge chip, which is almost always by NEC.
Another technique to slow the bridge down is to insert a regular PCI
card in the other slot (these bridges tend to offer 2 or 3 slots). As
the weakest link, it'll drag everything down to 33MHz.
An old PCI-X 66MHz-only card may prove helpful here as well. You don't
have to drive it in any way; getting power to it is sufficient.
Regards,
Tony V.
H8DME-2 is the same board as H8DM8-2, just without scsi controller.
There is 2 3-pin jumpers somewhere between pci-x slots, one for each
bus. With these you can force the bus to 66MHz PCI or 66MHz PCI-X.
Without jumper means autodetect. Note that this information is only from
manual, haven't been able to confirm what it really does :)
Oh and on the other identical case, I heard that moving other controller
to different bus (1st controller in top slot and 2nd controller in 2nd
slot from bottom) resolved the issue, or at least it has not error'd yet.
--
Harri.
Nothing that's easy.
Here.. apply this patch, and post the output after you reboot with it.
--- 2.6.31/drivers/ata/sata_mv.c.orig 2009-08-21 22:16:05.000000000 -0400
+++ linux/drivers/ata/sata_mv.c 2009-10-08 23:05:37.392203506 -0400
@@ -3738,6 +3738,12 @@
hp_flags |= MV_HP_ERRATA_60X1B2;
break;
case 0x9:
+ {
+ struct mv_host_priv *hpriv = host->private_data;
+ void __iomem *mmio = hpriv->base;
+ printk(KERN_INFO "sata_mv: pcix_mode=%d\n", mv_in_pcix_mode(host));
+ printk(KERN_INFO "sata_mv: MV_PCI_COMMAND=%08x\n", readl(mmio + MV_PCI_COMMAND);
+ }
hp_flags |= MV_HP_ERRATA_60X1C0;
break;
default:
Adding to that: there is a register on the chip,
which software could use to override the normal auto-detected
PCI mode (bus speed) for the chip. This could be used to,
say, select 100Mhz or 66Mhz, or even 33Mhz operation.
BUT.. the register is autodetected from the bus at power-on,
and so if software wants to override that (by rewriting the reg),
it will also need to reset the PCI bus afterward.
Which requires knowing how to reset a PCI bridge,
something I don't know about.
Cheers