Intel 82559 NIC corrupted EEPROM

1176 views
Skip to first unread message

John

unread,
Nov 3, 2006, 12:40:06 PM11/3/06
to
Hello,

I have an EBC-2000T motherboard with 3 on-board Intel 82559 NICs.

http://www.intel.com/design/network/products/lan/controllers/82559.htm
http://www.adlinktech.com/PD/web/PD_detail.php?pid=213
http://www.intel.com/support/network/adapter/1000/linux/e100.htm

Running a 2.6.14 kernel, the e100 driver refuses to load because
it detects a corrupted EEPROM.

cf. e100_eeprom_load()

/* The checksum, stored in the last word, is calculated such that
* the sum of words should be 0xBABA */
checksum = le16_to_cpu(0xBABA - checksum);
if(checksum != nic->eeprom[nic->eeprom_wc - 1]) {
DPRINTK(PROBE, ERR, "EEPROM corrupted\n");
if (!eeprom_bad_csum_allow)
return -EAGAIN;
}

Several people have reported the same error. Intel's Auke Kok has
stated that ignoring the error is a BAD idea.

http://lkml.org/lkml/2006/7/10/215

What tool is used to reprogram the EEPROM? ethtool?
I suppose I'll have to ask the manufacturer for an updated EEPROM?


# ethtool -e eth0
Cannot get EEPROM data: Operation not supported

I'm not sure why I can't dump the contents of the EEPROM.
Does the driver need to be loaded?

On a totally unrelated note, does the 82559 support VLAN tagging?
(I believe the driver supports it.)

Thanks for reading this far.

Please note, email address is a bit-bucket.
I do monitor the mailing list.

Regards.

John

# lspci -vv
[...]
00:08.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro
100] (rev 08)
Subsystem: Intel Corporation EtherExpress PRO/100B (TX)
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (2000ns min, 14000ns max), cache line size 08
Interrupt: pin A routed to IRQ 11
Region 0: Memory at e5402000 (32-bit, non-prefetchable) [size=4K]
Region 1: I/O ports at d800 [size=64]
Region 2: Memory at e5000000 (32-bit, non-prefetchable) [size=1M]
Expansion ROM at 20000000 [disabled] [size=1M]
Capabilities: [dc] Power Management version 2
Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA
PME(D0+,D1+,D2+,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=2 PME-

00:09.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro
100] (rev 08)
Subsystem: Intel Corporation EtherExpress PRO/100B (TX)
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (2000ns min, 14000ns max), cache line size 08
Interrupt: pin A routed to IRQ 12
Region 0: Memory at e5401000 (32-bit, non-prefetchable) [size=4K]
Region 1: I/O ports at dc00 [size=64]
Region 2: Memory at e5100000 (32-bit, non-prefetchable) [size=1M]
Expansion ROM at 20100000 [disabled] [size=1M]
Capabilities: [dc] Power Management version 2
Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA
PME(D0+,D1+,D2+,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=2 PME-

00:0a.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro
100] (rev 08)
Subsystem: Intel Corporation EtherExpress PRO/100B (TX)
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (2000ns min, 14000ns max), cache line size 08
Interrupt: pin A routed to IRQ 10
Region 0: Memory at e5400000 (32-bit, non-prefetchable) [size=4K]
Region 1: I/O ports at e000 [size=64]
Region 2: Memory at e5200000 (32-bit, non-prefetchable) [size=1M]
Expansion ROM at 20200000 [disabled] [size=1M]
Capabilities: [dc] Power Management version 2
Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA
PME(D0+,D1+,D2+,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=2 PME-

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

H. Peter Anvin

unread,
Nov 3, 2006, 8:50:05 PM11/3/06
to
John wrote:
>
> Several people have reported the same error. Intel's Auke Kok has
> stated that ignoring the error is a BAD idea.
>
> http://lkml.org/lkml/2006/7/10/215
>
> What tool is used to reprogram the EEPROM? ethtool?
> I suppose I'll have to ask the manufacturer for an updated EEPROM?
>
> # ethtool -e eth0
> Cannot get EEPROM data: Operation not supported
>
> I'm not sure why I can't dump the contents of the EEPROM.
> Does the driver need to be loaded?
>

Yes, the driver needs to be loaded.

Basically, Auke wants you to throw away your NIC and/or motherboard.
Since you're effectively dead, the only damage you can do by disabling
the check has already been done. This unfortunately seems to be fairly
common with e100, especially for the on-motherboard version, and you
basically have two options: either disable the check or write an offline
tool to reprogram the EEPROM.

The latest netdev tree (if it's not in Linus' tree already, which it
might be) does add back the option to ignore the check so you can update
the EEPROM, which will automatically fix the checksum.

-hpa

tho...@hockin.org

unread,
Nov 4, 2006, 1:30:08 AM11/4/06
to
On Fri, Nov 03, 2006 at 10:22:51PM -0800, tho...@hockin.org wrote:

> On Fri, Nov 03, 2006 at 05:46:25PM -0800, H. Peter Anvin wrote:
> > Basically, Auke wants you to throw away your NIC and/or motherboard.
> > Since you're effectively dead, the only damage you can do by disabling
> > the check has already been done. This unfortunately seems to be fairly
> > common with e100, especially for the on-motherboard version, and you
> > basically have two options: either disable the check or write an offline
> > tool to reprogram the EEPROM.
>
> I have a tool to write the eepro100 EEPROM. Let me see if I can find it.
> It even had all the default data coded, ready to restore a NIC to default.
>
> However - back in the eepro100.c days, it was considered a warning only if
> the EEPROM had a bad checksum. There were two "supported" formats for the
> EEPROM, one of which was just the MAC address. And it worked!

One from the vaults: http://www.hockin.org/~thockin/enet_eeprom/

It's pretty simple, but easily hacked. ifdown your interface first! :)

Tim

tho...@hockin.org

unread,
Nov 4, 2006, 1:30:08 AM11/4/06
to
On Fri, Nov 03, 2006 at 05:46:25PM -0800, H. Peter Anvin wrote:
> Basically, Auke wants you to throw away your NIC and/or motherboard.
> Since you're effectively dead, the only damage you can do by disabling
> the check has already been done. This unfortunately seems to be fairly
> common with e100, especially for the on-motherboard version, and you
> basically have two options: either disable the check or write an offline
> tool to reprogram the EEPROM.

I have a tool to write the eepro100 EEPROM. Let me see if I can find it.


It even had all the default data coded, ready to restore a NIC to default.

However - back in the eepro100.c days, it was considered a warning only if
the EEPROM had a bad checksum. There were two "supported" formats for the
EEPROM, one of which was just the MAC address. And it worked!

Tim

John

unread,
Nov 7, 2006, 6:30:14 AM11/7/06
to
H. Peter Anvin wrote:

> John wrote:
>
>> Several people have reported the same error. Intel's Auke Kok has
>> stated that ignoring the error is a BAD idea.
>>
>> http://lkml.org/lkml/2006/7/10/215
>>
>> What tool is used to reprogram the EEPROM? ethtool?
>> I suppose I'll have to ask the manufacturer for an updated EEPROM?
>>
>> # ethtool -e eth0
>> Cannot get EEPROM data: Operation not supported
>>
>> I'm not sure why I can't dump the contents of the EEPROM.
>> Does the driver need to be loaded?
>
> Yes, the driver needs to be loaded.
>
> Basically, Auke wants you to throw away your NIC and/or motherboard.
> Since you're effectively dead, the only damage you can do by
> disabling the check has already been done. This unfortunately seems
> to be fairly common with e100, especially for the on-motherboard
> version, and you basically have two options: either disable the check
> or write an offline tool to reprogram the EEPROM.
>
> The latest netdev tree (if it's not in Linus' tree already, which it
> might be) does add back the option to ignore the check so you can
> update the EEPROM, which will automatically fix the checksum.

I have investigated further.

I changed e100_eeprom_load() to return 0 even when the checksum fails.

Loading e100.ko reports:

e100: Intel(R) PRO/100 Network Driver, 3.4.14-k2-NAPI
e100: Copyright(c) 1999-2005 Intel Corporation
ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 11
PCI: setting IRQ 11 as level-triggered
ACPI: PCI Interrupt 0000:00:08.0[A] -> Link [LNKA] -> GSI 11 (level,
low) -> IRQ 11
e100: 0000:00:08.0: e100_eeprom_load: EEPROM corrupted
e100: eth0: e100_probe: addr 0xe6302000, irq 11, MAC addr FF:FF:FF:FF:FF:FF
ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 12
PCI: setting IRQ 12 as level-triggered
ACPI: PCI Interrupt 0000:00:09.0[A] -> Link [LNKB] -> GSI 12 (level,
low) -> IRQ 12
e100: eth1: e100_probe: addr 0xe6301000, irq 12, MAC addr 00:30:64:04:E6:E5
ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 10
PCI: setting IRQ 10 as level-triggered
ACPI: PCI Interrupt 0000:00:0a.0[A] -> Link [LNKC] -> GSI 10 (level,
low) -> IRQ 10
e100: eth2: e100_probe: addr 0xe6300000, irq 10, MAC addr 00:30:64:04:E6:E6

I had thought all cards would have the same problem, but only the
first NIC seems affected.

The MAC address for eth0 should be 00:30:64:04:E6:E4
(0x003064 is an ADLINK OUI.)

#ip addr
1: lo: <LOOPBACK,UP> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
5: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
link/ether ff:ff:ff:ff:ff:ff brd ff:ff:ff:ff:ff:ff
6: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
link/ether 00:30:64:04:e6:e5 brd ff:ff:ff:ff:ff:ff
7: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
link/ether 00:30:64:04:e6:e6 brd ff:ff:ff:ff:ff:ff


I then used ethtool to dump the contents of the EEPROMs.

# ethtool -e eth0
Offset Values
------ ------
0x0000 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0010 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0020 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0040 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0050 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0060 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

# ethtool -e eth1
Offset Values
------ ------
0x0000 00 30 64 04 e6 e5 03 0e 00 00 01 02 01 47 00 00
0x0010 13 72 10 83 a2 40 01 00 86 80 00 00 00 00 00 00
0x0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0060 28 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f7 91

# ethtool -e eth2
Offset Values
------ ------
0x0000 00 30 64 04 e6 e6 03 0e 00 00 01 02 01 47 00 00
0x0010 13 72 10 83 a2 40 01 00 86 80 00 00 00 00 00 00
0x0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0060 28 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f7 90


Either the EEPROM image on eth0 is corrupted, or ethtool is not
able to read the contents of the EEPROM.

So I tried the other driver, eepro100.c which, AFAIU, e100.c is
supposed to supersede.

Loading eepro100.ko reports:

eepro100.c:v1.09j-t 9/29/99 Donald Becker
http://www.scyld.com/network/eepro100.html
eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin
<s...@saw.sw.com.sg> and others
ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 11
PCI: setting IRQ 11 as level-triggered
ACPI: PCI Interrupt 0000:00:08.0[A] -> Link [LNKA] -> GSI 11 (level,
low) -> IRQ 11
eth0: 0000:00:08.0, 00:30:64:04:E6:E4, IRQ 11.
Board assembly 721383-016, Physical connectors present: RJ45
Primary interface chip i82555 PHY #1.
General self-test: passed.
Serial sub-system self-test: passed.
Internal registers self-test: passed.
ROM checksum self-test: passed (0x04f4518b).
ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 12
PCI: setting IRQ 12 as level-triggered
ACPI: PCI Interrupt 0000:00:09.0[A] -> Link [LNKB] -> GSI 12 (level,
low) -> IRQ 12
eth1: 0000:00:09.0, 00:30:64:04:E6:E5, IRQ 12.
Board assembly 721383-016, Physical connectors present: RJ45
Primary interface chip i82555 PHY #1.
General self-test: passed.
Serial sub-system self-test: passed.
Internal registers self-test: passed.
ROM checksum self-test: passed (0x04f4518b).
ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 10
PCI: setting IRQ 10 as level-triggered
ACPI: PCI Interrupt 0000:00:0a.0[A] -> Link [LNKC] -> GSI 10 (level,
low) -> IRQ 10
eth2: 0000:00:0a.0, 00:30:64:04:E6:E6, IRQ 10.
Board assembly 721383-016, Physical connectors present: RJ45
Primary interface chip i82555 PHY #1.
General self-test: passed.
Serial sub-system self-test: passed.
Internal registers self-test: passed.
ROM checksum self-test: passed (0x04f4518b).


NOTE: eepro100.ko found the correct MAC address for eth0.

#ip addr
1: lo: <LOOPBACK,UP> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
link/ether 00:30:64:04:e6:e4 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
link/ether 00:30:64:04:e6:e5 brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
link/ether 00:30:64:04:e6:e6 brd ff:ff:ff:ff:ff:ff


I then used Donald Becker's program to dump the contents of all
the EEPROMs. ( ftp://www.scyld.com/pub/diag/ )

# eepro100-diag -ee
eepro100-diag.c:v2.13 2/28/2005 Donald Becker (bec...@scyld.com)
http://www.scyld.com/diag/index.html

Index #1: Found a Intel i82557/8/9 EtherExpressPro100 adapter at 0xd800.
EEPROM contents, size 64x16:
00: 3000 0464 e4e6 0e03 0000 0201 4701 0000 _0d__________G__
0x08: 7213 8310 40a2 0001 8086 0000 0000 0000 _r___@__________
...
0x30: 0128 0000 0000 0000 0000 0000 0000 0000 (_______________
0x38: 0000 0000 0000 0000 0000 0000 0000 92f7 ________________
The EEPROM checksum is correct.
Intel EtherExpress Pro 10/100 EEPROM contents:
Station address 00:30:64:04:E6:E4.
Board assembly 721383-016, Physical connectors present: RJ45
Primary interface chip i82555 PHY #1.
Sleep mode is enabled. This is not recommended.
Under high load the card may not respond to
PCI requests, and thus cause a master abort.
To clear sleep mode use the '-G 0 -w -w -f' options.

Index #2: Found a Intel i82557/8/9 EtherExpressPro100 adapter at 0xdc00.
EEPROM contents, size 64x16:
00: 3000 0464 e5e6 0e03 0000 0201 4701 0000 _0d__________G__
0x08: 7213 8310 40a2 0001 8086 0000 0000 0000 _r___@__________
...
0x30: 0128 0000 0000 0000 0000 0000 0000 0000 (_______________
0x38: 0000 0000 0000 0000 0000 0000 0000 91f7 ________________
The EEPROM checksum is correct.
Intel EtherExpress Pro 10/100 EEPROM contents:
Station address 00:30:64:04:E6:E5.
Board assembly 721383-016, Physical connectors present: RJ45
Primary interface chip i82555 PHY #1.
Sleep mode is enabled. This is not recommended.
Under high load the card may not respond to
PCI requests, and thus cause a master abort.
To clear sleep mode use the '-G 0 -w -w -f' options.

Index #3: Found a Intel i82557/8/9 EtherExpressPro100 adapter at 0xe000.
EEPROM contents, size 64x16:
00: 3000 0464 e6e6 0e03 0000 0201 4701 0000 _0d__________G__
0x08: 7213 8310 40a2 0001 8086 0000 0000 0000 _r___@__________
...
0x30: 0128 0000 0000 0000 0000 0000 0000 0000 (_______________
0x38: 0000 0000 0000 0000 0000 0000 0000 90f7 ________________
The EEPROM checksum is correct.
Intel EtherExpress Pro 10/100 EEPROM contents:
Station address 00:30:64:04:E6:E6.
Board assembly 721383-016, Physical connectors present: RJ45
Primary interface chip i82555 PHY #1.
Sleep mode is enabled. This is not recommended.
Under high load the card may not respond to
PCI requests, and thus cause a master abort.
To clear sleep mode use the '-G 0 -w -w -f' options.


Apparently, eepro100.ko is able to read the contents of the EEPROM on
eth0 and it declares the checksum correct. Is it possible that there is
a bug in e100.c that makes it fail to read the EEPROM on eth0?

Regards,

John

H. Peter Anvin

unread,
Nov 7, 2006, 12:20:18 PM11/7/06
to
John wrote:
>
> I then used ethtool to dump the contents of the EEPROMs.
>
> # ethtool -e eth0
> Offset Values
> ------ ------
> 0x0000 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> 0x0010 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> 0x0020 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> 0x0030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> 0x0040 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> 0x0050 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> 0x0060 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> 0x0070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>
> Either the EEPROM image on eth0 is corrupted, or ethtool is not
> able to read the contents of the EEPROM.
>

[...]

Sure as heck sounds like it.

-hpa

Auke Kok

unread,
Nov 7, 2006, 12:50:15 PM11/7/06
to

(Please CC either me or at netdev on all intel nic drivers. thanks. I removed
`jo...@privacy.net` since it throws a bounce, and linux...@intel.com is a support
address only, doesn't reach us developers)

how did you do the first `ethtool` eeprom dump? did you have the `e100` module loaded at
that time? Did you use the new `override` mechanism graciously donated by David M?

Cheers,

Auke

H. Peter Anvin

unread,
Nov 7, 2006, 1:30:13 PM11/7/06
to
Auke Kok wrote:
>
> (Please CC either me or at netdev on all intel nic drivers. thanks. I
> removed `jo...@privacy.net` since it throws a bounce, and
> linux...@intel.com is a support address only, doesn't reach us
> developers)
>

I think John <m...@privacy.net> is the one who can actually answer your
questions...

-hpa

Auke Kok

unread,
Nov 7, 2006, 1:40:10 PM11/7/06
to
H. Peter Anvin wrote:
> Auke Kok wrote:
>>
>> (Please CC either me or at netdev on all intel nic drivers. thanks. I
>> removed `jo...@privacy.net` since it throws a bounce, and
>> linux...@intel.com is a support address only, doesn't reach us
>> developers)
>>
>
> I think John <m...@privacy.net> is the one who can actually answer your
> questions...

his original mail reads:

"Please note, email address is a bit-bucket.
I do monitor the mailing list. "

H. Peter Anvin

unread,
Nov 7, 2006, 1:50:13 PM11/7/06
to
Auke Kok wrote:
> H. Peter Anvin wrote:
>> Auke Kok wrote:
>>>
>>> (Please CC either me or at netdev on all intel nic drivers. thanks. I
>>> removed `jo...@privacy.net` since it throws a bounce, and
>>> linux...@intel.com is a support address only, doesn't reach us
>>> developers)
>>>
>>
>> I think John <m...@privacy.net> is the one who can actually answer your
>> questions...
>
> his original mail reads:
>
> "Please note, email address is a bit-bucket.
> I do monitor the mailing list. "

Ah. So it makes no difference either way. It definitely doesn't bounce.

-hpa

John

unread,
Nov 8, 2006, 6:00:22 AM11/8/06
to
Hello all,

[ E-mail address is a bit-bucket. I *do* monitor the mailing lists. ]

I will try and summarize the problem as I understand it at this point.

I've written two messages so far:
http://groups.google.com/group/linux.kernel/msg/3a05d819c66474db
http://groups.google.com/group/linux.kernel/msg/391aebbb3dfd6039

And here is a link to the complete thread:
http://lkml.org/lkml/fancy/2006/11/3/124

I have a motherboard with three on-board 82559 NICs.

o eepro100.ko properly initializes all three NICs
o e100.ko fails to initialize one of them

NOTE: With kernel 2.6.14, e100.ko fails to initialize the NIC with MAC
address 00:30:64:04:E6:E4. With kernel 2.6.18 e100.ko fails to
initialize the NIC with MAC address 00:30:64:04:E6:E5.

The problem is not an incorrect checksum. (Donald Becker's dump utility
reports a correct checksum for all three NICs.) The problem seems to be
that e100.ko fails to read the contents of one of the EEPROMs.

Auke wrote:
> How did you do the first `ethtool` eeprom dump? did you have the


> `e100` module loaded at that time? Did you use the new `override`
> mechanism graciously donated by David M?

These tests were performed on a 2.6.14 kernel. I hacked


e100_eeprom_load() to return 0 even when the checksum

fails. Thus the driver did not refuse to load, and I was
able to use ethtool to dump the contents of the 3 EEPROMs.


Here are additional examples running a 2.6.18.1-hrt kernel.

'insmod e100.ko' reports:

e100: Intel(R) PRO/100 Network Driver, 3.5.10-k2-NAPI


e100: Copyright(c) 1999-2005 Intel Corporation

ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 12


PCI: setting IRQ 12 as level-triggered

ACPI: PCI Interrupt 0000:00:08.0[A] -> Link [LNKA] -> GSI 12 (level,
low) -> IRQ 12
e100: eth0: e100_probe: addr 0xe5300000, irq 12, MAC addr 00:30:64:04:E6:E4
ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10


PCI: setting IRQ 10 as level-triggered

ACPI: PCI Interrupt 0000:00:09.0[A] -> Link [LNKB] -> GSI 10 (level,
low) -> IRQ 10
e100: 0000:00:09.0: e100_eeprom_load: EEPROM corrupted
ACPI: PCI interrupt for device 0000:00:09.0 disabled
e100: probe of 0000:00:09.0 failed with error -11
ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 11


PCI: setting IRQ 11 as level-triggered

ACPI: PCI Interrupt 0000:00:0a.0[A] -> Link [LNKC] -> GSI 11 (level,
low) -> IRQ 11
e100: eth1: e100_probe: addr 0xe5301000, irq 11, MAC addr 00:30:64:04:E6:E6


'insmod e100.ko eeprom_bad_csum_allow=1' reports:

e100: Intel(R) PRO/100 Network Driver, 3.5.10-k2-NAPI


e100: Copyright(c) 1999-2005 Intel Corporation

ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 12


PCI: setting IRQ 12 as level-triggered

ACPI: PCI Interrupt 0000:00:08.0[A] -> Link [LNKA] -> GSI 12 (level,
low) -> IRQ 12
e100: eth0: e100_probe: addr 0xe5300000, irq 12, MAC addr 00:30:64:04:E6:E4
ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10


PCI: setting IRQ 10 as level-triggered

ACPI: PCI Interrupt 0000:00:09.0[A] -> Link [LNKB] -> GSI 10 (level,
low) -> IRQ 10
e100: 0000:00:09.0: e100_eeprom_load: EEPROM corrupted
e100: 0000:00:09.0: e100_probe: Invalid MAC address from EEPROM, aborting.
ACPI: PCI interrupt for device 0000:00:09.0 disabled
e100: probe of 0000:00:09.0 failed with error -11
ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 11


PCI: setting IRQ 11 as level-triggered

ACPI: PCI Interrupt 0000:00:0a.0[A] -> Link [LNKC] -> GSI 11 (level,
low) -> IRQ 11
e100: eth1: e100_probe: addr 0xe5301000, irq 11, MAC addr 00:30:64:04:E6:E6


'insmod e100.ko debug=16' reports:

e100: Intel(R) PRO/100 Network Driver, 3.5.10-k2-NAPI


e100: Copyright(c) 1999-2005 Intel Corporation

ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 12


PCI: setting IRQ 12 as level-triggered

ACPI: PCI Interrupt 0000:00:08.0[A] -> Link [LNKA] -> GSI 12 (level,
low) -> IRQ 12
e100: 0000:00:08.0: mdio_ctrl: READ:addr=1, reg=0, data_in=0x0000,
data_out=0x18203000
e100: 0000:00:08.0: mdio_ctrl: READ:addr=1, reg=1, data_in=0x0000,
data_out=0x18217809
e100: 0000:00:08.0: mdio_ctrl: READ:addr=1, reg=1, data_in=0x0000,
data_out=0x18217809
e100: 0000:00:08.0: e100_phy_init: phy_addr = 1
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=0, reg=0, data_in=0x0400,
data_out=0x14000400
e100: 0000:00:08.0: mdio_ctrl: READ:addr=1, reg=0, data_in=0x0000,
data_out=0x18203000
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=1, reg=0, data_in=0x3000,
data_out=0x14203000
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=2, reg=0, data_in=0x0400,
data_out=0x14400400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=3, reg=0, data_in=0x0400,
data_out=0x14600400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=4, reg=0, data_in=0x0400,
data_out=0x14800400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=5, reg=0, data_in=0x0400,
data_out=0x14A00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=6, reg=0, data_in=0x0400,
data_out=0x14C00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=7, reg=0, data_in=0x0400,
data_out=0x14E00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=8, reg=0, data_in=0x0400,
data_out=0x15000400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=9, reg=0, data_in=0x0400,
data_out=0x15200400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=10, reg=0, data_in=0x0400,
data_out=0x15400400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=11, reg=0, data_in=0x0400,
data_out=0x15600400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=12, reg=0, data_in=0x0400,
data_out=0x15800400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=13, reg=0, data_in=0x0400,
data_out=0x15A00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=14, reg=0, data_in=0x0400,
data_out=0x15C00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=15, reg=0, data_in=0x0400,
data_out=0x15E00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=16, reg=0, data_in=0x0400,
data_out=0x16000400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=17, reg=0, data_in=0x0400,
data_out=0x16200400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=18, reg=0, data_in=0x0400,
data_out=0x16400400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=19, reg=0, data_in=0x0400,
data_out=0x16600400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=20, reg=0, data_in=0x0400,
data_out=0x16800400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=21, reg=0, data_in=0x0400,
data_out=0x16A00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=22, reg=0, data_in=0x0400,
data_out=0x16C00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=23, reg=0, data_in=0x0400,
data_out=0x16E00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=24, reg=0, data_in=0x0400,
data_out=0x17000400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=25, reg=0, data_in=0x0400,
data_out=0x17200400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=26, reg=0, data_in=0x0400,
data_out=0x17400400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=27, reg=0, data_in=0x0400,
data_out=0x17600400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=28, reg=0, data_in=0x0400,
data_out=0x17800400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=29, reg=0, data_in=0x0400,
data_out=0x17A00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=30, reg=0, data_in=0x0400,
data_out=0x17C00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=31, reg=0, data_in=0x0400,
data_out=0x17E00400
e100: 0000:00:08.0: mdio_ctrl: READ:addr=1, reg=2, data_in=0x0000,
data_out=0x182202A8
e100: 0000:00:08.0: mdio_ctrl: READ:addr=1, reg=3, data_in=0x0000,
data_out=0x18230154
e100: 0000:00:08.0: e100_phy_init: phy ID = 0x015402A8
e100: eth0: e100_probe: addr 0xe5300000, irq 12, MAC addr 00:30:64:04:E6:E4
ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10


PCI: setting IRQ 10 as level-triggered

ACPI: PCI Interrupt 0000:00:09.0[A] -> Link [LNKB] -> GSI 10 (level,
low) -> IRQ 10
e100: 0000:00:09.0: e100_eeprom_load: EEPROM corrupted
ACPI: PCI interrupt for device 0000:00:09.0 disabled
e100: probe of 0000:00:09.0 failed with error -11
ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 11


PCI: setting IRQ 11 as level-triggered

ACPI: PCI Interrupt 0000:00:0a.0[A] -> Link [LNKC] -> GSI 11 (level,
low) -> IRQ 11
e100: 0000:00:0a.0: mdio_ctrl: READ:addr=1, reg=0, data_in=0x0000,
data_out=0x18203000
e100: 0000:00:0a.0: mdio_ctrl: READ:addr=1, reg=1, data_in=0x0000,
data_out=0x18217809
e100: 0000:00:0a.0: mdio_ctrl: READ:addr=1, reg=1, data_in=0x0000,
data_out=0x18217809
e100: 0000:00:0a.0: e100_phy_init: phy_addr = 1
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=0, reg=0, data_in=0x0400,
data_out=0x14000400
e100: 0000:00:0a.0: mdio_ctrl: READ:addr=1, reg=0, data_in=0x0000,
data_out=0x18203000
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=1, reg=0, data_in=0x3000,
data_out=0x14203000
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=2, reg=0, data_in=0x0400,
data_out=0x14400400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=3, reg=0, data_in=0x0400,
data_out=0x14600400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=4, reg=0, data_in=0x0400,
data_out=0x14800400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=5, reg=0, data_in=0x0400,
data_out=0x14A00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=6, reg=0, data_in=0x0400,
data_out=0x14C00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=7, reg=0, data_in=0x0400,
data_out=0x14E00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=8, reg=0, data_in=0x0400,
data_out=0x15000400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=9, reg=0, data_in=0x0400,
data_out=0x15200400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=10, reg=0, data_in=0x0400,
data_out=0x15400400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=11, reg=0, data_in=0x0400,
data_out=0x15600400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=12, reg=0, data_in=0x0400,
data_out=0x15800400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=13, reg=0, data_in=0x0400,
data_out=0x15A00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=14, reg=0, data_in=0x0400,
data_out=0x15C00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=15, reg=0, data_in=0x0400,
data_out=0x15E00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=16, reg=0, data_in=0x0400,
data_out=0x16000400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=17, reg=0, data_in=0x0400,
data_out=0x16200400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=18, reg=0, data_in=0x0400,
data_out=0x16400400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=19, reg=0, data_in=0x0400,
data_out=0x16600400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=20, reg=0, data_in=0x0400,
data_out=0x16800400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=21, reg=0, data_in=0x0400,
data_out=0x16A00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=22, reg=0, data_in=0x0400,
data_out=0x16C00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=23, reg=0, data_in=0x0400,
data_out=0x16E00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=24, reg=0, data_in=0x0400,
data_out=0x17000400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=25, reg=0, data_in=0x0400,
data_out=0x17200400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=26, reg=0, data_in=0x0400,
data_out=0x17400400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=27, reg=0, data_in=0x0400,
data_out=0x17600400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=28, reg=0, data_in=0x0400,
data_out=0x17800400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=29, reg=0, data_in=0x0400,
data_out=0x17A00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=30, reg=0, data_in=0x0400,
data_out=0x17C00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=31, reg=0, data_in=0x0400,
data_out=0x17E00400
e100: 0000:00:0a.0: mdio_ctrl: READ:addr=1, reg=2, data_in=0x0000,
data_out=0x182202A8
e100: 0000:00:0a.0: mdio_ctrl: READ:addr=1, reg=3, data_in=0x0000,
data_out=0x18230154
e100: 0000:00:0a.0: e100_phy_init: phy ID = 0x015402A8
e100: eth1: e100_probe: addr 0xe5301000, irq 11, MAC addr 00:30:64:04:E6:E6


'insmod e100.ko eeprom_bad_csum_allow=1 debug=16' reports:

e100: Intel(R) PRO/100 Network Driver, 3.5.10-k2-NAPI


e100: Copyright(c) 1999-2005 Intel Corporation

PCI: Enabling device 0000:00:08.0 (0000 -> 0003)
ACPI: PCI Interrupt 0000:00:08.0[A] -> Link [LNKA] -> GSI 12 (level,
low) -> IRQ 12
e100: 0000:00:08.0: mdio_ctrl: READ:addr=1, reg=0, data_in=0x0000,
data_out=0x18203000
e100: 0000:00:08.0: mdio_ctrl: READ:addr=1, reg=1, data_in=0x0000,
data_out=0x18217809
e100: 0000:00:08.0: mdio_ctrl: READ:addr=1, reg=1, data_in=0x0000,
data_out=0x18217809
e100: 0000:00:08.0: e100_phy_init: phy_addr = 1
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=0, reg=0, data_in=0x0400,
data_out=0x14000400
e100: 0000:00:08.0: mdio_ctrl: READ:addr=1, reg=0, data_in=0x0000,
data_out=0x18203000
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=1, reg=0, data_in=0x3000,
data_out=0x14203000
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=2, reg=0, data_in=0x0400,
data_out=0x14400400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=3, reg=0, data_in=0x0400,
data_out=0x14600400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=4, reg=0, data_in=0x0400,
data_out=0x14800400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=5, reg=0, data_in=0x0400,
data_out=0x14A00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=6, reg=0, data_in=0x0400,
data_out=0x14C00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=7, reg=0, data_in=0x0400,
data_out=0x14E00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=8, reg=0, data_in=0x0400,
data_out=0x15000400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=9, reg=0, data_in=0x0400,
data_out=0x15200400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=10, reg=0, data_in=0x0400,
data_out=0x15400400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=11, reg=0, data_in=0x0400,
data_out=0x15600400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=12, reg=0, data_in=0x0400,
data_out=0x15800400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=13, reg=0, data_in=0x0400,
data_out=0x15A00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=14, reg=0, data_in=0x0400,
data_out=0x15C00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=15, reg=0, data_in=0x0400,
data_out=0x15E00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=16, reg=0, data_in=0x0400,
data_out=0x16000400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=17, reg=0, data_in=0x0400,
data_out=0x16200400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=18, reg=0, data_in=0x0400,
data_out=0x16400400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=19, reg=0, data_in=0x0400,
data_out=0x16600400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=20, reg=0, data_in=0x0400,
data_out=0x16800400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=21, reg=0, data_in=0x0400,
data_out=0x16A00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=22, reg=0, data_in=0x0400,
data_out=0x16C00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=23, reg=0, data_in=0x0400,
data_out=0x16E00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=24, reg=0, data_in=0x0400,
data_out=0x17000400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=25, reg=0, data_in=0x0400,
data_out=0x17200400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=26, reg=0, data_in=0x0400,
data_out=0x17400400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=27, reg=0, data_in=0x0400,
data_out=0x17600400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=28, reg=0, data_in=0x0400,
data_out=0x17800400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=29, reg=0, data_in=0x0400,
data_out=0x17A00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=30, reg=0, data_in=0x0400,
data_out=0x17C00400
e100: 0000:00:08.0: mdio_ctrl: WRITE:addr=31, reg=0, data_in=0x0400,
data_out=0x17E00400
e100: 0000:00:08.0: mdio_ctrl: READ:addr=1, reg=2, data_in=0x0000,
data_out=0x182202A8
e100: 0000:00:08.0: mdio_ctrl: READ:addr=1, reg=3, data_in=0x0000,
data_out=0x18230154
e100: 0000:00:08.0: e100_phy_init: phy ID = 0x015402A8
e100: eth0: e100_probe: addr 0xe5300000, irq 12, MAC addr 00:30:64:04:E6:E4
PCI: Enabling device 0000:00:09.0 (0000 -> 0003)
ACPI: PCI Interrupt 0000:00:09.0[A] -> Link [LNKB] -> GSI 10 (level,
low) -> IRQ 10
e100: 0000:00:09.0: e100_eeprom_load: EEPROM corrupted
e100: 0000:00:09.0: mdio_ctrl: READ:addr=1, reg=0, data_in=0x0000,
data_out=0x18203000
e100: 0000:00:09.0: mdio_ctrl: READ:addr=1, reg=1, data_in=0x0000,
data_out=0x18217829
e100: 0000:00:09.0: mdio_ctrl: READ:addr=1, reg=1, data_in=0x0000,
data_out=0x1821782D
e100: 0000:00:09.0: e100_phy_init: phy_addr = 1
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=0, reg=0, data_in=0x0400,
data_out=0x14000400
e100: 0000:00:09.0: mdio_ctrl: READ:addr=1, reg=0, data_in=0x0000,
data_out=0x18203000
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=1, reg=0, data_in=0x3000,
data_out=0x14203000
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=2, reg=0, data_in=0x0400,
data_out=0x14400400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=3, reg=0, data_in=0x0400,
data_out=0x14600400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=4, reg=0, data_in=0x0400,
data_out=0x14800400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=5, reg=0, data_in=0x0400,
data_out=0x14A00400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=6, reg=0, data_in=0x0400,
data_out=0x14C00400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=7, reg=0, data_in=0x0400,
data_out=0x14E00400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=8, reg=0, data_in=0x0400,
data_out=0x15000400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=9, reg=0, data_in=0x0400,
data_out=0x15200400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=10, reg=0, data_in=0x0400,
data_out=0x15400400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=11, reg=0, data_in=0x0400,
data_out=0x15600400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=12, reg=0, data_in=0x0400,
data_out=0x15800400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=13, reg=0, data_in=0x0400,
data_out=0x15A00400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=14, reg=0, data_in=0x0400,
data_out=0x15C00400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=15, reg=0, data_in=0x0400,
data_out=0x15E00400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=16, reg=0, data_in=0x0400,
data_out=0x16000400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=17, reg=0, data_in=0x0400,
data_out=0x16200400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=18, reg=0, data_in=0x0400,
data_out=0x16400400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=19, reg=0, data_in=0x0400,
data_out=0x16600400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=20, reg=0, data_in=0x0400,
data_out=0x16800400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=21, reg=0, data_in=0x0400,
data_out=0x16A00400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=22, reg=0, data_in=0x0400,
data_out=0x16C00400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=23, reg=0, data_in=0x0400,
data_out=0x16E00400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=24, reg=0, data_in=0x0400,
data_out=0x17000400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=25, reg=0, data_in=0x0400,
data_out=0x17200400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=26, reg=0, data_in=0x0400,
data_out=0x17400400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=27, reg=0, data_in=0x0400,
data_out=0x17600400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=28, reg=0, data_in=0x0400,
data_out=0x17800400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=29, reg=0, data_in=0x0400,
data_out=0x17A00400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=30, reg=0, data_in=0x0400,
data_out=0x17C00400
e100: 0000:00:09.0: mdio_ctrl: WRITE:addr=31, reg=0, data_in=0x0400,
data_out=0x17E00400
e100: 0000:00:09.0: mdio_ctrl: READ:addr=1, reg=2, data_in=0x0000,
data_out=0x182202A8
e100: 0000:00:09.0: mdio_ctrl: READ:addr=1, reg=3, data_in=0x0000,
data_out=0x18230154
e100: 0000:00:09.0: e100_phy_init: phy ID = 0x015402A8
e100: 0000:00:09.0: e100_probe: Invalid MAC address from EEPROM, aborting.
ACPI: PCI interrupt for device 0000:00:09.0 disabled
e100: probe of 0000:00:09.0 failed with error -11
PCI: Enabling device 0000:00:0a.0 (0000 -> 0003)
ACPI: PCI Interrupt 0000:00:0a.0[A] -> Link [LNKC] -> GSI 11 (level,
low) -> IRQ 11
e100: 0000:00:0a.0: mdio_ctrl: READ:addr=1, reg=0, data_in=0x0000,
data_out=0x18203000
e100: 0000:00:0a.0: mdio_ctrl: READ:addr=1, reg=1, data_in=0x0000,
data_out=0x18217809
e100: 0000:00:0a.0: mdio_ctrl: READ:addr=1, reg=1, data_in=0x0000,
data_out=0x18217809
e100: 0000:00:0a.0: e100_phy_init: phy_addr = 1
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=0, reg=0, data_in=0x0400,
data_out=0x14000400
e100: 0000:00:0a.0: mdio_ctrl: READ:addr=1, reg=0, data_in=0x0000,
data_out=0x18203000
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=1, reg=0, data_in=0x3000,
data_out=0x14203000
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=2, reg=0, data_in=0x0400,
data_out=0x14400400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=3, reg=0, data_in=0x0400,
data_out=0x14600400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=4, reg=0, data_in=0x0400,
data_out=0x14800400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=5, reg=0, data_in=0x0400,
data_out=0x14A00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=6, reg=0, data_in=0x0400,
data_out=0x14C00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=7, reg=0, data_in=0x0400,
data_out=0x14E00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=8, reg=0, data_in=0x0400,
data_out=0x15000400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=9, reg=0, data_in=0x0400,
data_out=0x15200400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=10, reg=0, data_in=0x0400,
data_out=0x15400400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=11, reg=0, data_in=0x0400,
data_out=0x15600400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=12, reg=0, data_in=0x0400,
data_out=0x15800400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=13, reg=0, data_in=0x0400,
data_out=0x15A00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=14, reg=0, data_in=0x0400,
data_out=0x15C00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=15, reg=0, data_in=0x0400,
data_out=0x15E00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=16, reg=0, data_in=0x0400,
data_out=0x16000400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=17, reg=0, data_in=0x0400,
data_out=0x16200400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=18, reg=0, data_in=0x0400,
data_out=0x16400400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=19, reg=0, data_in=0x0400,
data_out=0x16600400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=20, reg=0, data_in=0x0400,
data_out=0x16800400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=21, reg=0, data_in=0x0400,
data_out=0x16A00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=22, reg=0, data_in=0x0400,
data_out=0x16C00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=23, reg=0, data_in=0x0400,
data_out=0x16E00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=24, reg=0, data_in=0x0400,
data_out=0x17000400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=25, reg=0, data_in=0x0400,
data_out=0x17200400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=26, reg=0, data_in=0x0400,
data_out=0x17400400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=27, reg=0, data_in=0x0400,
data_out=0x17600400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=28, reg=0, data_in=0x0400,
data_out=0x17800400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=29, reg=0, data_in=0x0400,
data_out=0x17A00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=30, reg=0, data_in=0x0400,
data_out=0x17C00400
e100: 0000:00:0a.0: mdio_ctrl: WRITE:addr=31, reg=0, data_in=0x0400,
data_out=0x17E00400
e100: 0000:00:0a.0: mdio_ctrl: READ:addr=1, reg=2, data_in=0x0000,
data_out=0x182202A8
e100: 0000:00:0a.0: mdio_ctrl: READ:addr=1, reg=3, data_in=0x0000,
data_out=0x18230154
e100: 0000:00:0a.0: e100_phy_init: phy ID = 0x015402A8
e100: eth1: e100_probe: addr 0xe5301000, irq 11, MAC addr 00:30:64:04:E6:E6


i.e. e100.ko initializes only two NICs:

# ip addr
1: lo: <LOOPBACK,UP,10000> mtu 16436 qdisc noqueue


link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo

2: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop
link/ether e2:18:f7:f8:88:4e brd ff:ff:ff:ff:ff:ff
6: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000


link/ether 00:30:64:04:e6:e4 brd ff:ff:ff:ff:ff:ff

7: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000


link/ether 00:30:64:04:e6:e6 brd ff:ff:ff:ff:ff:ff


Constrast this with eepro100.ko...

'insmod e100.ko debug=6' reports:

eepro100.c:v1.09j-t 9/29/99 Donald Becker
http://www.scyld.com/network/eepro100.html
eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin
<s...@saw.sw.com.sg> and others

ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 12


PCI: setting IRQ 12 as level-triggered

ACPI: PCI Interrupt 0000:00:08.0[A] -> Link [LNKA] -> GSI 12 (level,
low) -> IRQ 12
Found Intel i82557 PCI Speedo at 0xe5300000, IRQ 12.
eth0: 0000:00:08.0, 00:30:64:04:E6:E4, IRQ 12.


Board assembly 721383-016, Physical connectors present: RJ45
Primary interface chip i82555 PHY #1.
General self-test: passed.
Serial sub-system self-test: passed.
Internal registers self-test: passed.
ROM checksum self-test: passed (0x04f4518b).

ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10


PCI: setting IRQ 10 as level-triggered

ACPI: PCI Interrupt 0000:00:09.0[A] -> Link [LNKB] -> GSI 10 (level,
low) -> IRQ 10
Found Intel i82557 PCI Speedo at 0xe5302000, IRQ 10.
eth1: 0000:00:09.0, 00:30:64:04:E6:E5, IRQ 10.


Board assembly 721383-016, Physical connectors present: RJ45
Primary interface chip i82555 PHY #1.
General self-test: passed.
Serial sub-system self-test: passed.
Internal registers self-test: passed.
ROM checksum self-test: passed (0x04f4518b).

ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 11


PCI: setting IRQ 11 as level-triggered

ACPI: PCI Interrupt 0000:00:0a.0[A] -> Link [LNKC] -> GSI 11 (level,
low) -> IRQ 11
Found Intel i82557 PCI Speedo at 0xe5301000, IRQ 11.
eth2: 0000:00:0a.0, 00:30:64:04:E6:E6, IRQ 11.


Board assembly 721383-016, Physical connectors present: RJ45
Primary interface chip i82555 PHY #1.
General self-test: passed.
Serial sub-system self-test: passed.
Internal registers self-test: passed.
ROM checksum self-test: passed (0x04f4518b).

#ip addr
1: lo: <LOOPBACK,UP,10000> mtu 16436 qdisc noqueue


link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo

2: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop
link/ether e2:18:f7:f8:88:4e brd ff:ff:ff:ff:ff:ff
3: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000


link/ether 00:30:64:04:e6:e4 brd ff:ff:ff:ff:ff:ff

4: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000


link/ether 00:30:64:04:e6:e5 brd ff:ff:ff:ff:ff:ff

5: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000


link/ether 00:30:64:04:e6:e6 brd ff:ff:ff:ff:ff:ff

# eepro100-diag -aa -ee


eepro100-diag.c:v2.13 2/28/2005 Donald Becker (bec...@scyld.com)
http://www.scyld.com/diag/index.html
Index #1: Found a Intel i82557/8/9 EtherExpressPro100 adapter at 0xd800.

i82557 chip registers at 0xd800:
00000000 00000000 00000000 00080002 10000000 00000000
No interrupt sources are pending.
The transmit unit state is 'Idle'.
The receive unit state is 'Idle'.
This status is unusual for an activated interface.


EEPROM contents, size 64x16:
00: 3000 0464 e4e6 0e03 0000 0201 4701 0000 _0d__________G__
0x08: 7213 8310 40a2 0001 8086 0000 0000 0000 _r___@__________
...
0x30: 0128 0000 0000 0000 0000 0000 0000 0000 (_______________
0x38: 0000 0000 0000 0000 0000 0000 0000 92f7 ________________
The EEPROM checksum is correct.
Intel EtherExpress Pro 10/100 EEPROM contents:
Station address 00:30:64:04:E6:E4.
Board assembly 721383-016, Physical connectors present: RJ45
Primary interface chip i82555 PHY #1.
Sleep mode is enabled. This is not recommended.
Under high load the card may not respond to
PCI requests, and thus cause a master abort.
To clear sleep mode use the '-G 0 -w -w -f' options.
Index #2: Found a Intel i82557/8/9 EtherExpressPro100 adapter at 0xdc00.

i82557 chip registers at 0xdc00:
00000000 00000000 00000000 00080002 10000000 00000000
No interrupt sources are pending.
The transmit unit state is 'Idle'.
The receive unit state is 'Idle'.
This status is unusual for an activated interface.


EEPROM contents, size 64x16:
00: 3000 0464 e5e6 0e03 0000 0201 4701 0000 _0d__________G__
0x08: 7213 8310 40a2 0001 8086 0000 0000 0000 _r___@__________
...
0x30: 0128 0000 0000 0000 0000 0000 0000 0000 (_______________
0x38: 0000 0000 0000 0000 0000 0000 0000 91f7 ________________
The EEPROM checksum is correct.
Intel EtherExpress Pro 10/100 EEPROM contents:
Station address 00:30:64:04:E6:E5.
Board assembly 721383-016, Physical connectors present: RJ45
Primary interface chip i82555 PHY #1.
Sleep mode is enabled. This is not recommended.
Under high load the card may not respond to
PCI requests, and thus cause a master abort.
To clear sleep mode use the '-G 0 -w -w -f' options.
Index #3: Found a Intel i82557/8/9 EtherExpressPro100 adapter at 0xe000.

i82557 chip registers at 0xe000:
00000000 00000000 00000000 00080002 10000000 00000000
No interrupt sources are pending.
The transmit unit state is 'Idle'.
The receive unit state is 'Idle'.
This status is unusual for an activated interface.


EEPROM contents, size 64x16:
00: 3000 0464 e6e6 0e03 0000 0201 4701 0000 _0d__________G__
0x08: 7213 8310 40a2 0001 8086 0000 0000 0000 _r___@__________
...
0x30: 0128 0000 0000 0000 0000 0000 0000 0000 (_______________
0x38: 0000 0000 0000 0000 0000 0000 0000 90f7 ________________
The EEPROM checksum is correct.
Intel EtherExpress Pro 10/100 EEPROM contents:
Station address 00:30:64:04:E6:E6.
Board assembly 721383-016, Physical connectors present: RJ45
Primary interface chip i82555 PHY #1.
Sleep mode is enabled. This is not recommended.
Under high load the card may not respond to
PCI requests, and thus cause a master abort.
To clear sleep mode use the '-G 0 -w -w -f' options.


On a related note, I am concerned by this message:

Sleep mode is enabled. This is not recommended.
Under high load the card may not respond to
PCI requests, and thus cause a master abort.
To clear sleep mode use the '-G 0 -w -w -f' options.

Intel 82559 EEPROM Map and Programming Information (AP-394) states:
http://www.intel.com/design/network/applnots/ap394.htm

The Standby Enable bit enables the 82559 to enter standby mode. When
this bit equals 1b, the 82559 is able to recognize an idle state and can
enter standby mode (some internal clocks are stopped for power saving
purposes). The 82559 does not require a PCI clock signal in standby
mode. If this bit equals 0b, the idle recognition circuit is disabled
and the 82559 always remains in an active state. Thus, the 82559 always
requests PCI CLK using the Clockrun mechanism.

Auke, do you agree with Donald Becker's warning?

If I disable STB, the NICs will waste a bit more power when idle,
is that correct? Are there other implications?

Thanks for reading this far!

Auke Kok

unread,
Nov 8, 2006, 11:20:12 AM11/8/06
to
John wrote:
> I have a motherboard with three on-board 82559 NICs.
>
> o eepro100.ko properly initializes all three NICs
> o e100.ko fails to initialize one of them
>
> NOTE: With kernel 2.6.14, e100.ko fails to initialize the NIC with MAC
> address 00:30:64:04:E6:E4. With kernel 2.6.18 e100.ko fails to
> initialize the NIC with MAC address 00:30:64:04:E6:E5.
>
> The problem is not an incorrect checksum. (Donald Becker's dump utility
> reports a correct checksum for all three NICs.) The problem seems to be
> that e100.ko fails to read the contents of one of the EEPROMs.

[snip]

> 'insmod e100.ko eeprom_bad_csum_allow=1' reports:
>
> e100: Intel(R) PRO/100 Network Driver, 3.5.10-k2-NAPI
> e100: Copyright(c) 1999-2005 Intel Corporation
> ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 12
> PCI: setting IRQ 12 as level-triggered
> ACPI: PCI Interrupt 0000:00:08.0[A] -> Link [LNKA] -> GSI 12 (level,
> low) -> IRQ 12
> e100: eth0: e100_probe: addr 0xe5300000, irq 12, MAC addr 00:30:64:04:E6:E4
> ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10
> PCI: setting IRQ 10 as level-triggered
> ACPI: PCI Interrupt 0000:00:09.0[A] -> Link [LNKB] -> GSI 10 (level,
> low) -> IRQ 10
> e100: 0000:00:09.0: e100_eeprom_load: EEPROM corrupted
> e100: 0000:00:09.0: e100_probe: Invalid MAC address from EEPROM, aborting.
> ACPI: PCI interrupt for device 0000:00:09.0 disabled
> e100: probe of 0000:00:09.0 failed with error -11

This is what I was afraid of: even though the code allows you to bypass the EEPROM
checksum, the probe fails on a further check to see if the MAC address is valid.

Since something with this NIC specifically made the EEPROM return all 0xff's, the MAC
address is automatically invalid, and thus probe fails.

It seems that the driver has more problems with this NIC than just the eeprom checksum
being bad. Needless to say this might need fixing.

Can you load the eepro driver and send me the full eeprom dump? Perhaps I can duplicate
things over here.

[snip]

> On a related note, I am concerned by this message:
>
> Sleep mode is enabled. This is not recommended.
> Under high load the card may not respond to
> PCI requests, and thus cause a master abort.
> To clear sleep mode use the '-G 0 -w -w -f' options.
>
> Intel 82559 EEPROM Map and Programming Information (AP-394) states:
> http://www.intel.com/design/network/applnots/ap394.htm
>
> The Standby Enable bit enables the 82559 to enter standby mode. When
> this bit equals 1b, the 82559 is able to recognize an idle state and can
> enter standby mode (some internal clocks are stopped for power saving
> purposes). The 82559 does not require a PCI clock signal in standby
> mode. If this bit equals 0b, the idle recognition circuit is disabled
> and the 82559 always remains in an active state. Thus, the 82559 always
> requests PCI CLK using the Clockrun mechanism.
>
> Auke, do you agree with Donald Becker's warning?

If you are using the e100 in a performance situation, I would certainly switch it off :)

> If I disable STB, the NICs will waste a bit more power when idle,
> is that correct? Are there other implications?

hm, I don't know the power specs of e100 that well, so I can't say that it saves
significant amounts of power, but I suspect it would.

Power management on nics is hairy business. As suggested, it can take time before the
nic powers back up, performance can be impacted, and some commands might return an
invalid or unknown value. OTOH our labs here test these things pretty well before they
get send out to customers and resales agents, so Beckers cautious wording catches the
severity pretty well (recommended).

I would say that under most circumstances, it's safe to enable STB, but you might want
to disable it for use in routing and other server applications, where most of the time
the NIC is active anyway.

hth

Auke

Jesse Brandeburg

unread,
Nov 8, 2006, 12:30:18 PM11/8/06
to
On 11/8/06, John <m...@privacy.net> wrote:
> Hello all,
>
> [ E-mail address is a bit-bucket. I *do* monitor the mailing lists. ]
>
> I will try and summarize the problem as I understand it at this point.
>
> I've written two messages so far:
> http://groups.google.com/group/linux.kernel/msg/3a05d819c66474db
> http://groups.google.com/group/linux.kernel/msg/391aebbb3dfd6039
>
> And here is a link to the complete thread:
> http://lkml.org/lkml/fancy/2006/11/3/124
>
> I have a motherboard with three on-board 82559 NICs.
>
> o eepro100.ko properly initializes all three NICs
> o e100.ko fails to initialize one of them
>
> NOTE: With kernel 2.6.14, e100.ko fails to initialize the NIC with MAC
> address 00:30:64:04:E6:E4. With kernel 2.6.18 e100.ko fails to
> initialize the NIC with MAC address 00:30:64:04:E6:E5.
>
> The problem is not an incorrect checksum. (Donald Becker's dump utility
> reports a correct checksum for all three NICs.) The problem seems to be
> that e100.ko fails to read the contents of one of the EEPROMs.

<snip>

Thanks for the report, I have some thoughts.
I suspect that one reason beckers code works is that it uses IO based
access (slower, and different method) to the adapter rather than
memory mapped access.

The second thought is that the adapter is in D3, and something about
your kernel or the driver doesn't successfully wake it up to D0. An
indication of this would be looking at lspci -vv before/after loading
the driver. Also, after loading/unloading eepro100 does the e100
driver work?

A third idea is look for a master abort in lspci after e100 fails to load.

And a last idea is for us to instrument the reads /writes from/to the
device during init and see if everything is returning 0xffffffff, as
that indicates the I/O and/or memory bar is not enabled, or the
address returned from ioremap is invalid.

Jesse

John

unread,
Nov 9, 2006, 7:20:20 AM11/9/06
to
Auke Kok wrote:

> This is what I was afraid of: even though the code allows you to bypass
> the EEPROM checksum, the probe fails on a further check to see if the
> MAC address is valid.
>
> Since something with this NIC specifically made the EEPROM return all
> 0xff's, the MAC address is automatically invalid, and thus probe fails.

I don't understand why you think there is something wrong with a
specific NIC?

In 2.6.14.7, e100.ko fails to read the EEPROM on 0000:00:08.0 (eth0)
In 2.6.18.1, e100.ko fails to read the EEPROM on 0000:00:09.0 (eth1)
In both kernels, eepro100.ko successfully reads all the EEPROMs.

> It seems that the driver has more problems with this NIC than just the
> eeprom checksum being bad. Needless to say this might need fixing.
>
> Can you load the eepro driver and send me the full eeprom dump?
> Perhaps I can duplicate things over here.

00:08.0 EEPROM contents, size 64x16

3000 0464 e4e6 0e03 0000 0201 4701 0000

7213 8310 40a2 0001 8086 0000 0000 0000

0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000


0128 0000 0000 0000 0000 0000 0000 0000

0000 0000 0000 0000 0000 0000 0000 92f7

00:09.0 EEPROM contents, size 64x16

3000 0464 e5e6 0e03 0000 0201 4701 0000

7213 8310 40a2 0001 8086 0000 0000 0000

0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000


0128 0000 0000 0000 0000 0000 0000 0000

0000 0000 0000 0000 0000 0000 0000 91f7

00:0a.0 EEPROM contents, size 64x16

3000 0464 e6e6 0e03 0000 0201 4701 0000

7213 8310 40a2 0001 8086 0000 0000 0000

0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000


0128 0000 0000 0000 0000 0000 0000 0000

0000 0000 0000 0000 0000 0000 0000 90f7

John

unread,
Nov 9, 2006, 9:20:13 AM11/9/06
to
Jesse Brandeburg wrote:

> I suspect that one reason Becker's code works is that it uses IO


> based access (slower, and different method) to the adapter rather
> than memory mapped access.

I've noticed this difference.

> The second thought is that the adapter is in D3, and something about
> your kernel or the driver doesn't successfully wake it up to D0.

On my NICs, the EEPROM ID (Word 0Ah) is set to 0x40a2.
Thus DDPD (bit 6) is set to 0.

DDPD is the "Disable Deep Power Down while PME is disabled" bit.
0 - Deep Power Down is enabled in D3 state while PME-disabled.
1 - Deep Power Down disabled in D3 state while PME-disabled.
This bit should be set to 1b if a TCO controller is being used via the
SMB because it requires receive functionality at all power states.

Are you suggesting I try and set DDPD to 1?
Or is this completely unrelated?

> An indication of this would be looking at lspci -vv before/after
> loading the driver.

$ diff -u lspci_vv_before_e100.txt lspci_vv_after_e100.txt
--- lspci_vv_before_e100.txt 2006-11-09 14:51:30.000000000 +0100
+++ lspci_vv_after_e100.txt 2006-11-09 14:51:30.000000000 +0100
@@ -74,21 +74,20 @@


Expansion ROM at 20000000 [disabled] [size=1M]
Capabilities: [dc] Power Management version 2
Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA
PME(D0+,D1+,D2+,D3hot+,D3cold+)

- Status: D0 PME-Enable+ DSel=0 DScale=2 PME-
+ Status: D0 PME-Enable- DSel=0 DScale=2 PME-

00:09.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro
100] (rev 08)
Subsystem: Intel Corporation EtherExpress PRO/100B (TX)

- Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
+ Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop-

ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-

- Latency: 32 (2000ns min, 14000ns max), cache line size 08


Interrupt: pin A routed to IRQ 10

- Region 0: Memory at e5302000 (32-bit, non-prefetchable) [size=4K]
- Region 1: I/O ports at dc00 [size=64]
- Region 2: Memory at e5100000 (32-bit, non-prefetchable) [size=1M]
+ Region 0: Memory at e5302000 (32-bit, non-prefetchable)
[disabled] [size=4K]
+ Region 1: I/O ports at dc00 [disabled] [size=64]
+ Region 2: Memory at e5100000 (32-bit, non-prefetchable)
[disabled] [size=1M]


Expansion ROM at 20100000 [disabled] [size=1M]
Capabilities: [dc] Power Management version 2
Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA
PME(D0+,D1+,D2+,D3hot+,D3cold+)

- Status: D0 PME-Enable+ DSel=0 DScale=2 PME-
+ Status: D0 PME-Enable- DSel=0 DScale=2 PME-

00:0a.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro
100] (rev 08)
Subsystem: Intel Corporation EtherExpress PRO/100B (TX)

> Also, after loading/unloading eepro100 does the e100 driver work?

No.

> A third idea is look for a master abort in lspci after e100 fails to
> load.

I don't understand that one.

Auke Kok

unread,
Nov 9, 2006, 12:10:10 PM11/9/06
to
John wrote:
> Auke Kok wrote:
>
>> This is what I was afraid of: even though the code allows you to
>> bypass the EEPROM checksum, the probe fails on a further check to see
>> if the MAC address is valid.
>>
>> Since something with this NIC specifically made the EEPROM return all
>> 0xff's, the MAC address is automatically invalid, and thus probe fails.
>
> I don't understand why you think there is something wrong with a
> specific NIC?

that was completely not my point - I was merely trying to point out that the original
problem causes a cascade of error events later on, and bypassing the eeprom check in
this case didn't help you at all. Something is wrong in the driver, but I don't
understand yet why it only affects one of the 3 nics in your system.

> In 2.6.14.7, e100.ko fails to read the EEPROM on 0000:00:08.0 (eth0)
> In 2.6.18.1, e100.ko fails to read the EEPROM on 0000:00:09.0 (eth1)

almost sounds like a bug got fixed and it introduced a regression. this wouldn't be the
right time to pull out git-bisect would it? even loading 2.6.15, 2.6.16, 2.6.17 on it
would give us some good information.


Cheers,

Auke

Jesse Brandeburg

unread,
Nov 9, 2006, 7:30:17 PM11/9/06
to
On 11/9/06, John <m...@privacy.net> wrote:
> > The second thought is that the adapter is in D3, and something about
> > your kernel or the driver doesn't successfully wake it up to D0.
>
> On my NICs, the EEPROM ID (Word 0Ah) is set to 0x40a2.
> Thus DDPD (bit 6) is set to 0.
>
> DDPD is the "Disable Deep Power Down while PME is disabled" bit.
> 0 - Deep Power Down is enabled in D3 state while PME-disabled.
> 1 - Deep Power Down disabled in D3 state while PME-disabled.
> This bit should be set to 1b if a TCO controller is being used via the
> SMB because it requires receive functionality at all power states.
>
> Are you suggesting I try and set DDPD to 1?
> Or is this completely unrelated?

This may be related but I doubt it. Something is strange about how
memory is being mapped in your system. whatever is creating the
problem moved when you changed the kernel version. I'm wondering if
there is a device collision at e5302000. I'm not convinced at this
point it is e100's fault.

can you send output of cat /proc/iomem

> > An indication of this would be looking at lspci -vv before/after
> > loading the driver.
>
> $ diff -u lspci_vv_before_e100.txt lspci_vv_after_e100.txt
> --- lspci_vv_before_e100.txt 2006-11-09 14:51:30.000000000 +0100
> +++ lspci_vv_after_e100.txt 2006-11-09 14:51:30.000000000 +0100
> @@ -74,21 +74,20 @@
> Expansion ROM at 20000000 [disabled] [size=1M]
> Capabilities: [dc] Power Management version 2
> Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA
> PME(D0+,D1+,D2+,D3hot+,D3cold+)
> - Status: D0 PME-Enable+ DSel=0 DScale=2 PME-
> + Status: D0 PME-Enable- DSel=0 DScale=2 PME-

okay when the driver loads it is clearing PME enable, but not
re-enabling it when it unloads. That is pretty much expected.

> 00:09.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro
> 100] (rev 08)
> Subsystem: Intel Corporation EtherExpress PRO/100B (TX)
> - Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR- FastB2B-
> + Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR- FastB2B-
> Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium
> >TAbort- <TAbort- <MAbort- >SERR- <PERR-

pci_enable_device should be enabling io,mem,busmaster, they are
probably being disabled when the driver errors out of init. maybe you
should add a call to pci_set_power_state(dev, PCI_D0); before the
call to e100_reset

> > Also, after loading/unloading eepro100 does the e100 driver work?
>
> No.

now that is really odd.

> > A third idea is look for a master abort in lspci after e100 fails to
> > load.
>
> I don't understand that one.

There isn't one, MAbort+ would be showing in the above lspci output.

The all 0xffffffff returns when you read registers is a sure sign the
hardware either isn't at the address specified or is in a power down
state. The only other option i can think of is that something else is
intercepting memory reads and writes.

try something like the attached patch, compile tested only:

e100_debug.patch

John

unread,
Nov 10, 2006, 7:10:07 AM11/10/06
to
Jesse Brandeburg wrote:

> Can you send output of cat /proc/iomem

00000000-0009ffff : System RAM
000a0000-000bffff : Video RAM area
000f0000-000fffff : System ROM
00100000-0ffeffff : System RAM
00100000-00296a1a : Kernel code
00296a1b-0031bbe7 : Kernel data
0fff0000-0fff2fff : ACPI Non-volatile Storage
0fff3000-0fffffff : ACPI Tables
20000000-200fffff : 0000:00:08.0
20100000-201fffff : 0000:00:09.0
20200000-202fffff : 0000:00:0a.0
e0000000-e3ffffff : 0000:00:00.0
e5000000-e50fffff : 0000:00:08.0
e5100000-e51fffff : 0000:00:09.0
e5200000-e52fffff : 0000:00:0a.0
e5300000-e5300fff : 0000:00:08.0
e5301000-e5301fff : 0000:00:0a.0
e5302000-e5302fff : 0000:00:09.0
ffff0000-ffffffff : reserved

I've also attached:

o config-2.6.18.1-adlink used to compile this kernel
o dmesg output after the machine boots

> try something like the attached patch

Loading e100-debug.ko reports:

e100: Intel(R) PRO/100 Network Driver, 3.5.10-k2-NAPI
e100: Copyright(c) 1999-2005 Intel Corporation

***e100 debug: unable to set power state (error 0)


ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 12
PCI: setting IRQ 12 as level-triggered
ACPI: PCI Interrupt 0000:00:08.0[A] -> Link [LNKA]
-> GSI 12 (level, low) -> IRQ 12

***e100 debug: read 01000000/00000000 from the same register
e100: eth0: e100_probe: addr 0xe5300000, irq 12, MAC addr 00:30:64:04:E6:E4

***e100 debug: unable to set power state (error 0)


ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10
PCI: setting IRQ 10 as level-triggered
ACPI: PCI Interrupt 0000:00:09.0[A] -> Link [LNKB]
-> GSI 10 (level, low) -> IRQ 10

***e100 debug: read 01000000/00000000 from the same register


e100: 0000:00:09.0: e100_eeprom_load: EEPROM corrupted
ACPI: PCI interrupt for device 0000:00:09.0 disabled
e100: probe of 0000:00:09.0 failed with error -11

***e100 debug: unable to set power state (error 0)


ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 11
PCI: setting IRQ 11 as level-triggered
ACPI: PCI Interrupt 0000:00:0a.0[A] -> Link [LNKC]
-> GSI 11 (level, low) -> IRQ 11

***e100 debug: read 01000000/00000000 from the same register
e100: eth1: e100_probe: addr 0xe5301000, irq 11, MAC addr 00:30:64:04:E6:E6


In other words, the behavior is the same for all three NICs.

pci_set_power_state(pdev, PCI_D0) returns 0
pci_iomap returns something != NULL

Can I provide more information to help locate the problem?

config-2.6.18.1-adlink
dmesg.txt

John

unread,
Nov 15, 2006, 3:40:03 AM11/15/06
to
John wrote:

> 00000000-0009ffff : System RAM
> 000a0000-000bffff : Video RAM area
> 000f0000-000fffff : System ROM
> 00100000-0ffeffff : System RAM
> 00100000-00296a1a : Kernel code
> 00296a1b-0031bbe7 : Kernel data
> 0fff0000-0fff2fff : ACPI Non-volatile Storage
> 0fff3000-0fffffff : ACPI Tables
> 20000000-200fffff : 0000:00:08.0
> 20100000-201fffff : 0000:00:09.0
> 20200000-202fffff : 0000:00:0a.0
> e0000000-e3ffffff : 0000:00:00.0
> e5000000-e50fffff : 0000:00:08.0
> e5100000-e51fffff : 0000:00:09.0
> e5200000-e52fffff : 0000:00:0a.0
> e5300000-e5300fff : 0000:00:08.0
> e5301000-e5301fff : 0000:00:0a.0
> e5302000-e5302fff : 0000:00:09.0
> ffff0000-ffffffff : reserved
>
> I've also attached:
>
> o config-2.6.18.1-adlink used to compile this kernel
> o dmesg output after the machine boots

I suppose the information I've sent is not enough to locate the
root of the problem. Is there more I can provide?

John

unread,
Nov 27, 2006, 9:21:24 AM11/27/06
to
John wrote:

Here is some context for those who have been added to the CC list:
http://groups.google.com/group/linux.kernel/browse_frm/thread/bdc8fd08fb601c26

As far as I understand, some consider the eepro100 driver to be
obsolete, and it has been considered for removal.

What is the current status?

Unfortunately, e100 does not work out-of-the-box on this system.

Is there something I can do to improve the situation?

--
Regards,

John

[ E-mail address is a bit-bucket. I *do* monitor the mailing lists. ]

-

Jesse Brandeburg

unread,
Nov 27, 2006, 3:40:13 PM11/27/06
to

lets go ahead and print the output from e100_load_eeprom
debug patch attached.

e100_debug.patch

Jesse Brandeburg

unread,
Nov 29, 2006, 2:00:15 PM11/29/06
to
On 11/29/06, John <m...@privacy.net> wrote:
> > Let's go ahead and print the output from e100_load_eeprom
> > debug patch attached.
>
> Loading (then unloading) e100.ko fails the first few times (i.e. the
> driver claims one of the EEPROMs is corrupted). Thereafter, sometimes it
> fails, other times it works. Sounds like a race, no?

yes, or something like that. I think you may have a piece of eeprom
hardware that is either "slow" or slightly out of spec. I wonder if
the hrt kernel makes udelay(4) much more like 4us than the regular
kernels.

can you try adding mdelay(100); in e100_eeprom_load before the for loop,
and then change the multiple udelay(4) to mdelay(1) in e100_eeprom_read

> On an unrelated note, insmod_100.txt is truncated at the beginning, and
> insmod_110.txt is truncated in the middle (!!) cf. line 14. What would
> cause klogd to behave like that?

usually its because whatever is printing is printing too fast or too
much at a time.

Jesse Brandeburg

unread,
Dec 4, 2006, 6:30:09 PM12/4/06
to
On 12/1/06, John <m...@privacy.net> wrote:
> > can you try adding mdelay(100); in e100_eeprom_load before the for loop,
> > and then change the multiple udelay(4) to mdelay(1) in e100_eeprom_read
>
> I applied the attached patch.
>
> Loading the driver now takes around one minute :-)

ouch, but yep, thats what happens when you use "super extra delay"

> I ran 'source load_unload' 25 times in a loop.
>
> The first 12 times were successful. The last 13 times failed.
> (cf. attached archive)
>
> I noticed something very strange.
>
> The number of words obviously in error (0xFFFF) returned by the EEPROM
> on 00:09.0 is not constant.

That is very strange, I would think that maybe you have something else
on the bus with the e100 that may be hogging bus cycles you have
failing hardware (maybe a bad eeprom, or possibly a bad mac chip)

> $ grep -c 0xFFFF insmod*
> insmod_300.txt:0
> insmod_301.txt:0
> insmod_302.txt:0
> insmod_303.txt:0
> insmod_304.txt:0
> insmod_305.txt:0
> insmod_306.txt:0
> insmod_307.txt:0
> insmod_308.txt:0
> insmod_309.txt:0
> insmod_310.txt:0
> insmod_311.txt:0
> insmod_312.txt:1
> insmod_313.txt:5
> insmod_314.txt:24
> insmod_315.txt:45
> insmod_316.txt:243
> insmod_317.txt:256
> insmod_318.txt:256
> insmod_319.txt:256
> insmod_320.txt:256
> insmod_321.txt:256
> insmod_322.txt:256
> insmod_323.txt:253
> insmod_324.txt:240

this is even stranger, does it cycle back down (sine wave) to zero
again? The delays did seem to work, at least sometimes. This
indicates that something needs that extra delay to successfully read
the eeprom. I might try changing all the udelay(4) to udelay(40) (x10
increase) and see if that gives you a happy medium of "most times
driver loads without error"

John, this problem seems to be very specific to your hardware. I know
that you have put in a lot of time debugging this, but I'm not sure
what we can do from here. If this were a generic code problem more
people would be reporting the issue.

What would you like to do? At this stage I would like e100 to work
better than it is, but I'm not sure what to do next.

Thanks for your patience on this issue,
Jesse

Reply all
Reply to author
Forward
0 new messages