Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#360699: installation-reports: Installation on Sun Netra X1

6 views
Skip to first unread message

Javier Fernández-Sanguino Peña

unread,
Apr 4, 2006, 5:40:31 AM4/4/06
to
Package: installation-reports

Boot method: RARP/TFTP
Image version:
http://http.us.debian.org/debian/dists/sarge/main/installer-sparc/current/images/sparc64/netboot/2.6/
07-Mar-2005 01:32 5.3M boot.img

Date: 2nd April 2006

Machine: Sun Netra X1
Processor:
# cat /proc/cpuinfo

cpu : TI UltraSparc IIe (Hummingbird)
fpu : UltraSparc IIe integrated FPU
promlib : Version 3 Revision 0
prom : 4.0.9

Memory:
# cat /proc/meminfo
MemTotal: 1029576 kB

Partitions:
fdisk -l /dev/hda durante la isntalacion y obtuve esto:

Disk /dev/hda (Sun disk label): 16 heads, 255 sectors, 19156 cylinders
Units = cylinders of 4080 * 512 bytes

Device Flag Start End Blocks Id System
/dev/hda1 17722 19158 2929440 83 Linux native
/dev/hda2 16765 17722 1952280 82 Linux swap
/dev/hda3 0 19158 39082320 5 Whole disk

Output of lspci and lspci -n:
#lspci

0000:00:00.0 Host bridge: Sun Microsystems Computer Corp. Ultra IIe
0000:00:03.0 Non-VGA unclassified device: ALi Corporation M7101 Power
Management Controller [PMU]
0000:00:05.0 Ethernet controller: Davicom Semiconductor, Inc. 21x4x
DEC-Tulip compatible 10/100 Ethernet (rev 31)
0000:00:07.0 ISA bridge: ALi Corporation M1533 PCI to ISA Bridge
[Aladdin IV]
0000:00:0a.0 USB Controller: ALi Corporation USB 1.1 Controller (rev 03)
0000:00:0c.0 Ethernet controller: Davicom Semiconductor, Inc. 21x4x
DEC-Tulip compatible 10/100 Ethernet (rev 31)
0000:00:0d.0 IDE interface: ALi Corporation M5229 IDE (rev c3)

#lspci -n
0000:00:00.0 0600: 108e:a001
0000:00:03.0 0000: 10b9:7101
0000:00:05.0 0200: 1282:9102 (rev 31)
0000:00:07.0 0601: 10b9:1533
0000:00:0a.0 0c03: 10b9:5237 (rev 03)
0000:00:0c.0 0200: 1282:9102 (rev 31)
0000:00:0d.0 0101: 10b9:5229 (rev c3)

Base System Installation Checklist:
[O] = OK, [E] = Error (please elaborate below), [ ] = didn't try it

Initial boot worked: [ O ]
Configure network HW: [ O ]
Config network: [ E ]
Detect CD: [ ] -- This system does not have a CD-ROM
Load installer modules: [ O ]
Detect hard drives: [ O ]
Partition hard drives: [ O ]
Create file systems: [ O ]
Mount partitions: [ O ]
Install base system: [ O ]
Install boot loader: [ E ]
Reboot: [ E ]

Comments/Problems:

I have been unable to find this information at
http://www.debian.org/devel/debian-installer/errata

Netwokr config issue
--------------------

The first problem throughout the installation was the DHCP configuration
step, it goes through all the other steps without showing any error but this
step fails as it is not able to get an assigned IP address.

It turns out that the problem is due to the drivers being loaded. And this
issue have to be fixed in console. If you go to the console an run
'ifconfig'. The output of ifconfig is the following:

eth0 Link encap:Ethernet HWaddr 00:00:00:00:00:00
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

eth1 Link encap:Ethernet HWaddr 00:00:00:00:00:00
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
Interrupt:192

lo Link encap:Local Loopback
LOOPBACK MTU:16436 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

Notice that the Harware MAC address is all '0's, which is clearly wrong.
If you use 'dmesg' you see the following errors:

SABRE0: PCI SERR signal asserted.
SABRE0: PCI bus error, PCI_STATUS[eaa0]
SABRE0: PCI SERR signal asserted.
SABRE0: PCI bus error, PCI_STATUS[caa0]

And this is how the kernel loads them:

Linux Tulip driver version 1.1.13-NAPI (May 11, 2002)
dmfe: Davicom DM9xxx net driver, version 1.36.4 (2002-01-17)
eth0: Davicom DM9102 at pci0000:00:05.0, 00:00:00:00:00:00, irq 7204992.
eth1: Davicom DM9102 at pci0000:00:0c.0, 00:00:00:00:00:00, irq 7204288.

Google searching turns out that this is a known bug of the 'dmfe' driver
which replaces the 'tulip' driver. If you run 'lsmod' in console you
will see that both drivers are loaded:

Module Size Used by
(...)
dmfe 23900 0
tulip 54472 0
ohci_hcd 19716 0

This is wrong, and needs to be fixed by hand by removing the drivers
and reinstalling the drivers, like this:

$ modprobe -r dmfe
$ modprobe -r tulip
$ modprobe tulip
$ dhclient eth1
Note: The 'eth1' interface is the 'A' interface of the Netra, while 'B' is
eth0, which is actually somewhat confusing. In our setup we were using
'A' (eth1) to connect to the network.
(This was reported in #239873 but has not been documented in the SPARC
Installation Guide)

NOTE: This URL
http://040.digital-bless.com/texts/installing_debian_linux_on_sun_netra_x1.html
also mentions issues with the 'dmfe' network driver. I wonder why the
installer tries to install both drivers when the 'dmfe' is not recommended.
The person doing that writeup installed a CD-ROM to the system (the Netra X1,
unlike the T1, does not have one) so the lack of network was not critical for
him, but it would be a show-stopper for any other installing in this system.


SILO issue
----------

When the installer tries to install SILO it fails:

SILO installation failed
Running "/sbin/silo" failed with error code "1".

If SILO is executed in the terminal (within a chroot), it also fails:

/etc/silo.conf appears to be valid
Fatal error: Cannot open /dev/hda3

This is the output of 'fdisk -l /dev/hda'

Disk /dev/hda (Sun disk label): 16 heads, 255 sectors, 19156 cylinders
Units = cylinders of 4080 * 512 bytes

Device Flag Start End Blocks Id System
/dev/hda1 17722 19158 2929440 83 Linux native
/dev/hda2 16765 17722 1952280 82 Linux swap
/dev/hda3 0 19158 39082320 5 Whole disk

However, if you restart the installer (without rebooting the system)
and rerun the system (either from the installer or from a chroot)
it *does* work. ?

RAID issue
----------

Software RAID can not be configured throughout the installation since
the partitions cannot be configured as a RAID volume.
The option: "Use as: physical volume for RAID" is not available
in the 'partman' module.

Reboot issue
------------

(This only happened in one system, so it might be a hardware problem)

After the installation, the system does not boot and returns the following
error message:

Allocated 8 Megs of memory at 0x40000000 for kernel
Uncompressing image...
Loaded kernel version 2.6.15
Illegal Instruction

If the system is hard-reset (soft-reset won't do) then it is able to reboot
properly and load the Linux kernel.


Regards

Javier

PS: Full log of the installation and more information available on request

signature.asc

Frans Pop

unread,
Apr 4, 2006, 7:30:14 AM4/4/06
to
clone 360699 -1
reassign -1 discover1
retitle -1 Entries in exception lists should overrule default lists
tags -1 + d-i
thanks

Hi Javier,

Thanks for your extensive, well researched and well written report

On Tuesday 04 April 2006 11:17, Javier Fernández-Sanguino Peña wrote:
> Network config issue


> --------------------
> The first problem throughout the installation was the DHCP
> configuration step, it goes through all the other steps without showing
> any error but this step fails as it is not able to get an assigned IP
> address.

[...]


> Notice that the Harware MAC address is all '0's, which is clearly
> wrong. If you use 'dmesg' you see the following errors:

[...]


> Google searching turns out that this is a known bug of the 'dmfe'
> driver which replaces the 'tulip' driver. If you run 'lsmod' in console
> you will see that both drivers are loaded:

[...]
> http://040.digital-bless.com/texts/installing_debian_linux_on_sun_netra
>_x1.html also mentions issues with the 'dmfe' network driver. I wonder


> why the installer tries to install both drivers when the 'dmfe' is not
> recommended.

This looks like a bug in discover. It has different lists it uses as a
hardware database, the relevant ones are the pci.lst and the
pci-sparc.lst.
pci.lst: 12829102 ethernet dmfe 21x4x DEC-Tulip
compatible 10/100 Ethernet
pci-sparc.lst: 12829102 ethernet tulip 21x4x DEC-Tulip
compatible 10/100 Ethernet

As you can see, tulip _is_ listed as recommended for sparc, and IIUC this
should overrule the default in the pci.lst. However, from your report for
some reason both are being loaded.

That said, this won't be fixed for Sarge anymore and the installer for
Etch no longer uses discover for hardware detection, but udev. I'd be
interested to know how the Etch Beta 2 release deals with this NIC.

> SILO issue
> ----------


> SILO installation failed
> Running "/sbin/silo" failed with error code "1".

[...]


> /etc/silo.conf appears to be valid
> Fatal error: Cannot open /dev/hda3

This is the first time I see that error and Google has nothing on it
either. The only strange thing I can see about your partition table is
the fact that 0-16765 seems unused. Is that free space?

> However, if you restart the installer (without rebooting the system)
> and rerun the system (either from the installer or from a chroot)
> it *does* work. ?

Unless you can reprocuce the problem on a new install, I don't see any
real possibility of tracing this. I don't see much point in reassigning
this to silo-installer to be honest. Let's hope it may prove useful
sometime having this in the archive...

> RAID issue
> ----------
> Software RAID can not be configured throughout the installation since
> the partitions cannot be configured as a RAID volume.
> The option: "Use as: physical volume for RAID" is not available
> in the 'partman' module.

Known issue; needs someone to work on adding RAID support for sparc.
There's quite a few little issues that could be improved for sparc...

> Reboot issue
> ------------


> After the installation, the system does not boot and returns the
> following error message:
>
> Allocated 8 Megs of memory at 0x40000000 for kernel
> Uncompressing image...
> Loaded kernel version 2.6.15
> Illegal Instruction
>
> If the system is hard-reset (soft-reset won't do) then it is able to
> reboot properly and load the Linux kernel.

Could this be related to the silo problem somehow?
I think I do remember earlier mails or reports that talked about needing a
power off before rebooting into the installed system.

Cheers,
FJP

Javier Fernández-Sanguino Peña

unread,
Apr 4, 2006, 8:20:25 AM4/4/06
to
On Tue, Apr 04, 2006 at 01:13:12PM +0200, Frans Pop wrote:
>
> Thanks for your extensive, well researched and well written report

The merit is not mine, I translated (and edited) the report from a co-worker
:-)

> That said, this won't be fixed for Sarge anymore and the installer for
> Etch no longer uses discover for hardware detection, but udev. I'd be
> interested to know how the Etch Beta 2 release deals with this NIC.

I don't know if I can get my co-worker to try that one. I'll mention it and
see if that's possible but I don't believe they can do that right now.

> Unless you can reprocuce the problem on a new install, I don't see any
> real possibility of tracing this. I don't see much point in reassigning
> this to silo-installer to be honest. Let's hope it may prove useful
> sometime having this in the archive...

Agreed. I'm not sure if that might be a hardware issue even...

> > If the system is hard-reset (soft-reset won't do) then it is able to
> > reboot properly and load the Linux kernel.
>
> Could this be related to the silo problem somehow?

Might be.

> I think I do remember earlier mails or reports that talked about needing a
> power off before rebooting into the installed system.

It could still be a hardware problem. It doesn't seem to happen in another
system that is being installed with the same image version.

Regards


Javier

signature.asc
0 new messages