patchadd 137138-09 in single-usermode hanged up the system in
postpatch in the patch's own bootadm - program.
Executing postpatch script...
Creating GRUB menu in /
- searching for UFS boot signatures
- no existing UFS boot signatures
Call 38210519 was opened.
System was not interuptable and had to be powered down.
Any new boot ended in system panic.
I had to install the machine new. I don't think, that my system was
unique, so this could happen to other SUN customers too.
I don't know, why they did not create an alert news.
there are a few Sun Alerts posted about update 6
kernel patches and booting problems:
246206: Solaris 10 Kernel Patches 137137-09/137138-09
May Cause Boot Failure For An MPxIO Enabled ... |
246207: A Lack of Root Filesystem Space When
Installing Solaris 10 Kernel Patch 137137-09/137138-09
> Horst Scheuermann wrote:
> > System X4500 Thumper 5.10 Generic_137112-08 mirroed root swap var
> >
> > patchadd 137138-09 in single-usermode hanged up the system in
> > postpatch in the patch's own bootadm - program.
> >
> > Executing postpatch script...
> > Creating GRUB menu in /
> > - searching for UFS boot signatures
> > - no existing UFS boot signatures
> >
> > Call 38210519 was opened.
> >
> > System was not interuptable and had to be powered down.
> > Any new boot ended in system panic.
> >
> > I had to install the machine new. I don't think, that my system was
> > unique, so this could happen to other SUN customers too.
> >
> > I don't know, why they did not create an alert news.
> >
> there are a few Sun Alerts posted about update 6
> kernel patches and booting problems:
> 246206: Solaris 10 Kernel Patches 137137-09/137138-09
> May Cause Boot Failure For An MPxIO Enabled ... |
Symptoms were diffent from those in 246206, MPxIO was not enabled
> 246207: A Lack of Root Filesystem Space When
> Installing Solaris 10 Kernel Patch 137137-09/137138-09
there was plenty of root in the root-Filesystem
--
11. Gebot: Wenn Du eine Fahrradklingel hörst, dreh Dich um, reiße
Mund, Nase und Augen auf, trete aber keinesfalls zur Seite.
There are more sun alerts for the 137137-09 issues:
Document Audience: PUBLIC
Document ID: 244606
Title: Solaris 10 SPARC Kernel patch 137111-01 through 137111-08
Enforces Mutex Alignment Rules and May Cause Some Applications to Fail
Document Audience: PUBLIC
Document ID: 245626
Title: ZFS Pool Corruption May Occur With Sun Cluster 3.2 Running
Solaris 10 with patch 137137-09 or 137138-09
Doesn't look like either of those matchup with this problem.
I looked at your ticket in the system and it's the only one I could find
with this precise error and patch 137138-09. If this comes up again
with you or another customer, we'd obviously have to investigate further
prior to any OS reload.
But at least its now recorded here in Usenet.
137138-09 has more problems than that.
According to smpatch on Solaris 10 x86 U5, the following patches are
needed and all require 137138-09:
139573-01
139499-01
139484-01
139552-01
139580-01
But it fails to install with a weird error about not being able to mount
user's home directories to a path under /var/run.
--
Jim Pennino
Remove .spam.sux to reply.
Hi Horst,
as I went exactly into the same condition as you did - I got out of
that with some sweat and swearing. ;)
For all the others here, who will read this when looking what has
happened to their system, here is how i made it (maybe it will not
work for you, but it's a chance):
My system (a sun fire x4100 - so it's real sun hardware - no chance
for sun to tell "not supported" ;) ) hang like Horst told. My root-
filesystem is about 52GB and I don't use mpxio.
* I switched to the ilom and resetted the hanging system (power off/on
would work as well)
* Any new boot into the standard solaris system made the system crash
* So I did start a "solaris failsave" Session from grub. The system
told me, no installed Solaris was found, as I have meta-devices from
SVM (SDS Rootdisk) Mirroring, that can't be controlled. So I mounted
my boot-device (c0t2d0s0) primary mirror to /mnt:
#mount -F ufs /dev/dsk/c0t2d0s0 /mnt
#update boot-archive /mnt
#sync
#umount /mnt
#reboot
Next boot (not failsave!): (still the old kernel 127128-11 was
displayed in the boot-screen)
msg:
files in / differ from the boot archive: ....
action:
# svcadm clear system/boot-archive
State: system crash while booting up
Next boot (137138-09) !!!! (new Kernel found!)
msg:
....
WARNING: The following files in / differ from the boot archive: ....
(many more than before!!!!)
# svcadm clear system/boot-archive
State: error due to filesystem corruption.
action: fsck (3 times until no more errors where displayed)
# fsck -y -F ufs /dev/rdsk/c0t2d0s0
after that (that's a bit of screen logging here):
# mount
/ on /pci@0,0/pci1022,7450@2/pci1000,3060@3/sd@2,0:a
#
svcadm clear system/boot-archive
svcadm: Instance
"svc:/system/boot-archive:default" is not in a maintenance or degraded
# svcs -xv svc:/system/filesystem/usr:default
(read/write root file systems mounts) State: maintenance since Tue
Dec
16 12:18:52 2008 Reason: Start method exited with
$SMF_EXIT_ERR_FATAL.
See: http://sun.com/msg/SMF-8000-KS See: /etc/svc/volatile/system-
filesystem-usr:default.log Impact: 68
dependent services are not running:
.....(68 services are listed here)
## this was because of the corrupted root-filesystem before, so let's
clear this maintenance-state now
# svcadm clear system/filesystem/usr
Configuring devices.
Loading smf(5) service descriptions: 4/4
No pending job.
Reading ZFS config: done.
SYSTEMNAME console login: root
Password: *************
# showrev -p | grep 137138 Patch: 137138-09 Obsoletes: 118997-10 ....
=> Now another test-reboot -> successful, no crash
So my system is up to date and running again.
What seemed strange was the mirror-disk of root. After a minute it
told me 77% resynced (52 GB!) and some seconds later the resync was
finished. I don't know SVM mirror doing a incremental resync after a
crash I broke up the mirror and started a normal sync when attaching
the mirror-device again (took more than one hour!).
Happy repairing and:
Sun-Support, replace this buggy patch ASAP!
....
user60...@spamcorptastic.com writes:
>Hi Horst,
>as I went exactly into the same condition as you did - I got out of
>that with some sweat and swearing. ;)
>[...]
I also experienced the same problem on a SPARC system with corresponding
patch #137137-09. The patch looks like it will install properly but then
the system fails to boot. I had a mirrored root filesystem on c0t0d0s0 and
c0t1d0s0 and these were the steps I took to repair things. These are
the instructions for a SPARC system with OBP, not GRB.
1: Upon reboot, when the system comes up with the the error messages, enter
maintenance mode using the root password and then immediately halt the
system:
telinit 0
2: Reboot in failsafe mode and login with the root password:
boot -F failsafe
3: You should get instructions telling you what to do. If you have a mirrored
system they instruct you to fix the "primary" side of the mirror and
reboot - presumptively with md resyncing the other disk to match. However,
this did not work for me, so I had to repeat the process and perform the
operation on both disks [substitute your own disks here]:
mount /dev/dsk/c0t0d0s0 /mnt
bootadm update-archive -R /mnt
umount /mnt
mount /dev/dsk/c0t1d0s0 /mnt
bootadm update-archive -R /mnt
umount /mnt
halt
3: Reboot the system. Since I had installed a bunch of patches along with
this one, I decided to do a "boot -r" just to be on the safe side. However,
I doubt that this is necessary. Surprisingly, the system came up clean
and the mirrors didn't even need to resync! I rebooted a couple of times
after that and everything seems to me OK.
I agree that Sun should have immediately pulled these patches after the first
report. What are they waiting for?
Regards,
--
Jeffery Small
jeffery, it seems, yo udid not have to resync, because you wrote the
new boot block on both disks, that build your boot md-devices ;)
Just looking at sunsolve again i found the problem described in
http://sunsolve.sun.com/search/document.do?assetkey=1-1-6772822-1
seems to affect NOT ONLY the MPxIO and FC disks! So maybe generally
installing 125556-01 (for x86!) (125555-01 for Sparc) before
installing this 13713[78]-09 patch would avoid the situation we ran
into.
Regards,
Burkard