Here is what I did: I installed Solaris 10 update 8 from DVD onto a
x86 system with a pair of 250GB SATA disks. After installing the root
filesystem onto a UFS disk slice, I used Solaris Volume Manager to
create a mirror metadevice. I also created a mirrored zpool on
another pair of disk slices.
I used lucreate to make a copy of the s10u8 boot environnent onto
another disk slice without problems. I was able to run luactivate and
boot into it without problems.
Now, I want to use Live Upgrade to upgrade to Solaris 10 update 9.
I replaced the SUNWlu packages by running the
Solaris_10/Tools/Installers/liveupgrade20 script on the s10u9 DVD.
That script says I am supposed to install some patches to make live
upgrade work, but I can't find a clear list of which patches I need to
do that. The script said to find infodoc 206844 on sunsolve, but that
doesn't seem to exist anymore. Is it 1004881.1? But that was last
updated 2009-11-12 (before s10u9 was released?)
I installed these patches:
119255-76 SunOS 5.10_x86: Install and Patch Utilities Patch
119253-33 SunOS 5.10_x86: System Administration Applications Patch
119535-19 SunOS 5.10_x86: Flash Archive Patch
120200-16 SunOS 5.10_x86: sysidtool Patch
121431-54 SunOS 5.8_x86 5.9_x86 5.10_x86: Live Upgrade Patch
124629-15 SunOS 5.10_x86: CD-ROM Install Boot Image Patch
124631-44 SunOS 5.10_x86: System Administration Applications, Network,
and C
140915-02 SunOS 5.10_x86: cpio patch
Now, when I try to run lucreate, it just hangs; here is excerpt of
ptree root:
11387 lucreate -n s10u8c -m /:/dev/md/dsk/d40:ufs
11388 /etc/lib/lu/plugins/lupi_zones plugin
11389 /etc/lib/lu/plugins/lupi_svmio plugin
11391 /etc/lib/lu/plugins/lupi_bebasic plugin
11400 /sbin/sh /usr/lib/lu/lucreate -n s10u8c -m /:/dev/md/
dsk/d40:ufs
12012 /sbin/sh /usr/lib/lu/lumake -b 11400 -c -s s10u8a -n
s10u8c -i /etc/lu/INODE.3
12383 /sbin/sh /usr/lib/lu/lupop -i /etc/lu/ICF.3 -p
s10u8a -s /tmp/.liveupgrade.4816
12541 /sbin/sh /usr/lib/lu/lucopy -i /etc/lu/ICF.3 -c
s10u8a -p /etc/lu/ICF.1 -z /tmp
12909 /sbin/sh /usr/lib/lu/lucopy -i /etc/lu/ICF.3 -
c s10u8a -p /etc/lu/ICF.1 -z /tmp
12912 /bin/awk -F: { if ($2 == "/")
{ printf("%s %s\n", $3, $4); }
12913 /sbin/sh /usr/lib/lu/lumk_iconf s10u8b
12927 /usr/lib/lu/lumount -f -Z s10u8b
12929 /etc/lib/lu/plugins/lupi_svmio plugin
12931 /etc/lib/lu/plugins/lupi_zones plugin
I interrupt with ctrl-c. I try to ludelete, it fails with
Assertion failed: ("attempt to free unallocated memory", *ptrKey ==
(unsigned long long)_lu_malloc), file lu_mem.c, line 365
Even "init 6" fails with same message
But, if I "zpool export" and have no zfs filesystem, then lucreate,
etc
commands work. So, I think the zpool/zfs is the cause of this
problem.
If I put back the three SUNWlu packages from the s10u8 DVD, then I am
able to lucreate without problems (with a mounted zfs filesystem). I
then was able to luupgrade to s10u9 (using the s10u8 live upgrade
programs)
The s10u9 environment booted, but there were problems:
$ svcs -xv
svc:/milestone/multi-user:default (multi-user milestone)
State: offline since Thu Oct 14 12:03:42 2010
Reason: Start method is running.
See: http://sun.com/msg/SMF-8000-C4
See: man -M /usr/share/man -s 1M init
See: /var/svc/log/milestone-multi-user:default.log
Impact: 7 dependent services are not running:
svc:/system/boot-config:default
svc:/milestone/multi-user-server:default
svc:/application/autoreg:default
svc:/system/basicreg:default
svc:/system/zones:default
svc:/application/stosreg:default
svc:/application/cde-printinfo:default
$ tail -4 /var/svc/log/milestone-multi-user:default.log
[ Oct 14 12:03:42 Executing start method ("/sbin/rc2 start") ]
Executing legacy init script "/etc/rc2.d/S10lu".
last activated environment: <s10u8a>.
Assertion failed: ("attempt to free unallocated memory", *ptrKey ==
(unsigned long long)_lu_malloc), file lu_mem.c, line 365
I'll submit a support request to Oracle, but any advice in the
meantime?
Thanks
> $ tail -4 /var/svc/log/milestone-multi-user:default.log
>
> [ Oct 14 12:03:42 Executing start method ("/sbin/rc2 start") ]
> Executing legacy init script "/etc/rc2.d/S10lu".
> last activated environment: <s10u8a>.
> Assertion failed: ("attempt to free unallocated memory", *ptrKey ==
> (unsigned long long)_lu_malloc), file lu_mem.c, line 365
>
> I'll submit a support request to Oracle, but any advice in the
> meantime?
I had exactly the same problem with a Live Upgrade from S10 U7 to S10
U9. I'm not completely sure if the problem is the pure existance of the
zpool is the problem or its name. I suspected that the - in the pool
name was the problem, but it could well be that any pool would case the
same problem. We had a pool called j4500-01, and I worked around the
problem by replacing /sbin/zfs by a wrapper script which simply filters
out that poolname from zfs list output:
#!/bin/sh
base=`basename $0`
case $base in
zfs)
/sbin/zfs.bin $@ > /tmp/zfs.out.$$
;;
*mount)
/sbin/zfs.bin $base $@ > /tmp/zfs.out.$$
;;
esac
status=$?
cat /tmp/zfs.out.$$ | grep -v j4500-01
rm -f /tmp/zfs.out.$$
exit $status
It's a bit complicated because it is either called directly, but
sometimes also via a symlink from /etc/fs/zfs/*mount.
I meant to open a case with Oracle, but haven't gotten arount to it yet.
Hope this helps.
Rainer
--
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University
Thanks for the info. But, I named my zpool "tank" and used "zfs
create tank/scratch" for two zfs filesystems in my example above, so
no hyphen in a zfs/zpool name for me.
> Thanks for the info. But, I named my zpool "tank" and used "zfs
> create tank/scratch" for two zfs filesystems in my example above, so
> no hyphen in a zfs/zpool name for me.
You could still try the script to filter that pool name from zfs list
output. If this helps, we know that the pool name is irrelevant (which
would speak volumes for the quality of LU testing at Oracle ;-(
Have you tried migrating to ZFS root first?
--
Ian Collins
I've seen this exact issue *with* ZFS root, so this won't help.
Did you have more than one zpool on your system? Or did your /sbin/
zfs wrapper script always output an empty list?
I've upgraded systems with more than one pool, but never one with hyphen
in the name. So if there is a problem with pool names, it's with the
hyphen.
--
Ian Collins
Have you tried exporting the pool first?
--
Ian Collins
> Did you have more than one zpool on your system? Or did your /sbin/
Sure, the rpool and the data (j4500-01) pool.
> zfs wrapper script always output an empty list?
No, it still emitted the rpool info, otherwise lu wouldn't have worked.
OK, thanks again for all the good advice.
The root filesystem is on UFS. I also use zpool/zfs for non-OS
files. Some systems use Solaris zones/containers. The zone roots are
sometimes on zfs.
I could "zpool export ..." prior to live upgrade and then import the
zpools again once I have booted into s10u9. But, then the zones would
not work. (I guess I could do a "zone upgrade on attach" and hope
that upgrades the zone.) But I would rather use live upgrade to do
what it is supposed to do.
I think LU has become more fragile and buggy as more Solaris updates
have been released (the opposite of what should be happening as
Solaris 10 "matures") I get that LU has to now deal with UFS, ZFS,
zones, etc. But, I hate having to guess which patches I am supposed
to install before running LU and the suspense of wondering if it will
work or fail with a random shell error message or, in this case, a
pointlessly cryptic message.
Hi Doug,
The Superblock message is definitely coming from SAM-QFS, but this
message emitting from lu* and friends:
Assertion failed: ("attempt to free unallocated memory",
is CR 6990618 just recently filed.
It was difficult to reproduce but we hope to have some information
soon.
I've tested 100s of LUs from Solaris 10 10/08, 5/09, 10/09, and 9/10
on
lab systems with low memory and existing ZFS storage pools. Never
saw this and I've seen a lot.
Th
To reproduce, I just started with x86 S10U8 full+OEM install from the
DVD. The root filesystem and swap on SVM mirrored slices. A zfs
filesystem on a zpool mirrored on slices. Then I installed the
patches I mentioned in my first message. Then run the "liveupgrade20"
script from the S10U9 DVD to replace the three SUNWlu* packages.
Finally I ran lucreate to install a boot environment to another
mirrored SVM UFS filesystem. The lucreate hangs (I let it run over an
hour--when I ran the lucreate from S10U8 and without the live ugprade
patch 121431-54 it finished OK in 20 minutes.) After aborting the
lucreate, ludelete & init 6 fail with "Assertain failed..."
This zfs config consistently triggers the LU problems on s10u9:
$ zfs list -r
NAME USED AVAIL REFER MOUNTPOINT
tank 122K 19.2G 22K none
tank/scratch 21K 19.2G 21K /scratch
I set that up with these commands:
# zfs set mountpoint=none tank
# zfs set mountpoint=/scratch tank/scratch
That will cause lucreate, ludelete, init 6, etc! to fail by hanging or
"Assertion failed" messages.
The workaround is to set a non-null mountpoint for the zfs that
corresponds to the zpool (tank, in this case.)
I bet Rainer's "j4500-01" zfs had mountpoint=none which triggered the
problem.
This LU bug is a regression since the LU that came with s10u8 did not
have a problem with zfs mountpoints.
Very nice sleuthing. Unfortunately with zvol's there is no option to
set mountpoint to avoid this problem.