For the non-exp this would be the best way to do it Phil..as long as the drives
belong to the rootvg volume group. And guess what, if you lose a drive, the
system won't crash.. and yes, this would be software mirroring..
Good Luck!!
One thing to watch out for - if your dump device is hd6. The dump device
cannot be mirrored. You have to create a different LV for the dump device.
--
Andy Carlson |\ _,,,---,,_
an...@andyc.carenet.org ZZZzz /,`.-'`' -. ;-;;,_
BJC Health System |,4- ) )-,_. ,\ ( `'-'
St. Louis, Missouri '---''(_/--' `-'\_)
Cat Pics: http://andyc.dyndns.org/animal.html
Andy Carlson <an...@andyc.carenet.org> wrote in message
news:ov5nv7...@andyc.carenet.org...
Quorum is nothing more than a collection of votes for all disks in a
volume group. Each VGDA and VGSA has a vote. If GREATER than 50%
(stated as 51% or more in the manual) is maintained in the VG, quorum is
maintained. If there is 50% or less VGDA/VGSAs quorum is lost. The
breakdown of how a disk has a vote is as follows:
1 disk VG - 1 disk has 2 VGDA and 2 VGSA
2 disk VG - 1 disk has 2VGDA/2VGSA and 1 disk has 1VGDA/1VGSA
3+ disk VG - EACH disk has 1VGDA/1VGSA
That means for a 1 disk VG, the loss of 1 disks is loss of quorum. For a
3+ disk VG, loss of 1 disk is NOT loss of quorum (only loss of more than
50% is loss of quorum). And for a 2 disk VG (the way many rootvgs are
mirrored) it depends on which disk is lost. If the disk with 2
VGDA/VGSAs is lost, quorum is lost. If the disk with 1 VGDA/VGSA is
lost, quorum is maintained.
Now, by default quorum is enabled. That means that if there is 51% or
more VGDA/VGSAs the VG will remain ONLINE and WILL be able to be varied
on and varied off. This is crucial for rootvg. To replace a disk of
rootvg, the system probably has to be brought down, rootvg varied off,
and rootvg HAS to be able to varyon again. However, if quorum is NOT
maintained, the VG will automatically varyoff and will not varyon unless
it is forced (I am not sure if rootvg can be forced). So if you have
rootvg mirrored and you only have 2 disks in rootvg, you could be in for
some trouble.
Please REMEMBER quorum is NOT related to the number of copies in a
mirror. It is only related to the number of DISKS in a VG. So if you
have rootvg with 4 disks and 2 copies, the loss of one disk is a loss of
only 25% quorum, and rootvg will remain up.
Now, you can try to protect yourself from the 2 disk VG by disabling
quorum (this is also desirable when mirroring disk subsystems where a
loss of one cabinet could be a loss of 50% disks). This allows the VG to
remain online as long as AT LEAST 1 VGDA/VGSA is good and available.
However, here is the catch. To varyon a VG with quorum disabled YOU MUST
HAVE 100% QUORUM FOR IT TO SUCCEED!! This can be worked around again
with the force varyon flag (not sure rootvg can do this).
So for someone mirroring rootvg, I would have AT LEAST 3 disks in
rootvg. You don't necessarily need 3 copies, just at least 3 disks. And
LEAVE QUORUM ENABLED. Following the 2 above (assuming your rootvg is
mirrored correctly... different subject), the system should be able to
handle a single disk failure, a system shutdown, disk replacement, and a
system back online for OS disk replacement. If quorum is DISABLED, you
may not get rootvg back online EVEN IF ONLY 1 DISK FAILED (100% quorum
needed for varyon without force flag.)
I hope this clears things up.
- Matt
Bill Verzal wrote:
>
> Wouldn't the system crash if quorum checking was left on ? Also, if you do
> mirror hd6 (and you CAN do it), it will just be unreadable by IBM if you
> ever need to do so. Do as it says below - create a separate dump device.
>
> Andy Carlson <an...@andyc.carenet.org> wrote in message
> news:ov5nv7...@andyc.carenet.org...
> > In article <19991022150843...@ng-fi1.aol.com>,
> > begg...@aol.com (BEggertDP) writes:
> > > Hi Phil,
> > >
> > > For the non-exp this would be the best way to do it Phil..as long as the
> drives
> > > belong to the rootvg volume group. And guess what, if you lose a drive,
> the
> > > system won't crash.. and yes, this would be software mirroring..
> > >
> >
> > One thing to watch out for - if your dump device is hd6. The dump device
> > cannot be mirrored. You have to create a different LV for the dump
> device.
--
_______________________________________________________________________
Matthew Landt - AIX and HACMP Cert. Specialist - la...@austin.ibm.com
Comments, views, and opinions are mine alone, not IBM's.
1. missing disks: If a disk dies, LVM eventually will mark it as "missing"
and will cease all i/o to it. VG cannot be varied if there are any disks
in "missing" state even if the quorum is present. Force option must be
used to varyon. (If you realize that the disk is in missing state, before
the VG is taken down if you run "chpv -v r <disk>", the disk will go into
"removed" state and VG can be varied on without force. Ofcourse, replacing
the disk is a better option than putting it in removed state - if the disk
is completely dead. Forcing a VG online will also put the missing disks
in removed state.).
2. rootvg: rootvg will varyon even if the disks are in missing state.
However, it will not if the quorum is not present. Quorum checking must
be turned off for the rootvg for it to stay up AND to varyon in case
of quorum loss.
When the VGs are mirrored, it is always safe to replace the disks and
remirror BEFORE taking the system down. After all, the very idea behind
the mirroring is to avoid taking the system down for maintenance in
case of disk failures.
Clear ? :-)
Two different scenarios if the quorum checking is disabled.
--
______________________________________________
Ram Pandiri pan...@austin.ibm.com
My views are mine, not IBMs.
If yes, how do I implement it??
Yes, it's called RAID. RAID levels 1 or 1/0, depending on who you ask.
This is less system overhead, better performance than RAID 5 under write
conditions, usually slower for reads than RAID 5.
The way to implement it is to buy a disk that supports RAID 1 (1/0). The
7135 supported this level of RAID.
- Matt
This was the BIGGEST complaint of students. Having quorum enabled means you
check quorum. If you have quorum (and it's enabled) the volumegroup can be
online and be brought up and down. However, if quorum is DISABLED you do NOT
check quorum. That means that as long as you have one valid disk (VGDA/VGSA)
avaialble, the VG will remain online. However, the BAD side to this is, 100% of
a VG's VGDAs and VGSAs must be available for the varyon process WITHOUT the
force flag.
The best way I could describe it is:
1) No quorum checking when VG is online:
The system does NOT check quorum on a live VG. As long as
one PV is accessable, the VG is accessable.
2) No quorum checkin when trying to get VG online:
Quorum is NOT checked to bring a VG online. That is becuase
a quorum (51% or more) is not needed to bring up a VG, 100% of
the PVs must be available and accessable (can override with
force flag.)
Watch the following process on a VG with 3 disks (1VGDA/1VGSA per disk). Loss
of one disk is NOT loss of quorum. The LV and Filesystem Log LV are mirrored
across all 3 disks. If one disk fails this VG is GOOD. Even if 2 disks fail
this VG is still good but there is not a quorum of disks available.
########################## QUORUM ENABLED #############################
# lsvg -l testvg
testvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
testvglog jfslog 1 3 3 open/syncd N/A
testlv jfs 100 300 3 open/syncd /testfs
# chvg -Qy testvg
# ####remove disk####
# cat /smit* > /testfs/datafile
# lsvg -l testvg
testvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
testvglog jfslog 1 3 3 open/stale N/A
testlv jfs 100 300 3 open/stale /testfs
# unmount /testfs
# varyoffvg testvg
# varyonvg testvg
PV Status: hdisk11 00810482436dae8d PVMISSING
hdisk12 00810482436db6f3 PVACTIVE
hdisk13 00810482436dbd0d PVACTIVE
varyonvg: Volume group testvg is varied on.
# 0516-068 lresynclv: Unable to completely resynchronize volume. Run
diagnostics if neccessary.
0516-932 /etc/syncvg: Unable to synchronize volume group testvg.
0516-068 lresynclv: Unable to completely resynchronize volume. Run
diagnostics if neccessary.
0516-932 /etc/syncvg: Unable to synchronize volume group testvg.
# mount /testfs
# cd /testfs
# ls
datafile lost+found
# #### unplug next disk ####
# cat /smit* >/testfs/datafile.2
There is an input or output error.
ksh: /testfs/datafile.2: 0403-005 Cannot create the specified file.
# lsvg -l testvg
0516-034 : Unable to access volume group device. Execute
redefinevg to build correct environment.
# unmount /testfs
# varyoffvg testvg
0516-010 lvaryoffvg: Volume group must be varied on; use varyonvg command.
0516-942 varyoffvg: Unable to vary off volume group testvg.
# varyonvg testvg
PV Status: hdisk11 00810482436dae8d PVNOTFND
hdisk12 00810482436db6f3 PVNOTFND
hdisk13 00810482436dbd0d PVINVG
0516-052 varyonvg: Volume group cannot be varied on without a
quorum. More physical volumes in the group must be active.
Run diagnostics on inactive PVs.
# #### Quorum enabled and loss of quorum required "-f" flag ####
# varyonvg -f testvg
PV Status: hdisk11 00810482436dae8d PVREMOVED
hdisk12 00810482436db6f3 PVREMOVED
hdisk13 00810482436dbd0d PVACTIVE
varyonvg: Volume group testvg is varied on.
# 0516-068 lresynclv: Unable to completely resynchronize volume. Run
diagnostics if neccessary.
0516-932 /etc/syncvg: Unable to synchronize volume group testvg.
0516-068 lresynclv: Unable to completely resynchronize volume. Run
diagnostics if neccessary.
0516-932 /etc/syncvg: Unable to synchronize volume group testvg.
# lsvg -l testvg
testvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
testvglog jfslog 1 3 3 closed/stale N/A
testlv jfs 100 300 3 closed/stale /testfs
########################## QUORUM DISABLED #############################
# lsvg -l testvg
testvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
testvglog jfslog 1 3 3 open/syncd N/A
testlv jfs 100 300 3 open/syncd /testfs
# chvg -Qn testvg
# #### REMOVE DISK ####
# cat /smit* > /testfs/datafile
# lsvg -l testvg
testvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
testvglog jfslog 1 3 3 open/stale N/A
testlv jfs 100 300 3 open/stale /testfs
# varyoffvg testvg
0516-012 lvaryoffvg: Logical volume must be closed. If the logical
volume contains a filesystem, the umount command will close
the LV device.
0516-942 varyoffvg: Unable to vary off volume group testvg.
# unmount /testfs
# varyoffvg testvg
# varyonvg testvg
PV Status: hdisk11 00810482436dae8d PVMISSING
hdisk12 00810482436db6f3 PVACTIVE
hdisk13 00810482436dbd0d PVACTIVE
0516-056 varyonvg: The volume group is not varied on because a
physical volume is marked missing. Run diagnostics.
# varyonvg -f testvg
PV Status: hdisk11 00810482436dae8d PVMISSING
hdisk12 00810482436db6f3 PVACTIVE
hdisk13 00810482436dbd0d PVACTIVE
varyonvg: Volume group testvg is varied on.
# 0516-068 lresynclv: Unable to completely resynchronize volume. Run
diagnostics if neccessary.
0516-932 /etc/syncvg: Unable to synchronize volume group testvg.
0516-068 lresynclv: Unable to completely resynchronize volume. Run
diagnostics if neccessary.
0516-932 /etc/syncvg: Unable to synchronize volume group testvg.
# #### there is 66% disk avail not 100%. ####
# #### Quorum was DISABLED #####
# #### varyon REQUIRED "-f" flag since 100% disk not available ####
# ### REMOVE NEXT PV ####
# mount /testfs2
mount: 0506-334 /testfs2 is not a known file system.
# mount /testfs
# cat /smit* > /testfs/datafile2
# ls /testfs/datafile2
/testfs/datafile2
# lsvg -l testvg
testvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
testvglog jfslog 1 3 3 open/stale N/A
testlv jfs 100 300 3 open/stale /testfs
# #### Quorum disabled and only 1 disk online. VG still up! ####
My logic has alway been, all you really need is one good vgda in any
lvm system. It is only because of IBM's concern for customer data
and taking the conservative approach that you needed a quorum for a
vary on. After all, while other systems might put their lvm definitions
in a file in the filesystems (why don't we call it 'repository' for
an example) and then let the administrator blow it away requiring a
complete restore, IBM takes the conservative approach where the vgda
is replicated across every volume in the group. That handles those
pesky i/o read errors when you can't get data from one of the vgda's.
But, all you REALLY need is one good one, and you should fly.
You mention,
> However, the BAD side to this is, 100% of
> a VG's VGDAs and VGSAs must be available for the varyon process WITHOUT the
> force flag.
But that just reinforces that you CAN force the VGDA on. As long as you know
you have one good one, that is the very GOOD side. Once again, IBM taking the
conservative approach says "if you want to vary the vg on with only one good
vgda - I want you, the customer to realize we don't have quorum."
That goes along with the same philosophy of, why you can't create a vg with
quorum initially disabled. You have to go in with a chvg to set it off. IBM
wants you to know, that you are making a choice and it may not be the most
secure choice (but we know, once in a row is good enough).
My logic has alway been, all you really need is one good vgda in any
lvm system. It is only because of IBM's concern for customer data
and taking the conservative approach that you needed a quorum for a
vary on. After all, while other systems might put their lvm definitions
in a file in the filesystems (why don't we call it 'repository' for
an example) and then let the administrator blow it away requiring a
complete restore, IBM takes the conservative approach where the vgda
is replicated across every volume in the group. That handles those
pesky i/o read errors when you can't get data from one of the vgda's.
But, all you REALLY need is one good one, and you should fly.
You mention,
> However, the BAD side to this is, 100% of
> a VG's VGDAs and VGSAs must be available for the varyon process WITHOUT the
> force flag.
But that just reinforces that you CAN force the VGDA on. As long as you know
you have one good one, that is the very GOOD side. Once again, IBM taking the
conservative approach says "if you want to vary the vg on with only one good
vgda - I want you, the customer to realize we don't have quorum.
That goes along with the same philosophy of, why you can't create a vg with
quorum disable initially. You have to go in with a chvg to set it off. IBM
wants you to know, that you are making a choice and it may not be the most
secure choice (but we know, once in a row is good enough).
-------------
Norman Levin - VM/dynAmIX inc - an IBM training partner
specializing in VM and AIX education
817 421-0123 Voice - 208 955-5282 Efax
Matthew Landt wrote:
> Now, by default quorum is enabled. That means that if there is 51% or
> more VGDA/VGSAs the VG will remain ONLINE and WILL be able to be varied
> on and varied off. This is crucial for rootvg. To replace a disk of
> rootvg, the system probably has to be brought down, rootvg varied off,
> and rootvg HAS to be able to varyon again. However, if quorum is NOT
> maintained, the VG will automatically varyoff and will not varyon unless
> it is forced (I am not sure if rootvg can be forced). So if you have
> rootvg mirrored and you only have 2 disks in rootvg, you could be in for
> some trouble.
> So for someone mirroring rootvg, I would have AT LEAST 3 disks in
> rootvg. You don't necessarily need 3 copies, just at least 3 disks. And
> LEAVE QUORUM ENABLED. Following the 2 above (assuming your rootvg is
> mirrored correctly... different subject), the system should be able to
> handle a single disk failure, a system shutdown, disk replacement, and a
> system back online for OS disk replacement. If quorum is DISABLED, you
> may not get rootvg back online EVEN IF ONLY 1 DISK FAILED (100% quorum
> needed for varyon without force flag.)
Just to add some info: at least in AIX 4.2.1 and above,
"The Logical Volume Manager (LVM) always uses the -f flag
to forcibly activate (vary on) a nonquorum rootvg; this operation involves
risk. The reason for the forced
activation is that the system cannot be brought up unless rootvg is
activated. In other words, LVM makes a
last ditch attempt to activate (vary on) a nonquorum rootvg even if only a
single disk is accessible. "
From the manuals.
And when you tell it to mirror rootvg, it sets it to non-quorum by default.
So it seems that some of the confusion at least has been alleviated by OS
improvements.