> $ show dev d
>
> Device Device Error Volume Free Trans Mnt
> Name Status Count Label Blocks Count Cnt
> $4$DIA0: (DISK41) HostUnavailable 0
> $4$DIA2: (DISK42) HostUnavailable 0
> $4$DIA3: (DISK41) HostUnavailable 0
> $4$DIA4: (DISK44) HostUnavailable 0
> $4$DIA5: (DISK45) HostUnavailable 0
On node CHAIN:
> $ show dev d
>
> Device Device Error Volume Free Trans Mnt
> Name Status Count Label Blocks Count Cnt
> $4$DIA0: (VELOVX) Online 0
> $4$DIA2: (VELOVX) Online 0
> $4$DIA3: (VELOVX) Online 0
> $4$DIA4: (VELOVX) Online 0
> $4$DIA5: (VELOVX) Online 0
> $ mount $4$dia0/override=id
> %MOUNT-F-MEDOFL, medium is offline
> $ show dev d
>
> Device Device Error Volume Free Trans Mnt
> Name Status Count Label Blocks Count Cnt
> $4$DIA0: (VELOVX) Online 0
Interesting that the Alpha node is not made aware that the drives are
not there, and continues to think they are "online" despite a mount
attempt having failed.
(Those drives are not only offline, but in a box, and the cabinet they
were in is partly dismantled so they are *really* offline :-)
VELVOX is MSCP-serving those DSSI disks. CHAIN - as the MSCP client -
is seeing the MSCP-served disks. There is no mechanism in the MSCP
protocol to indicate, that the MSCP server has lost connection to the
ISEs of the real DSSI disks.
Volker.
Say node 1 sees MSCP nadvertisements for DISK A by both nodes 2 and 3.
If node 2 loses connection to disk A, and node 2 can't tell other nodes
that it has lost access to diskj A, how is node 1 going to know that it
must now ask node 3 for any access to Disk A snce node 2 has lost its
connection to it ?
JF,
if the MSCP device is served by 2 nodes, node 1 will know about that.
During MOUNT, it will try both pathes to the disk via node 2 and also
via node 3. If one path works, it will be able to mount the disk.
Volker.
> if the MSCP device is served by 2 nodes, node 1 will know about that.
> During MOUNT, it will try both pathes to the disk via node 2 and also
> via node 3. If one path works, it will be able to mount the disk.
Bot how does Node1 learn that Node 2, via whom it had mounted the drive,
has lost connection to Disk A ? (so it can failover to node3 as MSCP
server for that disk).
>Volker Halle wrote:
Node 1 DUDRIVER will issue a MSCP Packack (I forget the exact name of the
command at that level) to Node 2. It will fail, and Node 1 will issue
another Packack to Node 3, where it will hopefully work if the device is
valid from there. As Volker mentioned, there isn't any provision in MSCP
to say "this device has gone away" asynchronously, it's not until the
device is attempted to be used that an error status is returned. There is
more complication at the PACKACK level trying to bring things online.
-Mike
(former DUDRIVER/MSCP Server maintainer)