I have a little probling regarding problems with failovers in a HA
linux cluste. Here is my configuration :
2 machines : Sunset and Horizon , replicate a block device with DRBD
(raid1 over IP). With this there is Heartbeat, so when horizon the
master fails the hand goes to sunset. The replicated block device is
the target of iscsi. A third machine called client has the open-iscsi
initiator. The client attaches iscsi target on a floating IP address
that heartbeat controls and moves to sunset if horizon goes down.
Here is the problem : iscsi is atached and I can use the target as i
want ; i simulate a failover : horizon goes down, sunset takes the
lead, and with heartbeat launches iscsitarget. At this point any other
write/read on the iscsi device on client makes an I/O error.
So I d like to know if someone already tried such a configuration and
if it was successfull.
Any help would be nice .
Thx in advance
Nicolas
> > At this point any other write/read on the iscsi
> > device on client makes an I/O error.
> > So I d like to know if someone already tried such a
> > configuration and if it was successfull.
> > Any help would be nice .
> > Thx in advance
>
>
> > To complete my previous message :
> > It's like if open-isci doesnt sees the connection went down a
> > few seconds (the time needed for heartbeat to switch the
> > nodes and the services). And so the open-iscsi doesnt retry
> > to connect to the heartbeat floating IP.
>
> It should. What's the "heartbeat floating IP"?
Heartbeat can make a IP to float between the nodes of the cluster : For
example My isci portal is at 192.168.5.3 so any initiator connects to
this IP. But this IP can move if the master node goes down. As the
nodes have the same consistent data data its transparent for the
initiator.
>
> > Is there maybe a parameter to tune so that the initiator
> > tries to reconnect ?
>
> The parameter is called reopen_max, default 32.
>
> Send us iscsid trace and (if possible) ethereal trace. What rev# are
you
> using?
In fact I managed to make it work. I did a echo 1 >
/sys/bus/scsi/devices/<iscsi_controler_number>/rescan
and then it s ok.
I will try to tune the reopen_max too.
Thanks for the advices