Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

iscsi over HAST backed storage partial success

3 views
Skip to first unread message

Kevin Day

unread,
Mar 9, 2010, 6:03:41 PM3/9/10
to freeb...@freebsd.org

I'm running istgt (iscsi target) using HAST backed storage. For the most part, it seems to work really well. I have ucarp running to change the IP that istgt is bound to, and modified the ucarp scripts to start/stop istgt depending on which side is the master. If I shut down the primary, the secondary takes over and all seems well.

However, if I reboot the secondary, the primary starts freezing up for long periods:

Mar 9 22:46:27 cs04 hastd: [iscsi1] (primary) Unable to r: Socket is not connected.
Mar 9 22:46:27 cs04 hastd: [iscsi1] (primary) Unable to co: Connection refused.
Mar 9 22:46:42 cs04 last message repeated 3 times
Mar 9 22:46:53 cs04 istgt[14298]: ABORT_TASK
Mar 9 22:47:35 cs04 last message repeated 3 times
Mar 9 22:48:02 cs04 hastd: [iscsi1] (primary) Unable to co: Operation timed out.
Mar 9 22:48:02 cs04 istgt[14298]: CmdSN(45748), OP=0x2a, ElapsedTime=74 cleared
Mar 9 22:48:02 cs04 istgt[14298]: istgt_iscsi.c: 640:istgt_iscsi_write_pdu: ***ERROR*** iscsi_write() failed (errno=32)
Mar 9 22:48:02 cs04 istgt[14298]: istgt_iscsi.c:3327:istgt_iscsi_op_task: ***ERROR*** iscsi_write_pdu() failed
Mar 9 22:48:02 cs04 istgt[14298]: istgt_iscsi.c:3867:istgt_iscsi_execute: ***ERROR*** iscsi_op_task() failed
Mar 9 22:48:02 cs04 istgt[14298]: istgt_iscsi.c:4337:worker: ***ERROR*** iscsi_execute() failed
Mar 9 22:48:02 cs04 istgt[14298]: CmdSN(490802), OP=0x2a, ElapsedTime=73 cleared
Mar 9 22:48:02 cs04 istgt[14298]: CmdSN(28387), OP=0x2a, ElapsedTime=73 cleared
Mar 9 22:48:14 cs04 istgt[14298]: ABORT_TASK
Mar 9 22:48:52 cs04 last message repeated 2 times
Mar 9 22:49:22 cs04 hastd: [iscsi1] (primary) Unable to co: Operation timed out.

As soon as the secondary comes back online, everything starts behaving again and all is well.

Is this expected behavior at this point, or should hastd not block like this?

-- Kevin

Pawel Jakub Dawidek

unread,
Mar 10, 2010, 12:46:12 PM3/10/10
to Kevin Day, freeb...@freebsd.org

It shouldn't of course block like this. There is a separate thread
responsible for reconnecting which shouldn't interact with I/O threads.
I'll try to reproduce and will let you know.

--
Pawel Jakub Dawidek http://www.wheelsystems.com
p...@FreeBSD.org http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!

Pawel Jakub Dawidek

unread,
Mar 10, 2010, 3:57:11 PM3/10/10
to Kevin Day, freeb...@freebsd.org
On Tue, Mar 09, 2010 at 05:03:41PM -0600, Kevin Day wrote:
>

Could you try the following patch?

http://people.freebsd.org/~pjd/patches/hastd_primary.c.patch

Kevin Day

unread,
Apr 6, 2010, 1:34:10 PM4/6/10
to Pawel Jakub Dawidek, freeb...@freebsd.org


Sorry for the long delay.

This does seem to fix that problem, yes. :)

-- Kevin

0 new messages