[Iscsitarget-devel] IET chunk + ESXI5

28 views
Skip to first unread message

Yucong Sun (叶雨飞)

unread,
Mar 16, 2012, 1:32:54 AM3/16/12
to iscsitarget-devel
Hi, I've seen this enough times that I believe it is actually a problem

Once a week or so, one of my server will get trouble in one of my IET box on keep retrying a specfic task that never can get finished Here's what it looks like when it happens, all othter box still connects to the same target fine and working perfectly

Mar 15 08:09:30 vstore-2 kernel: [656805.027625] iscsi_trgt: Abort Task (01) issued on tid:1 lun:1 by sid:3662473236120064 (Unknown LUN)
Mar 15 08:09:32 vstore-2 kernel: [656807.039642] iscsi_trgt: Abort Task (01) issued on tid:1 lun:1 by sid:3662473236120064 (Unknown LUN)
Mar 15 08:09:34 vstore-2 kernel: [656809.054532] iscsi_trgt: Abort Task (01) issued on tid:1 lun:1 by sid:3662473236120064 (Unknown LUN)
Mar 15 08:09:35 vstore-2 kernel: [656809.879225] iscsi_trgt: Abort Task (01) issued on tid:1 lun:0 by sid:3662473236120064 (Function Complete)
Mar 15 08:09:36 vstore-2 kernel: [656811.066953] iscsi_trgt: Abort Task (01) issued on tid:1 lun:1 by sid:3662473236120064 (Unknown LUN)
Mar 15 08:09:38 vstore-2 kernel: [656813.083320] iscsi_trgt: Abort Task (01) issued on tid:1 lun:1 by sid:3662473236120064 (Unknown LUN)

And a IET restart solve the problem right away
 
Mar 15 15:12:09 vstore-2 kernel: [682100.606834] iscsi_trgt: Removing all connections, sessions and targets

This has appeared several times that I believe is some protocol level hiccup between esxi and iet, Had anyone else noticed it?

Thanks.

Emmanuel Florac

unread,
Mar 16, 2012, 5:45:06 PM3/16/12
to Yucong Sun, iscsitarget-devel
Le Thu, 15 Mar 2012 22:32:54 -0700 vous écriviez:

> This has appeared several times that I believe is some protocol level
> hiccup between esxi and iet, Had anyone else noticed it?

Which version of iet are you running? This sort of problem has been
quite common in the recent past even with VMWare 4, unless you're
running the latest trunk version.

--
------------------------------------------------------------------------
Emmanuel Florac | Direction technique
| Intellique
| <efl...@intellique.com>
| +33 1 78 94 84 02
------------------------------------------------------------------------

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Iscsitarget-devel mailing list
Iscsitar...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/iscsitarget-devel

Yucong Sun (叶雨飞)

unread,
Mar 16, 2012, 6:56:37 PM3/16/12
to Emmanuel Florac, iscsitarget-devel
I'm running trunk .

Emmanuel Florac

unread,
Mar 17, 2012, 6:49:28 AM3/17/12
to Yucong Sun, iscsitarget-devel
Le Fri, 16 Mar 2012 15:56:37 -0700 vous écriviez:

> I'm running trunk .
>

OK, I've made a new fresh install this week that should be connected to
an ESXi 5 host, I'll see if anything similar happens. Are you using
multiple targets, and do you have multiple LUNs on each target?

Yucong Sun (叶雨飞)

unread,
Mar 17, 2012, 7:35:20 PM3/17/12
to Emmanuel Florac, iscsitarget-devel
I only have a single target and single LUN, the problem is

for example , this target only have LUN 1 (no lun 0) , but somehow
esxi is blocked by trying to issue command to LUN0, The similiar
things happens on another machine which only have LUN 0 configured but
esxi starts to send traffic to LUN0, what could possibly be the
problem?

[88410.868981] iscsi_trgt: Abort Task (01) issued on tid:1 lun:0 by
sid:283674003964416 (Unknown LUN)
[88412.880949] iscsi_trgt: Abort Task (01) issued on tid:1 lun:0 by
sid:283674003964416 (Unknown LUN)
[88414.892930] iscsi_trgt: Abort Task (01) issued on tid:1 lun:0 by
sid:283674003964416 (Unknown LUN)

Cheers.

Ross S. W. Walker

unread,
Mar 17, 2012, 7:40:37 PM3/17/12
to Yucong Sun (叶雨飞), iscsitarget-devel, Emmanuel Florac
I would definitely make it LUN 0 as there cannot exist a LUN 1 without a LUN 0 per the SCSI spec.

-Ross

This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof.

Yucong Sun (叶雨飞)

unread,
Mar 17, 2012, 7:48:25 PM3/17/12
to Ross S. W. Walker, iscsitarget-devel, Emmanuel Florac
The problem is I have 2 machines that configured as such:

Target iqn.2009-10.com.outofwall:vstore1
Lun 1 Path=/dev/sdb,Type=fileio,ScsiId=vstore1,ScsiSN=1,IOMode=wt

Target iqn.2010-09.com.outofwall:vstore2
Lun 0 Path=/dev/sdb,Type=fileio,ScsiId=vstore2,ScsiSN=1,IOMode=wt

and if I use the same Lun 0 , ESXi will think they are the same device address

t10.94545xxxxxxxxx 673747F627516<LUN number>3000000000

I couldn't figure out how to make it not appear to be same address, so
I resorted to Lun number difference. what did I do wrong?

Ross S. W. Walker

unread,
Mar 17, 2012, 7:51:13 PM3/17/12
to Yucong Sun (叶雨飞), iscsitarget-devel, Emmanuel Florac
Ahh, problem is ScsiSN must be globally unique between all disks.

ESXi thinks they are two paths to the same disk with that config.

-Ross

Yucong Sun (叶雨飞)

unread,
Mar 17, 2012, 7:53:44 PM3/17/12
to Ross S. W. Walker, iscsitarget-devel, Emmanuel Florac
ahhh, I would try re-configure them right away today, But this doesn't
really explain the mysteriousness of unknown LUN commands.

And for most time, my current approach actually works fine.

On Sat, Mar 17, 2012 at 4:51 PM, Ross S. W. Walker

Ross S. W. Walker

unread,
Mar 17, 2012, 8:20:50 PM3/17/12
to Yucong Sun (叶雨飞), iscsitarget-devel, Emmanuel Florac
Well the results are undefined if two disks from different targets with different LUNs have the same ScsiSN, so no one can say if explains or doesn't explain.

-Ross

Yucong Sun (叶雨飞)

unread,
Mar 19, 2012, 5:11:52 PM3/19/12
to Ross S. W. Walker, iscsitarget-devel, Emmanuel Florac
I finally manged to switch everything to LUN0, let's see how it goes.
Can some one make it clearer what is allowed in the ScsiId and ScsiSN,
and how are they being used to generate the t10.xxxx value ?

Plus, as a reminder to whoever had the same issue, DO NOT just change
lun numbers and restart your IETD, doing that will cause a signature
mis-match on VMFS and cause it being detected as a snapshot...
To fix that you need to disconnect everything and re-signature it,
very painful :-(

But anyway, I got it working again, and I will watch out about the errors.

Cheers.

On Sat, Mar 17, 2012 at 5:20 PM, Ross S. W. Walker

Joseph L. Casale

unread,
Mar 19, 2012, 5:25:03 PM3/19/12
to iscsitarget-devel ‎[iscsitarget-devel@lists.sourceforge.net]‎
>I finally manged to switch everything to LUN0, let's see how it goes.
>Can some one make it clearer what is allowed in the ScsiId and ScsiSN,
>and how are they being used to generate the t10.xxxx value ?

See the ietd.conf man page. I pretty sure that would be in the rfc as well...

Ross S. W. Walker

unread,
Mar 19, 2012, 5:33:50 PM3/19/12
to suny...@gmail.com, iscsitarget-devel, Emmanuel Florac
Yucong Sun (叶雨飞) [mailto:suny...@gmail.com] wrote:
>
> I finally manged to switch everything to LUN0, let's see how it goes.
> Can some one make it clearer what is allowed in the ScsiId and ScsiSN,
> and how are they being used to generate the t10.xxxx value ?

The ScsiId is a 16 character field that is used to generate a unique
T10 SCSI Unit Identifier as part of page 0x83 in the SCSI inquiry. If
no value is presented then a 128-bit MD5 hash of the target's IQN and
the disk's LUN is generated and used. Since the page type is set to
binary this hash is used as-is. Most systems's use this to unquely
identify a disk.

The ScsiSN is a 32 character field that represents a disk's unique
vendor serial number as presented in page 0x80 of the SCSI inquiry.
If none is provided the default is to use the hexidecimal
representation of the ScsiId. One could also use uuidgen and strip
the hyphens out to generate a globally unique ScsiSN. As most people
realize by now, VMware uses this to uniquely identify a disk
throughout it's lifetime.

> Plus, as a reminder to whoever had the same issue, DO NOT just change
> lun numbers and restart your IETD, doing that will cause a signature
> mis-match on VMFS and cause it being detected as a snapshot...
> To fix that you need to disconnect everything and re-signature it,
> very painful :-(

It's the ScsiSN that causes a signature mismatch. The ScsiSN is
written into the label of the VMFS when created and used to identify
it throughout it's lifetime.

VMware has a write-up on how to safetly re-signature a disk whose
ScsiSN has changed. I believe there is a change due to have ESX
prompt the admin whether the disk was moved or is a snapshot when
it sees this change instead of assuming it's a snapshot and if
the admin selects 'Moved' it re-signature's the disk.

-Ross
______________________________________________________________________
Reply all
Reply to author
Forward
0 new messages