ESOS and ESX - shared datastore

293 views
Skip to first unread message

Martin Dumont

unread,
Apr 2, 2015, 11:10:43 AM4/2/15
to esos-...@googlegroups.com
Hello, I have some issues configuring and using shared datastore under two ESXi nodes.
One datastore from one ESOS shared to two ESX is ok.
When I try creating the second datastore on the second ESX (while having the disk mapped on both ESX), it gets lock in "loading...".

The ESX nodes are stuck.  The VM are working great, but I can't add disks and scan anything.

I would like to know if there is some special configuration to do in order to have ESX share datastore correctly.
I didn't had these issues while working with ISCSI disks.



Then...


Marc Smith

unread,
Apr 2, 2015, 11:17:09 AM4/2/15
to esos-...@googlegroups.com
Hi,

Please post the following so we can help:
- The /var/log/kern.log and /etc/scst.conf files from each ESOS host.
- The /var/log/vmkernel.log file from each ESXi host.


--Marc

--
You received this message because you are subscribed to the Google Groups "esos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Steve Jones

unread,
Apr 2, 2015, 12:12:22 PM4/2/15
to esos-...@googlegroups.com
You dont want to CREATE the datastore on the second ESXi machine, you should be able to just rescan, and it should find it, if the zoning is all right, and the initiators are both in the group..
 
I've got a similar setup here at home, and it's working great - I built an ESOS machine with four 1tb 7200 SATA disks, and one 3tb SATA disk, and they're all separate LUNs going to two ESXi machines.  I was cheap, so I didn't buy a FC switch, but instead bought a 4-port FC HBA for my ESOS machine, so my connections are point to point, but I used to have the same thing connected through two FC switches, and it worked the same..
 
Are you using the block driver, or the disk driver?  I had trouble when using the disk driver in ESOS, but it works fine on block_io, using the default block size of 512  (I think that's what it is).  if I increased that 512, it failed for me too, but not on only the second machine - it failed to format on the first machine..
-Steve

On Thu, Apr 2, 2015 at 11:10 AM, Martin Dumont <flo...@gmail.com> wrote:

Dan Swartzendruber

unread,
Apr 2, 2015, 12:38:09 PM4/2/15
to esos-...@googlegroups.com
Yeah, it sounds almost like the 2nd esxi host is not able to write lockfiles or something...

Dan Swartzendruber

unread,
Apr 2, 2015, 12:39:16 PM4/2/15
to esos-...@googlegroups.com

True, Steve.  May just have been bad phrasing on his part.  If the second host tries to do 'add storage' on that LUN, it should see it as an existing LUN.

Martin Dumont

unread,
Apr 2, 2015, 1:08:39 PM4/2/15
to esos-...@googlegroups.com
You got it all of you.
I wasn't re-creating the datastore on the second ESX, but the problem was from the fact that with all my excitement in testing all this, I did set the block size on SAN2 to 1M instead of 512.  I think this is the culprit.  Everything seems to work fine.
I'm using now the blockio on a LVM volume.  Fileio on vdisk seems to work ok too.  So it seems ESX doesn't require that much customization to make it work with ESOS.
About the LUN numbering, I found that I must use different numbers on both of my SAN's.  Are you also doing this?  (beside, I'm preparing to try some ALUA configuration (I have just received two 10gbe HBA's)).  If you can share experience on this, I would greatly appreciate.

Dan Swartzendruber

unread,
Apr 2, 2015, 1:25:00 PM4/2/15
to esos-...@googlegroups.com

I haven't yet gotten the 2nd vsphere host spun up (waiting for the infiniband card to arrive), but I'm surprised you need a different LUN.  What happens if you try the same one?  An error?  It just doesn't work?  Something else?

Martin Dumont

unread,
Apr 2, 2015, 1:35:01 PM4/2/15
to esos-...@googlegroups.com
It thinks it's another path to the same disks.
You can see 4 paths instead of two.

Dan Swartzendruber

unread,
Apr 2, 2015, 1:36:43 PM4/2/15
to esos-...@googlegroups.com

that's odd :)

Steve Jones

unread,
Apr 2, 2015, 1:57:41 PM4/2/15
to esos-...@googlegroups.com
If you want to be able to vMotion between the hosts, I believe the LUN numbers need to be the same for both ESXi hosts.  I have zero infiniband experience and zero ALUA experience though, so maybe this restriction is only FC..
 

Just in case it helps, I attached my scst.conf..  It's bigger than it needs to be, because I was using iSCSI while waiting for my ebay FC HBA purchase to arrive, but you get the idea.  I would think that if you are using a single port, that you'd have only one target with multiple LUNs and multiple initiators under it, whereas I have a target for each of two ports that have an ESXi server hooked to them..  The other 2 targets (ports) are disabled until I get more ESXi hosts.

Back when I had two switches, I had 2 targets, and EACH of them had both initiators shown, with the same LUNs, so that way multiple ESXi boxes saw multiple paths to the same LUNs and it worked great..  I could turn off a switch or pull a fiber cord, and it would fail right over.

IF you don't have a consistent view between the ESXi machines, you wont be able to vMotion, even if both boxes effectively see the same data.




--
scst.conf.txt

Curtis Grice

unread,
Apr 3, 2015, 9:04:35 PM4/3/15
to esos-...@googlegroups.com
On the subject of block sizes, it MUST be 512 in ESXi 5.5 and older. I have not checked version 6 but don't see why it would have changed. This is a limitation on the underlying GNU fdisk utility used by ESXi to partition the disks.

As for the fiber-channel, I wish I had some experience with that. My home lab is all 1Gb iSCSI.
Reply all
Reply to author
Forward
0 new messages