Question about ESOS Block Storage & ESXi 6.5 support

912 views
Skip to first unread message

최재훈

unread,
Jun 25, 2017, 3:24:39 AM6/25/17
to esos-users
Hi! Marc!

I saw your blog several years ago.

I sent you message that I found Mellanox vSphere OFED 1.8.2.5 with SRP support on ESXi 6.x @ STH...:)

OmniOS announced OmniOS support will be discontinued.

I found your ESOS was reached version 1.0 - GA level -, I'm test last 2 month with ESOS SRP Target.

Performance was great with SRP Target and also support ZFS on Linux like OmniOS.

Linux based compact kernel also support ConnectX-3 HCA.

ESOS ZFS LUN & SRP Target work rock solid, but ESXi 6.5 console log show me a problem message with ESOS LUN like attched images.

I made a zvol with 128k block for throughput performance, then make a SCST_BIO device with 512 bytes option and export it to SRP Target.

But ESXi 6.5 catch a ZFS zvol block size (not SCST_BIO block size) then blame it.

Is there any solutions?

photo_2017-05-20_00-44-04.jpg
photo_2017-05-20_00-44-31.jpg

Marc Smith

unread,
Jun 25, 2017, 2:29:35 PM6/25/17
to esos-...@googlegroups.com
On Sun, Jun 25, 2017 at 3:24 AM, 최재훈 <inbusine...@gmail.com> wrote:
Hi! Marc!

I saw your blog several years ago.

I sent you message that I found Mellanox vSphere OFED 1.8.2.5 with SRP support on ESXi 6.x @ STH...:)

OmniOS announced OmniOS support will be discontinued.

I found your ESOS was reached version 1.0 - GA level -, I'm test last 2 month with ESOS SRP Target.

Performance was great with SRP Target and also support ZFS on Linux like OmniOS.

Linux based compact kernel also support ConnectX-3 HCA.

Great to hear, thank you.

 

ESOS ZFS LUN & SRP Target work rock solid, but ESXi 6.5 console log show me a problem message with ESOS LUN like attched images.

I made a zvol with 128k block for throughput performance, then make a SCST_BIO device with 512 bytes option and export it to SRP Target.

Try the zvol with a 4K (4096 byte) block size, as use a 512 block size for the SCST device. According to that message, it supports a physical block size of 512 or 4096.



But ESXi 6.5 catch a ZFS zvol block size (not SCST_BIO block size) then blame it.

Yes, there is a "physical" block size and a "logical" block size... in the the case of ZFS volumes, the block size specified there comes across as the "physical" block size to the initiators, and the SCST device block size specified on the target (ESOS) side is the "logical" block size to the initiators.

--Marc

 

Is there any solutions?

--
You received this message because you are subscribed to the Google Groups "esos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

최재훈

unread,
Jun 27, 2017, 12:31:47 AM6/27/17
to esos-...@googlegroups.com
Hi!
4k zvol and default (512bytes) SCST_BIO test was completed.

But throughout performance was terrible on this configuration...:(

I think this is VMware's problem, not ESOS or ZFS.

I think ESXi 6.5 will be a terrible edition like Windows ME or Vista.

Best Regard,
Jae-Hoon Choi

2017년 6월 26일 (월) 03:29, Marc Smith <marc....@parodyne.com>님이 작성:

Is there any solutions?

To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "esos-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/esos-users/PdUldJ_7DTc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to esos-users+...@googlegroups.com.

Marc Smith

unread,
Jun 27, 2017, 12:45:03 AM6/27/17
to esos-...@googlegroups.com
On Tue, Jun 27, 2017 at 12:31 AM, 최재훈 <inbusine...@gmail.com> wrote:
Hi!
4k zvol and default (512bytes) SCST_BIO test was completed.

What's performance like if you make the ZFS volume size 512 and the SCST device block size 512? Just curious...



But throughout performance was terrible on this configuration...:(

I think this is VMware's problem, not ESOS or ZFS.

I think ESXi 6.5 will be a terrible edition like Windows ME or Vista.

Best Regard,
Jae-Hoon Choi

2017년 6월 26일 (월) 03:29, Marc Smith <marc....@parodyne.com>님이 작성:
On Sun, Jun 25, 2017 at 3:24 AM, 최재훈 <inbusine...@gmail.com> wrote:
Hi! Marc!

I saw your blog several years ago.

I sent you message that I found Mellanox vSphere OFED 1.8.2.5 with SRP support on ESXi 6.x @ STH...:)

OmniOS announced OmniOS support will be discontinued.

I found your ESOS was reached version 1.0 - GA level -, I'm test last 2 month with ESOS SRP Target.

Performance was great with SRP Target and also support ZFS on Linux like OmniOS.

Linux based compact kernel also support ConnectX-3 HCA.

Great to hear, thank you.

 

ESOS ZFS LUN & SRP Target work rock solid, but ESXi 6.5 console log show me a problem message with ESOS LUN like attched images.

I made a zvol with 128k block for throughput performance, then make a SCST_BIO device with 512 bytes option and export it to SRP Target.

Try the zvol with a 4K (4096 byte) block size, as use a 512 block size for the SCST device. According to that message, it supports a physical block size of 512 or 4096.



But ESXi 6.5 catch a ZFS zvol block size (not SCST_BIO block size) then blame it.

Yes, there is a "physical" block size and a "logical" block size... in the the case of ZFS volumes, the block size specified there comes across as the "physical" block size to the initiators, and the SCST device block size specified on the target (ESOS) side is the "logical" block size to the initiators.

--Marc

 

Is there any solutions?

--
You received this message because you are subscribed to the Google Groups "esos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "esos-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/esos-users/PdUldJ_7DTc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to esos-users+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "esos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+unsubscribe@googlegroups.com.

최재훈

unread,
Jun 27, 2017, 12:51:30 AM6/27/17
to esos-...@googlegroups.com
ZFS block size concern throughput performance.

Here is a  sample

128k bytes zvol block size + SCST_BIO 512 bytes + 2port 56Gb FDR HCA SRP Target = 6.6 ~ 7.4GB/s throughput 100% Read iometer

4k bytes zvol block size + SCST_BIO 512 bytes + 2port 56Gb FDR HCA SRP Target = 800 ~ 1.2GB/s throughput 100% Read iometer

There is a performance degrade on iometer 4k performance, too.

I don't want to remove ESXi console warning despite of degrade performance.

Best Regard,
Jae-Hoon Choi

최재훈

unread,
Aug 5, 2017, 4:52:55 AM8/5/17
to esos-users
Hi! again...:)

Is there any solution like FreeNAS's don't report physical block size like below in ESOS (or SCST Target)?

http://www.virten.net/2016/12/the-physical-block-size-reported-by-the-device-is-not-supported/

Marc Smith

unread,
Aug 11, 2017, 11:36:16 AM8/11/17
to esos-...@googlegroups.com
Sorry for the delay... I believe FreeNAS uses LIO (or something else)
and not SCST. I'm not aware of a flag/setting in SCST to disable
reporting the physical block size to initiators, or any way to
override the physical block size that is reported to the initiators.
If its merely a warning from ESXi, maybe you can live with the log
pollution (or perhaps there is a config setting for ESXi to silence
the warnings). But ESXi may also use that physical block size
information for something (eg, aligning I/O) and using something
different from what is recommended may have its own implications.

--Marc
> --
> You received this message because you are subscribed to the Google Groups "esos-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+...@googlegroups.com.

최재훈

unread,
Aug 11, 2017, 1:51:48 PM8/11/17
to esos-users
Hi!

I'm completed some test with VMFS6 on ESXi 6.5 update 1.

- VMFS6 can support 512e (512bytes sector format emulation)

Here is test lists.

* 128k zvol creation for these test

zfs create -s -b 128KB -s -o compression=lz4 -V 900g Storage/ESOS01_LUN11

01. vdisk_fileio

A. scstadmin -open_dev LUN11 -handler vdisk_fileio -attributes filename=/dev/Storage/ESOS01_LUN11,thin_provisioned

 - ESXi 6.5U1 shows me a 512e sector format on VMFS6 & support VAAI SCSI unmap

 - write performance was bad

B. scstadmin -open_dev LUN11 -handler vdisk_fileio -attributes filename=/dev/Storage/ESOS01_LUN11

 - ESXi 6.5U1 shows me a 512e sector format on VMFS6 & but can't support VAAI SCSI unmap

 - write performance was bad

C. scstadmin -open_dev LUN11 -handler vdisk_blockio -attributes filename=/dev/Storage/ESOS01_LUN11,thin_provisioned

 - ESXi 6.5U1 shows me a can't acknowledge sector format on VMFS6 & warning message that I write previously.

 - Performance was good

I think SCST vdisk_blockio doesn't have ability that pass non physical 512 sector format volume to 512e sector format (emulation) to initiator like vdisk_fileio.

Do you have any opinion?

When launch a ESXi 6.5, VMware add a 512e sector format support to their hypervisor but SCST can't support properly.

Do you have any resolution?

-- Jae-Hoon Choi

Marc Smith

unread,
Aug 11, 2017, 2:48:52 PM8/11/17
to esos-...@googlegroups.com
On Fri, Aug 11, 2017 at 1:51 PM, 최재훈 <inbusine...@gmail.com> wrote:
> Hi!
>
> I'm completed some test with VMFS6 on ESXi 6.5 update 1.
>
> - VMFS6 can support 512e (512bytes sector format emulation)

Yes, 512e and 4K native drives that are attached locally, NOT
external/SAN LUN's:
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2091600


>
> Here is test lists.
>
> * 128k zvol creation for these test
>
> zfs create -s -b 128KB -s -o compression=lz4 -V 900g Storage/ESOS01_LUN11
>
> 01. vdisk_fileio
>
> A. scstadmin -open_dev LUN11 -handler vdisk_fileio -attributes
> filename=/dev/Storage/ESOS01_LUN11,thin_provisioned
>
> - ESXi 6.5U1 shows me a 512e sector format on VMFS6 & support VAAI SCSI
> unmap
>
> - write performance was bad
>
> B. scstadmin -open_dev LUN11 -handler vdisk_fileio -attributes
> filename=/dev/Storage/ESOS01_LUN11
>
> - ESXi 6.5U1 shows me a 512e sector format on VMFS6 & but can't support
> VAAI SCSI unmap
>
> - write performance was bad
>
> C. scstadmin -open_dev LUN11 -handler vdisk_blockio -attributes
> filename=/dev/Storage/ESOS01_LUN11,thin_provisioned
>
> - ESXi 6.5U1 shows me a can't acknowledge sector format on VMFS6 & warning
> message that I write previously.
>
> - Performance was good
>
> I think SCST vdisk_blockio doesn't have ability that pass non physical 512
> sector format volume to 512e sector format (emulation) to initiator like
> vdisk_fileio.
>
> Do you have any opinion?

There are two components here, the "logical sector size" and the
"physical sector size". In SCST we can set the logical sector size as
seen by initiators using the "block_size" attribute for vdisk_*
devices, and for vdisk_* devices the physical sector size given to
initiators is the value of queue_physical_block_size() if its a block
device, and 4096 if its not.

This explains the difference you're seeing when using vdisk_fileio vs.
vdisk_blockio above (I assume when you're using vdisk_fileio the
physical sector size as seen by ESXi is 4096). For vdisk_blockio the
value is visible in /sys/block/nvme0n1/queue/physical_block_size (eg,
for the "nvme0n1" block device).


>
> When launch a ESXi 6.5, VMware add a 512e sector format support to their
> hypervisor but SCST can't support properly.

Technically, 512e and 4K are not supported by VMware for SAN storage,
only for locally attached drives. I'd stick with your best performing
option if the warnings aren't too much of a nuisance. =)

--Marc

최재훈

unread,
Aug 12, 2017, 6:37:37 AM8/12/17
to esos-...@googlegroups.com
Yes, 512e and 4K native drives that are attached locally, NOT
external/SAN LUN's:

Yes. Absolutely!
VMware said about local disk.

But ESOS is SAN.

ESXi 6.5 treat ESOS SAN vdisk_fileio volume as a 512e Disk.

But ESOS SAN vdisk_blockio isn't.

This is my question...:)

Best Regard,
- Jae_Hoon Choi


> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "esos-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/esos-users/PdUldJ_7DTc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to esos-users+unsubscribe@googlegroups.com.

Marc Smith

unread,
Aug 12, 2017, 3:53:16 PM8/12/17
to esos-...@googlegroups.com
On Sat, Aug 12, 2017 at 6:37 AM, 최재훈 <inbusine...@gmail.com> wrote:
> Yes, 512e and 4K native drives that are attached locally, NOT
> external/SAN LUN's:
>
> Yes. Absolutely!
> VMware said about local disk.
>
> But ESOS is SAN.
>
> ESXi 6.5 treat ESOS SAN vdisk_fileio volume as a 512e Disk.

With SCST, when using vdisk_fileio the physical sector size returned
is 4096, and if it's vdisk_blockio then the value of
"physical_block_size" of the block device is returned. And I believe
the definition of "512e" is when the physical sector size is 4096, so
that's why you're seeing it this way.

--Marc
>> You received this message because you are subscribed to a topic in the
>> Google Groups "esos-users" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/esos-users/PdUldJ_7DTc/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to

최재훈

unread,
Aug 13, 2017, 9:03:26 PM8/13/17
to esos-...@googlegroups.com
I see...:)

vdisk_fileio emulation 4k physical sector & 512 logical sector so VMFS6 acknowledge that 512e sector format.

Here is another question.

same physical configuration vdisk_blockio show me a good write performance.
But vdisk_fileio show me a bad performance unlike vdisk_blockio.

I'm heard vdisk_fileio use Linux page cache.

Do you have any opinion?

Best Regard,
Jae-Hoon Choi


>> > For more options, visit https://groups.google.com/d/optout.
>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "esos-users" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/esos-users/PdUldJ_7DTc/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to

>> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "esos-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an

> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "esos-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/esos-users/PdUldJ_7DTc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to esos-users+unsubscribe@googlegroups.com.

Marc Smith

unread,
Aug 14, 2017, 9:36:29 AM8/14/17
to esos-...@googlegroups.com
On Sun, Aug 13, 2017 at 9:03 PM, 최재훈 <inbusine...@gmail.com> wrote:
> I see...:)
>
> vdisk_fileio emulation 4k physical sector & 512 logical sector so VMFS6
> acknowledge that 512e sector format.
>
> Here is another question.
>
> same physical configuration vdisk_blockio show me a good write performance.
> But vdisk_fileio show me a bad performance unlike vdisk_blockio.
>
> I'm heard vdisk_fileio use Linux page cache.
>
> Do you have any opinion?

With vdisk_fileio are you using "nv_cache=1"? If not, try that and see
if there is any difference. And when you say "bad performance" is it
extremely poor performance, or close to what you get with
vdisk_blockio but lesser?

--Marc
>> >> You received this message because you are subscribed to a topic in the
>> >> Google Groups "esos-users" group.
>> >> To unsubscribe from this topic, visit
>> >> https://groups.google.com/d/topic/esos-users/PdUldJ_7DTc/unsubscribe.
>> >> To unsubscribe from this group and all its topics, send an email to
>> >> esos-users+...@googlegroups.com.
>> >> For more options, visit https://groups.google.com/d/optout.
>> >
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> > Groups
>> > "esos-users" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> > an
>> > email to esos-users+...@googlegroups.com.
>> > For more options, visit https://groups.google.com/d/optout.
>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "esos-users" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/esos-users/PdUldJ_7DTc/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to

최재훈

unread,
Aug 16, 2017, 2:09:44 AM8/16/17
to esos-...@googlegroups.com
I'm so sorry for late reply.

All of iops, throughput were lower then vdisk_blockio.

I have 56Gb FDR Infiniband Gateway SX6036G and bunch of 56Gb FDR Infiniband HCAs.

vdisk_blockio & vdick_fileio show me a difference performance on same configurations.

Exactly speaking Read performance was similar between vdisk_block and vdisk_fileio, but Write performance was terrible.

I found some pattern in vdisk_fileio.

vdisk_fileio show me a good write performance during initial stage on test.
but some time later write performance was decrease.

I'm also test NV_Cache=0 and 1, too.
But I met a same result.

Here is my theroy.

If Linux page cache in vdisk_fileio blame zfs disk cache configuration, I must reconfigure zfs zvol volume cache configuration.

Best Regard,
Jae-Hoon Choi


>> >> > For more options, visit https://groups.google.com/d/optout.
>> >>
>> >> --
>> >> You received this message because you are subscribed to a topic in the
>> >> Google Groups "esos-users" group.
>> >> To unsubscribe from this topic, visit
>> >> https://groups.google.com/d/topic/esos-users/PdUldJ_7DTc/unsubscribe.
>> >> To unsubscribe from this group and all its topics, send an email to

>> >> For more options, visit https://groups.google.com/d/optout.
>> >
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> > Groups
>> > "esos-users" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> > an

>> > For more options, visit https://groups.google.com/d/optout.
>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "esos-users" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/esos-users/PdUldJ_7DTc/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to

>> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "esos-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an

> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "esos-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/esos-users/PdUldJ_7DTc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to esos-users+unsubscribe@googlegroups.com.

Marc Smith

unread,
Aug 16, 2017, 9:17:26 AM8/16/17
to esos-...@googlegroups.com
On Wed, Aug 16, 2017 at 2:09 AM, 최재훈 <inbusine...@gmail.com> wrote:
> I'm so sorry for late reply.
>
> All of iops, throughput were lower then vdisk_blockio.
>
> I have 56Gb FDR Infiniband Gateway SX6036G and bunch of 56Gb FDR Infiniband
> HCAs.
>
> vdisk_blockio & vdick_fileio show me a difference performance on same
> configurations.
>
> Exactly speaking Read performance was similar between vdisk_block and
> vdisk_fileio, but Write performance was terrible.
>
> I found some pattern in vdisk_fileio.
>
> vdisk_fileio show me a good write performance during initial stage on test.
> but some time later write performance was decrease.

This has been my experience too, and I've always suspected (but
haven't officially tested) that this can be overcome by tweaking the
Linux page cache settings... lots of articles out there, this one
seems promising:
https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/


>
> I'm also test NV_Cache=0 and 1, too.
> But I met a same result.
>
> Here is my theroy.
>
> If Linux page cache in vdisk_fileio blame zfs disk cache configuration, I
> must reconfigure zfs zvol volume cache configuration.

Yes, could be a bit of both, but tune/tweak the Linux page cache settings too.
>> >> >> You received this message because you are subscribed to a topic in
>> >> >> the
>> >> >> Google Groups "esos-users" group.
>> >> >> To unsubscribe from this topic, visit
>> >> >>
>> >> >> https://groups.google.com/d/topic/esos-users/PdUldJ_7DTc/unsubscribe.
>> >> >> To unsubscribe from this group and all its topics, send an email to
>> >> >> esos-users+...@googlegroups.com.
>> >> >> For more options, visit https://groups.google.com/d/optout.
>> >> >
>> >> >
>> >> > --
>> >> > You received this message because you are subscribed to the Google
>> >> > Groups
>> >> > "esos-users" group.
>> >> > To unsubscribe from this group and stop receiving emails from it,
>> >> > send
>> >> > an
>> >> > email to esos-users+...@googlegroups.com.
>> >> > For more options, visit https://groups.google.com/d/optout.
>> >>
>> >> --
>> >> You received this message because you are subscribed to a topic in the
>> >> Google Groups "esos-users" group.
>> >> To unsubscribe from this topic, visit
>> >> https://groups.google.com/d/topic/esos-users/PdUldJ_7DTc/unsubscribe.
>> >> To unsubscribe from this group and all its topics, send an email to
>> >> esos-users+...@googlegroups.com.
>> >> For more options, visit https://groups.google.com/d/optout.
>> >
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> > Groups
>> > "esos-users" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> > an
>> > email to esos-users+...@googlegroups.com.
>> > For more options, visit https://groups.google.com/d/optout.
>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "esos-users" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/esos-users/PdUldJ_7DTc/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
Reply all
Reply to author
Forward
0 new messages