I set up http://scst.sourceforge.net/comparison.html page, which
compares features of existing SCSI target subsystems for Linux. The
comparison includes SCST, STGT, IET and LIO.
I might be not fully correct somewhere, so, if you don't agree with me
about some item(s) in the comparison table, please let me know and I
will fix that.
Vlad
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Performance is a bit debatable.
I made some simple SCST and STGT tests last week, there were some where
SCST won, there were some where STGT won.
What was surprising to me, although STGT has a bigger CPU impact than
SCST, STGT was faster when reading from an encrypted (dm-crypt) volume,
on a system where the CPU is the bottleneck (it can't decrypt as fast as
HDD can deliver data).
STGT was much slower when reading from a non-encrypted volume, when
target had "blockdev --setra 16384 ..." for a given target.
On the other hand, STGT was faster than SCST with default blockdev
readahead settings (256).
If anyone's interested, I can show results in a readable form on Monday
(right now, I have only raw data which is pretty long and would be hard
to compare).
--
Tomasz Chmielewski
http://wpkg.org
The result "in average" was listed in the comparison. Of course, one
target can be better somewhere, another one somewhere else. That a
nature of storage: it's pretty hard to optimize for all at once.
BTW, if I remember correctly your logs, you didn't apply all the SCST
kernel patches on your kernel. Then your results aren't much applicable
to this comparison, because it assumes all SCST kernel patches applied.
> I made some simple SCST and STGT tests last week, there were some where
> SCST won, there were some where STGT won.
>
> What was surprising to me, although STGT has a bigger CPU impact than
> SCST, STGT was faster when reading from an encrypted (dm-crypt) volume,
> on a system where the CPU is the bottleneck (it can't decrypt as fast as
> HDD can deliver data).
>
> STGT was much slower when reading from a non-encrypted volume, when
> target had "blockdev --setra 16384 ..." for a given target.
> On the other hand, STGT was faster than SCST with default blockdev
> readahead settings (256).
>
> If anyone's interested, I can show results in a readable form on Monday
> (right now, I have only raw data which is pretty long and would be hard
> to compare).
>
>
>
--
Hello Tomasz,
It would be great if you could publish the details of your setup and
the tests you have run, such that we are able to reproduce the tests
and analyze the results further.
Bart.
OK. Seems I should add than STGT has additional emulation for SMC and
SSC devices as "Under development", correct?
> On Sun, Apr 5, 2009 at 5:56 AM, Vladislav Bolkhovitin <v...@vlnb.net
> <mailto:v...@vlnb.net>> wrote:
>
> Hi All,
>
> I set up http://scst.sourceforge.net/comparison.html page, which
> compares features of existing SCSI target subsystems for Linux. The
> comparison includes SCST, STGT, IET and LIO.
>
> I might be not fully correct somewhere, so, if you don't agree with
> me about some item(s) in the comparison table, please let me know
> and I will fix that.
>
> Vlad
> --
> To unsubscribe from this list: send the line "unsubscribe stgt" in
> the body of a message to majo...@vger.kernel.org
> <mailto:majo...@vger.kernel.org>
True.
> BTW, if I remember correctly your logs, you didn't apply all the SCST
> kernel patches on your kernel. Then your results aren't much applicable
> to this comparison, because it assumes all SCST kernel patches applied.
I made three tests:
- STGT (with standard Debian Lenny kernel)
- SCST with default build options (i.e. no "make debug2perf"), no kernel
patches (standard Debian Lenny kernel)
- SCST + "make debug2perf", with kernel patches (Debian Lenny .config +
SCST patches and proper option enabled)
I'll post the results shortly.
--
Tomasz Chmielewski
http://wpkg.org
--
> Hello Tomasz,
>
> It would be great if you could publish the details of your setup and
> the tests you have run, such that we are able to reproduce the tests
> and analyze the results further.
Here it is.
The target is running Debian Lenny 64bit userspace on an Intel Celeron 2.93GHz CPU, 2 GB RAM.
Initiator is running Debian Etch 64 bit userspace, open-iscsi 2.0-869, Intel Xeon 3050/2.13GHz, 8 GB RAM.
Each test was repeated 6 times, "sync" was made and caches were dropped on both sides before each test was started.
dd parameters were like below, so 6.6 GB of data was read each time:
dd if=/dev/sdag of=/dev/null bs=64k count=100000
Data was read from two block devices:
- /dev/md0, which is RAID-1 on two ST31500341AS 1.5 TB drives
- encrypted dm-crypt device which is on top of /dev/md0
Encrypted device was created with the following additional options passed to cryptsetup
(it provides the most performance on systems where CPU is a bottleneck, but with decreased
security when compared to default options):
-c aes-ecb-plain -s 128
Generally, CPU on the target was a bottleneck, so I also tested the load on target.
md0, crypt columns - averages from dd
us, sy, id, wa - averages from vmstat
1. Disk speeds on the target
Raw performance: 102.17 MB/s
Raw performance (encrypted): 50.21 MB/s
2. Read-ahead on the initiator: 256 (default); md0, crypt - MB/s
md0 us sy id wa | crypt us sy id wa
STGT 50.63 4% 45% 18% 33% | 32.52 3% 62% 16% 19%
SCST (debug + no patches) 43.75 0% 26% 30% 44% | 42.05 0% 84% 1% 15%
SCST (fullperf + patches) 45.18 0% 25% 33% 42% | 44.12 0% 81% 2% 17%
3. Read-ahead on the initiator: 16384; md0, crypt - MB/s
md0 us sy id wa | crypt us sy id wa
STGT 56.43 3% 55% 2% 40% | 46.90 3% 90% 3% 4%
SCST (debug + no patches) 73.85 0% 58% 1% 41% | 42.70 0% 85% 0% 15%
SCST (fullperf + patches) 76.27 0% 63% 1% 36% | 42.52 0% 85% 0% 15%
--
Tomasz Chmielewski
http://wpkg.org
(...)
> 2. Read-ahead on the initiator: 256 (default); md0, crypt - MB/s
>
> md0 us sy id wa | crypt us sy id
> wa STGT 50.63 4% 45% 18% 33% | 32.52 3% 62%
If you see these results not wrapped properly, it looks better here:
http://marc.info/?l=linux-kernel&m=123901387318274&w=4
So, I'll add for STGT:
- "Emulation of virtual tape and media changer" "Experimental"
- "Possibility to write to emulated CD-R devices" "+"
OK?
Thanks,
Vlad
> On Mon, Apr 6, 2009 at 6:32 PM, Vladislav Bolkhovitin <v...@vlnb.net
> <mailto:v...@vlnb.net>> wrote:
>
> ronnie sahlberg, on 04/06/2009 07:19 AM wrote:
>
> Maybe you should also show which device types each
> implementation can emulate.
>
>
> OK. Seems I should add than STGT has additional emulation for SMC
> and SSC devices as "Under development", correct?
>
> On Sun, Apr 5, 2009 at 5:56 AM, Vladislav Bolkhovitin
> <v...@vlnb.net <mailto:v...@vlnb.net> <mailto:v...@vlnb.net
> <mailto:v...@vlnb.net>>> wrote:
>
> Hi All,
>
> I set up http://scst.sourceforge.net/comparison.html page, which
> compares features of existing SCSI target subsystems for
> Linux. The
> comparison includes SCST, STGT, IET and LIO.
>
> I might be not fully correct somewhere, so, if you don't
> agree with
> me about some item(s) in the comparison table, please let me know
> and I will fix that.
>
> Vlad
> --
> To unsubscribe from this list: send the line "unsubscribe
> stgt" in
> the body of a message to majo...@vger.kernel.org
> <mailto:majo...@vger.kernel.org>
> <mailto:majo...@vger.kernel.org
Good! You proved that:
1. SCST is capable to work much better than STGT: 35% for md and 37% for
crypt considering maximum values.
2. Default read-ahead size isn't appropriate for remote data access
cases and should be increased. I slowly have been discussing it in past
few months with Wu Fengguang, the read-ahead maintainer.
Which IO scheduler on the target did you use? I guess, deadline? If so,
you should try with CFQ as well.
Thanks,
Vlad
About the related functionality see http://lkml.org/lkml/2009/3/26/290
> Has AEN not been replaced with sense codes nowadays anyway?
AEN, particularly, is one of the ways to deliver those sense codes
> Support for Asynchronous Event Notifications (AEN) + - - -
This is about the AEN infrastructure in the target's core
> AEN for devices added/removed + - - -
> AEN for devices resized + - - -
Those two are about ability to generate particular events
>
>
>
> Support for Asynchronous Event Notifications (AEN) + - - -
This is about support for AENs delivery by target driver.
>
>
>
>
>
> On Sun, Apr 5, 2009 at 5:56 AM, Vladislav Bolkhovitin <v...@vlnb.net
> <mailto:v...@vlnb.net>> wrote:
>
> Hi All,
>
> I set up http://scst.sourceforge.net/comparison.html page, which
> compares features of existing SCSI target subsystems for Linux. The
> comparison includes SCST, STGT, IET and LIO.
>
> I might be not fully correct somewhere, so, if you don't agree with
> me about some item(s) in the comparison table, please let me know
> and I will fix that.
>
> Vlad
> --
> To unsubscribe from this list: send the line "unsubscribe stgt" in
> the body of a message to majo...@vger.kernel.org
> <mailto:majo...@vger.kernel.org>
Note that crypt performance for SCST was worse than that of STGT for
large read-ahead values.
Also, SCST performance on crypt device was more or less the same with
256 and 16384 readahead values. I wonder why performance didn't increase
here while increasing readahead values? Could anyone recheck if it's the
same on some other system?
> Which IO scheduler on the target did you use? I guess, deadline? If so,
> you should try with CFQ as well.
I used CFQ.
--
Tomasz Chmielewski
http://wpkg.org
Hello Tomasz,
How is it possible that for this test the read performance through
STGT (50.63 MB/s) was higher than the read performance on the target
(50.21 MB/s) ? Are you sure that all read buffers were flushed before
this test was started ?
Bart.
You're looking at wrong columns:
md0 crypt
STGT 50.63 32.52
RAW 102.17 50.21
--
Tomasz Chmielewski
http://wpkg.org
--
I have repeated the test for the non-encrypted case. Setup details:
* target: 2.6.29.1 kernel, 64-bit, Intel E8400 CPU @ 3 GHz, 4 GB RAM,
two ST3250410AS disks, with /dev/md3 set up in RAID-1 with a stripe
size of 32 KB, local reading speed of /dev/md3: 120 MB/s, I/O
scheduler: CFQ.
* initiator: 2.6.28.7 kernel, 64-bit, Intel E6750 CPU @ 2.66 GHz, 2 GB RAM.
* network: 1 Gbit/s Ethernet, two systems connected back to back via a
crossed cable.
Each test was repeated four times. Before each test the target caches
were dropped via the command "sync; echo 3 >
/proc/sys/vm/drop_caches". The following test has been run on the
initiator:
sync; echo 3 > /proc/sys/vm/drop_caches; dd if=/dev/sdb of=/dev/null
bs=64K count=100000
Results with read-ahead set to 256 on the initiator, in MB/s:
STGT 56.7 +/- 0.3
SCST 56.9 +/- 1.1
Results with read-ahead set to 16384 on the initiator, in MB/s:
STGT 59.9 +/- 0.1
SCST 59.5 +/- 0.0
Or: slightly better results with the larger read-ahead value, and a
performance difference well below 1% between the STGT and SCST
performance results.
Bart.
This is a very big topic. In short, increasing RA alone isn't
sufficient, because, while the bigger value transferred over the uplink,
the backend storage can get rotated too far, so, to continue reading
data from it, there will be a need to wait for that rotation completed.
Try together with the RA increase also decrease max_sectors_kb to 128K
or even to 64K.
Also, the above actions done on the target can also be quite positive.
> Could anyone recheck if it's the same on some other system?
>
>> Which IO scheduler on the target did you use? I guess, deadline? If so,
>> you should try with CFQ as well.
>
> I used CFQ.
You didn't apply io_context-XXX.patch, correct? With it you should see a
noticeable increase, like in http://scst.sourceforge.net/vl_res.txt.
>> Could anyone recheck if it's the same on some other system?
>>
>>> Which IO scheduler on the target did you use? I guess, deadline? If
>>> so, you should try with CFQ as well.
>>
>> I used CFQ.
>
> You didn't apply io_context-XXX.patch, correct? With it you should see a
> noticeable increase, like in http://scst.sourceforge.net/vl_res.txt.
I didn't apply this one.
I used 2.6.26.x kernel and io_context-XXX.patch was for 2.6.27, 2.6.28
and 2.6.29 only; 2.6.27 fails to apply to 2.6.26.8 kernel (perhaps in a
trivial way, I didn't check).
--
Tomasz Chmielewski
http://wpkg.org
Yes, io_context patch for 2.6.26.x kernels doesn't exist, because it
isn't clear if it has the necessary functionality.
But it's worth for you to upgrade to 2.6.27.x. Have you seen
http://scst.sourceforge.net/vl_res.txt?
Vlad