I'm running some performance tests with open-iscsi (iops & throughput).
With open-iscsi over TCP, I see very low numbers:
* iops: READ - 20000, WRITE - 13000
* throughput: READ - 185, WRITE - 185
In open-iscsi.org, I see much higher numbers. Did anyone measure
open-iscsi performance lately? Can you share your numbers? Which
iscsid.conf file did you use?
Thanks,
Erez
You're talking only about throughput (not iops). Here's more info:
* benchmark: sgp_dd with 512K messages.
* Target: IPStor with a RAM disk (so the storage isn't the bottleneck)
* NIC: DDR IB HCAs (running IPoIB)
* I'm using the default iscsid.conf that comes with open-iscsi
Which target are you using? Are you suing the deafult iscsid.conf?
Anything else? Did you measure iops?
Thanks,
Erez
Yeah, I have been running tests for a while. What test program are you
using, what io sizes, and what io scheduler and kernel and what nic module?
And what are the throughput values in? Is that 185 KB/s.
With smaller IOs I get really bad iop numbers. We just talked about this
on the list. For throughput if I use larger IOs I get this though (this
is from some tests I did when I was testing what I was putting in git):
disktest -PT -T30 -h1 -K32 -B256k -ID /dev/sdb -D 0:100
| 2007/11/18-12:54:17 | STAT | 4176 | v1.2.8 | /dev/sdb | Write
throughput: 117615274.7B/s (112.17MB/s), IOPS 454.0/s.
disktest -PT -T30 -h1 -K32 -B256k -ID /dev/sdb
| 2007/11/18-12:49:58 | STAT | 3749 | v1.2.8 | /dev/sdb | Read
throughput: 96521420.8B/s (92.05MB/s), IOPS 374.6/s.
Normally for reads I also get 112 MB/s. For some reason with my home
setup where I tool those numbers though, I am getting really bad read
numbers. I did a patch to switch the read path to always use a thread
and then the throughput went back to 112. Not sure if you are hitting
that problem, but I also noticed with some targets and workloads I need
to switch to noop instead of cfq or throughput drops to 3-12 MB/s even
with large IOs like above.
This is using the default values in the iscsid.conf.
Oh yeah what kernel are you running? I ported linux-iscsi to a recent
kernel so I could test if there are a problem with open-iscsi or
something in the network, block or scsi layer (this is why I tried
switching open-iscsi to use a recv thread like how linux-iscsi does
instead of running from the network softirq).
I will try to tar up my linux-iscsi code, so you can try it out on your
kernel.
Oh ignore that. I am not seeing any write problems (have to switch to
noop io sched to get things going sometimes though). I just see slow
reads, so we are not hitting the same problem since both your numbers
look to be the same.
Now, I'm running the following benchmarks:
* For BW checks: sgp_dd bs=512 of=/dev/null if=/dev/sg2 bpt=1024
thr=8 count=20480000 time=1 dio=1 (and a similar command for write
ops)
* For small iops: sgp_dd bs=512 of=/dev/null if=/dev/sg2 bpt=2 thr=8
time=1 count=1000k time=1 (and a similar command for write ops)
I'm using v2.0-865.15 (user & kernel) from OFED 1.3 on SLES 10 sp1. I'm
running over IPoIB on a ConnectX IB HCA.
Now, I see the following numbers:
* BW: read - 260 MB/sec, write - 190 MB/sec
* iops: read - 27000, write - 17000
Actually, I'm trying to improve the performance of open-iscsi over iSER.
I justed wanted to compare it to open-iscsi over TCP (because the
numbers in open-iscsi.org look very impressive).
> And what are the throughput values in? Is that 185 KB/s.
>
No, it was MB/sec.
CPU utilization is low on both sides (initiator & target).
Erez
Thank´s!
--
Marcos G. M. Santos
SysAdmin - DIGILAB S.A.
Tel: 55 48 3234 4041
www.digilab.com.br
I have tried using open-iscsi and samba mounts and I'm getting around 10
MB/s of performance. I've asked in the HP forums:
And they say that this is the normal performance of iSCSI with SATA
drives, is that true?
Thanks,
Miguel
10 MB/sec sounds really poor.
What kind of disk performance do you get locally on the target/server from
the SATA RAID array? You can try iometer or some other disk benchmarking tools.
What kind of latency (ping roundtrip with 4k packets) do you have from the
initiator to the target? latency is one of the key factors limiting IOPS
you will get.
How's your network between the target and initiator? gigabit? jumbo frames?
flow control? what kind of switch are you using?
How's the performance with FTP/HTTP/CIFS between the target and the
initiator?
-- Pasi
4056 bytes from 10.0.6.41: icmp_seq=179 ttl=128 time=0.795 ms
--- 10.0.6.41 ping statistics ---
179 packets transmitted, 178 received, 0% packet loss, time 178261ms
rtt min/avg/max/mdev = 0.789/1.429/1.990/0.534 ms
> How's your network between the target and initiator? gigabit? jumbo frames?
> flow control? what kind of switch are you using?
>
gigabit in both ends. The switch I'm not in front of the machine so I
can't tell you until next week but my guess is that's a regular switch.
I think that I'm going to try a crossover cable of CAT 6 to avoid any
other issue. I think jumbo frames are deactivated but how can I know?
> How's the performance with FTP/HTTP/CIFS between the target and the
> initiator?
>
It is the same as mentioned before: 10 MB/s
Miguel
-----Original Message-----
From: open-...@googlegroups.com [mailto:open-...@googlegroups.com]
On Behalf Of Miguel Gonzalez Casta?os
Sent: Wednesday, January 09, 2008 1:07 PM
To: open-...@googlegroups.com
Subject: Re: Performance of open-iscsi over TCP
You can try running tests with different block/request sizes.. with small
requests (512 bytes) you can test the maximum IOPS you can get, and with
large request sizes (64 kB and more) you can test the maximum throughput you
can get..
> > What kind of latency (ping roundtrip with 4k packets) do you have from the
> > initiator to the target? latency is one of the key factors limiting IOPS
> > you will get.
> >
> ping -s 4048 10.0.6.41
>
> 4056 bytes from 10.0.6.41: icmp_seq=179 ttl=128 time=0.795 ms
>
> --- 10.0.6.41 ping statistics ---
> 179 packets transmitted, 178 received, 0% packet loss, time 178261ms
> rtt min/avg/max/mdev = 0.789/1.429/1.990/0.534 ms
>
You should try with 4096 bytes packets.. but anyway, that's close enough.
1000ms / 0.795 ms == 1258 IOPS
That's your maximum IO operations per second you can get (assuming 1
outstanding IO):
1258 * 4k = 5032k
That's around 5 MB/sec.
> > How's your network between the target and initiator? gigabit? jumbo frames?
> > flow control? what kind of switch are you using?
> >
> gigabit in both ends. The switch I'm not in front of the machine so I
> can't tell you until next week but my guess is that's a regular switch.
> I think that I'm going to try a crossover cable of CAT 6 to avoid any
> other issue. I think jumbo frames are deactivated but how can I know?
Yep, testing with straight cable between the target and the initiator is a
good idea.
> > How's the performance with FTP/HTTP/CIFS between the target and the
> > initiator?
> >
> It is the same as mentioned before: 10 MB/s
>
Well, this is your problem. Fix it first.
If you can't get better throughput with FTP/HTTP/CIFS, how could you with
iSCSI?
Sounds like you're running at 100 Mbit/sec.
-- Pasi
hrping -l 4096 -t 10.0.6.41
Statistics for 10.0.6.41:
[Aborting...]
Packets: sent=70, rcvd=70, error=0, lost=0 (0% loss) in 34.500313 sec
RTTs of replies in ms: min/avg/max: 0.304 / 0.331 / 0.426
from your calculations that means around 13 MB/s
Even at my home LAN where I have a very cheap gigabit switch and I use
CAT 5e (so I'd meant to get around 200 Mb/s) and I'm getting with hrping
around 15 MB/s
Maybe We have something wrong with our switches (my boss says that all
are gigabit) or with our cabling.
Thanks!
Miguel
OK.
Remember that 4k I mentioned was just an example, because that's the often
used block size by many filesystems..
If you do (or if your application does) larger requests, you can easily get
much more throughput.. for example 128k requests will give you much more
throughput..
And then you can have many outstanding io's active at the same time..
depending on the used queue depth, io elevator etc..
iometer (on windows) let's you choose number of outstanding io's..
But yeah, first fix the throughput to be good with FTP/HTTP/CIFS and then
start playing with iSCSI.
-- Pasi
hrping -L 4096 -t 10.0.7.41
RTTs of replies in ms: min/avg/max: 0.228 / 0.266 / 1.121
However, I have installed bing. The performance it gives is around 500
Mbps in the cross-over network.
The problem here is both machines are running Windows, the open-iscsi
client would be running in a Virtual Server machine running Debian.
Running bing from the host Virtual Server against the debian virtual
machine reduces the performance to around 60 Mbps. The tulip driver in
Debian for Virtual Server configures a Fast Ethernet network card.
Although Microsoft claims that the real performance is limited by the
physical network, apparently is not the case for Debian virtual machines.
About the MTU, I get 1500 in the client part, in the initiator. Do I
need a bigger MTU? Is it possible to change that?
Thanks,
Miguel
500 Mbps sounds OK, but not very good.. you should get more. Have you done
any tcp/ip stack option tweaking? There is a lot of network memory/socket
options to tune for gigabit links/transfers.. (at least in Linux).
Also some network driver settings/parameters affect the performance.
> The problem here is both machines are running Windows, the open-iscsi
> client would be running in a Virtual Server machine running Debian.
>
Ouch. I have never measured performance of Linux VM under MS Virtual
server.. so no idea about that.
> Running bing from the host Virtual Server against the debian virtual
> machine reduces the performance to around 60 Mbps. The tulip driver in
> Debian for Virtual Server configures a Fast Ethernet network card.
> Although Microsoft claims that the real performance is limited by the
> physical network, apparently is not the case for Debian virtual machines.
>
Well, here you go.. if you only get 60 Mbit/sec between the host and the
Linux VM, that's the problem..
Can you change the emulated NIC to something else from the Linux VM?
Does MS have optimized (=paravirtualized) NIC drivers available for
use in the Linux VM?
Have you tried VMware Server or ESX?
> About the MTU, I get 1500 in the client part, in the initiator. Do I
> need a bigger MTU? Is it possible to change that?
>
I don't know what MS Virtual Server supports.. you might get better
performance with jumbo frames (9000 bytes), but not necessarily.. it depends
a lot of the switches used etc.
Good ethernet flow control implementation is more important for iSCSI than
jumbo frames. If you need choose between flow control and jumbo frames,
choose flow control.. (some switches can't do both at the same time - and
many switches have bad flow control implementation - so be careful).
-- Pasi
If I have not already, try this
http://www.open-iscsi.org/bits/open-iscsi-2.0-868-test1.tar.gz
I am not sure exactly why yet, but we found that with someone's setup we
saw IO taking 6 or 7 seconds to get sent. With that tarball the problem
is fixed and performance is normal.