iscsi over RBD performance tips?

Wyllys Ingersoll

unread,

Aug 21, 2014, 5:10:04 PM8/21/14

to open-...@googlegroups.com

Im looking for suggestions about maximizing performance when using an RBD backend (Ceph) over a 10GB Ethernet link. In my testing, I see the read throughput max out at about 100Mbyte/second for just about any block sizes above 4K (below 4K it becomes horribly slow) and write operations are about 40Mbyte/second.

Using librados directly to read from the same backend pool/image yields much higher numbers, so the issue seems to be in the iscsi/bs_rbd backend. Regardless of the data sizes being read, the max thruput I am seeing is about 80% slower than using librados directly.

Any suggestions would be much appreciated.

thanks,

Wyllys

Michael Christie

unread,

Aug 22, 2014, 5:00:33 PM8/22/14

to open-...@googlegroups.com

Are you using linux for the initiator? If so, what is the throughput you get from just using this open-iscsi initiator connected to tgt with a ram disk?

I just installed RBD here for work, so let me check it out. What io tool are using and if it is something like fio could you post the arguments you used to run it?

--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to open-iscsi+...@googlegroups.com.
To post to this group, send email to open-...@googlegroups.com.
Visit this group at http://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.

Wyllys Ingersoll

unread,

Aug 25, 2014, 6:48:54 PM8/25/14

to open-...@googlegroups.com

Yes, using open-iscsi with tgt as the target side.

I used fio with the following job file. I only used 1 job (thread) because I want to see the max that a single job can read at a time. Even by maximizing the MaxXmitDataSegmentLength and MaxRecvDataSegmentLength, I dont see much difference.

[default]

rw=randread

size=20g

bs=16m

ioengine=libaio

direct=1

numjobs=1

filename=/dev/sdb

runtime=600

write_bw_log=iscsiread

Then I ran fio as follows:

$ fio iscsi.job

Gruher, Joseph R

unread,

Aug 25, 2014, 7:44:04 PM8/25/14

to open-...@googlegroups.com

Try setting some queue depth, like 64. Not sure what FIO defaults to if not specified but if it is 1 that won't yield good performance.

To unsubscribe from this group and stop receiving emails from it, send an email to open-iscsi+...@googlegroups.com<mailto:open-iscsi+...@googlegroups.com>.
To post to this group, send email to open-...@googlegroups.com<mailto:open-...@googlegroups.com>.

Mike Christie

unread,

Aug 25, 2014, 9:39:50 PM8/25/14

to open-...@googlegroups.com

Also see what Donald reccomends for increasing the iscsi and device
queue depths. You will want the device and fio queue depths to be
similar. For bs, you should use something like 256K. I think then you
also want --iodepth_batch to be around the queue depth.

For iscsi settings make sure they got negotiated for by running

iscsiadm -m session -P 2

after you login.

On the tgt side, you would also want to increase the per session queue
depth from 128 to whatever you set for node.session.cmds_max.

Also remember, open-iscsi is a little odd in that you cannot just change
the iscsid.conf settings and have them take affect the next login. You
would have to do discovery then relogin or if you want to set a iscsi
setting for a specific target portal you would do

iscsiadm -m node -T target -p ip -o update -n
mysetting_like_node.session.cmds_max -v 1024

I did not see your reply about if iscsi alone was slow or just iscsi
with rbd?

Wyllys Ingersoll

unread,

Aug 26, 2014, 8:55:18 AM8/26/14

to open-...@googlegroups.com

Thanks for the tips, Im changing the fio settings to see if I get an
improvement, I will post results later.

Im mostly concerned with iscsi/rbd, I havent yet isolated iscsi by
itself to a file, though I have run tests using straight librados
(non-iscsi) that show that better performance IS possible via rbd, but
Im trying to isolate the bottleneck between iscsi and tgt, but it may
end up being a combination of both. The tgt rbd backend is not using
the asynchronous IO functions from librados, so I will also modify
that code to see if it improves things. I've been talking to the tgt
developers separately about the same issue.

thanks,
Wyllys

> You received this message because you are subscribed to a topic in the Google Groups "open-iscsi" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/open-iscsi/UDnEeyzk4jo/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to open-iscsi+...@googlegroups.com.

Mike Christie

unread,

Aug 26, 2014, 1:33:02 PM8/26/14

to open-...@googlegroups.com

On 08/26/2014 07:55 AM, Wyllys Ingersoll wrote:
> Im mostly concerned with iscsi/rbd, I havent yet isolated iscsi by
> itself to a file, though I have run tests using straight librados

It is just easier to make sure iscsi is ok first since that is what we
are experts on here. If that is already slow, then we can start from
there by changing network/iscsi/scsi/block/tcp/fio tunables.

This is the first iscsi/rbd question on the list and I have just
installed it for the first time the other day.

Wyllys Ingersoll

unread,

Aug 26, 2014, 5:04:50 PM8/26/14

to open-...@googlegroups.com

iscsi performance to a RAM disk iscsi target using the same fio parameters, yields a 1GB/second throughput for both read and write operations, compared to about 400(r)/400(w) MB/second using an RBD based backend over a 10GB link.

My fio job file looks like this (change 'randread' to 'randwrite' for the write test and modify the filename device to switch between the ramdisk and rbd).

[default]
rw=randread
size=10g
bs=1m

ioengine=libaio
direct=1
numjobs=1
filename=/dev/sdb
runtime=600
write_bw_log=iscsiread

iodepth=256
iodepth_batch=256

Gruher, Joseph R

unread,

Aug 28, 2014, 9:24:43 PM8/28/14

to open-...@googlegroups.com

Note that librados can access data from all nodes in the cluster while iSCSI will funnel the data through whatever node or proxy is hosting the iSCSI target. What does your overall network and storage layout look like? Number of systems, number and type of disks, how are Ceph journals set up, etc?

From: open-...@googlegroups.com [mailto:open-...@googlegroups.com] On Behalf Of Wyllys Ingersoll
Sent: Tuesday, August 26, 2014 2:05 PM
To: open-...@googlegroups.com
Subject: Re: iscsi over RBD performance tips?

iscsi performance to a RAM disk iscsi target using the same fio parameters, yields a 1GB/second throughput for both read and write operations, compared to about 400(r)/400(w) MB/second using an RBD based backend over a 10GB link.

--

Wyllys Ingersoll

unread,

Aug 29, 2014, 9:53:41 AM8/29/14

to open-iscsi

I understand how rados and ceph work. The host being used for iscsi testing has a direct 10GB link to the cluster which has 6 storage servers each with 12 4TB drives for a total of 72 OSD and about 288TB total size. The journals are writing to disk (not SSD).

The problem I was trying to identify was the bottleneck in the iscsi-RBD chain and I think it is in the TGT-RBD backend since all other combinations appear to perform OK. Bumping the total threads that tgtd uses to service iSCSI commands seems to help some, though.

I realize tgtd is a separate project with different devs, but originally I was gathering info on both the initiator and the target side trying to squeeze the most performance from both sides.

--
You received this message because you are subscribed to a topic in the Google Groups "open-iscsi" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/open-iscsi/UDnEeyzk4jo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to open-iscsi+...@googlegroups.com.

Reply all

Reply to author

Forward