ceph

373 views
Skip to first unread message

Valentin Atanassov

unread,
Jun 2, 2013, 2:26:17 AM6/2/13
to esos-...@googlegroups.com
Any plans to include ceph libraries into esos ?

Marc Smith

unread,
Jun 2, 2013, 2:45:55 PM6/2/13
to esos-...@googlegroups.com
Hi,

I have never heard of "ceph" before so I took a look at the Wikipedia article:

I'm not sure how Ceph would fit with ESOS -- could you give me an example?


--Marc


On Sun, Jun 2, 2013 at 2:26 AM, Valentin Atanassov <valentin....@gmail.com> wrote:
Any plans to include ceph libraries into esos ?

--
You received this message because you are subscribed to the Google Groups "esos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Valentin Atanassov

unread,
Jun 3, 2013, 12:29:25 AM6/3/13
to esos-...@googlegroups.com

Valentin Atanassov

unread,
Jun 3, 2013, 12:37:25 AM6/3/13
to esos-...@googlegroups.com
Ceph’s RADOS Block Device (RBD) provides access to block device images that are striped and replicated across the entire storage cluster.  Basically clustered file system like gcluster fs. Ceph provides a POSIX-compliant network file system that aims for high performance, large data storage, and maximum compatibility with legacy applications

On Sunday, 2 June 2013 08:26:17 UTC+2, Valentin Atanassov wrote:

Jon Busey

unread,
Jun 8, 2013, 10:02:19 PM6/8/13
to esos-...@googlegroups.com
Ceph only really makes sense if you plan to have a sea of machines running esos, as ceph's main selling point is expandability ("the last storage subsystem you ever migrate to...").  A single or even dual node setup in the fashion esos is destined to run is not the typical ceph setup. 

It's also still pretty dynamic, and the updates are frequent.  I've seen it run on several racks of servers (very well) and grow dynamically as the compute needs increased.  It's object storage is production ready but the block storage seems to still have some stability issues.  I.e. the rados gateway is usable now but the RBD didn't make our production cut.

I think a more typical location for ceph would be spread across your compute nodes, run under an isolated container via cgroup if necessary, rather than on a dedicated device or couplet.

I hope this helps,

Jon

Adrian Lewis

unread,
Aug 13, 2014, 11:21:00 AM8/13/14
to esos-...@googlegroups.com
"Ceph only really makes sense if you plan to have a sea of machines running esos"

Or what would be pretty great is the ability to use ESOS as a gateway providing iSCSI/FC access to one or more Ceph RBD backing devices on a separate cluster. This would not mean using ESOS as a full Ceph node but adding just the client-side RBD functionality enabling ESOS to give access to a Ceph cluster via 'legacy' protocols.

One significant use-case is that of providing iSCSI access to an RBD cluster for Esxi, Xenserver and Hyper-V that don't and won't speak RBD natively. There's potentially some scope for Xenserver to speak RBD natively but I can't see MS or VMware providing native RBD access in the near future.

With two or three ESOS nodes using pacemaker/corosync this would be a great solution to retain the redundancy that Ceph provides. Might take a bit of work to implement any sort of active-active HA but simple active-passive (I think) would be relatively simple.

Fairly sure that Intank/Redhat are working on something similar (with either STGT or LIO) but a flash-based run-from-ram solution would be pretty cool.

Adrian Lewis

unread,
Aug 13, 2014, 11:25:15 AM8/13/14
to esos-...@googlegroups.com
Add in bcache locally on each ESOS node and you've got a very cool caching layer for the Ceph cluster as well that might appeal to those who can't afford 10G networking throughout.

Valentin Atanassov

unread,
Aug 20, 2014, 1:07:22 AM8/20/14
to esos-...@googlegroups.com
I know it is been long time since my first post but out of curiosity I have compiled SCST stack on  Ubuntu 14.04 and install ceph single node cluster on it. Then mounted RBD image and presented to ESXi host via iscsi SCST. Interestingly ceph single node cluster was performing better than Linux RAID 6  on the same node! Still I think is worth to have ceph code on ESOS.

Marc Smith

unread,
Aug 20, 2014, 1:03:11 PM8/20/14
to esos-...@googlegroups.com
Hi,

At this point, I picture ESOS being the back-end storage used for your Ceph nodes... ESOS is the platform for the disk arrays on your SAN, and then you provision storage to your Ceph servers via the SAN (and ESOS).

I think the inclusion of NAS features (like Ceph, Gluster, Samba, NFS, etc.) into ESOS is inevitable. I believe in the enterprise storage community, the trend is moving (and has been, really) to unified storage, and its going to be the expectation.

I've read very little about Ceph (and related technologies), but the idea of object based storage is intriguing. I think its going to be a while before we start seeing our current and new applications written (updated/modified) to use this type of storage natively.

I'm sure the Ceph people would like to see all of the SAN disk arrays replaced with Ceph clusters that provide block, network, and object based/level storage, and I think its definitely a possibility at some point in the future. Until then, for low latency, high IOPS storage, running with the fewest number of layers/bridging is going to bring the best performance.


So, for now, we need to get a stable branch of ESOS (coming soon), and then we can consider and discuss adding NAS (and NAS-like features) to ESOS. I feel its the right direction for this project, and will ultimately increase the adoption of it.


--Marc


--
You received this message because you are subscribed to the Google Groups "esos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Valentin Atanassov

unread,
Aug 20, 2014, 3:35:13 PM8/20/14
to esos-...@googlegroups.com
CEPH is currently incorporated into kernel since 2.6.34. In order to mount CEPH block devices there are two packages needed in ESOS ceph-common and ceph-fs-common. This will make ESOS perfect storage gateway between CEPH clusters and all types of initiators (iSCSI, FC, FCoE etc.). Further more is currently CEPH storage can not be mounted directly to ESXi or Windows Hyper-V servers. ESOS can become unique storage consolidation gateway including multi path capability.

On Sunday, 2 June 2013 08:26:17 UTC+2, Valentin Atanassov wrote:

Marc Smith

unread,
Aug 21, 2014, 11:28:27 AM8/21/14
to esos-...@googlegroups.com
I'll bite... but since you're familiar, you can save me some time...

Looks like a couple kernel options:
CONFIG_CEPH_LIB
CONFIG_CEPH_FS

Both need to be enabled? Is that all on the kernel?


User-land it appears this is the latest source package:

Correct? Do the user-land tools need to match specific kernel versions (like DRBD), or is the latest fine? We're on Linux 3.14.16 currently.

In the "ceph-0.80.5" source package (assuming that's the right version) what configure options are needed to produce binaries that include what you need in those two distribution-specific packages (ceph-common and ceph-fs-common)?


--Marc


--
You received this message because you are subscribed to the Google Groups "esos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+...@googlegroups.com.

Adrian Lewis

unread,
Aug 21, 2014, 1:43:35 PM8/21/14
to esos-...@googlegroups.com
I'd leave CONFIG_CEPH_FS out for now as Ceph FS is still experimental and if you're just looking at ESOS as a "storage gateway between CEPH clusters and all types of initiators (iSCSI, FC, FCoE etc.)" it's not really relevant I don't think. Regarding kernel version support, it looks like 3.14 and higher is well supported apart from 3.15 which apparently has some form of issue: http://ceph.com/docs/master/start/os-recommendations/

All very exciting!

Valentin Atanassov

unread,
Aug 21, 2014, 2:34:06 PM8/21/14
to esos-...@googlegroups.com
CONFIG_CEPH_FS not needed. Only if intend to use fuse. DRBD also not needed. Kernel should be 3.14 not 3.15 branch. Two libraries are needed librados2 and librbd1 and python-ceph for mounting RBD block devices


On Sunday, 2 June 2013 08:26:17 UTC+2, Valentin Atanassov wrote:

Valentin Atanassov

unread,
Aug 21, 2014, 2:43:40 PM8/21/14
to esos-...@googlegroups.com
Ax .. forgot also rbdmap start up script for mounting RBD block devices...


On Sunday, 2 June 2013 08:26:17 UTC+2, Valentin Atanassov wrote:

Marc Smith

unread,
Aug 22, 2014, 2:04:09 PM8/22/14
to esos-...@googlegroups.com
I'm getting towards the end of this, but in my testing, I noticed something -- not sure if its funny or I should be scared:

[marc.smith@marc-ws ceph-0.80.5]$ make DESTDIR=/tmp/ceph install
marc.smith@marc-ws tmp]$ du -skh ceph
1.9G ceph


Nearly 2 gigabytes!? Huh?

I'll see how far I can pair it down to get the minimum of what we need to mount those block devices, but it may be a no-go...


--
You received this message because you are subscribed to the Google Groups "esos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+...@googlegroups.com.

Marc Smith

unread,
Aug 22, 2014, 5:06:36 PM8/22/14
to esos-...@googlegroups.com
I stripped everything and got the temp. install directory size down to 118M. Still insane, but do-able. The static libraries will be left out too and that should put us below the 100M mark, plus we probably won't keep all binaries for now, since we're just after the block device support (from a Ceph server/node).

Can someone walk me through the commands they would use to make a Ceph block device visible on a ESOS host (examples)? This will help me weed out what isn't needed.


Thanks,

Marc

Valentin Atanassov

unread,
Aug 22, 2014, 6:28:54 PM8/22/14
to esos-...@googlegroups.com
It is actually very easy to mount rbd block devices. Somebody already write   good article about using rbdmap. Have a look at this http://www.sebastien-han.fr/blog/2013/11/22/map-slash-unmap-rbd-device-on-boot-slash-shutdown/


On Sunday, 2 June 2013 08:26:17 UTC+2, Valentin Atanassov wrote:

Valentin Atanassov

unread,
Aug 22, 2014, 7:35:42 PM8/22/14
to esos-...@googlegroups.com

Apparently we might non need to install ceph-common binaries . Have a look at this http://cephnotes.ksperis.com/blog/2014/01/09/map-rbd-kernel-without-install-ceph-common. It is look like we might need only kernel modules this will save 100 mb in temp.

On Sunday, 2 June 2013 08:26:17 UTC+2, Valentin Atanassov wrote:

Marc Smith

unread,
Aug 25, 2014, 5:18:48 PM8/25/14
to esos-...@googlegroups.com
I did the initial commit for this (r677); I paired ceph down to the 'rbd' binary (and required libraries). I have not created the rc/init script yet. After the new build posts, I'll test that and then work on the rc script.


--Marc


--
You received this message because you are subscribed to the Google Groups "esos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+...@googlegroups.com.

Valentin Atanassov

unread,
Aug 26, 2014, 1:50:02 AM8/26/14
to esos-...@googlegroups.com
Great news. Actually testing is when module rbd is able to load (modprobe rbd). Yesterday I have tested Ubuntu 14.04 fresh install. Rbd is loading without ceph-common and is able to mount rbd block device from ceph cluster.


On Sunday, 2 June 2013 08:26:17 UTC+2, Valentin Atanassov wrote:

Valentin Atanassov

unread,
Aug 26, 2014, 11:36:15 AM8/26/14
to esos-...@googlegroups.com
I have compiled 677 but rbd kernel module is missing also we need rbdmap script and ceph.conf files.


On Sunday, 2 June 2013 08:26:17 UTC+2, Valentin Atanassov wrote:

Marc Smith

unread,
Aug 26, 2014, 11:52:40 AM8/26/14
to esos-...@googlegroups.com
CONFIG_BLK_DEV_RBD was not mentioned as a requirement previously, so I did not include it. I enabled it (built-in) and will commit this later today.


--Marc


--
You received this message because you are subscribed to the Google Groups "esos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+...@googlegroups.com.

Marc Smith

unread,
Aug 26, 2014, 5:01:20 PM8/26/14
to esos-...@googlegroups.com
I just committed this... r679. If you could also try doing it using the 'rbd' tool I'd appreciate it. Once you confirm, I'll make an rc script and add it.


--Marc

Valentin Atanassov

unread,
Aug 27, 2014, 1:09:41 AM8/27/14
to esos-...@googlegroups.com
Unfortunately still not working. You are missing rbd.ko under /lib/moduleles/. See what standard Ubuntu with rbd have bellow:

/etc/bash_completion.d/rbd

/etc/init.d/rbdmap

/etc/ceph/rbdmap

/lib/modules/3.15.9/kernel/drivers/block/rbd.ko

/usr/bin/rbd

/usr/lib/python2.7/dist-packages/rbd.py

/usr/lib/python2.7/dist-packages/rbd.pyc

If module rbd.ko is compiled you should have it under
/lib/modules/3.14.16.esos.prod/kernel/drivers/block/


On Sunday, 2 June 2013 08:26:17 UTC+2, Valentin Atanassov wrote:

Marc Smith

unread,
Aug 27, 2014, 8:58:01 AM8/27/14
to esos-...@googlegroups.com
CONFIG_BLK_DEV_RBD is not compiled as a module, its built-in, which is why there is no 'rbd.ko' module.

Try just using the "rbd" tool on the command line... I looked at init-rbdmap and that appears to be all its doing:
rbd map $DEV $CMDPARAMS

Looks like this is the syntax for that command:
rbd map [image-name] [-o | –options map-options ] [–read-only]

I assume you'll need to create the /etc/ceph/rbdmap file; the init.d script (init-rbdmap) just loops over this file and runs the 'rbd map' command for each. So you can test it by running the 'rbd map' command directly and then I'll add the rest of the pieces in once we're sure that's all it needs.

The format of /etc/ceph/rbdmap looks like this:
# RbdDevice             Parameters
#poolname/imagename     id=client,keyring=/etc/ceph/ceph.client.keyring

After the using "rbd map" you should get a device node in /dev (somewhere).


--Marc



--
You received this message because you are subscribed to the Google Groups "esos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+...@googlegroups.com.

Valentin Atanassov

unread,
Aug 27, 2014, 4:13:38 PM8/27/14
to esos-...@googlegroups.com

Ok . I am using following command to mount image from real ceph cluster: rbd map disk01 --pool rbd --name client.admin -m 192.168.1.102  -k client.admin

And getting following :
sh: /sbin/udevadm: not found
rbd: '/sbin/udevadm settle' failed! (32512)
[root@localhost ceph]# rbd: '/sbin/udevadm settle' failed!
bash: rbd:: command not found



On Sunday, 2 June 2013 08:26:17 UTC+2, Valentin Atanassov wrote:

Marc Smith

unread,
Aug 27, 2014, 4:19:39 PM8/27/14
to esos-...@googlegroups.com
Try adding this option to your command line string: --no-settle


--Marc


--
You received this message because you are subscribed to the Google Groups "esos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+...@googlegroups.com.

Valentin Atanassov

unread,
Aug 27, 2014, 4:49:03 PM8/27/14
to esos-...@googlegroups.com
With that I am getting:  rbd: add failed: (5) Input/output error

Marc Smith

unread,
Aug 27, 2014, 4:56:45 PM8/27/14
to esos-...@googlegroups.com
Do this and see if there is anything interesting in kernel logs: dmesg | tail

Post that here.


--Marc


--
You received this message because you are subscribed to the Google Groups "esos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+...@googlegroups.com.

Valentin Atanassov

unread,
Aug 27, 2014, 5:30:35 PM8/27/14
to esos-...@googlegroups.com
Ok finally success. I have to set ceph cluster to :
ceph osd crush tunables legacy.  Because dmesg shows
[ 8131.240382] libceph: mon0 192.168.1.102:6789 socket error on read
[root@localhost ceph]# feature set mismatch

This link helped me :

Valentin Atanassov

unread,
Aug 27, 2014, 5:43:33 PM8/27/14
to esos-...@googlegroups.com
Dmesg :
[ 8131.240371] libceph: mon0 192.168.1.102:6789 feature set mismatch, my 384a042a42 < server's 2384a042a42, missing 20000000000

[ 8131.240382] libceph: mon0 192.168.1.102:6789 socket error on read
[root@localhost ceph]# feature set mismatch
I had to change ceph cluster:
ceph osd crush tunables legacy

So it is working fine now I can see device /dev/rbd0 .

[root@localhost ceph]# ls /dev/rbd0
/dev/rbd0
[root@localhost ceph]#

Marc Smith

unread,
Aug 29, 2014, 2:38:27 PM8/29/14
to esos-...@googlegroups.com
I got this committed (r681); there is a default/example /etc/ceph/rbdmap file now and the /etc/rc.d/rc.rbdmap script. Edit /etc/rc.conf and add "rc.rbdmap_enable=YES" to it so it starts on boot. Then edit the /etc/ceph/rbdmap file, then use "/etc/rc.d/rc.rbdmap start" to try it out. I included the '--no-settle' option in rc.rbdmap so it ignores the udev stuff.

Let me know what (if anything) needs to be changed.


--Marc


--
You received this message because you are subscribed to the Google Groups "esos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+...@googlegroups.com.

Valentin Atanassov

unread,
Aug 29, 2014, 6:50:04 PM8/29/14
to esos-...@googlegroups.com
Where should include ceph mon configuration( needed for connection to ceph cluster)?  Normally is defined into /etc/ceph/ceph.conf. Also we need auth key file to connect to ceph cluster. By default stays  at /etc/ceph.

Marc Smith

unread,
Aug 29, 2014, 8:07:24 PM8/29/14
to esos-...@googlegroups.com
Yeah, create the configuration files in the /etc/ceph directory. I can include a default (example) /etc/ceph/ceph.conf file.

--Marc


--
You received this message because you are subscribed to the Google Groups "esos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+...@googlegroups.com.

Valentin Atanassov

unread,
Aug 30, 2014, 6:32:21 AM8/30/14
to esos-...@googlegroups.com
Well it does work. need only ceph.conf with mons included and client key file from ceph cluster. However after reboot does not start. Need manually to issue /etc/rc.d/rc.rbdmap  start. I think boot script should start after network interface is started.

Marc Smith

unread,
Aug 30, 2014, 8:53:57 AM8/30/14
to esos-...@googlegroups.com
That's great! For the no-start-on-reboot issue, did you add "rc.rbdmap_enable=YES" to the /etc/rc.conf file?

I added a default/sample /etc/ceph/ceph.conf file and will commit this sometime in the next few days. Do you mind providing the lines you need to add/change in ceph.conf and your rbdmap file so I can include an example in the wiki?


Thanks,

Marc




--
You received this message because you are subscribed to the Google Groups "esos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+...@googlegroups.com.

Valentin Atanassov

unread,
Aug 30, 2014, 10:50:16 AM8/30/14
to esos-...@googlegroups.com
Yes I did. But I have experience with other Linux versions with rbmap not starting on boot. What I think is network stack should start first immediately rbdmap script and than scst. I am not sure whether we need some delay after network is started. As for the ceph conf is very simple. Typically each ceph cluster have one three or five monitors (nodes) this is needed to ensure quorum and avoid split brain scenario. See bellow example:

[mon.0]
         host = node1
         mon addr = 192.168.1.101:6789
[mon.1]
         host = node2
         mon addr = 192.168.1.102:6789
[mon.2]
         host = node3
         mon addr = 192.168.1.103:6789

This lines are sufficient for rbdmap to discover ceph cluster. On another note we need also key ring file. It contain one line with key string see bellow:
[client.admin]
        key = AQC2WFlTYPvVHhAAuk1jxZ4u86EkMdeUyn6LYA==

These two files simply can be scp from each  working ceph cluster node.
That is all. Very happy with working ceph client available on ESOS. Keep good work!!!

Marc Smith

unread,
Sep 3, 2014, 9:19:15 AM9/3/14
to esos-...@googlegroups.com
Thank you; I updated the wiki article here:

If you have a moment, take a look at that post... I have "id=client" but should it actually be "id=client.admin" (does that align with the configuration I described)?


For the boot order, it already starts in the order you described: rc.network, rc.rbdmap, rc.scst (and other stuff in-between). Check the /var/log/boot.log file for start-up output. Also check the kernel logs after a boot, may give some clues there.


--Marc


--
You received this message because you are subscribed to the Google Groups "esos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+...@googlegroups.com.

Valentin Atanassov

unread,
Sep 18, 2014, 1:49:07 AM9/18/14
to esos-...@googlegroups.com
I have been on vacation ... It should be actually client.admin

Marc Smith

unread,
Sep 19, 2014, 11:18:50 AM9/19/14
to esos-...@googlegroups.com
Thanks, I updated the wiki page with the change.


--Marc

--
You received this message because you are subscribed to the Google Groups "esos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages