Problem with mounting ceph storage to coreos/kubernetes

1,130 views
Skip to first unread message

Thomas Privat

unread,
Jul 14, 2016, 8:09:22 AM7/14/16
to CoreOS User
Hello,

I have some troubles with mounting a ceph storage into my kubernetes cluster.
My Kubernetes cluster is based on 3 coreos and 1 centos hosts.

Here are the steps Im doing to mount the ceph storage into kubernetes:

1. I create a Persistent Volume
2. I create a Persistent Volume claim
3. I create a DaemonSet which is starting a busybox image on each node with the claimed ceph sorage. Im starting a DaemonSet because I want to have this running on coreos and centos for comparison.

Here is my yaml file for the steps above:
apiVersion: v1
kind
: Secret
metadata
:
  name
: ceph-secret
data
:
  key
: QVFBUkySkZWVFE9PQ==
---
apiVersion
: v1
kind
: PersistentVolume
metadata
:
  name
: "ceph"
spec
:
  capacity
:
    storage
: "2Gi"
  accessModes
:
   
- "ReadWriteOnce"
  rbd
:
    monitors
:
     
- "172.28.150.31:6789"
     
- "172.28.150.32:6789"
     
- "172.28.150.33:6789"
    pool
: rbd
    image
: foo2
    user
: admin
    keyring
: "/etc/ceph/ceph.client.admin.keyring"
    secretRef
:          
       name
: "ceph-secret"
    fsType
: ext4
    readOnly
: false
  persistentVolumeReclaimPolicy
: "Recycle"
---
kind
: PersistentVolumeClaim
apiVersion
: v1
metadata
:
  name
: ceph-claim
spec
:
  accessModes
:
   
- ReadWriteOnce
  resources
:
    requests
:
      storage
: 2Gi
---
apiVersion
: extensions/v1beta1
kind
: DaemonSet
metadata
:
  name
: ceph-pod1
spec
:
 
template:
    metadata
:
      labels
:
        app
: ceph-pod1
      name
: ceph-pod1
    spec
:
      containers
:
     
- image: busybox
        name
: ceph-busybox
        command
: ["sleep", "60000"]
        volumeMounts
:
       
- name: ceph-vol1
          mountPath
: /usr/share/busybox
          readOnly
: false
        securityContext
:
          privileged
: true          
      volumes
:
       
- name: ceph-vol1
          persistentVolumeClaim
:
            claimName
: ceph-claim
      hostNetwork
: true
      hostPID
: true


After a while, the pod on the centos host are in running state while the pods on the coreos machines remains in "ContainerCreating" State:

ceph-pod1-2om7i   0/1       ContainerCreating   0          8m
ceph-pod1-8dwrc   0/1       ContainerCreating   0          8m
ceph-pod1-dvip9   1/1       Running             0          8m
ceph-pod1-rk57h   0/1       ContainerCreating   0          8m
ceph-pod1-rsw6l   0/1       ContainerCreating   0          8m

So it seems that the storage can be mounted to the pods on the centos host (I chekced in inside the busybox host).
On the centos host, the was one additional step I have done: "yum install -y ceph-common", as we know, on coreos this is not possible :-)

From my point of view it seems that there are some ceph utilities missing on the coreos host, A look into the docker log files showing the error " rbd: failed to modprobe rbd error:executable file not found in $PATH"

Jul 14 12:03:27 coreos2 kubelet-wrapper[727]: E0714 12:03:27.360815     727 disk_manager.go:56] failed to attach disk
Jul 14 12:03:27 coreos2 kubelet-wrapper[727]: E0714 12:03:27.361464     727 rbd.go:215] rbd: failed to setup
Jul 14 12:03:27 coreos2 kubelet-wrapper[727]: E0714 12:03:27.361854     727 goroutinemap.go:155] Operation for "kubernetes.io/rbd/[172.28.50.231:6789 172.28.50.232:6789 172.28.50.233:6789]:foo2" failed. No retries permitted until 2016-07-14 12:03:27.861840454 +0000 UTC (durationBeforeRetry 500ms). error: MountVolume.SetUp failed for volume "kubernetes.io/rbd/[172.28.50.231:6789 172.28.50.232:6789 172.28.50.233:6789]:foo2" (spec.Name: "ceph") pod "f8606d63-49ba-11e6-8908-326261363032" (UID: "f8606d63-49ba-11e6-8908-326261363032") with: rbd: failed to modprobe rbd error:executable file not found in $PATH
Jul 14 12:03:27 coreos2 kubelet-wrapper[727]: I0714 12:03:27.895290     727 reconciler.go:253] MountVolume operation started for volume "kubernetes.io/rbd/[172.28.50.231:6789 172.28.50.232:6789 172.28.50.233:6789]:foo2" (spec.Name: "ceph") to pod "f8606d63-49ba-11e6-8908-326261363032" (UID: "f8606d63-49ba-11e6-8908-326261363032").

Is there something missing on the coreos host ?
Or am I on the wrong way to mounting an external ceph storage into kuberenetes ?

Cheers, Thomas







Seán C. McCord

unread,
Jul 14, 2016, 2:00:46 PM7/14/16
to Thomas Privat, CoreOS User
It's not missing from the CoreOS _host_, it's missing from the kubelet container.  None of the Ceph utilities are distributed with the kubelet container.  You have to link them into place when running the kubelet.  It's worse, actually, because the kubelet has the paths hard-coded internally, so you have to match paths, as well.


--
You received this message because you are subscribed to the Google Groups "CoreOS User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to coreos-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Seán C McCord
CyCore Systems, Inc

Seán C. McCord

unread,
Jul 14, 2016, 2:02:49 PM7/14/16
to Thomas Privat, CoreOS User
Sorry... you're right; that's the other blocker.  The kubelet tries to load the module even if it is already loaded, and the module is not present inside the kubelet, either (as it should not be).  Worse, the kubelet fails if it fails to load the module (again, regardless of whether it is already loaded).


Aaron Levy

unread,
Jul 15, 2016, 2:22:52 PM7/15/16
to Seán C. McCord, Thomas Privat, CoreOS User
Yeah, there are a few hoops that need to be jumped through in the current state.

The kubelet will try and load the module as, Seán meantioned, so you would likely need to bind mount /lib/modules and possibly /sbin/modprobe into the kubelet container (so it can try and load the module, even if it is loaded already). See the kubelet-wrapper docs for examples of how to set RKT_OPTS with additional volume options.

Next, you would also need to make the rbd binary available inside the hyperkube container. This could be done by bind-mounting the binary from the host (following same instructions above), but it's possible that the kubelet container does not have all the necessary shared libs, and this would be a bit fragile. See my (similar) response regarding nfs tools: https://github.com/coreos/coreos-kubernetes/issues/372#issuecomment-218533390

Another option would be to install ceph-common in the hyperkube container itself (in addition to mounting /lib/modules & /sbin/modprobe). See the discussion here: https://github.com/kubernetes/kubernetes/issues/23924#issuecomment-206898274

Looking forward, we have a general issue open to track improving how we can ship these types of tools: https://github.com/coreos/coreos-kubernetes/issues/287

Or if you would like, you can open an issue in https://github.com/coreos/kubernetes to possibly add ceph-common to the coreos hyperkube image.

Seán C. McCord

unread,
Jul 15, 2016, 2:33:21 PM7/15/16
to Aaron Levy, Thomas Privat, CoreOS User
Simply bind-mounting rbd into hyperkube is, alas, not sufficient.  The rbd tool has a whole host of library dependencies.  Critically, these include librbd and librados from Ceph and also libboost.  If you can statically-compile rbd, then that should be fine, but I don't know if the ceph build environment provides for static compilation.

One of the github.com/ceph/ceph-docker folks produced a wrapper for rbd which executes rbd itself inside a container, but there is going to be a lot of directory mapping trickery to get that to work with the kubelet.

Thomas Privat

unread,
Jul 19, 2016, 7:33:15 AM7/19/16
to CoreOS User, ule...@gmail.com, ta9...@gmail.com
Hi !

I had a try with installing ceph-common into the container itself and have tho following outcome :

I created an ubuntu image with the ceph-common packet installed. When I try to start this container as a DaemonSet without mounting the Ceph storage via the yaml file, all containers coming up as expected.
When I connect directly on a node into a container with "docker exec -it 73498ea97a9b /bin/bash" and try to connect the ceph storage via "rbd map rbd/foo -m 172.28.28.28:6789" I get tho following output:

2016-07-19 11:28:48.617466 7fd82777a100 -1 did not load config file, using default settings.
rbd: sysfs write failed
rbd: map failed: (6) No such device or address

An "rbd info rbd/foo -m 172.28.28.28:6789" is working correctly:

2016-07-19 11:32:01.607591 7fc60a74e100 -1 did not load config file, using default settings.
rbd image 'foo':
    size 4096 MB in 1024 objects
    order 22 (4096 kB objects)
    block_name_prefix: rbd_data.4ba0d238e1f29
    format: 2
    features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
    flags:

Any Ideas what the problem with mapping the storage can be ?

Cheers, Thomas

Seán C. McCord

unread,
Jul 19, 2016, 7:52:12 AM7/19/16
to Thomas Privat, CoreOS User
It is complaining that it has no write access to /sys.  You'll need to give the container write access to both /dev and /sys.  The mapping of an RBD involves creating that rbd "block device" as a real, local device.

Thomas Privat

unread,
Jul 19, 2016, 8:09:48 AM7/19/16
to CoreOS User, ta9...@gmail.com
Hi Seán,

I had a deeper look into this, a "strace rbd map rbd/foo -m 172.28.50.231:6789" shows exactly where the problem is:
>open("/sys/bus/rbd/add", O_WRONLY)      = 4
>write(4, "172.28.50.231:6789 name=admin,ke"..., 56) = -1 ENXIO (No such device or address)

So it seems that the container has no write permission to /sys/bus/rbd/add (As you mentioned).

I double checked with the command:
echo "172.28.28.28 name=admin,secret=AQDFSGSDFGSDFGSDFGF9Hd37SgJSI2JFVTQ== rbd foo" | tee /sys/bus/rbd/add

This is working directly on the host but not inside the container.

So my last question (I hope :-)) now is how can I give write permission to /dev and /sys ?
Do I have to add these directorys to the volumes section in the yaml file with write permission ?
Or is there another way to set write permission to directories.
Btw: The Pods are started with:
        securityContext:
          privileged: true 


This doesnt mean that the container has root access at all ?

Pavel Kutishchev

unread,
Sep 13, 2016, 2:55:26 PM9/13/16
to CoreOS User, ule...@gmail.com, ta9...@gmail.com
Man i got a chance to avoid this issue, just create RBD image with option --image-feature layering. 
This will help you.
Reply all
Reply to author
Forward
0 new messages