Re: [kubernetes/kubernetes] Error creating rbd image: executable file not found in $PATH (#38923)

3,376 views
Skip to first unread message

Lucas Käldström

unread,
May 30, 2017, 2:32:50 AM5/30/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

cc @kubernetes/sig-storage-bugs

I guess and think this is out of scope to include rbd, but very unfortunate bug indeed.


You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

Jan Šafránek

unread,
May 30, 2017, 4:03:41 AM5/30/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

I am working on kubernetes/features#278 - /usr/bin/rbd (and similar tools) could run in containers. On GKE you would run a daemonset with all Ceph utilities and you won't need anything on the nodes nor master(s).

Lucas Käldström

unread,
May 30, 2017, 4:24:30 AM5/30/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@jsafrane Would that incorporate dynamic storage provisioning for the controller manager as well?

Jan Šafránek

unread,
May 30, 2017, 4:39:56 AM5/30/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@luxas, probably not in alpha, but in the end yes, no /usr/bin/rbd on controller-manager host.

Andor Uhlár

unread,
Jun 15, 2017, 11:46:09 AM6/15/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

The hyperkube version of controller-manager includes a GlusterFS client (albeit a very out of date one), so why the disparity with Ceph? I understand that it can't and shouldn't support all the different storage provisioners, but I think it is a reasonable to expectation to at least support the most common ones.

Jonas Kongslund

unread,
Jun 16, 2017, 3:57:55 AM6/16/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

As I understand it, the efforts being made at https://github.com/kubernetes-incubator/external-storage will eventually result in provisioner support in a clean and nicely separated fashion. There is currently nothing for Ceph RBD (see kubernetes-incubator/external-storage#99) but there is preliminary support for CephFS.

On a related note, the recent Kubernetes Community Meeting 20170615 featured a 20 minutes demo of another provisioner in that project: Local Persistent Storage by Michelle Au (@msau42). That's expected to be included in Kubernetes 1.7 with more features planned for 1.8 and 1.9, so stuff is happening.

Guang Ya Liu

unread,
Jul 5, 2017, 4:54:08 AM7/5/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

I was confused here: Why the rbd was needed in the controller-manager container? I found that the failure log was found in kubelet, so seems we need to put the rbd binary to kubelet but not controller manager?

E0705 08:31:20.223470   31869 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/rbd/9f3c21b9-615a-11e7-aea9-525400852aca-rbdpd\" (\"9f3c21b9-615a-11e7-aea9-525400852aca\")" failed. No retries permitted until 2017-07-05 08:33:20.223442369 +0000 UTC (durationBeforeRetry 2m0s). Error: MountVolume.SetUp failed for volume "kubernetes.io/rbd/9f3c21b9-615a-11e7-aea9-525400852aca-rbdpd" (spec.Name: "rbdpd") pod "9f3c21b9-615a-11e7-aea9-525400852aca" (UID: "9f3c21b9-615a-11e7-aea9-525400852aca") with: rbd: failed to modprobe rbd error:executable file not found in $PATH

Yecheng Fu

unread,
Jul 5, 2017, 5:43:05 AM7/5/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@gyliu513

hi, there are two parts:

  • Volume Provisioning: Currently, if you want dynamic provisioning, RBD provisioner in controller-manager needs to access rbd binary to create new image in ceph cluster for your PVC.
    external-storage plans to move volume provisioners from in-tree to out-of-tree, there will be a separated RBD provisioner container image with rbd utility included (kubernetes-incubator/external-storage#200), then controller-manager do not need access rbd binary anymore.
  • Volume Attach/Detach: kubelet needs to access rbd binary to attach (rbd map) and detach (rbd unmap) RBD image on node. If kubelet is running on the host, host needs to install rbd utility (install ceph-common package on most Linux distributions).

Guang Ya Liu

unread,
Jul 5, 2017, 5:59:09 AM7/5/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

Ah, I see, I was not using dynamic provision, so no need to update controller-manager. But as now most people are using hyperkube and all kubenetes services are using same image, so if we can put the rbd binary to the hyperkube image, then it should works. Thanks @cofyc

Guang Ya Liu

unread,
Jul 5, 2017, 9:03:37 AM7/5/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@cofyc another question is if I run kubernetes directly on the host but not in container, then once I installed rbd utility, then ceph + kubernetes should works for both controller-manager and kubelet, right?

Simon Lepla

unread,
Jul 5, 2017, 11:11:25 AM7/5/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

We solved this by creating our own images and adding the kube-controller-binary in it.

Dockerfile:

FROM ubuntu:16.04

ARG KUBERNETES_VERSION=v1.6.4

ENV DEBIAN_FRONTEND=noninteractive \
    container=docker \
    KUBERNETES_DOWNLOAD_ROOT=https://storage.googleapis.com/kubernetes-release/release/${KUBERNETES_VERSION}/bin/linux/amd64 \
    KUBERNETES_COMPONENT=kube-controller-manager

RUN echo 'deb http://download.ceph.com/debian-kraken xenial main' > /etc/apt/sources.list.d/download_ceph_com_debian_kraken.list

RUN set -x \
    && apt-get update \
    && apt-get install -y --allow-unauthenticated \
        ceph-common=11.2.0-1xenial \
        curl \
    && curl -L ${KUBERNETES_DOWNLOAD_ROOT}/${KUBERNETES_COMPONENT} -o /usr/bin/${KUBERNETES_COMPONENT} \
    && chmod +x /usr/bin/${KUBERNETES_COMPONENT} \
    && apt-get purge -y --auto-remove \
        curl \
    && rm -rf /var/lib/apt/lists/*

Note that we are requesting a specific version of the ceph-common package (11.2.0)
Feel free to use/adapt for your needs

Yecheng Fu

unread,
Jul 5, 2017, 12:02:48 PM7/5/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@gyliu513 Yes

Yecheng Fu

unread,
Jul 13, 2017, 10:47:52 PM7/13/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

hi, guys,

Now you can avoid using customized kube-controller image, external-storage out-of-tree RBD provisioner is merged, you can use it instead。Here is guide:

1, Deploy standalone rbd-provisioner controller:

Note: Currently v0.1.0 is latest version, you can always check newest version here.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: rbd-provisioner
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: rbd-provisioner
    spec:
      containers:
      - name: rbd-provisioner
        image: "quay.io/external_storage/rbd-provisioner:v0.1.0"

2, Then configure storage class:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: rbd 
provisioner: ceph.com/rbd
parameters:
  monitors: <ceph monitors addresses>
  pool: <pool to use>
  adminId: <admin id>
  adminSecretNamespace: <admin id secret namespace>
  adminSecretName: <admin id secret name>
  userId: <user id>
  userSecretName: <user id secret name>

Now you can create PVC using rbd as storageClassName, e.g.:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ceph-pvc
spec:
  accessModes: 
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi 
  storageClassName: rbd

See also kubernetes-incubator/external-storage#206 kubernetes-incubator/external-storage#200.

cc @sbezverk @gyliu513 @Platzii @ianchakeres @kongslund @jingxu97 @thanodnl @v1k0d3n

zhangqx2010

unread,
Jul 18, 2017, 1:30:38 AM7/18/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@cofyc While using this external-provisioner, I got error in the provisioner container:

kubectl logs rbd-provisioner-1825796386-01nqz -n kube-system |more
I0718 02:09:24.062514       1 main.go:70] Creating RBD provisioner with identity: 1e7aa80f-6b5e-11e7-a22f-faa5ad76f27a
I0718 02:09:24.064650       1 controller.go:407] Starting provisioner controller 1e7bfb96-6b5e-11e7-a22f-faa5ad76f27a!
E0718 02:09:24.066793       1 reflector.go:201] github.com/kubernetes-incubator/external-storage/lib/controller/controller.go:411: Failed to list *v1.PersistentVolumeClaim: User "system:serviceaccount:kube-syst
em:default" cannot list persistentvolumeclaims at the cluster scope. (get persistentvolumeclaims)
E0718 02:09:24.066796       1 reflector.go:201] github.com/kubernetes-incubator/external-storage/lib/controller/controller.go:412: Failed to list *v1.PersistentVolume: User "system:serviceaccount:kube-system:de
fault" cannot list persistentvolumes at the cluster scope. (get persistentvolumes)

The issue may caused by RABC.
So I think the deployment should be modified like:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: rbd-provisioner
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: rbd-provisioner
    spec:
      containers:
      - name: rbd-provisioner
        image: "quay.io/external_storage/rbd-provisioner:v0.1.0
"
      serviceAccountName: persistent-volume-binder  ### add service account here

Yecheng Fu

unread,
Jul 18, 2017, 1:42:33 AM7/18/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@zhangqx2010

Yes, if default ServiceAccount cannot have enough permissions to access apiserver, you can add your own ServiceAccount for rbd-provisioner, which is recommended way, especially in production.

Also, namespace in guide example can also be changed, if you don't want to deploy it in kube-system namespace.

zhangqx2010

unread,
Jul 18, 2017, 3:01:29 AM7/18/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@cofyc
Why the key is not recognized by system?
kubectl logs rbd-provisioner-4059846714-blk2s -n kube-system

E0718 06:44:50.606634       1 goroutinemap.go:166] Operation for "provision-kube-system/ceph-pvc[47e83dbe-6b84-11e7-8632-00505682fcc1]" failed. No retries permitted until 2017-07-18 06:45:54.606613353 +0000 UTC (durationBeforeRetry 1m4s). Error: failed to create rbd image: exit status 22, command output: rbd: image format 1 is deprecated
2017-07-18 06:44:50.604709 7f83a44f2d80 -1 auth: failed to decode key 'QVFBUEhGOVoxTU0vQnhBQTZSaEdLblhmYXFqVFE3WjNqZ0xDc1E9PQo=='
2017-07-18 06:44:50.604719 7f83a44f2d80  0 librados: client.admin initialization error (22) Invalid argument
rbd: couldn't connect to the cluster!

Yecheng Fu

unread,
Jul 18, 2017, 3:28:31 AM7/18/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

hi, @zhangqx2010

$ echo QVFBUEhGOVoxTU0vQnhBQTZSaEdLblhmYXFqVFE3WjNqZ0xDc1E9PQo== | base64 -d
AQAPHF9Z1MM/BxAA6RhGKnXfaqjTQ7Z3jgLCsQ==
base64: invalid input

Your provided base64-encoded string QVFBUEhGOVoxTU0vQnhBQTZSaEdLblhmYXFqVFE3WjNqZ0xDc1E9PQo== is invalid. It seems like it has an extra char = at the end of string.

$ echo QVFBUEhGOVoxTU0vQnhBQTZSaEdLblhmYXFqVFE3WjNqZ0xDc1E9PQo= | base64 -d
AQAPHF9Z1MM/BxAA6RhGKnXfaqjTQ7Z3jgLCsQ==

Please remove it and retry.

What's your k8s version? kubectl version. In my local 1.6.4 environment, if base64-encoded string is invalid, apiserver does not accept it.

$ �cat <<EOF > t.yaml
apiVersion: v1
kind: Secret
metadata:
  name: test-secret
type: "kubernetes.io/rbd"  
data:
  key: QVFBUEhGOVoxTU0vQnhBQTZSaEdLblhmYXFqVFE3WjNqZ0xDc1E9PQo==
EOF
$ kubectl apply -f t.yaml 
Error from server (BadRequest): error when creating "t.yaml": Secret in version "v1" cannot be handled as a Secret: [pos 92]: json: error decoding base64 binary 'QVFBUEhGOVoxTU0vQnhBQTZSaEdLblhmYXFqVFE3WjNqZ0xDc1E9PQo==': illegal base64 data at input byte 56

Vladimir Pouzanov

unread,
Jul 18, 2017, 3:40:50 AM7/18/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

You can use the kube-system/persistent-volume-binder serviceaccount with rbd provisioner, although it still lacks the events/get permission (seems to be not critical for rbd-provisioner, though).

zhangqx2010

unread,
Jul 18, 2017, 4:15:13 AM7/18/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.0", GitCommit:"d3ada0119e776222f11ec7945e6d860061339aad", GitTreeState:"clean", BuildDate:"2017-06-29T23:15:59Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.1", GitCommit:"1dc5c66f5dd61da08412a74221ecc79208c2165b", GitTreeState:"clean", BuildDate:"2017-07-14T01:48:01Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

It's typo for my first try. After modified the key, the issue still exsits.

cat ceph.client.admin.keyring 
[client.admin]
	key = AQAPHF9Z1MM/BxAA6RhGKnXfaqjTQ7Z3jgLCsQ==
	caps mds = "allow *"
	caps mon = "allow *"
	caps osd = "allow *"

echo 'AQAPHF9Z1MM/BxAA6RhGKnXfaqjTQ7Z3jgLCsQ==' | base64 
QVFBUEhGOVoxTU0vQnhBQTZSaEdLblhmYXFqVFE3WjNqZ0xDc1E9PQo=
W0718 08:10:53.418428       1 rbd_util.go:71] failed to create rbd image, output 2017-07-18 08:10:53.386902 7ff38f0fad80 -1 did not load config file, using default settings.
rbd: image format 1 is deprecated
2017-07-18 08:10:53.416330 7ff38f0fad80 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
2017-07-18 08:10:53.416431 7ff38f0fad80 -1 auth: failed to decode key 'QVFBUEhGOVoxTU0vQnhBQTZSaEdLblhmYXFqVFE3WjNqZ0xDc1E9PQo='
2017-07-18 08:10:53.416446 7ff38f0fad80  0 librados: client.admin initialization error (22) Invalid argument

zhangqx2010

unread,
Jul 25, 2017, 4:35:29 AM7/25/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@cofyc
After successfully use your kubernetes-incubator/external-storage#200, pv is created as claimed.

#kubectl get pv
NAME                                       CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS    CLAIM                STORAGECLASS   REASON    AGE
pvc-b3a244da-7111-11e7-bb10-00505682fcc1   2Gi        RWO           Delete          Bound     default/dy-rbd-c-1   rbd-dynamic              11m
pvc-b3ab0cd5-7111-11e7-bb10-00505682fcc1   1Gi        RWO           Delete          Bound     default/dy-rbd-c-2   rbd-dynamic              11m
# kubectl get pvc
NAME         STATUS    VOLUME                                     CAPACITY   ACCESSMODES   STORAGECLASS   AGE
dy-rbd-c-1   Bound     pvc-b3a244da-7111-11e7-bb10-00505682fcc1   2Gi        RWO           rbd-dynamic    11m
dy-rbd-c-2   Bound     pvc-b3ab0cd5-7111-11e7-bb10-00505682fcc1   1Gi        RWO           rbd-dynamic    11m

But my deployment of one pod get errors:

Unable to mount volumes for pod "gocd-server-2-197958991-s1z14_default(b39c4442-7111-11e7-bb10-00505682fcc1)": timeout expired waiting for volumes to attach/mount for pod "default"/"gocd-server-2-197958991-s1z14". list of unattached/unmounted volumes=[dy-rbd-1 dy-rbd-2]

The provisioner logs:

#kubectl logs  rbd-provisioner-2785693406-7s3r0 -f
I0725 08:17:33.112058       1 provision.go:110] successfully created rbd image "kubernetes-dynamic-pvc-b5713380-7111-11e7-b630-46fbafb56e36"
I0725 08:17:33.112083       1 controller.go:801] volume "pvc-b3a244da-7111-11e7-bb10-00505682fcc1" for claim "default/dy-rbd-c-1" created
I0725 08:17:33.218255       1 provision.go:110] successfully created rbd image "kubernetes-dynamic-pvc-b5816b74-7111-11e7-b630-46fbafb56e36"
I0725 08:17:33.218305       1 controller.go:801] volume "pvc-b3ab0cd5-7111-11e7-bb10-00505682fcc1" for claim "default/dy-rbd-c-2" created
I0725 08:17:33.564263       1 controller.go:818] volume "pvc-b3a244da-7111-11e7-bb10-00505682fcc1" for claim "default/dy-rbd-c-1" saved
I0725 08:17:33.564283       1 controller.go:854] volume "pvc-b3a244da-7111-11e7-bb10-00505682fcc1" provisioned for claim "default/dy-rbd-c-1"
I0725 08:17:33.763928       1 controller.go:818] volume "pvc-b3ab0cd5-7111-11e7-bb10-00505682fcc1" for claim "default/dy-rbd-c-2" saved
I0725 08:17:33.763946       1 controller.go:854] volume "pvc-b3ab0cd5-7111-11e7-bb10-00505682fcc1" provisioned for claim "default/dy-rbd-c-2"
I0725 08:18:03.136684       1 leaderelection.go:204] stopped trying to renew lease to provision for pvc default/dy-rbd-c-1, timeout reached
I0725 08:18:03.151513       1 leaderelection.go:204] stopped trying to renew lease to provision for pvc default/dy-rbd-c-2, timeout reached

What could cause this mount timeout?
I noticed that the fstype is stated nowhere. But when I tried to add fsType: ext4 to storageclass. I saw another error:

E0725 08:09:14.913472       1 goroutinemap.go:166] Operation for "provision-default/dy-rbd-c-1[81e6c6c7-7110-11e7-bb10-00505682fcc1]" failed. No retries permitted until 2017-07-25 08:09:15.913462177 +0000 UTC (durationBeforeRetry 1s). Error: invalid option "fsType" for ceph.com/rbd provisioner

Yecheng Fu

unread,
Jul 25, 2017, 5:13:02 AM7/25/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@zhangqx2010

No need to and do not add fsType: ext4 to storage class.

Have you checked userId and secret key in userSecretName?

You can execute rbd command on your minion nodes manually, e.g. rbd ls -m <ceph-monitor-addrs> -p <your-pool> --id <userId> --key=<ceph secret key of userId> to make sure node has rbd utility installed, and can access ceph cluster through user id and user secret you provided.

As for RBD plugin in kubelet, it simply calls rbd utility with ceph cluster information you provided to map image onto the host.

zhangqx2010

unread,
Jul 25, 2017, 9:41:53 PM7/25/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@cofyc
Yes, you are right. I typed wrong pool for ceph auth add command. Problem solved after correcting this. Thanks for helping!

Rob Mason

unread,
Aug 9, 2017, 6:30:06 AM8/9/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

I have done as suggested by @farcaller and am running my rbd-provisioner under service account persistent-volume-binder.
PVC now creates volume and binds. However I see the errors about lack of access to events in the rbd-provisioner log

E0809 10:26:06.901024 1 controller.go:682] Error watching for provisioning success, can't provision for claim "default/dbvolclaim": User "system:serviceaccount:kube-system:persistent-volume-binder" cannot list events in the namespace "default". (get events)

Also pod cannot mount the volume that was created :
`kubectl describe po mysql-3673113032-7k059
Name: mysql-3673113032-7k059
Namespace: default
Node: knode22.robm.ammeon.com/10.168.170.22
Start Time: Wed, 09 Aug 2017 11:26:08 +0100
Labels: app=mysql
pod-template-hash=3673113032
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"mysql-3673113032","uid":"277b204b-7ced-11e7-add2-00163e371bd3","...
Status: Pending
IP:
Controllers: ReplicaSet/mysql-3673113032
Containers:
mysql:
Container ID:
Image: mysql:5.6
Image ID:
Port: 3306/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment:
MYSQL_ROOT_PASSWORD: password
Mounts:
/var/lib/mysql from mysql-persistent-storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-702kp (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
mysql-persistent-storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: dbvolclaim
ReadOnly: false
default-token-702kp:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-702kp
Optional: false
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.alpha.kubernetes.io/notReady=:Exists:NoExecute for 300s
node.alpha.kubernetes.io/unreachable=:Exists:NoExecute for 300s
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message


1m 1m 2 default-scheduler Warning FailedScheduling PersistentVolumeClaim is not bound: "dbvolclaim" (repeated 4 times)
1m 1m 1 default-scheduler Normal Scheduled Successfully assigned mysql-3673113032-7k059 to knode22.robm.ammeon.com
1m 1m 1 kubelet, knode22.robm.ammeon.com Normal SuccessfulMountVolume MountVolume.SetUp succeeded for volume "default-token-702kp"
1m 20s 8 kubelet, knode22.robm.ammeon.com Warning FailedMount MountVolume.SetUp failed for volume "pvc-276d654d-7ced-11e7-add2-00163e371bd3" : rbd: image kubernetes-dynamic-pvc-277d877d-7ced-11e7-b9a9-5e25ff659549 is locked by other nodes
`
Is it reasonable to assume that because the provisoner cannot access events, it is never informed that PV is successfully created and hence never unlocks the PV ready for use.

zhangqx2010

unread,
Aug 9, 2017, 10:13:08 PM8/9/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@rtmie
You may create one service account for this provisioner, like rbd-provisioner. Then binding this sa to clusterrole system:controller:persistent-volume-binder.
When you deploy the provisioner pod, use the sa you just created to achieve the goal.

apiVersion: extensions/v1beta1
kind: Deployment
    ...
    spec:
      serviceAccountName: rbd-provisioner
   ...

Rob Mason

unread,
Aug 10, 2017, 5:04:17 AM8/10/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@zhangqx2010 thanks for suggestion. However to me this is creating another SA with same permissions as persistent-bolume-binder.
In any case I tried it (hope I have configured it correctly!):

kind: ServiceAccount
apiVersion: v1
metadata:
  name: rbd-provisioner
  namespace: kube-system
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: rbd-provisioner
subjects:
- kind: ServiceAccount
  name: rbd-provisioner
  namespace: kube-system
roleRef:
  kind: ClusterRole
  name: system:controller:persistent-volume-binder
  apiGroup: rbac.authorization.k8s.io

with similar result in deployment:

kubectl get po
NAME                     READY     STATUS              RESTARTS   AGE
mysql-3673113032-1021n   0/1       ContainerCreating   0          3m
kubectl describe po mysql-3673113032-1021n
Name:           mysql-3673113032-1021n
Namespace:      default
Node:           knode22.robm.ammeon.com/10.168.170.22
Start Time:     Thu, 10 Aug 2017 09:58:37 +0100
Labels:         app=mysql
                pod-template-hash=3673113032
Annotations:    kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"mysql-3673113032","uid":"184ff164-7daa-11e7-add2-00163e371bd3","...
Status:         Pending
IP:
Controllers:    ReplicaSet/mysql-3673113032
 from default-token-702kp (ro)
Conditions:
  Type          Status
  Initialized   True 
  Ready         False 
  PodScheduled  True 
Volumes:
  mysql-persistent-storage:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  dbvolclaim
    ReadOnly:   false
  default-token-702kp:
    Type:       Secret (a volume populated by a Secret)
    SecretName: default-token-702kp
    Optional:   false
QoS Class:      BestEffort
Node-Selectors: <none>
Tolerations:    node.alpha.kubernetes.io/notReady=:Exists:NoExecute for 300s
                node.alpha.kubernetes.io/unreachable=:Exists:NoExecute for 300s
Events:
  FirstSeen     LastSeen        Count   From                                    SubObjectPath   Type            Reason                  Message
  ---------     --------        -----   ----                                    -------------   --------        ------                  -------
  3m            3m              2       default-scheduler                                       Warning         FailedScheduling        PersistentVolumeClaim is not bound: "dbvolclaim" (repeated 4 times)
  3m            3m              1       default-scheduler                                       Normal          Scheduled               Successfully assigned mysql-3673113032-1021n to knode22.robm.ammeon.com
  3m            3m              1       kubelet, knode22.robm.ammeon.com                        Normal          SuccessfulMountVolume   MountVolume.SetUp succeeded for volume "default-token-702kp" 
  1m            1m              1       kubelet, knode22.robm.ammeon.com                        Warning         FailedMount             Unable to mount volumes for pod "mysql-3673113032-1021n_default(185505a6-7daa-11e7-add2-00163e371bd3)": timeout expired waiting for volumes to attach/mount for pod "default"/"mysql-3673113032-1021n". list of unattached/unmounted volumes=[mysql-persistent-storage]
  1m            1m              1       kubelet, knode22.robm.ammeon.com                        Warning         FailedSync              Error syncing pod
  3m            1m              9       kubelet, knode22.robm.ammeon.com                        Warning         FailedMount             MountVolume.SetUp failed for volume "pvc-1844c146-7daa-11e7-add2-00163e371bd3" : rbd: image kubernetes-dynamic-pvc-1854ce34-7daa-11e7-b463-ba9a43d562ef is locked by other nodes```

zhangqx2010

unread,
Aug 10, 2017, 9:17:29 PM8/10/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

Need more information. Did you use the pvc that you mentioned above? And please show your storageclass.

Rob Mason

unread,
Aug 11, 2017, 6:32:24 AM8/11/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

Hi @zhangqx2010 ,
Storage class

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: rbd
provisioner: ceph.com/rbd

parameters:
  monitors: 10.168.170.99:6789
  adminId: admin
  adminSecretName: ceph-secret
  adminSecretNamespace: kube-system
  pool: kubernetes
  userId: kube
  userSecretName: ceph-secret-user
  imageFormat: "2"
  imageFeatures: layering

PVC and PV Status

kubectl get pvc
NAME         STATUS    VOLUME                                     CAPACITY   ACCESSMODES   STORAGECLASS   AGE
dbvolclaim   Bound     pvc-0df7359a-7ddf-11e7-add2-00163e371bd3   5Gi        RWO           rbd            19h

kubectl get pv
NAME                                       CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS    CLAIM                STORAGECLASS   REASON    AGE
pvc-0df7359a-7ddf-11e7-add2-00163e371bd3   5Gi        RWO           Delete          Bound     default/dbvolclaim   rbd                      19h

I think the problem is on the node. I have just found an RBD error in the kubelet log:

Aug 10 16:14:53 knode22.robm.ammeon.com kubelet[540]: I0810 16:14:53.456634     540 rbd_util.go:141] lock list output "2017-08-10 16:14:53.450147 7fb1cb3ba7c0 -1 auth: failed to decode key 'XXXXXXXXXXXXXXXXXXXXXXXX\n'\n2017-08-10 16:14:53.450196 7fb1cb3ba7c0  0 librados: client.kube initialization error (22) Invalid argument\nrbd: couldn't connect to the cluster!\n"

Key , which I have x'ed out, is correctly read from the secret and is correct from ceph user setup. However I don't like the look of the \n.

Rob Mason

unread,
Aug 11, 2017, 11:54:22 AM8/11/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

Updated ceph-auth for user kube

sudo ceph auth get client.kube
exported keyring for client.kube
[client.kube]
        key = XXXXXXXXXXXXXXXXXXXXXXXXX
        caps mds = "allow * pool=kubernetes"
        caps mon = "allow r"
        caps osd = "allow * pool=kubernetes"

Different error on kubelet. Can anyone provide correct ceph auth?

lock list output "2017-08-11 15:47:59.852439 7f01ddf547c0  0 librados: client.kube authentication error (1) Operation not permitted\nrbd: couldn't connect to the cluster!\n"

Vladimir Pouzanov

unread,
Aug 11, 2017, 12:09:22 PM8/11/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

mon="allow r" (read mon to find osd)
osd="allow class-read object_prefix rbd_children, allow rwx pool=kubernetes" (read rbd_children prefix, full access to kubernetes pool)

Rob Mason

unread,
Aug 14, 2017, 6:41:11 AM8/14/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@farcaller - thanks for that. All good now!

Jakub Błaszczyk

unread,
Sep 1, 2017, 6:08:04 AM9/1/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@cofyc Thanks for the rbd image, it works wonders here :)
However, I am stuck in a similar timeout problem that @rtmie had.

My kubernetes version is 1.6.7 installed by kubeadm

PVC:
NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE ceph-pvc Bound pvc-a6abf36b-8ef9-11e7-a959-02000a1ba70c 5Gi RWO rbd 23m

PV:
NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-a6abf36b-8ef9-11e7-a959-02000a1ba70c 5Gi RWO Delete Bound default/ceph-pvc rbd 23m

Logs from the rbd-image:
I0901 09:41:47.382496 1 main.go:84] Creating RBD provisioner with identity: ceph.com/rbd I0901 09:41:47.384600 1 controller.go:407] Starting provisioner controller c5c0a291-8ef9-11e7-9f54-8efcc95066fd! I0901 09:41:47.387744 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[a6abf36b-8ef9-11e7-a959-02000a1ba70c]] I0901 09:41:47.399585 1 leaderelection.go:156] attempting to acquire leader lease... I0901 09:41:47.408363 1 leaderelection.go:178] successfully acquired lease to provision for pvc default/ceph-pvc I0901 09:41:47.408480 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[a6abf36b-8ef9-11e7-a959-02000a1ba70c]] I0901 09:41:47.480116 1 provision.go:110] successfully created rbd image "kubernetes-dynamic-pvc-c5c5a17f-8ef9-11e7-9f54-8efcc95066fd" I0901 09:41:47.480189 1 controller.go:801] volume "pvc-a6abf36b-8ef9-11e7-a959-02000a1ba70c" for claim "default/ceph-pvc" created I0901 09:41:47.485847 1 controller.go:818] volume "pvc-a6abf36b-8ef9-11e7-a959-02000a1ba70c" for claim "default/ceph-pvc" saved I0901 09:41:47.485890 1 controller.go:854] volume "pvc-a6abf36b-8ef9-11e7-a959-02000a1ba70c" provisioned for claim "default/ceph-pvc" I0901 09:41:49.415265 1 leaderelection.go:198] stopped trying to renew lease to provision for pvc default/ceph-pvc, task succeeded

Test pod in pending state:
`Events:


FirstSeen LastSeen Count From SubObjectPath Type Reason Message


25m 24m 7 default-scheduler Warning FailedScheduling [SchedulerPredicates failed due to PersistentVolumeClaim is not bound: "ceph-pvc", which is unexpected., SchedulerPredicates failed due to PersistentVolumeClaim is not bound: "ceph-pvc", which is unexpected.]
24m 24m 1 default-scheduler Normal Scheduled Successfully assigned test-pod to cepf-slave-curious-tiger
22m 2m 10 kubelet, cepf-slave-curious-tiger Warning FailedMount Unable to mount volumes for pod "test-pod_default(a72f09cd-8ef9-11e7-a959-02000a1ba70c)": timeout expired waiting for volumes to attach/mount for pod "default"/"test-pod". list of unattached/unmounted volumes=[pvc]
22m 2m 10 kubelet, cepf-slave-curious-tiger Warning FailedSync Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "default"/"test-pod". list of unattached/unmounted volumes=[pvc]`

Any ideas?

Jing Xu

unread,
Sep 1, 2017, 12:21:23 PM9/1/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@Demonsthere could you provide more details about your issue? How you set up your pod, pvc and pv, e.g., yaml file could be helpful. Also if you have the kubelet log on node, we could take a look to debug.

Rob Mason

unread,
Sep 1, 2017, 7:25:05 PM9/1/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

Can you take a look at kubelet logs on the node where the pod is instantiated, look for anything related to ceph, RBD
Also

Jimmy Song

unread,
Sep 4, 2017, 3:32:39 AM9/4/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

I encountered the same issue.
kube-controller-manager logs

ep  4 15:25:36 bj-xg-oam-kubernetes-001 kube-controller-manager: W0904 15:25:36.032128   13211 rbd_util.go:364] failed to create rbd image, output
Sep  4 15:25:36 bj-xg-oam-kubernetes-001 kube-controller-manager: W0904 15:25:36.032201   13211 rbd_util.go:364] failed to create rbd image, output
Sep  4 15:25:36 bj-xg-oam-kubernetes-001 kube-controller-manager: W0904 15:25:36.032252   13211 rbd_util.go:364] failed to create rbd image, output
Sep  4 15:25:36 bj-xg-oam-kubernetes-001 kube-controller-manager: E0904 15:25:36.032276   13211 rbd.go:317] rbd: create volume failed, err: failed to create rbd image: fork/exec /usr/bin/rbd: invalid argument, command output:

Jakub Błaszczyk

unread,
Sep 4, 2017, 4:32:03 AM9/4/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@jingxu97 I am using the yamls from https://github.com/kubernetes-incubator/external-storage/tree/master/ceph/rbd

controller:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: rbd-provisioner
  namespace: kube-system
spec:
  replicas: 1
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: rbd-provisioner
    spec:
      containers:
      - name: rbd-provisioner
        image: "quay.io/external_storage/rbd-provisioner:latest"
        env:
        - name: PROVISIONER_NAME
          value: ceph.com/rbd
      # serviceAccountName: rbd-provisioner

pvc:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ceph-pvc
spec:
  accessModes: 
    - ReadWriteOnce
  resources:
    requests
:
      storage: 5Gi 
  storageClassName: rbd

secrets:

---
apiVersion: v1
kind: Secret
metadata
:
  name: ceph-secret-admin
  namespace: kube-system
type: "kubernetes.io/rbd"  
data:
  key: {{ ceph_key_admin | b64encode }}
---
apiVersion: v1
kind: Secret
metadata
:
  name: ceph-secret-user
type: "kubernetes.io/rbd"  
data:
  key: {{ ceph_key_user | b64encode }}

storageClass

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: rbd
provisioner: ceph.com/rbd
parameters
:
  monitors: {{ ceph_monitor_list }}
  pool: k8s-test
  adminId: k8s-admin
  adminSecretName: ceph-secret-admin
  adminSecretNamespace: kube-system
  userId: k8s-user
  userSecretName: ceph-secret-user
  imageFormat: "2"
  imageFeatures: layering

Rob Mason

unread,
Sep 4, 2017, 9:09:39 AM9/4/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@rootsongjc can you share the K8S deployment for the rbd-provisioner?

Rob Mason

unread,
Sep 4, 2017, 9:15:36 AM9/4/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@Demonsthere that looks like a problem on your ceph server. Have you tried creating an RBD volume with the ceph client tools and mounting it.

Serguei Bezverkhi

unread,
Sep 4, 2017, 9:16:41 AM9/4/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@rtmie Please check this link: https://github.com/kubernetes-incubator/external-storage/tree/master/ceph/rbd
I have not tried it personally, but it should work.

Rob Mason

unread,
Sep 4, 2017, 9:19:22 AM9/4/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@sbezverk thanks, I do not have any issues with my setup, just answering someone else.

Jakub Błaszczyk

unread,
Sep 4, 2017, 9:45:30 AM9/4/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@rtmie I was able to mount the volume created by k8s (rbd map) manually

rbd map k8s-test/kubernetes-dynamic-pvc-2519acf4-8f12-11e7-9da6-4e6002ec91dd --id k8s-user
/dev/rbd1

Jerome Pin

unread,
Dec 21, 2017, 9:05:24 AM12/21/17
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@rootsongjc Maybe your Secret key isn't base64 ?

bamb00

unread,
Feb 5, 2018, 3:38:18 PM2/5/18
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

Hi,

I'm running into the same error. Is there a workaround?

    Feb 05 20:05:53 minikube kubelet[3704]: E0205 20:05:53.570497    3704 nestedpendingoperations.go:263] Operation for "\"kubernetes.io/rbd/[10.97.152.94:6790 10.104.200.62:6790 10.111.171.163:6790]:k8s-dynamic-pvc-ea65c0b0-0a99-11e8-a29b-0800272637bb-ea742ba1-0a99-11e8-9a4d-0242ac110004\"" failed. No retries permitted until 2018-02-05 20:07:55.570463699 +0000 UTC m=+40710.307964873 (durationBeforeRetry 2m2s). Error: "MountVolume.WaitForAttach failed for volume \"pvc-ea65c0b0-0a99-11e8-a29b-0800272637bb\" (UniqueName: \"kubernetes.io/rbd/[10.97.152.94:6790 10.104.200.62:6790 10.111.171.163:6790]:k8s-dynamic-pvc-ea65c0b0-0a99-11e8-a29b-0800272637bb-ea742ba1-0a99-11e8-9a4d-0242ac110004\") pod \"prometheus-sample-metrics-prom-0\" (UID: \"38ed9d87-0aa9-11e8-a29b-0800272637bb\") : **error: executable file not found in $PATH, rbd output:** "

Kubernetes: v1.9.1 (Minikube)
OS: Linux minikube 4.9.13 #1 SMP Thu Oct 19 17:14:00 UTC 2017 x86_64 GNU/Linux

Thanks in Advance.

Jakub Błaszczyk

unread,
Feb 6, 2018, 3:14:01 AM2/6/18
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@bamb00 I found that the linux kernel required for ceph is at least 4.10 for Ubuntu. Plus you need the kernel modules libceph and rbd enabled

Simon Fredsted

unread,
Mar 5, 2018, 12:41:00 PM3/5/18
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

Hi everybody,

After following this guide: http://docs.ceph.com/docs/master/start/kube-helm/

And then trying so many different things in this thread, I'm now stuck here:

Mar  5 17:28:04 ip-172-25-37-183 kubelet[2495]: E0305 17:28:04.279492    2495 nestedpendingoperations.go:263] Operation for "\"kubernetes.io/rbd/[172.25.37.183:6789]:kubernetes-dynamic-pvc-7b423f8a-209a-11e8-a595-f29dbaa32808\"" failed. No retries permitted until 2018-03-05 17:28:12.279449171 +0000 UTC m=+16066.929917280 (durationBeforeRetry 8s). Error: "MountVolume.WaitForAttach failed for volume \"pvc-7b39bd6a-209a-11e8-a55b-02e13b5c0864\" (UniqueName: \"kubernetes.io/rbd/[172.25.37.183:6789]:kubernetes-dynamic-pvc-7b423f8a-209a-11e8-a595-f29dbaa32808\") pod \"mypod\" (UID: \"804e2b5a-209a-11e8-a55b-02e13b5c0864\") : error: exit status 1, rbd output: 2018-03-05 17:28:04.271178 7f05573bad40 -1 did not load config file, using default settings.\n2018-03-05 17:28:04.276321 7f05573bad40 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory\n2018-03-05 17:28:04.277661 7f05573bad40  0 librados: client.admin authentication error (1) Operation not permitted\nrbd: couldn't connect to the cluster!\n

I've manually installed the correct versions of ceph on each node of my k8s cluster – a little cumbersome, and weirdly, not documented anywhere (does anyone know why this isn't written down on docs.ceph.com?).

Then I had to use the service IP address in my Storage Class instead of the hostname.

And finally, it seems there's errors with authentication. I'm pretty sure I've created my userSecret correctly:

{
  "kind": "Secret",
  "apiVersion": "v1",
  "metadata": {
    "name": "pvc-ceph-client-key",
    "namespace": "default",
    "selfLink": "/api/v1/namespaces/default/secrets/pvc-ceph-client-key",
    "uid": "0c1a93eb-2080-11e8-a55b-02e13b5c0864",
    "resourceVersion": "18641341",
    "creationTimestamp": "2018-03-05T14:18:16Z"
  },
  "data": {
    "key": "QVFDeVVKMWF3ODdHS2hBQVBSc3NrRHYrMThnSVl0T3B1Qnlyb3c9PQo="
  },
  "type": "kubernetes.io/rbd"
}

Any pointers?

jianglingxia

unread,
Mar 6, 2018, 6:06:07 AM3/6/18
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

I have the problem that pvc log:

Warning ProvisioningFailed 55m (x6 over 1h) ceph.com/rbd rbd-provisioner-5b89b9bb7c-56jgz c4f87b20-2112-11e8-82cb-0242ac110005 (combined from similar events): Failed to provision volume with StorageClass "fast": failed to create rbd image: exit status 110, command output: 2018-03-06 09:40:33.361766 7fedad8c7d80 -1 did not load config file, using default settings.
rbd: image format 1 is deprecated
2018-03-06 09:40:33.411473 7fedad8c7d80 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
2018-03-06 09:45:33.411867 7fedad8c7d80 0 monclient(hunting): authenticate timed out after 300
2018-03-06 09:45:33.411971 7fedad8c7d80 0 librados: client.admin authentication error (110) Connection timed out


rbd: couldn't connect to the cluster!

Normal ExternalProvisioning 3m (x1839 over 2h) persistentvolume-controller waiting for a volume to be created, either by external provisioner "ceph.com/rbd" or manually created by system administrator

first i create the deployment in none-rbac enviroment

[root@controller:/home/ubuntu/rbd]$ cat rbd-provisioner.yaml
apiVersion: apps/v1


kind: Deployment
metadata:
name: rbd-provisioner

labels:
app: rbd-provisioner


spec:
replicas: 1
strategy:
type: Recreate

selector:
matchLabels:
app: rbd-provisioner


template:
metadata:
labels:
app: rbd-provisioner
spec:
containers:
- name: rbd-provisioner
image: "quay.io/external_storage/rbd-provisioner:latest"

imagePullPolicy: Never


env:
- name: PROVISIONER_NAME
value: ceph.com/rbd

then the pod running
[root@controller:/etc/ceph]$ kubectl get po


NAME READY STATUS RESTARTS AGE

rbd-provisioner-5b89b9bb7c-56jgz 1/1 Running 0 2h

second i create a storageclass

[root@controller:/home/ubuntu/rbd]$ cat storageclass-rbd.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: fast
provisioner: ceph.com/rbd
parameters:
monitors: 192.168.1.115:6789
adminId: admin
adminSecretName: ceph-secret
adminSecretNamespace: rbd
pool: rbd
userId: admin
userSecretName: ceph-secret

[root@controller:/home/ubuntu/rbd]$ kubectl get sc
NAME PROVISIONER AGE
fast ceph.com/rbd 2h

third i create a pvc

[root@controller:/home/ubuntu/rbd]$ cat ceph-pvc.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: ceph-claim-sc
namespace: rbd
spec:
storageClassName: fast


accessModes:
- ReadWriteOnce
resources:
requests:

storage: 1Gi

[root@controller:/home/ubuntu/rbd]$ kubectl get pvc -n rbd
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
ceph-claim-sc Pending fast 2h

[root@controller:/etc/ceph]$ kubectl describe pvc ceph-claim-sc -n rbd
Name: ceph-claim-sc
Namespace: rbd
StorageClass: fast
Status: Pending
Volume:
Labels:
Annotations: control-plane.alpha.kubernetes.io/leader={"holderIdentity":"c4f87b20-2112-11e8-82cb-0242ac110005","leaseDurationSeconds":15,"acquireTime":"2018-03-06T08:13:17Z","renewTime":"2018-03-06T09:45:18Z","lea...
volume.beta.kubernetes.io/storage-provisioner=ceph.com/rbd
Finalizers: []
Capacity:
Access Modes:
Events:
Type Reason Age From Message


Warning ProvisioningFailed 55m (x6 over 1h) ceph.com/rbd rbd-provisioner-5b89b9bb7c-56jgz c4f87b20-2112-11e8-82cb-0242ac110005 (combined from similar events): Failed to provision volume with StorageClass "fast": failed to create rbd image: exit status 110, command output: 2018-03-06 09:40:33.361766 7fedad8c7d80 -1 did not load config file, using default settings.
rbd: image format 1 is deprecated
2018-03-06 09:40:33.411473 7fedad8c7d80 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
2018-03-06 09:45:33.411867 7fedad8c7d80 0 monclient(hunting): authenticate timed out after 300
2018-03-06 09:45:33.411971 7fedad8c7d80 0 librados: client.admin authentication error (110) Connection timed out


rbd: couldn't connect to the cluster!

Normal ExternalProvisioning 3m (x1839 over 2h) persistentvolume-controller waiting for a volume to be created, either by external provisioner "ceph.com/rbd" or manually created by system administrator

the rbd-provisioner pod log is:
deploy.txt
thanks for all and wait for your reply!

song

unread,
Mar 8, 2018, 9:06:32 PM3/8/18
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@jianglingxia did you create ceph-secret in your namespaces ?

jianglingxia

unread,
Mar 8, 2018, 9:25:55 PM3/8/18
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

yes,thanks for your reply, and I have checked the problem that the ceph cluster version not Consistent with the minion ceph-common version!thanks very much!

huangxiaoping

unread,
May 2, 2018, 2:38:43 AM5/2/18
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention
Warning  ProvisioningFailed  1m    ceph.com/rbd rbd-provisioner-bc956f5b4-r2vg4 01c62837-4db5-11e8-b4c7-0a580af4040f  Failed to provision volume with StorageClass "ceph-storage": failed to create rbd image: exit status 2, command output: 2018-05-02 05:58:08.149335 7f58a112ad80 -1 did not load config file, using default settings.
2018-05-02 05:58:08.213998 7f58a112ad80 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
rbd: error opening pool rdb: (2) No such file or directory
  Normal   Provisioning        56s (x2 over 1m)  ceph.com/rbd rbd-provisioner-bc956f5b4-r2vg4 01c62837-4db5-11e8-b4c7-0a580af4040f  External provisioner is provisioning volume for claim "harbor/adminserver-config-harbor-harbor-adminserver-0"
  Warning  ProvisioningFailed  56s               ceph.com/rbd rbd-provisioner-bc956f5b4-r2vg4 01c62837-4db5-11e8-b4c7-0a580af4040f  Failed to provision volume with StorageClass "ceph-storage": failed to create rbd image: exit status 2, command output: 2018-05-02 05:58:15.320918 7fbd961cdd80 -1 did not load config file, using default settings.
2018-05-02 05:58:15.386747 7fbd961cdd80 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
rbd: error opening pool rdb: (2) No such file or directory
  Normal  ExternalProvisioning  7s (x4 over 13s)  persistentvolume-controller  waiting for a volume to be created, either by external provisioner "ceph.com/rbd" or manually created by system administrator

I have this error,how to solve?

huangxiaoping

unread,
May 3, 2018, 2:25:38 AM5/3/18
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

solved

Davanum Srinivas

unread,
May 27, 2018, 1:35:23 PM5/27/18
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

/close

@sbezverk please reopen if necessary

k8s-ci-robot

unread,
May 27, 2018, 1:35:27 PM5/27/18
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

Closed #38923.

Feres BERBECHE

unread,
May 28, 2018, 9:38:19 AM5/28/18
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

/reopen

k8s-ci-robot

unread,
May 28, 2018, 9:38:34 AM5/28/18
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@feresberbeche: you can't re-open an issue/PR unless you authored it or you are assigned to it.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Alexander M.

unread,
Dec 22, 2018, 2:29:43 AM12/22/18
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@cofyc

hi, guys,

Now you can avoid using customized kube-controller image, external-storage out-of-tree RBD provisioner is merged, you can use it instead. Here is guide:

1, Deploy standalone rbd-provisioner controller:

Ok, It can work. But what about of several already provisioned PVC/PV by kubernetes.io/rbd provisioner? How can I switch these volumes under the control of external storage controller? What is a clear upgrade path from 1.11 to 1.12?

Yecheng Fu

unread,
Dec 22, 2018, 2:59:58 AM12/22/18
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

But what about of several already provisioned PVC/PV by kubernetes.io/rbd provisioner? How can I switch these volumes under the control of external storage controller?

There is no automatic way, but you can do it manually.

  • Create new claim and pre-bind old volume with it (set spec.Volume.Name on creation, otherwise provisioner will provisioner a new volume for it)
  • Update old volume spec.claimRef filed to point to new claim
  • Delete old claim

What is a clear upgrade path from 1.11 to 1.12?

No extra action required for RBD provisioning.

Robert Jerzak

unread,
Jan 7, 2019, 1:28:55 PM1/7/19
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

External provisioner is great during provisioning volumes but if you want to resize those volumes it's handled by in-tree volume plugin (kube-controller-manager):

Error expanding volume "volume-name" of plugin kubernetes.io/rbd : rbd info failed, error: executable file not found in $PATH

So if you want to resize volumes you still need to have rbd binary in kube-controller-manager image.

Michael FIG

unread,
Feb 3, 2019, 10:57:04 AM2/3/19
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

Tuyen Tran

unread,
Feb 28, 2019, 3:05:34 PM2/28/19
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@gyliu513

hi, there are two parts:

* Volume Provisioning: Currently, if you want dynamic provisioning, RBD provisioner in `controller-manager` needs to access `rbd` binary to create new image in ceph cluster for your PVC.
  [external-storage](https://github.com/kubernetes-incubator/external-storage) plans to move volume provisioners from in-tree to out-of-tree, there will be a separated RBD provisioner container image with `rbd` utility included ([kubernetes-incubator/external-storage#200](https://github.com/kubernetes-incubator/external-storage/issues/200)), then `controller-manager` do not need access `rbd` binary anymore.

* Volume Attach/Detach: `kubelet` needs to access `rbd` binary to attach (`rbd map`) and detach (`rbd unmap`) RBD image on node. If `kubelet` is running on the host, host needs to install `rbd` utility (install `ceph-common` package on most Linux distributions).

This is very useful detail explanations.

I installed Kubernetes 1.13.2 with Kubespray

For the first part (Volume Provisioning), kubespray deployment uses /external-storage/rbd-provisioner container. So it works fine. I checked the logs and Ceph created the volume ok

In the second part (Volume Attach/Detach), it failed because rbd binary was not found.

My kubelet is running on CoreOS host (not inside docker/rkt) because and due to to security, kubespray kubelet_deployment_type default is 'host'. I could NOT install ceph-common when setting up these nodes. However, I could invoke 'rbd' binary mounted from ceph/ceph container

Source:
https://stackoverflow.com/questions/37996983/invoking-rbd-docker-from-kubernetes-on-coreos-returns-fork-exec-invalid-argume

I still got the error stating that rbd could not find ceph.conf in these nodes. Where to get this .conf dynamically?

qct

unread,
Oct 4, 2019, 6:34:36 AM10/4/19
to kubernetes/kubernetes, k8s-mirror-storage-bugs, Team mention

@childsb @saad-ali please help the community, as @r0bj said, there's still problem when resizing pvc

Reply all
Reply to author
Forward
0 new messages