Volume not detached on a node shutdown

723 views
Skip to first unread message

Chakravarthy Nelluri

unread,
Oct 30, 2017, 3:17:04 PM10/30/17
to Shay Berman, Cheng Xing, hek...@redhat.com, Jing Xu, jsaf...@redhat.com, Lior Tamary, Mohamed Mohamed, Saad Ali, kubernetes-sig-storage
Hi Shay,

AFAIK, TaintBasedEvictions is the only approach.

As I said, AFAIK this is true for all volume types in Kubernetes including GCE & AWS, nothing to do with Flex volume.

Copying sig storage team to see if anyone has details on other alternatives.

Thanks
~Chakri

On Oct 30, 2017, at 3:07 PM, Shay Berman <BS...@il.ibm.com> wrote:

Can you elaborate about why node-lost is not enough for k8s to trigger detach?
I guess you refer "pod is deleted" to the POD with the PVC that was running (as Deployment with replica=1) on the node that Lior just shutdown.
So I would expect that after this shutdown this POD will be down of cause and then Flex should detach it from the master node.

how can we delete a POD that was running on shutdown minion?   Does telling the customer to set the TaintBasedEvictions=True is the right approch to make a dead PODs (node-lost) to be deleted?

Thanks
Shay



From:        Chakravarthy Nelluri <cha...@diamanti.com>
To:        Shay Berman <BS...@il.ibm.com>
Cc:        Lior Tamary <LI...@il.ibm.com>, Cheng Xing <cx...@google.com>, Jing Xu <ji...@google.com>, jsaf...@redhat.com, Mohamed Mohamed <mmoh...@us.ibm.com>, Saad Ali <saa...@google.com>, hek...@redhat.com
Date:        10/30/2017 08:39 PM
Subject:        Re: FlexVolume




If the old pod is in node-lost state, AFAIK the controller won’t detach. Detach is only triggered when the pod is deleted. You can configure taint based evictions to delete the pod and verify.

This is nothing to do with Flexnvolume. This is the regular controller workflow.

Thanks
~Chakri


On Oct 30, 2017, at 2:33 PM, Shay Berman <BS...@il.ibm.com> wrote:

Hi Chakravathy,

Just clarification


1. When Lior say " i kill one node"   its actually shutdown the minion.
2. We would expect to see detach operation coming from the master to detach the volume, BUT it looks like the flex doesn't send any detach operation.


Chakravathy, please provide some troubleshooting we can do on the node.
Is it related to the fact that we didn't implemented the APIs : mountdevice, unmountdevice waitforattch?  because as far as we know, these APIs are not mandatory to implement in order to get the remote detach\attach.  
Since our flex driver is open source, then you can even review it ->
https://github.com/IBM/ubiquity-k8s/blob/dev/controller/controller.go

Thanks for your responsiveness


<Mail Attachment.gif>
<Mail Attachment.gif>
Shay Berman
Product Owner & Software Engineer
Cloud Storage Solutions
IBM Systems

Phone: (+972) 3 689 7780 | E-mail:
bs...@il.ibm.com
Visit the CSS community at :
https://ibm.biz/BdsM8x






From:        
Lior Tamary/Haifa/IBM
To:        
Chakravarthy Nelluri <cha...@diamanti.com>
Cc:        
Shay Berman <BS...@il.ibm.com>, Cheng Xing <cx...@google.com>, Jing Xu <ji...@google.com>, jsaf...@redhat.com, Mohamed Mohamed <mmoh...@us.ibm.com>, Saad Ali <saa...@google.com>
Date:        
10/30/2017 07:51 PM
Subject:        
Re: FlexVolume




Hi Chakravathy,
We are running kubernetes 1.8.1
The old pod was in node-lost state
We (running with very default installation) didn't configure any
TaintBasedEvictions
Thank you.


<Mail Attachment.gif>
<Mail Attachment.gif>
Lior Tamary
Software Developer

Cloud Storage Solutions

IBM Systems

Phone: (+972) 3 689 7339 | E-mail:
li...@il.ibm.com







From:        
Chakravarthy Nelluri <cha...@diamanti.com>
To:        
Lior Tamary <LI...@il.ibm.com>
Cc:        
Shay Berman <BS...@il.ibm.com>, Cheng Xing <cx...@google.com>, Jing Xu <ji...@google.com>, Mohamed Mohamed <mmoh...@us.ibm.com>, Saad Ali <saa...@google.com>, jsaf...@redhat.com
Date:        
30/10/2017 18:46
Subject:        
Re: FlexVolume




Hi Lior,

- What is the version of K8S you are using?
- Do you see the old Pod is still lying around in terminating state or is completely gone?
- Did you configure any “
TaintBasedEvictions”?

+ Jan

Thanks
~Chakri

On Oct 30, 2017, at 12:16 PM, Lior Tamary <
LI...@il.ibm.com> wrote:

Hi Chakravathy,
We've implemented init, mount, unmount, attach, detach, isattached (and currently left mountdevice, unmountdevice waitforattch unimplemented)
We've also configured the master controller to do the attach and detach on behalf of the nodes.
Now on my testing env, when i kill one node, i see the pod being created on the other node, but fails with "Volume is already exclusively attached to one node "
I see the controller calls isattached with the volume name and dead node name, and we reply with true because it is attached.
but nothing else happens, no detach request.
Any idea what we are missing?



I1030 14:23:58.621184       1 operation_generator.go:278] AttachVolume.Attach succeeded for volume "bash-pvc" (UniqueName: "flexvolume-ibm/ubiquity-k8s-flex/bash-pvc") from node "liort-ub4"
W1030 14:23:58.621449       1 plugin-defaults.go:32] flexVolume driver ibm/ubiquity-k8s-flex: using default GetVolumeName for volume bash-pvc
W1030 14:24:05.058479       1 plugin-defaults.go:32] flexVolume driver ibm/ubiquity-k8s-flex: using default GetVolumeName for volume bash-pvc
W1030 14:24:09.640287       1 plugin-defaults.go:32] flexVolume driver ibm/ubiquity-k8s-flex: using default GetVolumeName for volume bash-pvc
W1030 14:25:39.912980       1 plugin-defaults.go:32] flexVolume driver ibm/ubiquity-k8s-flex: using default GetVolumeName for volume bash-pvc
W1030 14:26:25.251278       1 plugin-defaults.go:32] flexVolume driver ibm/ubiquity-k8s-flex: using default GetVolumeName for volume ibm-ubiquity-db
W1030 14:26:25.251420       1 plugin-defaults.go:32] flexVolume driver ibm/ubiquity-k8s-flex: using default GetVolumeName for volume bash-pvc
W1030 14:29:25.252289       1 plugin-defaults.go:32] flexVolume driver ibm/ubiquity-k8s-flex: using default GetVolumeName for volume ibm-ubiquity-db
W1030 14:29:25.252436       1 plugin-defaults.go:32] flexVolume driver ibm/ubiquity-k8s-flex: using default GetVolumeName for volume bash-pvc
W1030 14:32:25.254821       1 plugin-defaults.go:32] flexVolume driver ibm/ubiquity-k8s-flex: using default GetVolumeName for volume bash-pvc
W1030 14:32:25.254981       1 plugin-defaults.go:32] flexVolume driver ibm/ubiquity-k8s-flex: using default GetVolumeName for volume ibm-ubiquity-db
I1030 14:33:15.983785       1 event.go:218] Event(v1.ObjectReference{Kind:"Node", Namespace:"", Name:"liort-ub4", UID:"605f84cf-bd4b-11e7-a93a-005056a46507", APIVersion:"", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'NodeNotReady' Node liort-ub4 status is now: NodeNotReady
W1030 14:33:15.995311       1 plugin-defaults.go:32] flexVolume driver ibm/ubiquity-k8s-flex: using default GetVolumeName for volume bash-pvc
W1030 14:35:25.256548       1 plugin-defaults.go:32] flexVolume driver ibm/ubiquity-k8s-flex: using default GetVolumeName for volume bash-pvc
W1030 14:35:25.257843       1 plugin-defaults.go:32] flexVolume driver ibm/ubiquity-k8s-flex: using default GetVolumeName for volume ibm-ubiquity-db
I1030 14:37:36.081888       1 event.go:218] Event(v1.ObjectReference{Kind:"Node", Namespace:"", Name:"liort-ub4", UID:"605f84cf-bd4b-11e7-a93a-005056a46507", APIVersion:"", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'DeletingAllPods' Node liort-ub4 event: Deleting all Pods from Node liort-ub4.
W1030 14:37:36.092035       1 plugin-defaults.go:32] flexVolume driver ibm/ubiquity-k8s-flex: using default GetVolumeName for volume bash-pvc
I1030 14:37:36.093247       1 event.go:218] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"bashx-854cfdf744-kml4k", UID:"f77039c6-bd7d-11e7-a93a-005056a46507", APIVersion:"v1", ResourceVersion:"31232", FieldPath:""}): type: 'Normal' reason: 'NodeControllerEviction' Marking for deletion Pod bashx-854cfdf744-kml4k from Node liort-ub4
W1030 14:37:36.101965       1 plugin-defaults.go:32] flexVolume driver ibm/ubiquity-k8s-flex: using default GetVolumeName for volume bash-pvc
I1030 14:37:36.126405       1 node_controller.go:438] Pods awaiting deletion due to Controller eviction
I1030 14:37:36.126768       1 event.go:218] Event(v1.ObjectReference{Kind:"ReplicaSet", Namespace:"default", Name:"bashx-854cfdf744", UID:"f76d7926-bd7d-11e7-a93a-005056a46507", APIVersion:"extensions", ResourceVersion:"31234", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: bashx-854cfdf744-msmq9
W1030 14:37:36.133242       1 plugin-defaults.go:32] flexVolume driver ibm/ubiquity-k8s-flex: using default GetVolumeName for volume bash-pvc
W1030 14:37:36.149822       1 plugin-defaults.go:32] flexVolume driver ibm/ubiquity-k8s-flex: using default GetVolumeName for volume bash-pvc
W1030 14:37:36.202285       1 reconciler.go:267] Multi-Attach error for volume "bash-pvc" (UniqueName: "flexvolume-ibm/ubiquity-k8s-flex/bash-pvc") from node "liort-ub2" Volume is already exclusively attached to one node and can't be attached to another



<Mail Attachment.gif>
<Mail Attachment.gif>
Lior Tamary
Software Developer

Cloud Storage Solutions

IBM Systems

Phone: (+972) 3 689 7339 | E-mail:
li...@il.ibm.com







From:        
Chakravarthy Nelluri <cha...@diamanti.com>
To:        
Shay Berman <BS...@il.ibm.com>
Cc:        
Saad Ali <saa...@google.com>, Cheng Xing <cx...@google.com>, Jing Xu <ji...@google.com>, Lior Tamary <LI...@il.ibm.com>, Mohamed Mohamed <mmoh...@us.ibm.com>
Date:        
24/10/2017 17:38
Subject:        
Re: FlexVolume




Hi Shay,

AFAIK there are various deployment models for K8S today. There is no defined standard yet and there are still discussions on standardizing it with Kube-adm. K8S has lots of features and each deployment model might choose & not choose to support some features.

Most of them like CoreOS, Openshift already support Flex volume. If you are using a particular deployment model, I would say request the deployer for Flex volume support.

Also, what deployment model are you using right now? I can check and see if they support it.

Thanks
~Chakri

On Oct 23, 2017, at 12:52 AM, Shay Berman <
BS...@il.ibm.com> wrote:

Thanks Saadi,

CSI is on our roadmap, but as much as I know CSI story will be ready for customer only 2Q 2018 (please correct me if I am wrong) and we need a solution for IBM storage ASAP.

The main reason we want to implement controller attach\detach, is to support crashed-minion scenario, so all the volumes from  the crashed minion will be detached automatically by the master controller and then they will be moved to another minion. And this is really basic scenario we need to address. So the controller attach\detach is the only native way we can apply to this crashed-minion scenario.  Again, please correct me if I am wrong here.  so "dropping the controller attach" is out of the question unless you can recommend on other solution to support crashed-minion?

You mentioned that
kube-controller-manager come as POD, in which case it doesn't come as a POD?  I follow the k8s installation instruction and i see it as a POD on 1.8 ad 1.7.  
Do you know when
kube-controller-manager is not coming as a POD?

Could you please send me link to the CSI deployment story, so i will provide my feedback to the community as well.

Thanks.
<Mail Attachment.gif>
<Mail Attachment.gif>
Shay Berman
Product Owner & Software Engineer
Cloud Storage Solutions
IBM Systems

Phone: (+972) 3 689 7780 | E-mail:
bs...@il.ibm.com
Visit the CSS community at :
https://ibm.biz/BdsM8x








From:        
Saad Ali <saa...@google.com>
To:        
Shay Berman <BS...@il.ibm.com>
Cc:        
Chakravarthy Nelluri <cha...@diamanti.com>, Cheng Xing <cx...@google.com>, Jing Xu <ji...@google.com>, Mohamed Mohamed <mmoh...@us.ibm.com>, Lior Tamary <LI...@il.ibm.com>
Date:        
10/23/2017 12:40 AM
Subject:        
Re: FlexVolume




Shay, as you've discovered controller attach can get tricky with Flex.

The DaemonSet that Cheng created just drops the Flex driver on to the host machine, and it assumes that the master is schedulable. If it is not, or your kube-controller-manager is deployed in a container (pod), then you need to modify your deployment system to make sure it populates the Flex driver in the correct path in the container--you can do this by creating a HostPath. If you don't control the deployment mechanism, then you'll have to depend on your users to do this. Which I agree is pretty painful.

If you can not live with that, then consider dropping the controller attach and put your attach method inside the mount step. This has some drawbacks, but it simplifies your deployment as you only need to be able to drop the Flex driver on to the host, not on to the master.

Otherwise, keep an eye out for CSI, which will have a much better deployment story.




On Sun, Oct 22, 2017 at 12:02 PM, Shay Berman <
BS...@il.ibm.com> wrote:
In k8s 1.6 or 1.7 i don't see this hostpath (to the plugins) inside the /etc/kubernetes/manifests/kube-controller-manager.yaml.

Chakravarthy \ Saadi
, so is this your recommendation to ask the customer to update the kube-controller-manager.yaml with the plugin directory on each master node?  
(I didn't saw it in the flex design ->
https://github.com/kubernetes/kubernetes/issues/20262)

Thanks
<Mail Attachment.gif>
<Mail Attachment.gif>
Shay Berman
Product Owner & Software Engineer
Cloud Storage Solutions
IBM Systems

Phone:
(+972) 3 689 7780 | E-mail: bs...@il.ibm.com
Visit the CSS community at :
https://ibm.biz/BdsM8x










From:        
Lior Tamary/Haifa/IBM
To:        
Shay Berman/Israel/IBM@IBMIL
Cc:        
Chakravarthy Nelluri <cha...@diamanti.com>, Cheng Xing <cx...@google.com>, Jing Xu <ji...@google.com>, Mohamed Mohamed <mmoh...@us.ibm.com>, Saad Ali <saa...@google.com>
Date:        
10/22/2017 04:54 PM
Subject:        
Re: FlexVolume



in version 1.8 it exists by deafult in /etc/kubernetes/manifests/kube-controller-manager.yaml


- hostPath:
   path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec
   type: DirectoryOrCreate
 name: flexvolume-dir

<Mail Attachment.gif>
<Mail Attachment.gif>
Lior Tamary
Software Developer

Cloud Storage Solutions

IBM Systems

Phone:
(+972) 3 689 7339 | E-mail: li...@il.ibm.com










From:        
Shay Berman/Israel/IBM
To:        
Chakravarthy Nelluri <cha...@diamanti.com>
Cc:        
Cheng Xing <cx...@google.com>, Jing Xu <ji...@google.com>, Lior Tamary <LI...@il.ibm.com>, Mohamed Mohamed <mmoh...@us.ibm.com>, Saad Ali <saa...@google.com>
Date:        
22/10/2017 11:50
Subject:        
Re: FlexVolume




Hi Chakravathy,

please let us know how to path the "kube-controller-manager" POD with the flex directory.  

Who responsible to start this POD?    and where its yml file so i can update it with the flex volume?


Thanks a lot
<Mail Attachment.gif>
<Mail Attachment.gif>
Shay Berman
Product Owner & Software Engineer
Cloud Storage Solutions
IBM Systems

Phone:
(+972) 3 689 7780 | E-mail: bs...@il.ibm.com
Visit the CSS community at :
https://ibm.biz/BdsM8x










From:        
Shay Berman/Israel/IBM
To:        
Chakravarthy Nelluri <cha...@diamanti.com>
Cc:        
Cheng Xing <cx...@google.com>, Jing Xu <ji...@google.com>, Lior Tamary <LI...@il.ibm.com>, Mohamed Mohamed <mmoh...@us.ibm.com>, Saad Ali <saa...@google.com>
Date:        
10/19/2017 11:50 PM
Subject:        
Re: FlexVolume



Currently the POD "kube-controller-manager-tzur-ubiquity-master"   has only /etc/kubernetes, /etc/ssl/certs and /etc/pki     as volumes.


How do i add the flex directory permenantly to this POD?





From:        
Chakravarthy Nelluri <cha...@diamanti.com>
To:        
kube-controller-manager-liort-ub1Shay Berman <BS...@il.ibm.com>
Cc:        
Cheng Xing <cx...@google.com>, Jing Xu <ji...@google.com>, Lior Tamary <LI...@il.ibm.com>, Mohamed Mohamed <mmoh...@us.ibm.com>, Saad Ali <saa...@google.com>
Date:        
10/19/2017 11:42 PM
Subject:        
Re: FlexVolume




- If the controller is running as a Pod, you have to expose /usr/libexec/kubernetes/kubelet-plugins/volume host path to controller manager Pod.


- Yes we have to restart kubelet after setting enable-controller-attach-detach=true. It is actually set to true by default.


- Haven’t tried it on 1.8 version, but master node can be marked schedulable. Saad might know if something changed lately.


Thanks
~Chakri

On Oct 19, 2017, at 4:20 PM, Shay Berman <

BS...@il.ibm.com> wrote:

Hi Chakravarthy,

1. yes the kube-controller-manager  is a POD, you can see below.  So all the attach\detach operations should be written into this kube-controller-manager-liort-ub1 log?
[root@liort-ub1 ~]# kubectl get pod -n kube-system
NAME                                READY     STATUS    RESTARTS   AGE
etcd-liort-ub1                      1/1       Running   0          5h
kube-apiserver-liort-ub1            1/1       Running   0          5h
kube-controller-manager-liort-ub1   1/1       Running   0          5h
kube-dns-2617979913-mkf96           3/3       Running   0          5h
kube-flannel-ds-l1zvn               1/1       Running   0          5h
kube-proxy-rhgcd                    1/1       Running   0          5h
kube-scheduler-liort-ub1            1/1       Running   0          5h

2. After we set the  enable-controller-attach-detach=true  on the minions and on the master, do we need to restart some POD or kubelet service?

3. how can we set the master to be a scheduled nodes?  if 1.8 we encounter issue when after set it to be schedule the master node was Not ready. So  i wonder what is the valid way to set the master as schedule and also how to return it back to none schedule.

Thanks




From:        
Chakravarthy Nelluri <cha...@diamanti.com>
To:        
Shay Berman <BS...@il.ibm.com>
Cc:        
Saad Ali <saa...@google.com>, Cheng Xing <cx...@google.com>, Jing Xu <ji...@google.com>, Lior Tamary <LI...@il.ibm.com>, Mohamed Mohamed <mmoh...@us.ibm.com>
Date:        
10/19/2017 11:08 PM
Subject:        
Re: FlexVolume




HI Shay,

- Is your controller manager running as a pod? If it is, we have to expose the host path where the plugin is installed to the pod. If you are running as a pod, you can check the logs in docker logs or kubectl logs.
- Yes in some environments, it is not possible to run Daemon sets on master. The only option today AFAIK is to copy the driver manually.

Thanks
~Chakri

On Oct 19, 2017, at 3:57 PM, Shay Berman <
BS...@il.ibm.com> wrote:

Hi Saad, Chakravarthy and Cheng,

During the development of the controller attach\detach API, we encounter some issues.

issueA:

We did the following things:        
  1. copied the flex CLI to the master   (of cause also on the minions)
  2. set the enable-controller-attach-detach=true on all the minions and the master
  3. and now when we create new POD with volume, we don't see attach operation coming from the flex on the master.  
any suggestions?   of cause we have loging inside the flex on the attach detach functions and we still don't see anything triggered there.
Who is triggering the attach detach on the master? is it the kubelet component or its the controller-manager. Where can we find logs that shows that the k8s tried to trigger attach API?
Is it must to set the master as scheduled node in order to make it work?

issueB:

About the deployment method of the flex CLI on the client.
So as I already mentioned, we implemented your recommendation of daemon set that just copy the flex on all the minions.  you can see our code ->
daemonset yml  and the image setup script
But what is the recommanded way to deploy flex also on the masters?  by setting the  enable-controller-attach-detach=true the masters should trigger the attach\detach, which mean that the our Flex should be installed on the masters as well.
By default the daemon set is not running on the master, so do you have any recommanded method to install the flex also on all the master so the remote attach\detach will work?
I saw that you can set the master to be a scheduled node, but i don't think i can enfoce my customer to set all the masters as schedule just for deploy the flex.  Any idea?

Thanks a lot for your help

Shay
<Mail Attachment.gif>
<Mail Attachment.gif>
Shay Berman
Product Owner & Software Engineer
Cloud Storage Solutions
IBM Systems

Phone:
(+972) 3 689 7780 | E-mail: bs...@il.ibm.com
Visit the CSS community at :
https://ibm.biz/BdsM8x













From:        
Shay Berman/Israel/IBM
To:        
Saad Ali <saa...@google.com>
Cc:        
Chakravarthy Nelluri <cha...@diamanti.com>, Cheng Xing <cx...@google.com>, Jing Xu <ji...@google.com>, Lior Tamary <LI...@il.ibm.com>, Mohamed Mohamed/Almaden/IBM@IBMUS
Date:        
10/04/2017 08:54 AM
Subject:        
Re: FlexVolume



Hi Saadi,

Thanks for your respond, but I still have some unclearify here,
So i drafted here each API and a questions for each one, please see below:
OrderAPIIBM flex driver implementation
0init
Called when kubelet first start.
 we should do volume mapping from the storage side to the node.
1attach
attach the PV to the host (runs from controller)
Map the volume of the PV to a given host
Question1
: what is this quote mean
"This call-out does not pass "secrets" specified in Flexvolume spec. If your driver requires secrets, do not implement this call-out and instead use "mount" call-out and implement attach and mount in that call-out"
Question2
: can you give an example of the <json options>?
2wait for attach Question1: what is this quote mean "Called from both Kubelet & Controller manager."?  how can we identify when it calls from the controller and when it calls from the kubelet? because from the kubelet we can check if the device showed up by running rescan, but from the controller we cannot.

Question2
: do we need to check if the device is identify in the node it self by running here the rescan command to identify the multipath device? (or this is the goal of "volume is attached"?)

Question3
: Can you give an example of the <json options>?
3Volume is attach Same question as above.
Question1
: Is this the place we should run rescan on the host and identify the multipath device?
Question2 :
why its this API also called from the "
Called from both Kubelet & Controller manager" ?  how can we identify if its called from the controller?
4mount device Question1 : is this the call we should implement mount the multipath device into lets say /ubiquity/VOLUME-ID  on the host? (and of cause create filesystem on the volume if its new before mounting it)
5mount Question1: "Mount then takes the global mount path and bind mounts it to individual pod paths"   can you explain what does it means to mount to individual pod path?
Since we already mount the device to "global path" on the host, why we should mount again the same mountpoint for individual pod?









BTW : Is there any block storage vendor that open source his Flexvolume that we can use as a reference?

+[Lior and Mohamed]

Thanks
Shay





From:        
Saad Ali <saa...@google.com>
To:        
Shay Berman <BS...@il.ibm.com>
Cc:        
Chakravarthy Nelluri <cha...@diamanti.com>, Cheng Xing <cx...@google.com>, Jing Xu <ji...@google.com>, Lior Tamary <LI...@il.ibm.com>
Date:        
10/03/2017 11:24 PM
Subject:        
Re: FlexVolume




> Can you please explain what is the  mount-deviceand unmount-device APIs do, and if we need to implement them?  and what is the difference between them to the mount\unmount API.

For volumes that implement a master attach, the mount step is broken into 3 steps: Wait for attach, Mount device, and Mount:
  • Wait for attach is self explanatory.
  • MountDevice allows you to mounts the attached device to a "global path" (common path for all pods that will use that volume on that node).
    • If the device is already mounted, MountDevice is a no-op
    • Mount then takes the global mount path and bind mounts it to individual pod paths.
  • Mount then takes the global mount path and bind mounts it to individual pod paths.
You don't have to use this pattern if it doesn't make sense for your implementation. You can choose to make your MountDevice operation a no-op, and put all your mounting logic directly in Mount, if you want.

On Tue, Oct 3, 2017 at 1:05 PM, Shay Berman <
BS...@il.ibm.com> wrote:
Hi Saadi

We have some questions about how to implement the controller attach\detach.
Reading the documentation of the flexvolume ->
https://github.com/kubernetes/community/blob/master/contributors/devel/flexvolume.md
Can you please explain what is the
 mount-deviceand unmount-device APIs do, and if we need to implement them?  and what is the difference between them to the mount\unmount API.

Thanks a lot.
<Mail Attachment.gif>
<Mail Attachment.gif>
Shay Berman
Product Owner & Software Engineer
Cloud Storage Solutions
IBM Systems

Phone:
(+972) 3 689 7780 | E-mail: bs...@il.ibm.com
Visit the CSS community at :
https://ibm.biz/BdsM8x













From:        
Saad Ali <saa...@google.com>
To:        
Shay Berman <BS...@il.ibm.com>
Cc:        
Cheng Xing <cx...@google.com>, Chakravarthy Nelluri <cha...@diamanti.com>, Jing Xu <ji...@google.com>
Date:        
10/02/2017 07:02 PM
Subject:        
Re: FlexVolume




Regarding Snapshots, Flex volume does not have a snapshots API yet. The snapshots feature is in prototype phase right now, if you're interested in contributing or extending it, Jing would be your go-to contact.


On Sat, Sep 30, 2017 at 12:03 PM, Shay Berman <
BS...@il.ibm.com> wrote:
Hi Cheng,

Thanks for the feedback.
I got your point
I think I will go with the alternative to install first the driver and then the config, make more sense, so the new driver will have also to support the new config.

Thanks
<Mail Attachment.gif>
<Mail Attachment.gif>
Shay Berman
Product Owner & Software Engineer
Cloud Storage Solutions
IBM Systems

Phone:
(+972) 3 689 7780 | E-mail: bs...@il.ibm.com
Visit the CSS community at :
https://ibm.biz/BdsM8x














From:        
Cheng Xing <cx...@google.com>
To:        
Shay Berman <BS...@il.ibm.com>
Cc:        
Chakravarthy Nelluri <cha...@diamanti.com>, Saad Ali <saa...@google.com>, Jing Xu <ji...@google.com>
Date:        
09/29/2017 09:17 PM
Subject:        
Re: FlexVolume




Hi Shay,

You aren't installing all your Flex-related files in a single atomic operation, so as your Flex install script executes, there's a short period of time when your new config file is installed but your new driver is not. If a volume operation is triggered, kubelet/controller-manager is going to execute your old driver, but with new configs. Your old driver needs to handle new configs gracefully.

Alternatively you could have your new driver handle the old configs properly, and install the driver before the config file.

Other than that, the config file approach seems OK to me.

+Jing who has more knowledge about snapshots. As far as I understand, snapshot is triggered by creating a VolumeSnapshot API object, so no it's not required to add anything extra in the Flexvolume driver.

On Fri, Sep 29, 2017 at 2:29 AM Shay Berman <
BS...@il.ibm.com> wrote:
Thanks for the feedback

Cheng, could you please elaborate about "
handle the config file being installed separately". What exactly is your suggestion?
Our flex driver need some configuration file, so this is why the daemonset we mount a configmap, and the setup_flex.sh just copy it to the host under /etc/ubiquity directory.
Since the flex driver is just a CLI (out of band from the cluster) there is no access to the configmap inside the flex driver, this is why we copy it locally so the flex driver will have access to its configuration.  (I guess if flex will be as a POD in the future, then no need to copy the conf file to the host, because as a POD you have access to the configmap)
So what kind of improvement you expect me to do here?

In addition, I would ask about the new k8s 1.8 release, that also introduced snapshot for volume.   Is there anything related to flexVolume that we need to implement in order to support snapshot?   or its related only to the dynamic provisioner component?

Thanks
<Mail Attachment.gif>
<Mail Attachment.gif>
Shay Berman
Product Owner & Software Engineer
Cloud Storage Solutions
IBM Systems

Phone:
(+972) 3 689 7780 | E-mail: bs...@il.ibm.com
Visit the CSS community at :
https://ibm.biz/BdsM8x















From:        
Cheng Xing <cx...@google.com>
To:        
Shay Berman <BS...@il.ibm.com>, Chakravarthy Nelluri <cha...@diamanti.com>
Cc:        
Saad Ali <saa...@google.com>
Date:        
09/28/2017 11:40 PM
Subject:        
Re: FlexVolume




Those files look good to me. One thing to point out is your Flex driver has to be able to handle the config file being installed separately, i.e. it's possible the older Flex driver is executed using new configs.

Privileged mode is required because Kubelet executes driver commands outside the container's namespace and scope.
On Thu, Sep 28, 2017 at 1:01 PM Shay Berman <
BS...@il.ibm.com> wrote:
I just pushed the daemonset for the IBM flex driver, in our project. I would appreciate if you could quickly review it (only if you can) :

1. The flex daemon set ->
https://github.com/IBM/ubiquity-k8s/blob/feature/daemonset_to_deploy_flex/scripts/k8s_scbe_all_in_one/yamls/ubiquity-k8s-flex-daemonset.yml
2. The deployment script inside the Flex docker image ->
https://github.com/IBM/ubiquity-k8s/blob/feature/daemonset_to_deploy_flex/deploy/k8s_deployments/setup_flex.sh
3. The dockerfile of the flex image ->
https://github.com/IBM/ubiquity-k8s/blob/feature/daemonset_to_deploy_flex/Dockerfile.Flex

BTW why you mentioned in the
recommended deployment method to specify securityContext:privileged:truein the daemonset. Is this mandatory?


Any feedback is welcome
Thanks
Shay





From:        
Shay Berman/Israel/IBM
To:        
Chakravarthy Nelluri <cha...@diamanti.com>
Cc:        
Cheng Xing <cx...@google.com>, Saad Ali <saa...@google.com>
Date:        
09/27/2017 09:50 PM
Subject:        
Re: FlexVolume



thanks!





From:        
Chakravarthy Nelluri <cha...@diamanti.com>
To:        
Shay Berman <BS...@il.ibm.com>
Cc:        
Saad Ali <saa...@google.com>, Cheng Xing <cx...@google.com>
Date:        
09/27/2017 05:05 PM
Subject:        
Re: FlexVolume




If there is no change in Driver capabilities(unlikely), you do not have to restart the driver when you upgrade your driver.

Thanks
~Chakri

On Sep 27, 2017, at 9:54 AM, Shay Berman <
BS...@il.ibm.com> wrote:

Hi Saad and Chakravarthy,

Thanks for the input, so we will go with daemon set method to deploy the ibm flexvolume driver.
And yes we are testing the flex driver on all the versions I mentioned below.

Another question please about the restart of the kubelet
So we need to restart the kubelete after we put for the first time the flex volume in place(e.g:
/usr/libexec/kubernetes/kubelet-plugins/volume/exec/ibm~ubiquity-k8s-flex).  But should be also need to restart kubelet when we just update the flex driver file in the same directory?
or its only for the first time discovery of the plugin?

thanks a lot for your responsiveness

Shay





From:        
Saad Ali <saa...@google.com>
To:        
Shay Berman <BS...@il.ibm.com>, Cheng Xing <cx...@google.com>
Cc:        
Chakravarthy Nelluri <cha...@diamanti.com>
Date:        
09/26/2017 10:47 PM
Subject:        
Re: FlexVolume




+Cheng author of the DaemonSet Flex Deployer

Ultimately to deploy a flex volume driver you need to place a file in a specific location on the node and master machines. You can communicate to cluster admins the manual steps to do this (copy this file to this location on each machine) or give them so automated mechanism to do it. But, yes, I would recommend the DaemonSets as a nicer way to package and distribute a Flex volume driver. DaemonSets exist in all versions you listed (1.5-1.8) so that shouldn't be an issue. However, I would recommend you test your driver with all the versions of k8s you intend to support because the Flex volume API has evolved over those versions and a driver compatible with the current interface maynot be compatible with older versions (master attach/detach logic, for example wasn't introduced in Flex until the 1.6 release)--Chakri may be able to provide more info here.


On Tue, Sep 26, 2017 at 12:29 PM, Shay Berman <
BS...@il.ibm.com> wrote:
Hi Chakravarthy and Saad,

A follow up question please regarding flexvolume.

What is the recommended way to deploy the flexvolume CLI on minions?
is it by using daemonSet that just copy the flex CLI with infinit loop on every minion, as mentioned in ->
https://github.com/kubernetes/community/blob/master/contributors/design-proposals/storage/flexvolume-deployment.md#recommended-driver-deployment-method?
Is this deployment method(daemonset) is recommended for all the k8s versions : 1.5.6, 1.6, 1.7 and 1.8?  

Thanks a lot
Shay




From:        
Chakravarthy Nelluri <cha...@diamanti.com>
To:        
Saad Ali <saa...@google.com>
Cc:        
Shay Berman <BS...@il.ibm.com>, kubernetes-sig-storage <kubernetes-...@googlegroups.com>
Date:        
09/25/2017 08:44 PM
Subject:        
Re: FlexVolume




Hi Shay,

For controller attach/detach support you have to implement

- Attach
- Detach
- WaitforAttach
- isAttached
- MountDevice
- UnmountDevice

Thanks
~Chakri

On Sep 25, 2017, at 1:29 PM, Saad Ali <
saa...@google.com> wrote:

+Charki Flex volume author and SIG Storage

> If I will implement the controller attach\detach APIs then if a minion will crash then detach operation will be automatically triggered for all the POD's PVCs that was on this minion?

​If your storage system supports control plane detach, then yes, implementing ​attach/detach would allow the k8s master to detach a volume even if the node the volume is attached to becomes unresponsive.

> In addition, which APIs should i implement in order to get this controller attach\detach functionality?

Chakri (CC'd), author of FlexVolumes, should be able to clarify that.




On Sun, Sep 24, 2017 at 12:53 PM, Shay Berman <
BS...@il.ibm.com> wrote:
hello

My name is Shay, and I am develop a flexvolume driver for IBM storage.

I saw your name on the "Detailed design for Volume Attach/Detach Controller" ->
https://github.com/kubernetes/kubernetes/issues/20262

I want to validate that I understood the value of this controller attach\detach.
If I will implement the controller attach\detach APIs then if a minion will crash then detach operation will be automatically triggered for all the POD's PVCs that was on this minion?

In addition, which APIs should i implement in order to get this controller attach\detach functionality?   ->
https://github.com/kubernetes/community/blob/master/contributors/devel/flexvolume.md
you can see here what we currently implementing for flexvolume ->
https://github.com/IBM/ubiquity-k8s/blob/master/controller/controller.go
currently we just skip waitforattach, isattached, getvolumename, mountdevice, unmount device.
and we do implement the mount(currently do attach volume and then mount) and umount(currently do detach and umount) and of cause we implement init.

So please direct me what we should implement in order to gain this controller detach\attach?

Thanks.
<Mail Attachment.gif>
<Mail Attachment.gif>
Shay Berman
Product Owner & Software Engineer
Cloud Storage Solutions
IBM Systems

Phone:
(+972) 3 689 7780 | E-mail: bs...@il.ibm.com
Visit the CSS community at :
https://ibm.biz/BdsM8x


























































Lior Tamary

unread,
Oct 31, 2017, 9:38:34 AM10/31/17
to Chakravarthy Nelluri, Shay Berman, Cheng Xing, hek...@redhat.com, Jing Xu, jsaf...@redhat.com, kubernetes-sig-storage, Mohamed Mohamed, Saad Ali
Thanks Chakravathy
We have switched to TaintBasedEvictions:true


I1031 13:02:53.917338       1 feature_gate.go:156] feature gates: map[TaintBasedEvictions:true]
I1031 13:02:54.123765       1 node_controller.go:303] Controller is using taint based evictions.
I1031 13:03:04.128248       1 taint_controller.go:158] Sending events to api server.
I1031 13:03:06.029338       1 taint_controller.go:181] Starting NoExecuteTaintManager
I1031 13:09:36.231776       1 taint_controller.go:85] NoExecuteTaintManager is deleting Pod: default/bashx-854cfdf744-wmmkk
I1031 13:09:36.232845       1 event.go:218] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"bashx-854cfdf744-wmmkk", UID:"", APIVersion:"", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TaintManagerEviction' Marking for deletion Pod default/bashx-854cfdf744-wmmkk


Now its not in node-lost state, its just stuck on Terminating


kubectl get pods --all-namespaces                                                                                            

NAMESPACE     NAME                                       READY          STATUS                 RESTARTS   AGE
default       bashx-854cfdf744-wmmkk                    1/1           Terminating              1             5h
default       bashx-854cfdf744-zhzg8                      0/1           ContainerCreating   0             18m


I can see in the controller log the following:


I1031 13:04:31.043087       1 event.go:218] Event(v1.ObjectReference{Kind:"Node", Namespace:"", Name:"liort-ub4", UID:"605f84cf-bd4b-11e7-a93a-005056a46507", APIVersion:"", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'NodeNotReady' Node liort-ub4 status is now: NodeNotReady
I1031 13:09:36.231776       1 taint_controller.go:85] NoExecuteTaintManager is deleting Pod: default/bashx-854cfdf744-wmmkk
I1031 13:09:36.232845       1 event.go:218] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"bashx-854cfdf744-wmmkk", UID:"", APIVersion:"", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TaintManagerEviction' Marking for deletion Pod default/bashx-854cfdf744-wmmkk
W1031 13:09:36.241423       1 plugin-defaults.go:32] flexVolume driver ibm/ubiquity-k8s-flex: using default GetVolumeName for volume bash-pvc
I1031 13:09:36.255217       1 event.go:218] Event(v1.ObjectReference{Kind:"ReplicaSet", Namespace:"default", Name:"bashx-854cfdf744", UID:"68c9a320-be14-11e7-a93a-005056a46507", APIVersion:"extensions", ResourceVersion:"136951", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: bashx-854cfdf744-zhzg8c
W1031 13:09:36.394707       1 plugin-defaults.go:32] flexVolume driver ibm/ubiquity-k8s-flex: using default GetVolumeName for volume bash-pvc
W1031 13:09:36.464425       1 reconciler.go:267] Multi-Attach error for volume "bash-pvc" (UniqueName: "flexvolume-ibm/ubiquity-k8s-flex/bash-pvc") from node "liort-ub2" Volume is already exclusively attached to one node and can't be attached to another


However, we still don't get detach volume request on the pod.
Again all we have in our flexvolume is repeated is-attached calls, which we reply with true (and some GetVolumeName calls which we reply with not-supported)
Any idea what we're still missing?

 
 

Chakravarthy Nelluri

unread,
Oct 31, 2017, 10:24:44 AM10/31/17
to Lior Tamary, Shay Berman, Cheng Xing, hek...@redhat.com, Jing Xu, jsaf...@redhat.com, kubernetes-sig-storage, Mohamed Mohamed, Saad Ali
The only way I know is to manually force delete it. Please open an issue for this, volume detach should be triggered automatically/configurable in these scenarios.

$ kubectl delete <pod> --grace-period=0 —force

Thanks
~Chakri

Reply all
Reply to author
Forward
0 new messages