On Thu, Apr 2, 2020 at 5:56 AM Vishal Thapar <vth...@redhat.com> wrote:I thought Mike's comment on PR was about making it so subctl doesn't have any sequential assumptions.I think Option #1 would be a recommended way of going about it. In fact I'd say we may want to use a CR for the information currently stored in broker-info.subm too if we want to support API based deployments. The small window of conflict in Option #1 is not an issue as it will result in failure to write to the CR in which case we will just have to re-allocate. We already handle such cases in globalnet code when annotating services with GlobalIp.exactly - k8s compare-and-set on update provides the atomicity.On Thu, Apr 2, 2020 at 2:26 PM Sridhar Gaddam <sga...@redhat.com> wrote:Present status:
Currently, while deploying Globalnet via Operator/subctl, we store the globalCIDR (which can be overridden during deploy-broker command) in the broker-info.subm file.
Subsequently, when we join data-clusters to the Broker, subctl first queries the existing clusters (i.e., submariner clusterCRD) associated with Broker and allocates a new chunk to the cluster we are trying to join.
While this works fine when joining clusters in sequence, it has issues when we try to execute the "subctl join ..." commands in parallel.There are a couple of options to address it.
1. Allocation in subctl:
Use a "new" CRD (say globalCIDRList) to store the allocations on the Broker Cluster that includes the clusterName, the allocated globalCIDR (and any other info).
subctl instead of reading the clusterCRD on the broker, will now read the globalCIDRList and make new allocations such that it will not overlap with existing allocations.
Once an allocation is done in subctl, it will update the globalCIDRListCRD on the Broker cluster. Here we are reducing the window where allocation and subsequent updation on the broker, but there is still a small possibility for issues to creep in when subctl join operations are performed in parallel.
2. Allocation on Broker:
Introduce a new service on the Broker which does the allocation to the joining clusters. This might be fool-proof but has a drawback that we will now be running a service/pod on the Broker.
3. Allocation in subctl with support for joining multiple clusters:
As @mike suggested in the PR[*], modify subctl to handle parallel joins. There will still be an issue if the user runs two independent "subctl join ..." commands in parallel from different terminals.
I'm interested to hear your views/suggestions on addressing it. Please comment even if you think the current expectation of running subctl sequentially is good enough.
[*] https://github.com/submariner-io/shipyard/pull/61#discussion_r402124724
Thanks,
--Sridhar.
Yeah, that's another option, for CI I agree it must be fine.In the long term, probably we should have the submariner-operator by itself handling the allocation on the global CIDR (and probably lockingover the broker with the other operators while grabbing that CIDR).We can probably implement both:1) Forced CIDR through the cmdline (for CI speed)
--
You received this message because you are subscribed to the Google Groups "submariner-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to submariner-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/submariner-dev/CAC3B9fnh-Bw%2BkMCO0SQ-c5drdvdjh4Fi-Q8B2bhFLXXmQH6HFA%40mail.gmail.com.
On Mon, Apr 6, 2020 at 11:20 AM Miguel Angel Ajo Pelayo <majo...@redhat.com> wrote:Yeah, that's another option, for CI I agree it must be fine.In the long term, probably we should have the submariner-operator by itself handling the allocation on the global CIDR (and probably lockingover the broker with the other operators while grabbing that CIDR).We can probably implement both:1) Forced CIDR through the cmdline (for CI speed)It is important that the operator (the person setting up the connectivity) would be able to easily query what CIDRs are already in use if he can set it manually.Is this possible today ?
On Mon, Apr 6, 2020 at 1:56 PM Livnat Peer <lp...@redhat.com> wrote:On Mon, Apr 6, 2020 at 11:20 AM Miguel Angel Ajo Pelayo <majo...@redhat.com> wrote:Yeah, that's another option, for CI I agree it must be fine.In the long term, probably we should have the submariner-operator by itself handling the allocation on the global CIDR (and probably lockingover the broker with the other operators while grabbing that CIDR).We can probably implement both:1) Forced CIDR through the cmdline (for CI speed)It is important that the operator (the person setting up the connectivity) would be able to easily query what CIDRs are already in use if he can set it manually.Is this possible today ?Yes Livnat, the globalCIDR is stored as part of Submariner Cluster CRD.
On Mon, Apr 6, 2020 at 11:30 AM Sridhar Gaddam <sga...@redhat.com> wrote:On Mon, Apr 6, 2020 at 1:56 PM Livnat Peer <lp...@redhat.com> wrote:On Mon, Apr 6, 2020 at 11:20 AM Miguel Angel Ajo Pelayo <majo...@redhat.com> wrote:Yeah, that's another option, for CI I agree it must be fine.In the long term, probably we should have the submariner-operator by itself handling the allocation on the global CIDR (and probably lockingover the broker with the other operators while grabbing that CIDR).We can probably implement both:1) Forced CIDR through the cmdline (for CI speed)It is important that the operator (the person setting up the connectivity) would be able to easily query what CIDRs are already in use if he can set it manually.Is this possible today ?Yes Livnat, the globalCIDR is stored as part of Submariner Cluster CRD.perfect.Is it possible to change the allocation assuming that no global-ip is already allocated in the cluster?What do you think about enabling a disruptive operation of changing the global cidr which will re-allocate new IPs to all the Pods in the cluster (with the understanding that this operation is disruptive to any cross cluster communication that is going on)?
On Mon, Apr 6, 2020 at 11:37 AM Livnat Peer <lp...@redhat.com> wrote:On Mon, Apr 6, 2020 at 11:30 AM Sridhar Gaddam <sga...@redhat.com> wrote:On Mon, Apr 6, 2020 at 1:56 PM Livnat Peer <lp...@redhat.com> wrote:On Mon, Apr 6, 2020 at 11:20 AM Miguel Angel Ajo Pelayo <majo...@redhat.com> wrote:Yeah, that's another option, for CI I agree it must be fine.In the long term, probably we should have the submariner-operator by itself handling the allocation on the global CIDR (and probably lockingover the broker with the other operators while grabbing that CIDR).We can probably implement both:1) Forced CIDR through the cmdline (for CI speed)It is important that the operator (the person setting up the connectivity) would be able to easily query what CIDRs are already in use if he can set it manually.Is this possible today ?Yes Livnat, the globalCIDR is stored as part of Submariner Cluster CRD.perfect.Is it possible to change the allocation assuming that no global-ip is already allocated in the cluster?
What do you think about enabling a disruptive operation of changing the global cidr which will re-allocate new IPs to all the Pods in the cluster (with the understanding that this operation is disruptive to any cross cluster communication that is going on)?
Giving this another thought, it is like disconnecting and connecting a cluster with a different CIDR..in some way....
Okay, summarizing what was discussed over submariner slack channel.The plan is to implement the following.1. We will add a new flag (name to be decided) to "subctl join ..." allowing the user to specify the globalCIDR for the cluster. As @mike said, this will also help us to customize the globalCIDR per cluster (i.e., it can be outside of globalnet-cidr-range which currently defaults to 169.254.0.0/16)2. When user does not specify the flag during "subctl join ...", subctl will take a lock (thanks @miguel for this suggestion) on the Broker (with an appropriate timeout - say 60 secs) and allocate a globalCIDR chunk to the joining cluster from globalnet-cidr-range. The lock will be released after the clusterCRD is created on the Broker (or after a timeout).
On Mon, Apr 6, 2020 at 2:18 PM Sridhar Gaddam <sga...@redhat.com> wrote:Okay, summarizing what was discussed over submariner slack channel.The plan is to implement the following.1. We will add a new flag (name to be decided) to "subctl join ..." allowing the user to specify the globalCIDR for the cluster. As @mike said, this will also help us to customize the globalCIDR per cluster (i.e., it can be outside of globalnet-cidr-range which currently defaults to 169.254.0.0/16)2. When user does not specify the flag during "subctl join ...", subctl will take a lock (thanks @miguel for this suggestion) on the Broker (with an appropriate timeout - say 60 secs) and allocate a globalCIDR chunk to the joining cluster from globalnet-cidr-range. The lock will be released after the clusterCRD is created on the Broker (or after a timeout).I do think that we should prioritize getting subm contents moved to a CR in broker. We will need that to support API driven deployment. Adding locks etc. to subm are just stop-gap arrangements.IMO, we should go with what Mike proposed. Go with Option 1 to address our KIND/CI use case for now, and document the sequential limitation of what we already have. This should be enough while we work on moving broker to operator from subctl and replace subm file with a CR on broker to support API deployments.
On Mon, Apr 6, 2020 at 3:15 PM Vishal Thapar <vth...@redhat.com> wrote:On Mon, Apr 6, 2020 at 2:18 PM Sridhar Gaddam <sga...@redhat.com> wrote:Okay, summarizing what was discussed over submariner slack channel.The plan is to implement the following.1. We will add a new flag (name to be decided) to "subctl join ..." allowing the user to specify the globalCIDR for the cluster. As @mike said, this will also help us to customize the globalCIDR per cluster (i.e., it can be outside of globalnet-cidr-range which currently defaults to 169.254.0.0/16)2. When user does not specify the flag during "subctl join ...", subctl will take a lock (thanks @miguel for this suggestion) on the Broker (with an appropriate timeout - say 60 secs) and allocate a globalCIDR chunk to the joining cluster from globalnet-cidr-range. The lock will be released after the clusterCRD is created on the Broker (or after a timeout).I do think that we should prioritize getting subm contents moved to a CR in broker. We will need that to support API driven deployment. Adding locks etc. to subm are just stop-gap arrangements.IMO, we should go with what Mike proposed. Go with Option 1 to address our KIND/CI use case for now, and document the sequential limitation of what we already have. This should be enough while we work on moving broker to operator from subctl and replace subm file with a CR on broker to support API deployments.Using a lock on the Broker should be a simple mechanism. Anyways, I'd like to understand few things first.Can you elaborate this sentence - "while we work on moving broker to operator from subctl and replace subm file with a CR on broker to support API deployments."?
Okay, summarizing what was discussed over submariner slack channel.The plan is to implement the following.
1. We will add a new flag (globalnet-cidr) to "subctl join ..." allowing the user to specify the globalCIDR for the cluster. As @mike said, this will also help us to customize the globalCIDR per cluster (i.e., it can be outside of globalnet-cidr-range which currently defaults to 169.254.0.0/16)