Re: Allocation of globalCIDRs to Clusters.

10 views
Skip to first unread message

Miguel Angel Ajo Pelayo

unread,
Apr 6, 2020, 4:06:22 AM4/6/20
to submariner-dev, Multi-cluster networking team
I'm moving this discussion to the upstream mailing list.

I'd say let's avoid adding services on the broker (while we can) or new resources on the broker, we already have what we need (in the clusters info + subm file).

I'd add a lock mechanism to subctl using the broker (as we do for the gateway leader election on the local broker). This still has the disadvantage of keeping the
config of the CIDR in the subm file.. if it's lost you need to provide that CIDR again.

Adding some CR in the broker to store general configuration details (for persistence beyond subm file) is something we could do in the future, but I think
it's unrelated to fixing subctl (if we adopt the locking perspective).



On Thu, Apr 2, 2020 at 2:27 PM Thomas Pantelis <tpan...@redhat.com> wrote:


On Thu, Apr 2, 2020 at 5:56 AM Vishal Thapar <vth...@redhat.com> wrote:
I thought Mike's comment on PR was about making it so subctl doesn't have any sequential assumptions.

I think Option #1 would be a recommended way of going about it. In fact I'd say we may want to use a CR for the information currently stored in broker-info.subm too if we want to support API based deployments. The small window of conflict in Option #1 is not an issue as it will result in failure to write to the CR in which case we will just have to re-allocate. We already handle such cases in globalnet code when annotating services with GlobalIp.

exactly - k8s compare-and-set on update provides the atomicity. 
 

On Thu, Apr 2, 2020 at 2:26 PM Sridhar Gaddam <sga...@redhat.com> wrote:
Present status:

Currently, while deploying Globalnet via Operator/subctl, we store the globalCIDR (which can be overridden during deploy-broker command) in the broker-info.subm file.
Subsequently, when we join data-clusters to the Broker, subctl first queries the existing clusters (i.e., submariner clusterCRD) associated with Broker and allocates a new chunk to the cluster we are trying to join.

While this works fine when joining clusters in sequence, it has issues when we try to execute the "subctl join ..." commands in parallel.

There are a couple of options to address it.

1. Allocation in subctl:
Use a "new" CRD (say globalCIDRList) to store the allocations on the Broker Cluster that includes the clusterName, the allocated globalCIDR (and any other info).
subctl instead of reading the clusterCRD on the broker, will now read the globalCIDRList and make new allocations such that it will not overlap with existing allocations.
Once an allocation is done in subctl, it will update the globalCIDRListCRD on the Broker cluster. Here we are reducing the window where allocation and subsequent updation on the broker, but there is still a small possibility for issues to creep in when subctl join operations are performed in parallel.

2. Allocation on Broker:
Introduce a new service on the Broker which does the allocation to the joining clusters. This might be fool-proof but has a drawback that we will now be running a service/pod on the Broker.

3. Allocation in subctl with support for joining multiple clusters:
As @mike suggested in the PR[*], modify subctl to handle parallel joins. There will still be an issue if the user runs two independent "subctl join ..." commands in parallel from different terminals.

I'm interested to hear your views/suggestions on addressing it. Please comment even if you think the current expectation of running subctl sequentially is good enough.

[*] https://github.com/submariner-io/shipyard/pull/61#discussion_r402124724

Thanks,
--Sridhar.


--
Miguel Ángel Ajo  @mangel_ajo  
OpenShift / Kubernetes / Network Federation team.
OSP / Networking DFG, OVN Squad Engineering


Mike Kolesnik

unread,
Apr 6, 2020, 4:14:06 AM4/6/20
to Miguel Angel Ajo Pelayo, submariner-dev, Multi-cluster networking team
May i suggest we use Occam's razor and select the simplest approach? ;)

To me it seems reasonable to have a limitation of non-parallelised `subctl join` if we:
1. Document that this constraint exists in the command help
2. Have an option to specify a global CIDR instead of letting it select for itself (thus allowing parallelism)

This would be the simplest approach AFAICT and would also have a benefit of being able to customize the CIDR by the user per cluster (instead of just using one global cidr and cutting it up)

Then in the E2E tests we can specify the cidrs and run the deploy in parallel, instead of jumping through hoops to do it :)
--
Regards,
Mike

Miguel Angel Ajo Pelayo

unread,
Apr 6, 2020, 4:20:26 AM4/6/20
to Mike Kolesnik, submariner-dev, Multi-cluster networking team
Yeah, that's another option, for CI I agree it must be fine.

In the long term, probably we should have the submariner-operator by itself handling the allocation on the global CIDR (and probably locking
over the broker with the other operators while grabbing that CIDR).

We can probably implement both:

1) Forced CIDR through the cmdline (for CI speed)
2) Locking mechanism *only* during the allocation phase, and only if not forcedfully assigned, we can always throw a error if we detect an overlap.

Or just 2...


Livnat Peer

unread,
Apr 6, 2020, 4:26:30 AM4/6/20
to Miguel Angel Ajo Pelayo, Mike Kolesnik, submariner-dev, Multi-cluster networking team
On Mon, Apr 6, 2020 at 11:20 AM Miguel Angel Ajo Pelayo <majo...@redhat.com> wrote:
Yeah, that's another option, for CI I agree it must be fine.

In the long term, probably we should have the submariner-operator by itself handling the allocation on the global CIDR (and probably locking
over the broker with the other operators while grabbing that CIDR).

We can probably implement both:

1) Forced CIDR through the cmdline (for CI speed)

It is important that the operator (the person setting up the connectivity) would be able to easily query what CIDRs are already in use if he can set it manually.
Is this possible today ?
 
--
You received this message because you are subscribed to the Google Groups "submariner-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to submariner-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/submariner-dev/CAC3B9fnh-Bw%2BkMCO0SQ-c5drdvdjh4Fi-Q8B2bhFLXXmQH6HFA%40mail.gmail.com.

Sridhar Gaddam

unread,
Apr 6, 2020, 4:30:26 AM4/6/20
to Livnat Peer, Miguel Angel Ajo Pelayo, Multi-cluster networking team, submariner-dev
On Mon, Apr 6, 2020 at 1:56 PM Livnat Peer <lp...@redhat.com> wrote:


On Mon, Apr 6, 2020 at 11:20 AM Miguel Angel Ajo Pelayo <majo...@redhat.com> wrote:
Yeah, that's another option, for CI I agree it must be fine.

In the long term, probably we should have the submariner-operator by itself handling the allocation on the global CIDR (and probably locking
over the broker with the other operators while grabbing that CIDR).

We can probably implement both:

1) Forced CIDR through the cmdline (for CI speed)

It is important that the operator (the person setting up the connectivity) would be able to easily query what CIDRs are already in use if he can set it manually.
Is this possible today ?
Yes Livnat, the globalCIDR is stored as part of Submariner Cluster CRD. 

Livnat Peer

unread,
Apr 6, 2020, 4:38:17 AM4/6/20
to Sridhar Gaddam, Miguel Angel Ajo Pelayo, Multi-cluster networking team, submariner-dev
On Mon, Apr 6, 2020 at 11:30 AM Sridhar Gaddam <sga...@redhat.com> wrote:


On Mon, Apr 6, 2020 at 1:56 PM Livnat Peer <lp...@redhat.com> wrote:


On Mon, Apr 6, 2020 at 11:20 AM Miguel Angel Ajo Pelayo <majo...@redhat.com> wrote:
Yeah, that's another option, for CI I agree it must be fine.

In the long term, probably we should have the submariner-operator by itself handling the allocation on the global CIDR (and probably locking
over the broker with the other operators while grabbing that CIDR).

We can probably implement both:

1) Forced CIDR through the cmdline (for CI speed)

It is important that the operator (the person setting up the connectivity) would be able to easily query what CIDRs are already in use if he can set it manually.
Is this possible today ?
Yes Livnat, the globalCIDR is stored as part of Submariner Cluster CRD. 

perfect.
Is it possible to change the allocation assuming that no global-ip is already allocated in the cluster?
What do you think about enabling a disruptive operation of changing the global cidr which will re-allocate new IPs to all the Pods in the cluster (with the understanding that this operation is disruptive to any cross cluster communication that is going on)?

I'm asking to see how much flexibility we want to give to the operator.

Livnat

Livnat Peer

unread,
Apr 6, 2020, 4:39:47 AM4/6/20
to Sridhar Gaddam, Miguel Angel Ajo Pelayo, Multi-cluster networking team, submariner-dev
On Mon, Apr 6, 2020 at 11:37 AM Livnat Peer <lp...@redhat.com> wrote:


On Mon, Apr 6, 2020 at 11:30 AM Sridhar Gaddam <sga...@redhat.com> wrote:


On Mon, Apr 6, 2020 at 1:56 PM Livnat Peer <lp...@redhat.com> wrote:


On Mon, Apr 6, 2020 at 11:20 AM Miguel Angel Ajo Pelayo <majo...@redhat.com> wrote:
Yeah, that's another option, for CI I agree it must be fine.

In the long term, probably we should have the submariner-operator by itself handling the allocation on the global CIDR (and probably locking
over the broker with the other operators while grabbing that CIDR).

We can probably implement both:

1) Forced CIDR through the cmdline (for CI speed)

It is important that the operator (the person setting up the connectivity) would be able to easily query what CIDRs are already in use if he can set it manually.
Is this possible today ?
Yes Livnat, the globalCIDR is stored as part of Submariner Cluster CRD. 

perfect.
Is it possible to change the allocation assuming that no global-ip is already allocated in the cluster?
What do you think about enabling a disruptive operation of changing the global cidr which will re-allocate new IPs to all the Pods in the cluster (with the understanding that this operation is disruptive to any cross cluster communication that is going on)?


Giving this another thought, it is like disconnecting and connecting a cluster with a different CIDR..in some way....

Sridhar Gaddam

unread,
Apr 6, 2020, 4:48:32 AM4/6/20
to Miguel Angel Ajo Pelayo, Mike Kolesnik, Vishal Thapar, Thomas Pantelis, Multi-cluster networking team, submariner-dev
Okay, summarizing what was discussed over submariner slack channel.

The plan is to implement the following.
1. We will add a new flag (name to be decided) to "subctl join ..." allowing the user to specify the globalCIDR for the cluster. As @mike said, this will also help us to customize the globalCIDR per cluster (i.e., it can be outside of globalnet-cidr-range which currently defaults to 169.254.0.0/16)
2. When user does not specify the flag during "subctl join ...", subctl will take a lock (thanks @miguel for this suggestion) on the Broker (with an appropriate timeout - say 60 secs) and allocate a globalCIDR chunk to the joining cluster from globalnet-cidr-range. The lock will be released after the clusterCRD is created on the Broker (or after a timeout). 

Option1 above can be used in the KIND/CI deployments to speed up the deployment/test duration.


Best Regards,
--Sridhar.

Sridhar Gaddam

unread,
Apr 6, 2020, 5:20:15 AM4/6/20
to Livnat Peer, Miguel Angel Ajo Pelayo, Multi-cluster networking team, submariner-dev
On Mon, Apr 6, 2020 at 2:09 PM Livnat Peer <lp...@redhat.com> wrote:


On Mon, Apr 6, 2020 at 11:37 AM Livnat Peer <lp...@redhat.com> wrote:


On Mon, Apr 6, 2020 at 11:30 AM Sridhar Gaddam <sga...@redhat.com> wrote:


On Mon, Apr 6, 2020 at 1:56 PM Livnat Peer <lp...@redhat.com> wrote:


On Mon, Apr 6, 2020 at 11:20 AM Miguel Angel Ajo Pelayo <majo...@redhat.com> wrote:
Yeah, that's another option, for CI I agree it must be fine.

In the long term, probably we should have the submariner-operator by itself handling the allocation on the global CIDR (and probably locking
over the broker with the other operators while grabbing that CIDR).

We can probably implement both:

1) Forced CIDR through the cmdline (for CI speed)

It is important that the operator (the person setting up the connectivity) would be able to easily query what CIDRs are already in use if he can set it manually.
Is this possible today ?
Yes Livnat, the globalCIDR is stored as part of Submariner Cluster CRD. 

perfect.
Is it possible to change the allocation assuming that no global-ip is already allocated in the cluster?
Currently, when globalnet is enabled, it has to be enabled on ALL the clusters. As you know, we do not support a mix of globalnet vs non-globalnet deployments in the clusters.
If the question is about migrating a cluster from non-globalnet deployment (i.e., non-overlapping Clusters) to a Globalnet deployment (i.e., at least some clusters Overlap), then I "think" it should be possible. The user/operator has to update the Submariner DaemonSet with globalCIDR and restart all the submariner components. 
We did some basic testing in this area and also addressed an issue [*], but this is not tested extensively. Technically, this should be possible and we have to ensure that cleanup is properly handled during the migration.
 
What do you think about enabling a disruptive operation of changing the global cidr which will re-allocate new IPs to all the Pods in the cluster (with the understanding that this operation is disruptive to any cross cluster communication that is going on)?
There are couple of possibilities here like increasing the globalCIDR size (i.e., 169.254.0.0/24 is changed to 169.254.0.0/19) or changing the globalCIDR itself (i.e., 169.254.0.0/24 is changed to 172.16.0.0/24)
We have implemented globalnetController to be idempotent (i.e., you can restart it multiple times and there will not be any disruption to the existing IPs allocated to the PODs/Services as-well-as to the datapath).
This should "ideally" work when globalCIDR size is increased. But when we change the globalCIDR itself, it "will" be a disruptive operation (as you pointed out) and we might have to handle few corner cases in globalnet Controller. Again, this is something that needs a good amount of testing.


Giving this another thought, it is like disconnecting and connecting a cluster with a different CIDR..in some way....
Indeed yes. When we talk about modifying the globalCIDR, we will basically be updating the Submariner DaemonSet which requires a restart of Submariner Components. 

Vishal Thapar

unread,
Apr 6, 2020, 5:46:18 AM4/6/20
to Sridhar Gaddam, Miguel Angel Ajo Pelayo, Mike Kolesnik, Thomas Pantelis, Multi-cluster networking team, submariner-dev
On Mon, Apr 6, 2020 at 2:18 PM Sridhar Gaddam <sga...@redhat.com> wrote:
Okay, summarizing what was discussed over submariner slack channel.

The plan is to implement the following.
1. We will add a new flag (name to be decided) to "subctl join ..." allowing the user to specify the globalCIDR for the cluster. As @mike said, this will also help us to customize the globalCIDR per cluster (i.e., it can be outside of globalnet-cidr-range which currently defaults to 169.254.0.0/16)
2. When user does not specify the flag during "subctl join ...", subctl will take a lock (thanks @miguel for this suggestion) on the Broker (with an appropriate timeout - say 60 secs) and allocate a globalCIDR chunk to the joining cluster from globalnet-cidr-range. The lock will be released after the clusterCRD is created on the Broker (or after a timeout). 
I do think that we should prioritize getting subm contents moved to a CR in broker. We will need that to support API driven deployment. Adding locks etc. to subm are just stop-gap arrangements.

IMO, we should go with what Mike proposed. Go with Option 1 to address our KIND/CI use case for now, and document the sequential limitation of what we already have. This should be enough while we work on moving broker to operator from subctl and replace subm file with a CR on broker to support API deployments.

Sridhar Gaddam

unread,
Apr 6, 2020, 7:34:26 AM4/6/20
to Vishal Thapar, Miguel Angel Ajo Pelayo, Mike Kolesnik, Thomas Pantelis, Multi-cluster networking team, submariner-dev
On Mon, Apr 6, 2020 at 3:15 PM Vishal Thapar <vth...@redhat.com> wrote:


On Mon, Apr 6, 2020 at 2:18 PM Sridhar Gaddam <sga...@redhat.com> wrote:
Okay, summarizing what was discussed over submariner slack channel.

The plan is to implement the following.
1. We will add a new flag (name to be decided) to "subctl join ..." allowing the user to specify the globalCIDR for the cluster. As @mike said, this will also help us to customize the globalCIDR per cluster (i.e., it can be outside of globalnet-cidr-range which currently defaults to 169.254.0.0/16)
2. When user does not specify the flag during "subctl join ...", subctl will take a lock (thanks @miguel for this suggestion) on the Broker (with an appropriate timeout - say 60 secs) and allocate a globalCIDR chunk to the joining cluster from globalnet-cidr-range. The lock will be released after the clusterCRD is created on the Broker (or after a timeout). 
I do think that we should prioritize getting subm contents moved to a CR in broker. We will need that to support API driven deployment. Adding locks etc. to subm are just stop-gap arrangements.

IMO, we should go with what Mike proposed. Go with Option 1 to address our KIND/CI use case for now, and document the sequential limitation of what we already have. This should be enough while we work on moving broker to operator from subctl and replace subm file with a CR on broker to support API deployments.
Using a lock on the Broker should be a simple mechanism. Anyways, I'd like to understand few things first. 
Can you elaborate this sentence - "while we work on moving broker to operator from subctl and replace subm file with a CR on broker to support API deployments."?

Vishal Thapar

unread,
Apr 6, 2020, 7:41:01 AM4/6/20
to Sridhar Gaddam, Miguel Angel Ajo Pelayo, Mike Kolesnik, Thomas Pantelis, Multi-cluster networking team, submariner-dev
On Mon, Apr 6, 2020 at 5:04 PM Sridhar Gaddam <sga...@redhat.com> wrote:


On Mon, Apr 6, 2020 at 3:15 PM Vishal Thapar <vth...@redhat.com> wrote:


On Mon, Apr 6, 2020 at 2:18 PM Sridhar Gaddam <sga...@redhat.com> wrote:
Okay, summarizing what was discussed over submariner slack channel.

The plan is to implement the following.
1. We will add a new flag (name to be decided) to "subctl join ..." allowing the user to specify the globalCIDR for the cluster. As @mike said, this will also help us to customize the globalCIDR per cluster (i.e., it can be outside of globalnet-cidr-range which currently defaults to 169.254.0.0/16)
2. When user does not specify the flag during "subctl join ...", subctl will take a lock (thanks @miguel for this suggestion) on the Broker (with an appropriate timeout - say 60 secs) and allocate a globalCIDR chunk to the joining cluster from globalnet-cidr-range. The lock will be released after the clusterCRD is created on the Broker (or after a timeout). 
I do think that we should prioritize getting subm contents moved to a CR in broker. We will need that to support API driven deployment. Adding locks etc. to subm are just stop-gap arrangements.

IMO, we should go with what Mike proposed. Go with Option 1 to address our KIND/CI use case for now, and document the sequential limitation of what we already have. This should be enough while we work on moving broker to operator from subctl and replace subm file with a CR on broker to support API deployments.
Using a lock on the Broker should be a simple mechanism. Anyways, I'd like to understand few things first. 
Can you elaborate this sentence - "while we work on moving broker to operator from subctl and replace subm file with a CR on broker to support API deployments."?
This refers to another discussion thread we have going with OpenShift folks on not using CLI like subctl but API based deployments. Today broker installation is 100% through subctl, there is no operator logic for broker. I know Stephen was looking into it way back before 0.1.1 but was never a priority. Moving broker to operator would be a requisite to support API driven non-subctl deployments. And we need a way to replace broker-info.subm file for use cases where there is no subctl.

Today we can bypass subctl by creating the submariner CR directly, but:
1. It doesn't setup broker
2. Have to manually provide all the fields including globalCIDR.

The ease of use provided by subctl, esp for globalnet where it auto allocates CIDRs, can and I think should be done in operator. subctl becomes just a lightweight front-end for those who prefer CLI option.

Miguel Angel Ajo Pelayo

unread,
Apr 6, 2020, 8:10:59 AM4/6/20
to Vishal Thapar, Sridhar Gaddam, Mike Kolesnik, Thomas Pantelis, Multi-cluster networking team, submariner-dev
There are multiple things that we may want to move to the operator (out of subctl):

* Lighthouse deployment
* Network discovery
* Global CIDR allocation
* Broker deployment

Sridhar Gaddam

unread,
Apr 6, 2020, 2:27:52 PM4/6/20
to Miguel Angel Ajo Pelayo, Mike Kolesnik, Vishal Thapar, Thomas Pantelis, Multi-cluster networking team, submariner-dev
On Mon, Apr 6, 2020 at 2:17 PM Sridhar Gaddam <sga...@redhat.com> wrote:
Okay, summarizing what was discussed over submariner slack channel.

The plan is to implement the following.
1. We will add a new flag (globalnet-cidr) to "subctl join ..." allowing the user to specify the globalCIDR for the cluster. As @mike said, this will also help us to customize the globalCIDR per cluster (i.e., it can be outside of globalnet-cidr-range which currently defaults to 169.254.0.0/16)
Following issue is reported to track this - https://github.com/submariner-io/submariner-operator/issues/298
2. When globalnet-cidr flag is not specified during subctl join operation, subctl will continue to allocate an unused chuck of globalCIDR from the globalnet-cidr-range and store the allocation in a configMap on the Broker cluster.
subctl during join operation will always refer to ConfigMap on the Broker before making new allocations.
Following issue is reported to track this - https://github.com/submariner-io/submariner-operator/issues/299
Reply all
Reply to author
Forward
0 new messages