RFC: RevocableInfo Changes

33 views
Skip to first unread message

Benjamin Mahler

unread,
Mar 11, 2016, 10:09:46 PM3/11/16
to dev, Mesos Resource Allocation Working Group
Hey folks,

In the resource allocation working group we've been looking into a few projects that will make the allocator able to offer out resources as revocable. For example:

-We'll want to eventually allocate resources as revocable _by default_, only allowing non-revocable when there are guarantees put in place (static reservations or quota).

-On the path to revocable by default, we can incrementally start to offer certain resources as revocable. Consider when quota is set but the role isn't using all of the quota. The unallocated quota can be offered to other roles, but it should be revocable because we may revoke them should the quota'ed role want to use the resources. Unused reservations fall into a similar category.

-Going revocable by default also allows us to enforce fairness in a dynamically changing cluster by revoking resources as weights are changed, frameworks are added or removed, etc.

In this context, "revocable" means that the resources may be taken away and the container will be destroyed. The meaning of "revocable" in the context of usage oversubscription includes this, but also the container may experience a throttling (e.g. lower cpu shares, less network priority, etc).

For this reason, and because we internally need to distinguish revocable resources between the those that are generated by usage oversubscription and those that are generated by the allocator, we're thinking of the following change to the API:



-  message RevocableInfo {}
+  message RevocableInfo {
+    message ThrottleInfo {}
+
+    // If set, indicates that the resources may be throttled at
+    // any time. Throttle-able resoruces can be used for tasks
+    // that do not have strict performance requirements and are
+    // capable of handling being throttled.
+    optional ThrottleInfo throttle_info;
+  }

   // If this is set, the resources are revocable, i.e., any tasks or
-  // executors launched using these resources could get preempted or
-  // throttled at any time. This could be used by frameworks to run
-  // best effort tasks that do not need strict uptime or performance
+  // executors launched using these resources could be terminated at
+  // any time. This could be used by frameworks to run
+  // best effort tasks that do not need strict uptime
   // guarantees. Note that if this is set, 'disk' or 'reservation'
   // cannot be set.
   optional RevocableInfo revocable = 9;



Essentially we want to distinguish between revocable and revocable + throttle-able. This is because usage-oversubscription generates throttle-able revocable resources, whereas the allocator does not. This also solves our problem of distinguishing between these two kinds of revocable resources internally.

Feedback welcome!

Ben

Guangya Liu

unread,
Mar 11, 2016, 11:03:32 PM3/11/16
to Mesos Resource Allocation Working Group, d...@mesos.apache.org, bma...@apache.org

Hi Ben,

I think that currently and even in the near future, the __ThrottleInfo__ will only be used by the usage oversubscriptions and the oversubscription for allocator (Both quota and reservations) will not use this value but only using __RevocableInfo__ is enough.

I can even think that the __ThrottleInfo__ as a boolean value in optimistic offer phase 1 as it is mainly used to distinguish resources between usage oversubscriptions and allocation oversubscription (Quota and Reservations), comments?

Thanks,

Guangya

在 2016年3月12日星期六 UTC+8上午11:09:46,Benjamin Mahler写道:

Klaus Ma

unread,
Mar 12, 2016, 3:05:03 AM3/12/16
to Guangya Liu, Mesos Resource Allocation Working Group, dev, Benjamin Mahler
Yes, I think that's true for now; so we define `ThrottleInfo` as message to be more flexible. In Optimistic Offer Phase 1, we only use it to distinguish usage oversubscriptions and allocation oversubscription, similar to bool :).

Regarding the resources type, two questions after the discussion:

1. should we send different offer to the framework, so when usage/allocation oversubscription updated, only one type of offer will be rescinded?
2. should we define framework's capability against `ThrottleInfo`?

----
Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer 
Platform OpenSource Technology, STG, IBM GCG 
+86-10-8245 4084 | klaus1...@gmail.com | http://k82.me

--
You received this message because you are subscribed to the Google Groups "Mesos Resource Allocation Working Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mesos-allocati...@googlegroups.com.
To post to this group, send email to mesos-al...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mesos-allocation/a68b9e92-f22e-4cf7-9499-b982c9c07613%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Klaus Ma

unread,
Mar 15, 2016, 10:24:14 PM3/15/16
to Mesos Resource Allocation Working Group, d...@mesos.apache.org, bma...@apache.org
The patches are updated accordingly; JIRA: MESOS-3888 , RR: https://reviews.apache.org/r/40375/ .

Thanks
klaus

Guangya Liu

unread,
Mar 16, 2016, 9:32:37 AM3/16/16
to Mesos Resource Allocation Working Group, d...@mesos.apache.org, bma...@apache.org
Also please show your comments if any for the name here, the current name is ThrottleInfo, in Kubernetes resources qos design document, they are using scavenging as the key work for such behaviour, so a possible name here could be ScavengeInfo , please show your comments if any for those two names or even if you want to propose a new name here.

message RevocableInfo {
    message ThrottleInfo {}

    // If set, indicates that the resources may be throttled at
    // any time. Throttle-able resoruces can be used for tasks
    // that do not have strict performance requirements and are
    // capable of handling being throttled.
    optional ThrottleInfo throttle_info = 1;
  }

在 2016年3月16日星期三 UTC+8上午10:24:14,Klaus Ma写道:

Klaus Ma

unread,
Mar 19, 2016, 7:01:11 AM3/19/16
to Mesos Resource Allocation Working Group, d...@mesos.apache.org, bma...@apache.org
@team, in the latest meeting, we agree to keep current name ThrottleInfo.

If any more comments, please let me know.

Klaus Ma

unread,
Mar 20, 2016, 11:33:41 PM3/20/16
to dev, Mesos Resource Allocation Working Group, Benjamin Mahler
Here's some input :).

If throttling is tolerable but preemption is not, how would that be expressed? (Is that supported?)
[Klaus]: It's not supported; only revocable resources has this attribute: non-throttleable or throttleable. The throttleable revocable resources is reported by ResourceEstimator which means the resources maybe throttled by its original owner.

How does this work with the QoS controller? Will there be a new correction type to indicate throttling, or does throttling happen "behind the agent's back"?
[Klaus]: The QoSController/ResourceEstimator only manages throttleable revocable resources; the others resources (regular resources and non-throttleable revocable resources) are managed by allocator. The "manage" means generation and destroy/eviction. Regarding "throttling happen", good question. I think the throttling will dependent on containers, let me double check it :).

If any comments, please let me know.

----
Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer 
Platform OpenSource Technology, STG, IBM GCG 
+86-10-8245 4084 | klaus1...@gmail.com | http://k82.me

On Sat, Mar 19, 2016 at 11:15 PM, <conno...@gmail.com> wrote:
Thanks for the good explanations so far Ben and Klaus.  Apologies if you guys already covered these questions in the meeting:

If throttling is tolerable but preemption is not, how would that be expressed? (Is that supported?)

How does this work with the QoS controller? Will there be a new correction type to indicate throttling, or does throttling happen "behind the agent's back"?

Thanks,
--
Connor

Joris Van Remoortere

unread,
Mar 21, 2016, 4:26:33 AM3/21/16
to Klaus Ma, dev, Mesos Resource Allocation Working Group, Benjamin Mahler
@klaus:
I think @connor's question is whether we are absolutely sure we never want to support throttle-able but non-revocable resources.
It's clear from the protos that this is not supported, the question is whether we are sure that is what we want. If so, can you elaborate as to *why* we would never want that concept in Mesos.

— 
Joris Van Remoortere
Mesosphere

--
You received this message because you are subscribed to the Google Groups "Mesos Resource Allocation Working Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mesos-allocati...@googlegroups.com.
To post to this group, send email to mesos-al...@googlegroups.com.

Benjamin Mahler

unread,
Mar 21, 2016, 4:13:10 PM3/21/16
to Joris Van Remoortere, Klaus Ma, dev, Mesos Resource Allocation Working Group
Yeah that's definitely a question I've been asking myself, and we synced on that with Niklas during the last meeting. The thought currently is that we should choose a better name than ThrottleInfo. ThrottleInfo seems to carry too strong of an implication about what the resources will experience. Rather, we could pick a name like "ScavengeInfo" / "BestEffortInfo" / etc that indicates that these resources are running within the un-utilized portion of the machine and _may_ experience degradation.

Guangya Liu

unread,
Mar 21, 2016, 10:23:41 PM3/21/16
to Mesos Resource Allocation Working Group, jo...@mesosphere.io, klaus1...@gmail.com, d...@mesos.apache.org, bma...@apache.org
Some of my thinking here:

1) The ThrottleInfo may need to belong to "Resources" but not limited to "RevocableInfo". The cpus resources can be throttled even if it is not revocable resources.
2) There need to be a flag to indicate if the resources is Scavenge-able or Best effort, I did not have inclination on which one is better as both seems clear enough to describe the resources type. The Kubernetes document here is also using scavenge and best effort concept here.

message ThrottleInfo {}
message RevocableInfo {
    message ScavengeInfo {}

    // If set, indicates that the resources may be revoked at
    // any time. Scavenge-able resources can be used for tasks
    // that do not have strict performance requirements and are
    // capable of handling being revoked.
    optional ScavengeInfo scavenge_info = 1;
  }

Thanks,

Guangya

在 2016年3月22日星期二 UTC+8上午4:13:10,Benjamin Mahler写道:

— 
Joris Van Remoortere
Mesosphere

To unsubscribe from this group and stop receiving emails from it, send an email to mesos-allocation+unsubscribe@googlegroups.com.
To post to this group, send email to mesos-allocation@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Mesos Resource Allocation Working Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mesos-allocation+unsubscribe@googlegroups.com.
To post to this group, send email to mesos-allocation@googlegroups.com.

Klaus Ma

unread,
Mar 22, 2016, 2:50:43 AM3/22/16
to Mesos Resource Allocation Working Group, jo...@mesosphere.io, klaus1...@gmail.com, d...@mesos.apache.org, bma...@apache.org
@benm/joris,

here's the user scenario in my mind:

1. master offers resources to the framework, e.g. 2 cpu
2. framework launch a task (2 cpu) and mark the task/executors as throttleable
3. in ResourceEstimator, it should only consider the throttleable task/executors:
  - keep enough resources for the tasks/executors without throttleable flag/attribute
  - report allocated but not used resources by task/executor with throttleable flag/attribute; for example, report 1 cpu as "Revocable.Throttleable" resources to framework in this case
4. it's up to framework to use which resources; "Revocable.Throttleable" means it'll share compress resources with resources owner, "Revocable" (without ThrottleableInfo) means it'll be evicted when the resources owner reclaimed it back
5. QoS Controller makes sure:
  - enough resources for the tasks/executors without throttleable flag/attribute
  - if used resources exceed allocated resources with throttleable flag/attribute, evict the task/executor on revocable resource

So to @connor's question, maybe a flag/attribute to task/executor when launching it. Regarding the name, both "ScavengeInfo"/ "BestEffortInfo"/"ThrottleableInfo" are OK for me, maybe "ScavengeInfo" is better.

Any comments?

For this scenario, I think there're still open questions:
1. Can framework launch task with throttleable flag/attribute on revocable resources?
2. For ResourceEstimator/QoS Controllor, should Agent double check it report?
3. What's the behaviour between the two container: the container on original resouces & the container on revocable resource?
4. Who handle compressible/in-compressible resources? Maybe ResourceEstimator/QoSController, it should not report in-compressible resources as Revocable.Throttleable.

Thanks
Klaus

— 
Joris Van Remoortere
Mesosphere

To unsubscribe from this group and stop receiving emails from it, send an email to mesos-allocation+unsubscribe@googlegroups.com.
To post to this group, send email to mesos-allocation@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Mesos Resource Allocation Working Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mesos-allocation+unsubscribe@googlegroups.com.
To post to this group, send email to mesos-allocation@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages