Add notion of evictable task to RunTaskMessage

30 views
Skip to first unread message

Guangya Liu

unread,
Mar 13, 2016, 5:10:27 AM3/13/16
to Mesos Resource Allocation Working Group
Hi Ben and folks,

There are some discussion about dynamic reservation and revocable resource interaction, here have some questions want to get confirmed to make sure that we are on the same page. For more detail, please refer to https://docs.google.com/document/d/1B_v52zCOFcwCpqCPhgYi9h630a0NE-QM9Br0nCOZUR4/edit#

1) Add RevocationEvictResource to RunTaskMessageTaskInfo , and the master will set RevocationEvictResource in RunTaskMessageTaskInfo before launch task.

message RunTaskMessageTaskInfo {

 ...

 // Evict Resources to launch tasks.

 message RevocationEvictResource {

   string role = 1;

   repeated Resource revocable_resources = 2;

 }

 repeated RevocationEvictResource revocationsevict_resources = 3;

}

2) The master will decide how many resources does it want to revoke.

3) When dynamic reservation happens and some reservations are removed, just update the revocable resources directly.

4) When framework launch tasks with the regular resources, the master will try to evict some resources if there are not enough resources.

But there is a question for this, in step 4, the task is using regular resources but not reserved resources, so does the regular resources can also trigger eviction? I think that the answer is yes?

Take the following case as an example for above dicussion:
1) One agent with resources: cpus(*):10
2) Set up reserved resources for this agent via http endpoint: cpus(r1):10
3) So the total resources on this agent will be cpus(r1):10;cpus(*){REV}:10
4) A framework get all revocable resources on agent and launch task on all the revocable resources: cpus(*){REV}:10
5) unreserve 6 cpus on the agent and the total resource will become: cpus(*):6;cpus(r1):4;cpus(*){REV}:4 , but there are now 10 REV cpus are in use.
6) Launch a task which request cpus(*):6, so 6 REV cpus needs to be evicted. 
7) This means that even the task using regular resources still trigger eviction, comments?

Another question is for the protobuf change:


message
RunTaskMessageTaskInfo {

 ...

 // Evict Resources to launch tasks.

 message RevocationEvictResource {

   string role = 1;

   repeated Resource revocable_resources = 2;

 }

 repeated RevocationEvictResource revocationsevict_resources = 3;

}

I think that the RevocationEvictResource may not need role but only revocable resources, as the revocable resources will not have roles, all of the revocable resources from different roles will be flattened into same revocable resources. i.e. cpus(r1):1;cpus(r2):2 ==> cpus(*){REV}:3

I have also updated https://issues.apache.org/jira/browse/MESOS-3890 to trace this.

Thanks,

Guangya

Klaus Ma

unread,
Mar 13, 2016, 8:24:00 AM3/13/16
to Mesos Resource Allocation Working Group
Just back from park :). Here's my comments:

1. For "2) The master will decide how many resources does it want to revoke.", this's true in the future; but maybe false in Optimistic Offer Phase 1:
    a. ) If we did this master class, the master class seems handling allocator's task
    b. ) If we did this in allocator, master class has to wait for allocator's "Future" to launch task; is performance acceptable?
  So, I'd suggest to keep this calculation in agent in Phase 1 and re-visit it after "[MESOS-4553] Manage Offer in Allocator".

2. Regarding "task using regular resources still trigger eviction"; it's expected behaviour; as we'll let regular resource trigger eviction in "revocable by default". I think BenM is trying to generate the behaviour on revocable resources , not only Optimistic Offer Phase 1. @BenM, please correct me if mis-understanding.

3. Regarding "RunTaskMessage", it means how may revocable resources should be evicted in each role. It's used to balance the resources between roles. For example, the "TaskInfo.revocations" may have 2 items: "role = r1, revocable_resources=cpu(*){REV}:10" and "role=r2, revocable_resources=cpu(*):5", that means evict revocable executors in r1 to get "cpu(*):10" resources and evict revocable executors in r2 to get "cpu(*):10" resources. But in Optimistic Offer Phase 1, I'd like to set "role" to empty and set "revocable_resources" to task's reserved resources, that mean it dependent on agent's decision to evict "revocable_resources" resources.

Thanks
Klaus

Guangya Liu

unread,
Mar 15, 2016, 9:48:27 PM3/15/16
to Mesos Resource Allocation Working Group
We have updated the RunTaskMessage protobuf as following in blue pen, and a patch was also uploaded here: https://reviews.apache.org/r/40532/

message RunTaskMessage {

  // TODO(karya): Remove framework_id after MESOS-2559 has shipped.

  optional FrameworkID framework_id = 1 [deprecated = true];

  required FrameworkInfo framework = 2;

  required TaskInfo task = 4;


  // The pid of the framework. This was moved to 'optional' in

  // 0.24.0 to support schedulers using the HTTP API. For now, we

  // continue to always set pid since it was required in 0.23.x.

  // When 'pid' is unset, or set to empty string, the agent will

  // forward executor messages through the master. For schedulers

  // still using the driver, this will remain set.

  optional string pid = 3;


  // Evict Resources to launch tasks.

  message Revocation {

    optional FrameworkID framework_id = 1;

    required string role = 2;

    repeated Resource revocable_resources = 3;

  }

  repeated Revocation revocations = 5;

}


Thanks,

Guangya

在 2016年3月13日星期日 UTC+8下午8:24:00,Klaus Ma写道:
Reply all
Reply to author
Forward
0 new messages