Dear All,
I had a couple quick questions regarding how the 5000 Autopilot/No Autopilot jobs were selected to generate the Memory Slack CDF's (Fig 3,4) in the
"
Autopilot: workload autoscaling at Google" paper.
1. How were these 5000 jobs selected exactly? Were 5000 random jobs selected from all jobs that had Autopilot enabled (with the respective Algorithm) and then another 5000 selected from jobs without Autopilot? If this is the case how is it determined whether or not a job has Autopilot enabled? Is this done randomly or is there some logic to decided whether or not a job has Autopilot enabled on it, ergo certain types of workloads have Autopilot enabled while others do not.
2. Regarding the 8 different clusters of data that the dataset has, how are jobs assigned to these clusters? Is this done randomly or is there some system that makes it so a specific cluster is typically used to run a certain type of job?
Thanks for the help and clarification on these questions. I appreciate the information!
Sincerely,
Willie