Spark Summit EU talk and related work

28 views
Skip to first unread message

Alvaro Brandon

unread,
Oct 27, 2017, 3:44:11 AM10/27/17
to dr-elephant-users
Hello everyone:

I recently attended the talk of Akshay Rai about Dr. Elephant and it was a very enlightening talk. Specially because this is a sometimes ignored topic despite having a huge impact on the cost, energy and resources used when running Big Data jobs. A bit of analysis on the metrics of your jobs, like Dr. Elephant does, can prevent future cluster usage problems.

On this line we published one paper not long ago on providing automatic recommendations on the level of parallelism of Spark jobs. Parallelism parameters, like spark.executor.cores or spark.executor.memory, can have a huge impact on performance, but it's sometimes difficult to know the values you need. What we did was to leverage the metrics information from previous executions, to train a boost decision tree and predict what will be the impact of changing the parallelism parameters in Spark. Depending on the workload you can see that different values for spark.executor memory and spark.executor.cores can greatly reduce the execution time. The good thing is that the model can learn by itself and you don't need heuristics for this. I attach the paper in case some concepts might be useful for you guys.

Keep up the good work! :)

Alvaro
Using ML to optimise Spark.pdf

Akshay Rai

unread,
Nov 6, 2017, 2:13:51 AM11/6/17
to dr-elephant-users
Hi Alvaro,

Thank you for sharing your feedback and your work.

We hope this will be of great to help to us.

Thanks,
Akshay
Reply all
Reply to author
Forward
0 new messages