_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
I didn't find the time to write new GSoC projects for 2021 yet but
if you are interested we could probably set one up in this area. I
also CC'ed Mircea who might be interested in this too, maybe as
(co-)mentor.
We could look at loop transformations such as unrolling and fusion,
similar to the inliner work. Best case, we can distill a heuristic
out of a model we learned. We could also look at pass selection and
ordering. We started last year and I was hoping to continue. You
might want to watch https://youtu.be/TSMputNvHlk?t=617
<https://youtu.be/TSMputNvHlk?t=617> and
https://youtu.be/nxfew3hsMFM?t=1435 <https://youtu.be/nxfew3hsMFM?t=1435> .
In case your interested in a runtime topic, I really would love to
have a predictor for grid/block/thread block size for (OpenMP) GPU
kernels. We are having real trouble on that end.
I also would like to look at ML use in testing and CI.
Let me know what area sounds most interesting to you and we can
take it from there.
~ Johannes
Hi Konstantin,
I didn't find the time to write new GSoC projects for 2021 yet but
if you are interested we could probably set one up in this area. I
also CC'ed Mircea who might be interested in this too, maybe as
(co-)mentor.
We could look at loop transformations such as unrolling and fusion,
similar to the inliner work. Best case, we can distill a heuristic
out of a model we learned. We could also look at pass selection and
ordering. We started last year and I was hoping to continue. You
might want to watch https://youtu.be/TSMputNvHlk?t=617
<https://youtu.be/TSMputNvHlk?t=617> and
https://youtu.be/nxfew3hsMFM?t=1435 <https://youtu.be/nxfew3hsMFM?t=1435> .
In case your interested in a runtime topic, I really would love to
have a predictor for grid/block/thread block size for (OpenMP) GPU
kernels. We are having real trouble on that end.
In case your interested in a runtime topic, I really would love to
have a predictor for grid/block/thread block size for (OpenMP) GPU
kernels. We are having real trouble on that end.
On 2/7/21 8:31 AM, Сидоров , Константин Сергеевич wrote:
> Hello Johannes,
>
> I guess working on the loop transformations is a good starting point –
> firstly, it is similar to the already existing code, and secondly, this
> problem doesn't look way too hard (especially comparing with other ideas).
> To the best of my understanding, it corresponds to refactoring
> `LoopUnrollAnalyzer` and, if I understood you correctly, `MacroFusion`, in
> the similar way it has been done with the `InlineAdvisor`.
That sounds good to me. As you said, we'd have a nice "template"
to guide such exploration. Given that GSoC is short this year that
sounds certainly better than a big standalone project.
>
> As for the next step, I think that knowledge distillation is a promising
> idea – in fact, we can experiment with the approaches from [1], which can
> yield a nice inference speed-up in those models.
What I meant was that it would be nice if we not only come up with
an ML advisor but also an improved non-ML predictor based on the
insights we can extract from the model. I'm not sure if that came
across right.
>
> I think working on some kind of unified pipeline for pass selection and
> ordering is also an interesting idea – off the top of my head, a viable
> approach here is to consider a pass scheduling as a single-player game and
> running a Monte-Carlo tree search to maximize some objective function. For
> example, in [2] this kind of approach is used for learning to solve vertex
> cover and max-cut, while [3] employs this approach for searching for the
> molecule design with the specified properties. See also [4] for a survey of
> RL methods (including MCTS) for combinatorial problems.
The caveat is, we actually don't want to learn arbitrary pass
pipelines because they are simply not practical. That said, we
could define/learn building blocks which we then combine. I would
however start small here was well, e.g., take a canonicalization
pass, e.g., simplifycfg, and learn when in the pipeline we should
run it, potentially based on IR features.
>
> In case your interested in a runtime topic, I really would love to
>> have a predictor for grid/block/thread block size for (OpenMP) GPU
>> kernels. We are having real trouble on that end.
> I'm afraid I didn't quite understand this one – could you elaborate a bit
> more on this topic?
So the problem is as follows:
Given a GPU kernel, how many thread blocks and threads per thread
block should be use. This can heavily impact performance and so far
we use semi-arbitrary numbers in the OpenMP GPU offloading runtime.
We would use IR information, PTX information, GPU information, and
kernel parameters, to learn and eventually make a decision.
Does that make more sense?
~ Johannes
P.S. Given that we are defining a project via email I will not write
it up for the open project page. Instead, I expect you simply apply
to GSoC with some description we come up here.
Hi Konstantin,
I didn't find the time to write new GSoC projects for 2021 yet but
if you are interested we could probably set one up in this area. I
also CC'ed Mircea who might be interested in this too, maybe as
(co-)mentor.