I was just thinking the other day. Now that we are getting constexpr if, what else could we allow at compile time? What occurs to me is that we could have a nice compile time means to do loop unrolling. What I imagine is something like this:
constexpr for(int i = 0; i < 4; ++i) {
whatever();
}
would cause the compile to simply emit:
whatever();
whatever();
whatever();
whatever();
Obviously the loop bounds would have to be trivially known at compile time for this to work, and should throw a hard error if it isn't. Many compilers already are doing this analysis that would enable this, so why not make that analysis able to be leveraged by the developer?
I know that the conventional wisdom is to just write the loop, and if the compiler deems it optimal to unroll the loop, then it will. But sometimes, people just know better.
I can imagine versions of many algorithms that are functionally identical the standard ones, but allow the user to specify how much to unroll via template parameters, like this (parden any typos):
namespace unrolled {
template <int N, class In, class Size, class F>
F for_each_n(In first, Size Count, F fn) {
Size rounded = (count / N) * N; // round count down to nearest multiple of N
Size i = 0;
// do as much as possible in chunks of unrolled size N
while(i < rounded) {
constexpr for(int j = 0; j < N; ++j) {
fn(first[j]);
}
i += N;
}
while(i < count) {
fn(first[i++]);
}
}
}
Which would enable code like this:
unrolled::for_each_n<3>(std::begin(arr), 30, [](auto elem) {
// do whatever with elem
}
It would be functionally the same as a regular std::for_each_n, but unrolled into blocks of size 3 as much as possible.
Thoughts? Is this just not worth the effort?
Evan