1. source code
...
for (int i = 0; i < M; i++)
for (int j = 0; j < N; j++)
{
c[i][j] = d[i][j] + e[i][j]; // S_2
}
2. ppcg --target= c ... will generates the following code:
for (int i = 0; i <= 1023; i++)
for (int j = 0; j <= 1023; j++)
{
d[i][j] = 0.0; // S_0
for (int k = 0; k <= 31; k++)
d[i][j] += a[i][k] * b[k][j]; // S_1
c[i][j] = d[i][j] + e[i][j]; // S_2
}
3. modify one line in input source code:
const int K = 32;
Then ppcg --target=c ... will generate the following code:
for (int i = 0; i <= 1023; i++)
for (int j = 0; j <= 1023; j++)
d[i][j] = 0.0; // S_0
for (int i = 0; i <= 1023; i++)
for (int j = 0; j <= 1023; j++)
{
for (int k = 0; k < K; k++)
d[i][j] += a[i][k] * b[k][j]; // S_1
c[i][j] = d[i][j] + e[i][j]; // S_2
}
S_0 cannot be fused with S_1 and S_2.
Is there any method making loop fusion work even when K is declared as variable?
Thanks in advance~
-Leo