I've been playing around with LLVM to write a backend for a rather "simple" (co-)processor. Assume that only three arithmetic instructions exist: ADD mod N, SUB mod N and MUL mod N. The modulus N is programmable and stored in a register. No ordinary arithmetic instructions are available. The word size is 256-bit.
In other words, the following function, b + c mod N, corresponds to only one instruction on my target machine, given the modulus N set by another instruction to an auxiliary register.
# this should translate to one instruction like: r3 = add r1, r2
# n is extended to 257-bit for simplicity
define i256 @add(i256 %b, i256 %c, i257 %n) {
%1 = zext i256 %b to i257
%2 = zext i256 %c to i257
%3 = add i257 %2, %1
%4 = icmp uge i257 %3, %n
%5 = select i1 %4, i257 %n, i257 0
%6 = sub i257 %3, %5
%7 = trunc i257 %6 to i256
ret i256 %7
}
The ultimate goal is to use LLVM to optimize and emit machine code for programs full of modular addition, subtraction and multiplication. For example, I might want off-the-shelf LLVM passes to optimize (a+b+c)-(a+c) mod p = b mod p for me. For a large syntax tree, say more than 10,000 nodes, I might even need common subexpression elimination. Of course, we can target the same program to ARM or X86 for performance comparison.
The questions is: We have to recognize the patterns for modular arithmetic from optimized IR and translate into corresponding instructions. After investigation, I think LLVM can not do a good job on such a simple but unordinary instruction set. Is it possible that the patterns still recognizable after optimization? On the whole, is this viable using LLVM or is it better to write a compiler for this particular DSL from scratch?