codegen is a big one, as are inference.jl, gf.c, and cgutils.cpp. But there
are optimization sprinkled throughout (e.g., ccall.cpp).
You might be interested in this:
https://github.com/JuliaLang/julia/issues/3440
Most of the optimizations so far are low level; most of the higher-level stuff
tends to be macros in packages (@devec being a prime example, I'm working on
another now). The fact that @devec didn't work for you is evidence that this
is nontrivial (I bet that Dahua would be interested in contributions that
improve it). In the longer run, it might be interesting to experiment with
LLVM's Polly, but I'm not very clear on how far that project has gotten in
practice.
--Tim