Hi Jon,
there are so many parameters that could affect the power consumption of a particular implementation that it is not really possible to answer your question.
- on the process side for each node, each foundry might propose different flavors, each with a different power profiles...
- on the implementation side, depending on how careful you are with your CTS, you will also observe big variation in your results.
- finally (and not the least), you will also observe huge variations depending on what kind of software you are currently executing for your benchmark, or which power mode you currently are.
So, in the end, I am afraid that there is no ready-made answer for you. The best is for you to do the exercise, implement the core on your target library, simulate a piece of code that is representative of your target application, dump the VCD and finally run a power simulation with this.
Hope this helps,
Oliv'