Hi Alexey,
Thanks for your question. Let me rephrase what you are trying to do. You do the forward propagation of one Ristretto conv layer. Then you want to reproduce the layer outputs with a normal convolutional layer. This should indeed work, since the simulation of dynamic fixed point is done by quantizing the weights and inputs, then do the normal forward path, and then quantizing the output.
Now, as you saw, Trim2FixedPoint_cpu shows how Ristretto does the quantization:
-Saturate number
-Shift number to the left (for FL>0), according to the fractional length
-Round (to get rid of fractional digits)
-Shift back
All this happens in floating point format.
I see some minor differences to your code. Why do you delay shifting back the weights? If you actually do that right away, then you don't need to shift the output prior to saturation. Finally, you should also quantize the layer input.
Let me know if you can reproduce the same results,
Best,
Philipp