I just provide my answers according to my knowledge.
1、 Quantization is just converting the origin floating point data to the new floating point data which is the same as the fixed point valued. For example, 1.23 can be quantized to 1.25 (if bw=8, fl=6 then get 01.010000 as the fixed point). I'm not clear the exact process in the Ristretto,I guess it finds the nearest value under certain quantization params (bw and fl).
So, Ristretto wouldn't generate a new caffemodel, just generate a new .prototxt. When you get the quantization params about a certain layer, you can calculate a lookup table, it contains all the floating points as the decimal representation of the fixed point. For example, the first conv layer weights have the bitwitdth 8,and fl 4, then the lookup table will get 256(or 255) numbers which are float points and convert from binary fixed point 0000.0000 ~1111.1111. Using the lookup table ,you can get the nearest value to the origin float point.
2、As mentioned above, the data you extract from a fine-tuned model is exact a 32-bit caffemodel , you can get the corresponding value use the method I provided.