Handling of bias in power-of-two mode

45 views
Skip to first unread message

mar...@gmail.com

unread,
Apr 16, 2018, 8:15:19 AM4/16/18
to ristretto-users
Hi,
I implemented an CNN on an FPGA with dynamic fixed point quantization. Everything worked well. So I want to go a step further and use power-of-two weights to save DSP-ressources.

Unfortunately I have a hard time debugging. The matrix multiplications respectively bit shifts should be fine, but when adding the bias value always an error occurs. I tried different bitwidths for the bias. A smaller bitwidth causes a higher error. When using Q16.16 the error almost disappered. So I think my design isn't erroneous. Unfortunately my C/CUDA skills are too low to elaborate the Ristretto source code.

I guess the bias in power-of-two mode isn't quantized at all. Is this the case?


Another question: The only layer in my CNN that needs to be handled in DFP is the average pooling layer. Do you think it would be much effort to implement it for a C beginner like me, since it's basically just the result of the averaging division to be quantized?

Best regards
Martin

pmg...@ucdavis.edu

unread,
Apr 17, 2018, 2:24:16 AM4/17/18
to ristretto-users
Hi Martin,

Great to hear your fixed point CNN is working on the FPGA. As for the integer-power-two weights: Ristretto doesn't quantize the bias in this mode. See base_ristretto_layer.cu, QuantizeWeights_gpu() for more details. However, you need to do things differently on the FPGA (or DSP), you should use dynamic fixed point for the bias. You can quantize the bias similar to how Ristretto quantizes activations to dynamic fixed point: Use enough integer bits such that there is no saturation during quantization. 8 bits should be enough for the bias for most CNNs. If you want to know how Ristretto computes the number of integer bits required to avoid saturation, you can take a look at quantization.cpp, line 164.

So anyways, you observe there is nearly no error for bias in Q16.16. Your challenge now is to keep the error small for smaller bit widths. For this, it is important to choose good fixed point formats.

You also asked about the average pooling layer. That layer is a bit more tricky than max pooling. Here you should also use dynamic fixed point. The math-part of the layer is pretty straight forward, you should be able to implement it on a DSP. And yes, the average pooling layer computes the average of a kernel window. So you'll require division operations.

Best,
Philipp

mar...@gmail.com

unread,
Apr 17, 2018, 6:49:36 AM4/17/18
to ristretto-users
Hi Philipp,
the average pooling layer is already implemented on FPGA. I just wanted to have a fixed point representation to verify the exact result on software, instead of always comparing fixed and floating point numbers. For my CNNs it's just an single global average pooling layer. So in the meantime I wrote an custom python layer to do the job.

I will try my luck to implement a bias quantization in C++. Thank you for the hints and the great work on Ristretto in general!

Best regards,
Martin
Reply all
Reply to author
Forward
0 new messages