result_quantized_value = result_zero_point +
(lhs_scale * rhs_scale / result_scale) * int32_accumulator (7)
The difficulty here is of course that (lhs_scale * rhs_scale / result_scale)
is a positive real number, not an integer in general.
It is a constant, though. So what we have to implement here is the (approximate) scaling of a int32 value by some arbitrary positive constant multiplier.
Moreover, it is safe to assume that this positive constant multiplier is smaller than one — each of the scale
values here is typically smaller than one, as we are typically mapping the [0..255]
quantized uint8 value range to an interval of real values that is much narrower than that, typically within [-10,10]
in most neural networks. For example, a neural network using Relu6 activation functions will typically have real activation values in the interval [0,6].
Hello All,Referring to gemllowp's quantization step, there's an assumption that we typically map a narrow range of matrix values to a quantized range that is much larger; e.g. [0 .. 255] with uint8 **(see: here)Q: For deep NNs, is this assumption always true?i.e. are the the matrix elements always within a narrow range that is smaller than the quantized range, say [0..255]?
e.g. let's look at weights and activations:Weights:- several papers/sources on the web indicate that trained-weights are typically real-numbers in the range (-1,1)(empirical results)- however, this is empirical data from existing networks; Q: will this always hold true?Activation:Intermediate Layer:- intermediate layers (of a network) are constrained to a smaller range(RELU / RELU6 / sigmoid etc applied on the previous layer's output)- so, it seems like we're good here?Input Layer:- for the input layer, the activation may be large values (e.g. raw audio/video data)- Q: The assumption that activation values are in a narrow range may not hold for this layer, right?e.g. raw pixel values can take values from 0-255==Any comments/thoughts ..?Thanks in advance.result_quantized_value = result_zero_point + (lhs_scale * rhs_scale / result_scale) * int32_accumulator (7)
The difficulty here is of course that
(lhs_scale * rhs_scale / result_scale)
is a positive real number, not an integer in general.It is a constant, though. So what we have to implement here is the (approximate) scaling of a int32 value by some arbitrary positive constant multiplier.
Moreover, it is safe to assume that this positive constant multiplier is smaller than one — each of the
scale
values here is typically smaller than one, as we are typically mapping the[0..255]
quantized uint8 value range to an interval of real values that is much narrower than that, typically within[-10,10]
in most neural networks. For example, a neural network using Relu6 activation functions will typically have real activation values in the interval [0,6].==
--
You received this message because you are subscribed to the Google Groups "gemmlowp" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gemmlowp+u...@googlegroups.com.
To post to this group, send email to gemm...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gemmlowp/0ff7c057-8b43-48a3-97c5-5e7275e3a82d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.