xxd
.input_zero_point
and input_scale
, and I quantized the dataset, from float
to int8_t
. Of course, then TFLM model does not quantize the input before invoke, because the sample is already quantized. This was made to reduce the amount of bytes that must be sent to the microcontroller, from 4000 if samples are in float to 1000 if are int8_t. --
You received this message because you are subscribed to the Google Groups "SIG Micro" group.
To unsubscribe from this group and stop receiving emails from it, send an email to micro+un...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/micro/CAObikk8%2BDM6E6Ka4WcwiMQqrUssgdFRT%2B4spgbczCF9OUWnMQA%40mail.gmail.com.
Hi Martin,
If you’re using tflite to give you reference inference results make sure you’re using the reference kernels!
interpreter = tf.lite.Interpreter(flatbuff_name, experimental_op_resolver_type= tf.lite.experimental.OpResolverType.BUILTIN_REF)
otherwise differences in rounding / multiplication in the rescaling of quantized output can result in surprizingly large worst-case differences! For the same reason you shouldn't necessarily expect bit-exact matching results to the reference kernels if you are using an optimized kernel library (e.g. OPTIMIZED_KERNEL_DIR=cmsis_nn)
FP arithmetic (explicit or implicit in shift+and+multiply integer implementations) can be a real pain ;-)
Regards,
Andrew
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/micro/CAObikk_74%2BJuni%2BPyeHEJsEwL%3Dm%3DcXc1Ho%3DhBBhFN3Zs8bVm8Q%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/micro/CAF-jhMnv8hVrTdMc3XTu6j6AD6WxLTdXLGDaynOnH5mrc9FZMg%40mail.gmail.com.
interpreter = tf.lite.Interpreter(flatbuff_name, experimental_op_resolver_type= tf.lite.experimental.OpResolverType.BUILTIN_REF)