TFLM model predictions differ to TFLite model predictions

Skip to first unread message

Martin Cavallo Percovich

Mar 9, 2022, 8:15:59 PM3/9/22
to SIG Micro
Hi for all SIG team members!

I am working on a model based on TCN architecture.  I trained the model and then, I quantized and converted it with TensorFlow Lite Converter, applying FULL INTEGER ONLY QUANTIZATION. So from input tensor to output tensor, elements are 8-bits quantized. Finally I converted the TFLite model to a C array with xxd.

I am using Arduino Nano 33 BLE Sense, and I wrote a firmware code for running the model with this board using Arduino IDE. The TensorFlow Lite for Microcontrollers C++ library, was built from source last year using tensorflow 2.5.  

For evaluation, I was interested in testing dataset inference. The input tensor has a length of 1000 units. So I wrote a Python script which takes a testing dataset sample and sends it to the Arduino Nano by serial port communication using PySerial. For this task, I took the following considerations:

  • Using TensorFlow Lite model and TensorFlow Lite Interpreter, I loaded model input parameters, such input_zero_point and input_scale, and I quantized the dataset, from float to int8_t. Of course, then TFLM model does not quantize the input before invoke, because the sample is already quantized. This was made to reduce the amount of bytes that must  be sent to the microcontroller, from 4000 if samples are in float to 1000 if are  int8_t. 
  • The samples are sent by batches of 50 elements, 50 bytes. So 20 batches are used in total to complete one input sample.
  • A simple checksum system was implemented to verify communication is successful in each batch. Arduino sends a checksum of the bytes received to the python script and the python scripts compare with the checksum of bytes sent. Both must be equals. 
  • After the entire tensor was sent, from the firmware side, the TFLM model input tensor is read (using model_input->data.int8) and sent again to python script, where it is received and compared with the original sample.  Both must be equals and this is working well too.
The problem is that the prediction result made with TFLM model running on the Arduino Nano BLE 33 Sense differs with the prediction of the TFLite model used with TensorFlow Lite Interpreter in python environment. The output tensor is just one neuron.

Have you got any suggestion about this problems? Have you hear something similar to this problem? Any recommedation?

Thanks in advance!

pd: This work is part of my thesis of my postgraduate in Robotics and Artificial Intelligence. So I would love to fix this problem ASAP. 😅

Best regards!



Martin Cavallo
Ing. Tecnológico Electrónico

Deqiang Chen

Mar 10, 2022, 2:49:48 PM3/10/22
to Martin Cavallo Percovich, SIG Micro

I would suggest creating a x86 TfliteMicro benchmark or test (e.g. and use that to compare with TFLite interpreter outcome.   That can give a bifurcation point on whether you need to focus on data communication link or the TFLM code path.  

Best regards!

You received this message because you are subscribed to the Google Groups "SIG Micro" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To view this discussion on the web visit

Andrew Stevens

Mar 10, 2022, 2:57:32 PM3/10/22
to SIG Micro,

Hi Martin,

 If you’re using tflite to give you reference inference results make sure you’re using the reference kernels!

     interpreter = tf.lite.Interpreter(flatbuff_name, experimental_op_resolver_type= tf.lite.experimental.OpResolverType.BUILTIN_REF)

otherwise differences in rounding / multiplication in the rescaling of quantized output can result in surprizingly large worst-case differences! For the same reason you shouldn't necessarily expect bit-exact matching results to the reference kernels if you are using an optimized kernel library (e.g. OPTIMIZED_KERNEL_DIR=cmsis_nn)

FP arithmetic (explicit or implicit in shift+and+multiply integer implementations) can be a real pain ;-)




Måns Nilsson

Mar 11, 2022, 8:14:27 AM3/11/22
to Andrew Stevens, SIG Micro,
Well actually with OPTIMIZED_KERNEL_DIR=cmsis_nn you should expect bit-exactness to the reference kernels. If not, a github issue would be very welcome.


Från: Andrew Stevens <>
Skickat: den 10 mars 2022 20:57
Till: SIG Micro <>
Kopia: <>
Ämne: Re: TFLM model predictions differ to TFLite model predictions
You received this message because you are subscribed to the Google Groups "SIG Micro" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

Martin Cavallo Percovich

Mar 17, 2022, 12:38:01 AM3/17/22
to Måns Nilsson, Andrew Stevens, SIG Micro
Hi everyone!

First, thank you so much to all of you for your replies. Then, sorry for my delay.

I needed to create a new build of the library because the nightly build of TFLM library available via the Arduino IDE'S library manager, has not implemented AddSpaceToBatchND() op. Then I followed the explanation that is in "Understand the C++ library" and on page 466 of TinyML book written by P. Warden and D. Situnayake .So I download tensorflow 2.5.1 and I executed the following command: ./tensorflow/lite/micro/tools/ci_build/

I understand that in this way I did not include OPTIMIZED_KERNEL_DIR=cmsis_nn, that is right?? 

About tf.lite.Interpreter, in the python environment, I just use tf.lite.Interpreter with model.tflite file path as unique parameter. I did not know about experimental_op_resolver_type parameter. Reading its documentation, it says tf.lite.experimental.OpResolverType.AUTO is the default option. Do you recommend that I should use  tf.lite.experimental.OpResolverType.BUILTIN_REF instead?? 

Thanks in advance!

Best regards,


Mar 17, 2022, 7:03:39 AM3/17/22
to Martin Cavallo Percovich, Måns Nilsson, Andrew Stevens, SIG Micro
Hi Martin,

I'm happy to see your interest in TFLM and CMSIS-NN!

You can find the latest TFLM-Arduino integration code here. The Arduino integration you refer to has moved out of the TFLM main repository lately. This .md describes how to build the Arduino lib, and it uses the optimized CMSIS-NN kernels. I should mention that I haven't tried this myself :) Please, let me know if this works for you.


Advait Jain

Mar 17, 2022, 11:22:34 AM3/17/22
to Fredrik, Martin Cavallo Percovich, Måns Nilsson, Andrew Stevens, SIG Micro
Please see the top-level readme in the repo for how to use the TFLM port to the Arduino:

You should be able to clone the repo into your Arduino/Libraries folder and be ready to go (including using the cmsis-nn optimizations).


Martin Cavallo Percovich

Mar 18, 2022, 12:17:00 AM3/18/22
to Advait Jain, Fredrik, Måns Nilsson, Andrew Stevens, SIG Micro
Hi Fredrik and Advait,

Thanks for your responses.

As you suggested, I installed the latest TFLM library for Arduino, cloning  the repo ( into Arduino/Libraries folder. After compiling and uploading to Arduino Nano 33 BLE Sense, I got the same predictions as using the old version of tflite-micro library. Nothing changed.

On the other hand, I also wrote an equivalent firmware version for an ESP32 module, using ESP-IDF framework, and following the instructions in the Espressif tflite-micro repo for installing and using the TFLM library for ESP32 and ESP-IDF. Fortunately, I also got the same predictions as the Arduino Nano 33 BLE Sense board. This is good and everything makes sense!.

But I do not have a response to the question which started this conversation, and refers to the difference in predictions between inferences ran with tf.lite.interpreter in python environment and inferences ran in the Arduino board. The next test is to set the reference kernels in tf.lite.interpreter like suggested Andrew

interpreter = tf.lite.Interpreter(flatbuff_name, experimental_op_resolver_type= tf.lite.experimental.OpResolverType.BUILTIN_REF)

I will test this tomorrow or during the weekend.

Thanks again.

Reply all
Reply to author
0 new messages