Hi everyone,
We’re excited to announce the next-generation MLIR-based TensorFlow Lite model converter. We’re still working on improving the project, but wanted to get this into everyone’s hands and hear your feedback.
Why you should use the new converter
Enables conversion of new classes of models, including Mask R-CNN, Mobile BERT, and many more
Adds support for functional control flow (enabled by default in TensorFlow 2.x)
Tracks original TensorFlow node name and Python code, and exposes them during conversion if errors occur
Leverages MLIR, Google's cutting edge compiler technology for ML, which makes it easier to extend to accommodate feature requests
All existing functionality is supported
How to use the new converter
The feature is available through the tf-nightly pip or from head.
It's an easy 1-line change to enable the experimental converter by setting the experimental_new_converter flag in the TFLiteConverter:
If you're using tflite_convert command line tool, the new converter can be enabled with the --experimental_new_converter argument:
New feature: Control flow and RNN support
The new converter supports functional control flow, which is enabled by default in TensorFlow 2.x. This means that Keras RNN layers (e.g. tf.keras.layers.RNN and tf.keras.layers.LSTM), tf.nn.dynamic_rnn, and lower level control flow, like tf.while_loop and tf.cond should be convertible and runnable in TensorFlow Lite.
See the Colab for an example on how to convert and run a Keras LSTM model. There are some limitations with converting RNN models, including the following:
Quantization is not yet supported with control flow
Some of the TensorList operations are not yet supported (e.g. implementing seq2seq beam search decoder)
Dynamic shapes are not yet fully supported
E.g. Keras RNN/LSTM models must have a fixed batch size (TFLiteConverter will assume batch size = 1 if unspecified)
Performance is still being optimized
New feature: Error tracking
The new converter automatically tracks the original TensorFlow nodes during transformations. When the converter is called with the original python code creating the model, the python code lines and call stacks for creating the nodes are collected and imported to the converter.
If there is an error during the conversion, for example, unsupported types/ops, validation failures, and etc., a python call stack will be displayed as one part of the error message and the developer can use it to track down the root cause(s). An example can be triggered by running the following command:
An error stack trace pointing to the exact line (stack_trace_example.py:49) creating the error op will be displayed as follows:
Known issues and limitations
Currently, the new converter only supports a subset of old converter ops and optimizations; we're working hard to broaden the supported features and aiming to fully replace the old converter soon
Functional control flow (If and While ops) is fully supported, but the supported control flow models / use cases may be limited due to other constraints:
Quantization is not yet supported with control flow
Some of the TensorList operations are not yet supported (e.g. implementing seq2seq beam search decoder)
Dynamic shapes are not yet fully supported
E.g. Keras RNN/LSTM models must have a fixed batch size (TFLiteConverter will assume batch size = 1 if unspecified)
Error tracking has the following constraints:
Currently, error stack tracing is fully supported with from_concrete_functions and from_saved_model
Since it is difficult to get sufficient node creation information or source code from from_keras and from_frozen_graph, stack trace display isn’t supported now, but the error message will contain the TF node name for all of the cases
Feedback
For issues, please create a GitHub issue with the component label “TFLiteConverter.” Please include:
Command used to run the converter or code if you’re using the Python API
The output from the converter invocation
The input model to the converter
If the conversion is successful, but the generated model is wrong, state what is wrong:
Producing wrong results and/or decrease in accuracy
Producing correct results, but the model is slower than expected (model generated from old converter)
For feedback, please email tfl...@tensorflow.org.
Thanks,
Lawrence on behalf of TFLite and MLIR teams
Enables conversion of new classes of models, including Mask R-CNN, Mobile BERT, and many more