TOSA Legalization from TensorFlow and TensorFlow Lite
Overview
This RFC describes the legalization passes that transform TensorFlow and TensorFlow Lite content to the TOSA dialect described in https://llvm.discourse.group/t/rfc-tosa-dialect-in-mlir/1971/18 .
These passes implement the generation of validated sequences of TOSA operators corresponding to the input TensorFlow or TensorFlow Lite content. The legalizations express the functional sequence of TOSA operators required, as well as handling of numerical behavior, such as scaling, precision and saturation in the case of quantized integer datatypes.
Motivation
The legalization passes from TensorFlow and TensorFlow Lite enables full networks to be legalized to a single, standard TOSA form. Both fp32 and quantized 8-bit networks have been exercised. A standard form of multiple high-level frameworks enables further code generation to be decoupled from the evolution of the high-level framework, and also insulates codegen from the differences between the operator specifications in different high-level frameworks.
The TOSA reference model described in the TOSA RFC enables TOSA output to be validated for bit accuracy against TensorFlow/TensorFlow Lite output.
Components
This implementation includes two pieces:
* The actual MLIR lowerings from TensorFlow and TensorFlow Lite to TOSA, implemented in the form of multiple MLIR passes that may be registered with any MLIR pass manager.
* Legalization test suites validating legalizations for TensorFlow and TensorFlow Lite on a per-operator basis for each implemented legalization from a high-level framework operator to one or more TOSA operators.
Implementation Details and Further Scope
Approximately 100 TensorFlow and TensorFlow Lite operators are currently lowered to TOSA dialect form. These were chosen from occurrence frequency data generated from dozens of real-world networks in both TensorFlow and TensorFlow Lite , involving both fp32 and quantized 8-bit content.
The current set of lowerings enable the bit accurate lowering of nearly 50 real-world networks across several domains – image processing, object detection, super-resolution, speech recognition, speech synthesis, keyword spotting, natural language processing, and more.
Further legalizations may be added as needed, for operators that are not currently lowered to TOSA. The legalizations have not yet been exercised with networks that use quantized 16-bit int, fp16 or bfloat-16 datatypes. Most existing FP legalizations should work without alteration.
Legalizations supporting the TOSA training profile are in development / under discussion. However tf.FakeQuantWithMinMaxArgs/Vals and tfl.quantize/dequantize op lowerings are currently present.
Directory Setup
The legalization code for both TensorFlow and TensorFlow Lite content are intended to sit within the TensorFlow repository at tensorflow/compiler/mlir/tosa .
Build System and Dependencies
The third_party/mlir/BUILD file within the TensorFlow repository will be updated to add a build object TosaOps that constructs the TOSA MLIR dialect library from the associated llvm-project repository. This is a build dependency for the legalization passes. The build dependencies of the TOSA dialect are described in the dialect RFC at https://llvm.discourse.group/t/rfc-tosa-dialect-in-mlir/1971/18 .
TensorFlow and TensorFlow Lite dialects are build dependencies of the TOSA legalization passes. No other dialect currently present in the TensorFlow repository depends upon TOSA.
--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/71198d33-efe3-4304-81f3-0ff6393996acn%40tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/eda609bc-f194-4d07-9f5d-bd0aff76412bn%40tensorflow.org.
Thanks for the RFC! It looks like awesome work. I think my team at Google (Edge TPU) will be interested in making use of it.
I wonder if there is going to be an TOSA compiler available that can compile the "nearly 50 real-world networks" say on ARM CPU.
You received this message because you are subscribed to the Google Groups "iree-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iree-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/iree-discuss/CAMngs15B-RO_ppuFGRD9xUz4EVbrjLRc3Jk95%3DK4x4vv_QoXGA%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/cd973ce8-752a-45df-a455-b9b783c29849n%40tensorflow.org.
Hey,This sounds very promising. And I think it would fit well. As others have mentioned there are some questions wrt MHLO (especially it appears that there would be some savings to be had given the existing TF to MHLO legalizations), dynamic shapes and reuse in general. But that is something that could be worked on together. I was thinking we could even consider using either separate builds or configurable attributes (https://docs.bazel.build/versions/master/configurable-attributes.html) to make this easier initial integration/avoid it being on critical path from the start and so make it easier to start with.
You received this message because you are subscribed to the Google Groups "iree-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iree-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/iree-discuss/CAM4W%2BYcQdX3tf1Rru0KroEHXZcqcg%2BhKsNWt_M-mQFyuR3MiAw%40mail.gmail.com.