Hello,
I am working on a LLVM Based backend which lowers XLA operations to LLVM-IR. However, as part of this backend I would like to lower the Gelu operation in Keras, while retaining at the XLA level that the computation is performing GeLU. As Gelu is not an operation in XLA, the operation is lowered into its component operations as shown here:
While the individual operations exist in XLA, I would like to retain that this computation is a GeLU and not have it decomposed. What would be the best approach in retaining/recovering this information? I was thinking of two possible ways:
Best Regards,
Rafae
--
You received this message because you are subscribed to the Google Groups "XLA development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xla-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/xla-dev/BYAPR11MB2918CFCA2BFEF032049B32C286A59%40BYAPR11MB2918.namprd11.prod.outlook.com.
Hello all,
I’m attaching the small snippet of code which shows the model in Keras, as well as the HLO file for the Gelu kernel. Could you share/point out where some of the more complex pattern matching is being done in XLA? I can implement a pattern for the shown code sequence, but I’m concerned if that will capture many cases in practice.
With regards to XLA’s native code-gen, it is more than sufficient however as part of our requirement we would like retain some higher level operation information (gelu is one such example), and then later on do lower level codegen in a separate phase. For the regular case, we would deconstruct the operations similar to how is done in XLA, but for cases where some specialized hardware support is available having this information can be useful.
Regards,
Rafae
Hey Justin, I apologize if the intention is not clear. We’re working on an internal project for doing low level code-generation for many emerging tensor ISA and are trying to use the XLA Graph as a frontend to that system. Some of the constructs overlap with XLA operations in the system (e.g. Dot Product in XLA can map to Dot Product in our system), however our operations are not identical to the operations in XLA (hence we are supporting a restricted set of XLA operations). We map the operations in XLA accordingly and the code generation is done outside of TensorFlow, hence we are not using the LLVM lowering currently present in XLA for these operations. Does that help provide some idea of what I’m attempting to do?