Question on StableHLO Specification - dot_general constaint C15

36 views
Skip to first unread message

Kahho Phong

unread,
May 31, 2024, 3:49:41 AMMay 31
to OpenXLA Discuss
Hello.

I have a question about constraint C15 of dot_general operator in StableHLO v1.0.
The snippet is captured below:

  * (C15) `zero_points(rhs) = 0`.

Is there any reason why for quantized form of dot_general, the weights must be symmetrical quantized?  As far as I can see, other operator like `convolution` does not have such constraint.

Thanks!
--kahho


Sandeep Dasgupta

unread,
May 31, 2024, 1:08:14 PMMay 31
to Kahho Phong, OpenXLA Discuss
Dear Kahho
Your observation that the dot_general quantization specification in its current form supports symmetric quantization is accurate. This design was inspired by the TensorFlow Lite quantization specification and was driven by the use cases encountered up to this point, which have been adequately addressed by symmetric quantization. However, it is likely that future requirements may necessitate the accommodation of asymmetric use cases. Please provide us with your specific motivation for this change, and we will be more than happy to assist you in addressing it. Also, feel free to file an issue at https://github.com/openxla/stablehlo/issues for the same.

Regards,
Sandeep

--
You received this message because you are subscribed to the Google Groups "OpenXLA Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openxla-discu...@openxla.org.
To view this discussion on the web visit https://groups.google.com/a/openxla.org/d/msgid/openxla-discuss/c9b14bef-8730-4e07-aa25-2490b0adc141n%40openxla.org.
For more options, visit https://groups.google.com/a/openxla.org/d/optout.

Kahho Phong

unread,
Jun 2, 2024, 10:48:56 PMJun 2
to OpenXLA Discuss, Sandeep Dasgupta, OpenXLA Discuss, Kahho Phong
Thanks Sandeep for your reply.

I guess dot_general is the only operator  in the opset to have this constraint?

We are trying to pivot to use StableHLO v1.0.0 as IR for our compiler stack, but we are not stuck in the lowering because our inference engine supports asymmetrical quantization in the kernels (e.g. fullyconnected/linear, which we are mapping to dot_general in the lowering).

I guess it make sense to remove this constraint on dot_general to stay consistent with the rest of the opset, but probably will necessitates changes in the StableHLO interpreter (and legalization, etc).

Any advice that you can share, the best way for us to experiment locally, i.e. adapt dot_general to take in asymmetrical quantization?


Regards.
--kahho

Reply all
Reply to author
Forward
0 new messages