Using gemmlowp for "pure" uint8_t and int8_t matrix multiplications

314 views

Skip to first unread message

Yaman Umuroglu

unread,

May 15, 2017, 10:48:08 AM5/15/17

to gemmlowp

Hi there,

From the documentation, I understand that the public gemmlowp interfaces are focused on quantizing existing full-precision neural networks, where explicit (de)quantization steps are needed to enter and exit the low-precision GEMM domain.

However, there is some recent work on training NNs that directly use quantized weights and activations [1,2,3], where one can directly use 8-bit (signed/unsigned) math. I was wondering if it's possible to use gemmlowp without having to go through (de)quantization process for both signed and unsigned 8-bit numbers, reading out the 32-bit integer accumulator variables directly?

For unsigned 8-numbers I guess this corresponds to setting the *_offset = 0 and instantiating an empty output pipeline, but the public.md states that only uint8_t is supported as the rhs/lhs type at the moment and setting offset=128 seems a bit wasteful when the ISA supports signed operations directly.

[1] https://github.com/MatthieuCourbariaux/BinaryNet

[2] https://github.com/zhaoweicai/hwgq

[3] https://github.com/ppwwyyxx/tensorpack/tree/master/examples/DoReFa-Net

Thanks in advance!

- Yaman

Benoit Jacob

unread,

May 15, 2017, 11:02:42 AM5/15/17

to Yaman Umuroglu, gemmlowp

On Mon, May 15, 2017 at 10:48 AM, Yaman Umuroglu <malt...@gmail.com> wrote:

Hi there,

From the documentation, I understand that the public gemmlowp interfaces are focused on quantizing existing full-precision neural networks, where explicit (de)quantization steps are needed to enter and exit the low-precision GEMM domain.

However, there is some recent work on training NNs that directly use quantized weights and activations [1,2,3], where one can directly use 8-bit (signed/unsigned) math. I was wondering if it's possible to use gemmlowp without having to go through (de)quantization process for both signed and unsigned 8-bit numbers, reading out the 32-bit integer accumulator variables directly?

For unsigned 8-numbers I guess this corresponds to setting the *_offset = 0 and instantiating an empty output pipeline,

Yes, that's correct (with uint8, as you note below). That is exactly what this part of the test covers:

https://github.com/google/gemmlowp/blob/master/test/test.cc#L1212

but the public.md states that only uint8_t is supported as the rhs/lhs type at the moment and setting offset=128 seems a bit wasteful when the ISA supports signed operations directly.

(Right --- note, it's rather offset=-128, not +128).

Indeed, there is nonzero overhead here, and as you note, the ISA does support signed directly.

For NxN matrices, the overhead of handling the offset is O(N^2) is the GEMM as a whole is O(N^3), so the overhead is negligible for all but the smallest matrix sizes. Note, the trick which allows to have only O(N^2) offsets handling overhead is

https://github.com/google/gemmlowp/blob/master/doc/low-precision.md#efficient-handling-of-offsets

At the moment, gemmlowp does not offer a way to avoid the operands offset handling overhead when the offsets are 0 --- contrary to the output pipeline, which may be empty and then has no overhead.

On the other hand, if you want to hack around gemmlowp in this direction, you may find this interesting: recently, I found a way to write a much faster kernel if the operands at the kernel level are int8 instead of uint8, see

https://github.com/google/gemmlowp/commit/ff9471d2f05df96a6c5c875fc4cffd9213866697

which might serve as inspiration for various things you may want to experiment with in this area.

Cheers

Benoit

[1] https://github.com/MatthieuCourbariaux/BinaryNet
[2] https://github.com/zhaoweicai/hwgq
[3] https://github.com/ppwwyyxx/tensorpack/tree/master/examples/DoReFa-Net

Thanks in advance!

- Yaman

--
You received this message because you are subscribed to the Google Groups "gemmlowp" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gemmlowp+unsubscribe@googlegroups.com.
To post to this group, send email to gemm...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gemmlowp/625c5a1a-429d-4701-8d05-37801a522f93%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward

0 new messages