Dear Anjan,
Thank you for sharing these results.
I'm not sure whether this is relevant to you (it probably depends on the
different vector instructions supported by Intel Xeon), but I shared the
following optimisation with Rhys Weatherley two years ago:
On platforms where there is no NAND instruction but only AND and INVERT
instructions (such as AVR and RISC-V) you could rewrite the round
function to use an AND instead of a NAND, and to use an inverted version
for the key material. You could do this inversion at the start of the
function call, or save the inverted version and work with that. This
works because the state update function "s0^s47^(∼(s70&s85))^s91^ki" can
also be written as "s0^s47^(s70&s85)^s91^~ki".
In the ARM Cortex M3 assembly code versions of TinyJAMBU, this change
led to an implementation 'about 5% faster on average, and sometimes up
to 10% faster for some variants and packet sizes'. Might be worth
exploring for Intel Xeon as well (and/or any of the other processor
families that you have been benchmarking).
Regards,
Arne
On 2023-01-04 19:28, Anjan Roy wrote:
> Hi all,
>
> Happy New Year. I hope you all are doing good.
>
> During 2022, I decided to implement all NIST LWC finalists as
> zero-dependency, header-only, C++ libraries and few months ago I had
> informed the community about it. More here
> <
https://groups.google.com/a/list.nist.gov/g/lwc-forum/c/abb6cy7jP8s/m/E6-_Kzs6AQAJ>.
>
>
> Recently I was revisiting my work on implementation of TinyJambu AEAD
> and
> came across some interesting results which I'm here to share with you
> all.
> During following benchmark, I was targeting Intel Xeon Platinum 8375C
> CPU @
> 2.90GHz, while using google-benchmark as benchmark harness.
>
> Because the library implementation is in C++, due to presence of
> template
> functions, I can use std::is_constant_evaluated() for compile-time
> branch
> evaluation. And when that's employed, it boosts bytes processing
> bandwidth
> of encrypt/ decrypt routines ~(17-25)x. More on compile-time branch
> evaluation here
> <
https://en.cppreference.com/w/cpp/types/is_constant_evaluated>.
> <
https://github.com/itzmeanjan/tinyjambu/blob/238371569855c81d47d28e99397910a84e603589/bench/README.md>
> .