AES Encryption Took Too Long on Arm64 Embedded Machine

41 views
Skip to first unread message

Dwight Kulkarni

unread,
May 11, 2023, 4:23:57 PM5/11/23
to Crypto++ Users
Hi Everyone,

I created a 5 MB message and encrypted it. The message takes 3 seconds to encrypt. I needed something around 200 ms, even if the encryption is weaker.

My code is below, should I be setting any flags when compiling the library to make it fast ? 

 Got message str at: 05/11/2023 20:21:31.346
 in encrypt aes
 Encrypted at: 05/11/2023 20:21:33.027

message_bytes = encrypt_aes(message_bytes, key, iv);
cout << " Encrypted at: " << get_curr_datetime_str() << endl;


std::string encrypt_aes(std::string message, SecByteBlock key, SecByteBlock iv) {
try {
cout <<" in encrypt aes " <<endl;
AlgorithmParameters params = MakeParameters(Name::FeedbackSize(), 1/*8-bits*/)
(Name::IV(), ConstByteArrayParameter(iv));
CFB_Mode<AES>::Encryption e;
std::string cipher;
e.SetKey(key, key.size(), params);
StringSource ss(message, true, new StreamTransformationFilter(e, new StringSink(cipher)));
cout << " returning cipher " << endl;
return cipher;
}
catch (CryptoPP::Exception e) {
std::cerr << e.what() << std::endl;
return "";
}
}

Jeffrey Walton

unread,
May 11, 2023, 5:30:26 PM5/11/23
to cryptop...@googlegroups.com
You should probably avoid multiple resizes on cipher object. Add something like:

std::string cipher;
cipher.reserve(message.size()+16);

Otherwise, please run the benchmarks and report back:

cryptest.exe b2 3 <cpufreq in GHz>

Also see https://cryptopp.com/wiki/Benchmarks .

Jeff

Dwight Kulkarni

unread,
May 12, 2023, 9:09:12 AM5/12/23
to Crypto++ Users
Hi Jeff,

See below benchmark test results:


root@imx8mpevk:~/p2p_sockets# ./cryptest.exe b 2 1.8
<!DOCTYPE HTML>
<HTML lang="en">
<HEAD>
<META charset="UTF-8">
<TITLE>Speed Comparison of Popular Crypto Algorithms</TITLE>
<STYLE>
  table {border-collapse: collapse;}
  table, th, td, tr {border: 1px solid black;}
</STYLE>
</HEAD>
<BODY>
<H1><A href="http://www.cryptopp.com">Crypto++</A> 8.1.0 Benchmarks</H1>
<P>Here are speed benchmarks for some commonly used cryptographic algorithms.</P>
<P>CPU frequency of the test platform is 1.8 GHz.</P>
<BR>
<TABLE>
<COLGROUP><COL style="text-align: left;"><COL style="text-align: right;"><COL style="text-align: right;">
<THEAD style="background: #F0F0F0">
<TR><TH>Algorithm<TH>Provider<TH>MiB/Second<TH>Cycles/Byte
<TBODY style="background: white;">
<TR><TD>NonblockingRng<TD>C++<TD>104<TD>16.45
<TR><TD>AutoSeededRandomPool<TD>C++<TD>41<TD>41.5
<TR><TD>AutoSeededX917RNG(AES)<TD>C++<TD>7<TD>239.3
<TR><TD>MT19937<TD>C++<TD>143<TD>11.96
<TR><TD>AES/OFB RNG<TD>C++<TD>60<TD>28.5
<TR><TD>Hash_DRBG(SHA1)<TD>C++<TD>31<TD>55.1
<TR><TD>Hash_DRBG(SHA256)<TD>C++<TD>27<TD>64.3
<TR><TD>HMAC_DRBG(SHA1)<TD>C++<TD>7<TD>231.6
<TR><TD>HMAC_DRBG(SHA256)<TD>C++<TD>7<TD>242.4
<TBODY style="background: yellow;">
<TR><TD>CRC32<TD>C++<TD>254<TD>6.77
<TR><TD>CRC32C<TD>C++<TD>254<TD>6.77
<TR><TD>Adler32<TD>C++<TD>801<TD>2.14
<TR><TD>MD5<TD>C++<TD>226<TD>7.59
<TR><TD>SHA-1<TD>C++<TD>170<TD>10.08
<TR><TD>SHA-256<TD>C++<TD>73<TD>23.42
<TR><TD>SHA-512<TD>C++<TD>115<TD>14.93
<TR><TD>SHA3-224<TD>C++<TD>103<TD>16.64
<TR><TD>SHA3-256<TD>C++<TD>98<TD>17.53
<TR><TD>SHA3-384<TD>C++<TD>76<TD>22.45
<TR><TD>SHA3-512<TD>C++<TD>54<TD>31.7
<TR><TD>Keccak-224<TD>C++<TD>103<TD>16.64
<TR><TD>Keccak-256<TD>C++<TD>98<TD>17.53
<TR><TD>Keccak-384<TD>C++<TD>76<TD>22.45
<TR><TD>Keccak-512<TD>C++<TD>54<TD>31.7
<TR><TD>Tiger<TD>C++<TD>179<TD>9.60
<TR><TD>Whirlpool<TD>C++<TD>32<TD>54.0
<TR><TD>RIPEMD-160<TD>C++<TD>117<TD>14.63
<TR><TD>RIPEMD-320<TD>C++<TD>114<TD>15.12
<TR><TD>RIPEMD-128<TD>C++<TD>227<TD>7.55
<TR><TD>RIPEMD-256<TD>C++<TD>214<TD>8.02
<TR><TD>SM3<TD>C++<TD>102<TD>16.80
<TR><TD>BLAKE2s<TD>C++<TD>150<TD>11.46
<TR><TD>BLAKE2b<TD>C++<TD>153<TD>11.25
</TABLE>

<BR>
<TABLE>
<COLGROUP><COL style="text-align: left;"><COL style="text-align: right;"><COL style="text-align: right;"><COL style="text-align: right;"><COL style="text-align: right;">
<THEAD style="background: #F0F0F0">
<TR><TH>Algorithm<TH>Provider<TH>MiB/Second<TH>Cycles/Byte<TH>Microseconds to<BR>Setup Key and IV<TH>Cycles to<BR>Setup Key and IV
<TBODY style="background: white;">
<TR><TD>GMAC(AES) (2K tables)<TD>C++<TD>206<TD>8.32<TD>2.632<TD>4738
<TR><TD>GMAC(AES) (64K tables)<TD>C++<TD>134<TD>12.79<TD>24.252<TD>43654
<TR><TD>VMAC(AES)-64 (128-bit key)<TD>C++<TD>1085<TD>1.58<TD>4.104<TD>7387
<TR><TD>VMAC(AES)-128 (128-bit key)<TD>C++<TD>506<TD>3.39<TD>4.851<TD>8731
<TR><TD>HMAC(SHA-1) (128-bit key)<TD>C++<TD>169<TD>10.13<TD>1.098<TD>1977
<TR><TD>HMAC(SHA-256) (128-bit key)<TD>C++<TD>73<TD>23.42<TD>1.102<TD>1983
<TR><TD>Two-Track-MAC (160-bit key)<TD>C++<TD>140<TD>12.23<TD>0.050<TD>90
<TR><TD>CMAC(AES) (128-bit key)<TD>C++<TD>56<TD>30.4<TD>0.746<TD>1343
<TR><TD>DMAC(AES) (128-bit key)<TD>C++<TD>56<TD>30.4<TD>1.924<TD>3464
<TR><TD>Poly1305(AES) (256-bit key)<TD>C++<TD>341<TD>5.04<TD>0.844<TD>1520
<TR><TD>Poly1305TLS (256-bit key)<TD>C++<TD>341<TD>5.04<TD>0.049<TD>88
<TR><TD>BLAKE2s (256-bit key)<TD>C++<TD>150<TD>11.46<TD>0.636<TD>1146
<TR><TD>BLAKE2b (512-bit key)<TD>C++<TD>153<TD>11.19<TD>0.664<TD>1196
<TR><TD>SipHash-2-4 (128-bit key)<TD>C++<TD>620<TD>2.77<TD>0.054<TD>98
<TR><TD>SipHash-4-8 (128-bit key)<TD>C++<TD>350<TD>4.90<TD>0.054<TD>98
<TBODY style="background: yellow;">
<TR><TD>Panama-LE (256-bit key)<TD>C++<TD>259<TD>6.62<TD>3.831<TD>6896
<TR><TD>Panama-BE (256-bit key)<TD>C++<TD>249<TD>6.90<TD>3.859<TD>6947
<TR><TD>Salsa20<TD>C++<TD>224<TD>7.66<TD>0.499<TD>898
<TR><TD>Salsa20/12<TD>C++<TD>316<TD>5.44<TD>0.614<TD>1105
<TR><TD>Salsa20/8<TD>C++<TD>392<TD>4.37<TD>0.614<TD>1105
<TR><TD>ChaCha20<TD>C++<TD>186<TD>9.22<TD>0.489<TD>879
<TR><TD>ChaCha12<TD>C++<TD>288<TD>5.96<TD>0.603<TD>1085
<TR><TD>ChaCha8<TD>C++<TD>389<TD>4.41<TD>0.603<TD>1085
<TR><TD>ChaChaTLS (256-bit key)<TD>C++<TD>186<TD>9.23<TD>0.600<TD>1080
<TR><TD>Sosemanuk (128-bit key)<TD>C++<TD>482<TD>3.56<TD>1.497<TD>2694
<TR><TD>Rabbit (128-bit key)<TD>C++<TD>122<TD>14.13<TD>0.595<TD>1072
<TR><TD>RabbitWithIV (128-bit key)<TD>C++<TD>120<TD>14.27<TD>1.295<TD>2330
<TR><TD>HC-128 (128-bit key)<TD>C++<TD>290<TD>5.91<TD>18.216<TD>32788
<TR><TD>HC-256 (256-bit key)<TD>C++<TD>130<TD>13.20<TD>119.611<TD>215300
<TR><TD>MARC4 (128-bit key)<TD>C++<TD>114<TD>15.02<TD>4.450<TD>8010
<TR><TD>SEAL-3.0-LE (160-bit key)<TD>C++<TD>290<TD>5.92<TD>63.126<TD>113627
<TR><TD>WAKE-OFB-LE (256-bit key)<TD>C++<TD>179<TD>9.57<TD>4.016<TD>7229
<TBODY style="background: white;">
<TR><TD>AES/CTR (128-bit key)<TD>C++<TD>63<TD>27.3<TD>0.886<TD>1595
<TR><TD>AES/CTR (192-bit key)<TD>C++<TD>55<TD>31.3<TD>0.882<TD>1587
<TR><TD>AES/CTR (256-bit key)<TD>C++<TD>49<TD>35.2<TD>0.910<TD>1637
<TR><TD>AES/CBC (128-bit key)<TD>C++<TD>56<TD>30.4<TD>0.732<TD>1317
<TR><TD>AES/CBC (192-bit key)<TD>C++<TD>50<TD>34.4<TD>0.728<TD>1311
<TR><TD>AES/CBC (256-bit key)<TD>C++<TD>45<TD>38.3<TD>0.756<TD>1361
<TR><TD>AES/OFB (128-bit key)<TD>C++<TD>59<TD>29.0<TD>0.933<TD>1679
<TR><TD>AES/CFB (128-bit key)<TD>C++<TD>63<TD>27.1<TD>1.135<TD>2043
<TR><TD>AES/ECB (128-bit key)<TD>C++<TD>63<TD>27.4<TD>0.329<TD>592
<TR><TD>ARIA/CTR (128-bit key)<TD>C++<TD>32<TD>53.6<TD>0.889<TD>1600
<TR><TD>ARIA/CTR (256-bit key)<TD>C++<TD>26<TD>65.4<TD>0.949<TD>1708
<TR><TD>HIGHT/CTR (128-bit key)<TD>C++<TD>17<TD>103.5<TD>1.317<TD>2371
<TR><TD>Camellia/CTR (128-bit key)<TD>C++<TD>57<TD>30.0<TD>0.709<TD>1275
<TR><TD>Camellia/CTR (256-bit key)<TD>C++<TD>45<TD>38.6<TD>0.788<TD>1419
<TR><TD>Twofish/CTR (128-bit key)<TD>C++<TD>69<TD>24.8<TD>8.854<TD>15937
<TR><TD>Threefish-256(256)/CTR (256-bit key)<TD>C++<TD>94<TD>18.33<TD>0.807<TD>1453
<TR><TD>Threefish-512(512)/CTR (512-bit key)<TD>C++<TD>107<TD>16.02<TD>0.811<TD>1460
<TR><TD>Threefish-1024(1024)/CTR (1024-bit key)<TD>C++<TD>60<TD>28.6<TD>0.865<TD>1556
<TR><TD>Serpent/CTR (128-bit key)<TD>C++<TD>41<TD>41.8<TD>1.568<TD>2822
<TR><TD>CAST-128/CTR (128-bit key)<TD>C++<TD>47<TD>36.8<TD>1.081<TD>1946
<TR><TD>CAST-256/CTR (256-bit key)<TD>C++<TD>45<TD>37.7<TD>2.434<TD>4381
<TR><TD>RC6/CTR (128-bit key)<TD>C++<TD>83<TD>20.68<TD>2.734<TD>4921
<TR><TD>MARS/CTR (128-bit key)<TD>C++<TD>50<TD>34.3<TD>4.738<TD>8529
<TR><TD>SHACAL-2/CTR (128-bit key)<TD>C++<TD>76<TD>22.64<TD>1.026<TD>1846
<TR><TD>SHACAL-2/CTR (512-bit key)<TD>C++<TD>76<TD>22.64<TD>1.071<TD>1928
<TR><TD>DES/CTR (64-bit key)<TD>C++<TD>31<TD>55.6<TD>14.649<TD>26368
<TR><TD>DES-XEX3/CTR (192-bit key)<TD>C++<TD>25<TD>67.6<TD>14.750<TD>26550
<TR><TD>DES-EDE3/CTR (192-bit key)<TD>C++<TD>12<TD>140.3<TD>43.784<TD>78812
<TR><TD>IDEA/CTR (128-bit key)<TD>C++<TD>40<TD>42.8<TD>0.941<TD>1694
<TR><TD>RC5 (r=16)<TD>C++<TD>79<TD>21.83<TD>2.307<TD>4152
<TR><TD>Blowfish/CTR (128-bit key)<TD>C++<TD>60<TD>28.7<TD>64.804<TD>116647
<TR><TD>SKIPJACK/CTR (80-bit key)<TD>C++<TD>18<TD>96.7<TD>7.797<TD>14035
<TR><TD>SEED/CTR (1/2 K table)<TD>C++<TD>33<TD>52.3<TD>0.889<TD>1600
<TR><TD>SM4/CTR (128-bit key)<TD>C++<TD>39<TD>44.4<TD>1.058<TD>1905
<TR><TD>Kalyna-128(128)/CTR (128-bit key)<TD>C++<TD>57<TD>30.1<TD>1.217<TD>2191
<TR><TD>Kalyna-128(256)/CTR (256-bit key)<TD>C++<TD>40<TD>42.6<TD>1.210<TD>2178
<TR><TD>Kalyna-256(256)/CTR (256-bit key)<TD>C++<TD>36<TD>47.3<TD>1.868<TD>3362
<TR><TD>Kalyna-256(512)/CTR (512-bit key)<TD>C++<TD>28<TD>60.4<TD>2.199<TD>3958
<TR><TD>Kalyna-512(512)/CTR (512-bit key)<TD>C++<TD>30<TD>56.6<TD>3.507<TD>6312
<TBODY style="background: yellow;">
<TR><TD>CHAM-64(128)/CTR (128-bit key)<TD>C++<TD>21<TD>80.3<TD>0.650<TD>1169
<TR><TD>CHAM-128(128)/CTR (128-bit key)<TD>C++<TD>46<TD>37.1<TD>0.629<TD>1133
<TR><TD>CHAM-128(256)/CTR (256-bit key)<TD>C++<TD>40<TD>43.0<TD>0.664<TD>1196
<TR><TD>LEA-128(128)/CTR (128-bit key)<TD>C++<TD>52<TD>32.9<TD>0.874<TD>1572
<TR><TD>LEA-128(192)/CTR (192-bit key)<TD>C++<TD>45<TD>38.3<TD>0.954<TD>1718
<TR><TD>LEA-128(256)/CTR (256-bit key)<TD>C++<TD>39<TD>43.5<TD>1.041<TD>1874
<TR><TD>SIMECK-32(64)/CTR (64-bit key)<TD>C++<TD>22<TD>76.6<TD>0.836<TD>1505
<TR><TD>SIMECK-64(128)/CTR (128-bit key)<TD>C++<TD>51<TD>33.8<TD>0.864<TD>1556
<TR><TD>SIMON-64(96)/CTR (96-bit key)<TD>C++<TD>52<TD>32.7<TD>0.819<TD>1474
<TR><TD>SIMON-64(128)/CTR (128-bit key)<TD>C++<TD>50<TD>34.1<TD>0.852<TD>1533
<TR><TD>SIMON-128(128)/CTR (128-bit key)<TD>C++<TD>70<TD>24.6<TD>0.937<TD>1687
<TR><TD>SIMON-128(192)/CTR (192-bit key)<TD>C++<TD>69<TD>24.8<TD>0.940<TD>1693
<TR><TD>SIMON-128(256)/CTR (256-bit key)<TD>C++<TD>67<TD>25.8<TD>0.987<TD>1777
<TR><TD>SPECK-64(96)/CTR (96-bit key)<TD>C++<TD>76<TD>22.60<TD>0.665<TD>1197
<TR><TD>SPECK-64(128)/CTR (128-bit key)<TD>C++<TD>73<TD>23.37<TD>0.670<TD>1206
<TR><TD>SPECK-128(128)/CTR (128-bit key)<TD>C++<TD>130<TD>13.23<TD>0.714<TD>1286
<TR><TD>SPECK-128(192)/CTR (192-bit key)<TD>C++<TD>127<TD>13.50<TD>0.692<TD>1245
<TR><TD>SPECK-128(256)/CTR (256-bit key)<TD>C++<TD>124<TD>13.80<TD>0.701<TD>1262
<TR><TD>TEA/CTR (128-bit key)<TD>C++<TD>34<TD>51.2<TD>0.759<TD>1366
<TR><TD>XTEA/CTR (128-bit key)<TD>C++<TD>26<TD>67.2<TD>0.742<TD>1335
<TBODY style="background: white;">
<TR><TD>AES/GCM (2K tables)<TD>C++<TD>48<TD>35.7<TD>2.652<TD>4774
<TR><TD>AES/GCM (64K tables)<TD>C++<TD>43<TD>40.2<TD>24.196<TD>43552
<TR><TD>AES/CCM (128-bit key)<TD>C++<TD>30<TD>57.9<TD>1.145<TD>2062
<TR><TD>AES/EAX (128-bit key)<TD>C++<TD>30<TD>57.9<TD>2.237<TD>4027
<TR><TD>ChaCha20/Poly1305 (256-bit key)<TD>C++<TD>120<TD>14.34<TD>3.407<TD>6133
<TR><TD>XChaCha20/Poly1305 (256-bit key)<TD>C++<TD>120<TD>14.33<TD>4.000<TD>7200
</TABLE>


Jeffrey Walton

unread,
May 12, 2023, 9:37:50 AM5/12/23
to cryptop...@googlegroups.com
On Fri, May 12, 2023 at 9:09 AM Dwight Kulkarni <dwi...@realtime-7.com> wrote:
>
> See below benchmark test results:
>
> root@imx8mpevk:~/p2p_sockets# ./cryptest.exe b 2 1.8
> [...]
> <TR><TD>AES/CTR (128-bit key)<TD>C++<TD>63<TD>27.3<TD>0.886<TD>1595
> <TR><TD>AES/CTR (192-bit key)<TD>C++<TD>55<TD>31.3<TD>0.882<TD>1587
> <TR><TD>AES/CTR (256-bit key)<TD>C++<TD>49<TD>35.2<TD>0.910<TD>1637
> <TR><TD>AES/CBC (128-bit key)<TD>C++<TD>56<TD>30.4<TD>0.732<TD>1317
> <TR><TD>AES/CBC (192-bit key)<TD>C++<TD>50<TD>34.4<TD>0.728<TD>1311
> <TR><TD>AES/CBC (256-bit key)<TD>C++<TD>45<TD>38.3<TD>0.756<TD>1361
> <TR><TD>AES/OFB (128-bit key)<TD>C++<TD>59<TD>29.0<TD>0.933<TD>1679
> <TR><TD>AES/CFB (128-bit key)<TD>C++<TD>63<TD>27.1<TD>1.135<TD>2043
> <TR><TD>AES/ECB (128-bit key)<TD>C++<TD>63<TD>27.4<TD>0.329<TD>592
> [...]

For completeness, AES/CFB is 27.1 cycles-per-byte (cpb). Cycles per
byte is what I am interested in when comparing benchmarks. The other
number you are probably interested in is 63, which is 63
megabytes-per-second (MB/s).

These numbers are software-only implementations. And the provider is
"C++", which is software only.

Here is what an aarch64 machine looks like, from an early Pine64 board:

<TR><TD>AES/CTR (128-bit key)<TD>ARMv8<TD>428<TD>2.67<TD>1.174<TD>1408
<TR><TD>AES/CTR (192-bit key)<TD>ARMv8<TD>376<TD>3.05<TD>1.190<TD>1428
<TR><TD>AES/CTR (256-bit key)<TD>ARMv8<TD>343<TD>3.33<TD>1.230<TD>1476
<TR><TD>AES/CBC (128-bit key)<TD>ARMv8<TD>280<TD>4.08<TD>0.994<TD>1192
<TR><TD>AES/CBC (192-bit key)<TD>ARMv8<TD>245<TD>4.67<TD>1.007<TD>1208
<TR><TD>AES/CBC (256-bit key)<TD>ARMv8<TD>218<TD>5.26<TD>1.047<TD>1256
<TR><TD>AES/XTS (256-bit key)<TD>ARMv8<TD>225<TD>5.09<TD>1.728<TD>2074
<TR><TD>AES/XTS (384-bit key)<TD>ARMv8<TD>210<TD>5.46<TD>1.765<TD>2117
<TR><TD>AES/XTS (512-bit key)<TD>ARMv8<TD>199<TD>5.76<TD>1.854<TD>2225
<TR><TD>AES/OFB (128-bit key)<TD>ARMv8<TD>226<TD>5.06<TD>1.152<TD>1383
<TR><TD>AES/CFB (128-bit key)<TD>ARMv8<TD>249<TD>4.60<TD>1.414<TD>1697
<TR><TD>AES/ECB (128-bit key)<TD>ARMv8<TD>604<TD>1.90<TD>0.525<TD>630

AES/CFB is 4.6 cpb. The provider is "ARMv8." 249 is 249 MB/s.

Modern aarch64 machines can usually get down to 2.5 cpb or so for CFB
mode. This is from a MacMini M1:

<TR><TD>AES/CTR (128-bit key)<TD>ARMv8<TD>9316<TD>0.33<TD>0.109<TD>349
<TR><TD>AES/CTR (192-bit key)<TD>ARMv8<TD>8194<TD>0.37<TD>0.117<TD>376
<TR><TD>AES/CTR (256-bit key)<TD>ARMv8<TD>7303<TD>0.42<TD>0.129<TD>412
<TR><TD>AES/CBC (128-bit key)<TD>ARMv8<TD>1083<TD>2.82<TD>0.097<TD>310
<TR><TD>AES/CBC (192-bit key)<TD>ARMv8<TD>938<TD>3.25<TD>0.106<TD>339
<TR><TD>AES/CBC (256-bit key)<TD>ARMv8<TD>834<TD>3.66<TD>0.118<TD>379
<TR><TD>AES/XTS (256-bit key)<TD>ARMv8<TD>1807<TD>1.69<TD>0.181<TD>578
<TR><TD>AES/XTS (384-bit key)<TD>ARMv8<TD>1768<TD>1.73<TD>0.203<TD>650
<TR><TD>AES/XTS (512-bit key)<TD>ARMv8<TD>1712<TD>1.78<TD>0.227<TD>726
<TR><TD>AES/OFB (128-bit key)<TD>ARMv8<TD>1133<TD>2.69<TD>0.106<TD>340
<TR><TD>AES/CFB (128-bit key)<TD>ARMv8<TD>1121<TD>2.72<TD>0.117<TD>374
<TR><TD>AES/ECB (128-bit key)<TD>ARMv8<TD>10883<TD>0.28<TD>0.072<TD>232

AES/CFB is running at 2.7 cpb. 1121 is 1.12 GB/s.

There's something unusual about your setup. You will need to determine
why it is not providing ARMv8 acceleration, or why the library is not
picking it up.

Since your benchmarks are missing AES/XTR results, I know you are
using an old version of the library. Maybe you should update to
Crypto++ 8.7 for starters.

Jeff

Dwight Kulkarni

unread,
May 12, 2023, 10:11:01 AM5/12/23
to Crypto++ Users
Hi Jeff,

I am using an ARM64 cross compiler. I source into the sdk using:


source /opt/fsl-imx-xwayland/5.10-hardknott/environment-setup-cortexa53-crypto-poky-linux

I compile it on x86 and the move it to the embedded machine for testing.

This is the compiler command when I issue the make command :

aarch64-poky-linux-g++  -mcpu=cortex-a53 -march=armv8-a+crc+crypto -fstack-protector-strong  -O2 -D_FORTIFY_SOURCE=2 -Wformat -Wformat-security -Werror=format-security --sysroot=/opt/fsl-imx-xwayland/5.10-hardknott/sysroots/cortexa53-crypto-poky-linux -O2 -pipe -g -feliminate-unused-debug-types -DCRYPTOPP_ARM_ACLE_AVAILABLE=0 -DCRYPTOPP_DISABLE_ASM -fPIC -pthread -pipe -c zlib.cpp

I am right now compiling the latest library and will let you know the results.

Let me know if you see anything in the compiler flags.

Dwight Kulkarni

unread,
May 12, 2023, 10:27:52 AM5/12/23
to Crypto++ Users
These are the results from CryptoPP 8.7 vs 8.1 earlier:


root@imx8mpevk:~/p2p_sockets# ./cryptest.exe b 2 1.8
<!DOCTYPE HTML>
<HTML lang="en">
<HEAD>
<META charset="UTF-8">
<TITLE>Speed Comparison of Popular Crypto Algorithms</TITLE>
<STYLE>
  table {border-collapse: collapse;}
  table, th, td, tr {border: 1px solid black;}
</STYLE>
</HEAD>
<BODY>
<H1><A href="http://www.cryptopp.com">Crypto++ 8.7.0</A> Benchmarks</H1>

<P>Here are speed benchmarks for some commonly used cryptographic algorithms.</P>
<P>CPU frequency of the test platform is 1.8 GHz.</P>
<TABLE>
<COLGROUP><COL style="text-align: left;"><COL style="text-align: right;"><COL style="text-align: right;">
<THEAD style="background: #F0F0F0">
<TR><TH>Algorithm<TH>Provider<TH>MiB/Second<TH>Cycles/Byte
<TBODY style="background: white;">
<TR><TD>NonblockingRng<TD>C++<TD>104<TD>16.44
<TR><TD>AutoSeededRandomPool<TD>C++<TD>78<TD>22.13
<TR><TD>AutoSeededX917RNG(AES)<TD>ARMv8<TD>9<TD>198.0
<TR><TD>MT19937<TD>C++<TD>143<TD>11.98
<TR><TD>AES/OFB RNG<TD>ARMv8<TD>300<TD>5.72
<TR><TD>Hash_DRBG(SHA1)<TD>ARMv8<TD>58<TD>29.4
<TR><TD>Hash_DRBG(SHA256)<TD>ARMv8<TD>82<TD>20.92
<TR><TD>HMAC_DRBG(SHA1)<TD>ARMv8<TD>16<TD>106.5
<TR><TD>HMAC_DRBG(SHA256)<TD>ARMv8<TD>23<TD>74.0
<TBODY style="background: yellow;">
<TR><TD>CRC32<TD>ARMv8<TD>3256<TD>0.53
<TR><TD>CRC32C<TD>ARMv8<TD>3256<TD>0.53

<TR><TD>Adler32<TD>C++<TD>801<TD>2.14
<TR><TD>MD5<TD>C++<TD>226<TD>7.59
<TR><TD>SHA-1<TD>ARMv8<TD>762<TD>2.25
<TR><TD>SHA-256<TD>ARMv8<TD>667<TD>2.57
<TR><TD>SHA-512<TD>C++<TD>115<TD>14.95
<TR><TD>SHA3-224<TD>C++<TD>112<TD>15.32
<TR><TD>SHA3-256<TD>C++<TD>106<TD>16.27
<TR><TD>SHA3-384<TD>C++<TD>81<TD>21.09
<TR><TD>SHA3-512<TD>C++<TD>57<TD>30.2
<TR><TD>Keccak-224<TD>C++<TD>112<TD>15.33
<TR><TD>Keccak-256<TD>C++<TD>105<TD>16.28
<TR><TD>Keccak-384<TD>C++<TD>81<TD>21.08
<TR><TD>Keccak-512<TD>C++<TD>57<TD>30.2
<TR><TD>Tiger<TD>C++<TD>179<TD>9.61

<TR><TD>Whirlpool<TD>C++<TD>32<TD>54.0
<TR><TD>RIPEMD-160<TD>C++<TD>117<TD>14.63
<TR><TD>RIPEMD-320<TD>C++<TD>114<TD>15.11

<TR><TD>RIPEMD-128<TD>C++<TD>227<TD>7.55
<TR><TD>RIPEMD-256<TD>C++<TD>214<TD>8.02
<TR><TD>SM3<TD>C++<TD>101<TD>16.94
<TR><TD>BLAKE2s<TD>C++<TD>150<TD>11.48
<TR><TD>BLAKE2b<TD>C++<TD>151<TD>11.37
<TR><TD>LSH-256<TD>C++<TD>35<TD>49.3
<TR><TD>LSH-512<TD>C++<TD>56<TD>30.7

</TABLE>

<BR>
<TABLE>
<COLGROUP><COL style="text-align: left;"><COL style="text-align: right;"><COL style="text-align: right;"><COL style="text-align: right;"><COL style="text-align: right;">
<THEAD style="background: #F0F0F0">
<TR><TH>Algorithm<TH>Provider<TH>MiB/Second<TH>Cycles/Byte<TH>Microseconds to<BR>Setup Key and IV<TH>Cycles to<BR>Setup Key and IV
<TBODY style="background: white;">
<TR><TD>GMAC(AES)<TD>ARMv8<TD>1033<TD>1.66<TD>1.556<TD>2801
<TR><TD>VMAC(AES)-64 (128-bit key)<TD>ARMv8<TD>1345<TD>1.28<TD>2.222<TD>4000
<TR><TD>VMAC(AES)-128 (128-bit key)<TD>ARMv8<TD>820<TD>2.09<TD>2.527<TD>4549
<TR><TD>HMAC(SHA-1) (128-bit key)<TD>ARMv8<TD>753<TD>2.28<TD>1.097<TD>1975
<TR><TD>HMAC(SHA-256) (128-bit key)<TD>ARMv8<TD>663<TD>2.59<TD>1.097<TD>1975

<TR><TD>Two-Track-MAC (160-bit key)<TD>C++<TD>140<TD>12.23<TD>0.050<TD>90
<TR><TD>CMAC(AES) (128-bit key)<TD>ARMv8<TD>340<TD>5.05<TD>0.723<TD>1301
<TR><TD>DMAC(AES) (128-bit key)<TD>ARMv8<TD>345<TD>4.98<TD>2.029<TD>3653
<TR><TD>Poly1305(AES) (256-bit key)<TD>ARMv8<TD>300<TD>5.72<TD>0.831<TD>1496
<TR><TD>Poly1305TLS (256-bit key)<TD>C++<TD>300<TD>5.72<TD>0.051<TD>91
<TR><TD>BLAKE2s (256-bit key)<TD>C++<TD>150<TD>11.48<TD>0.597<TD>1074
<TR><TD>BLAKE2b (512-bit key)<TD>C++<TD>151<TD>11.36<TD>0.622<TD>1120
<TR><TD>SipHash-2-4 (128-bit key)<TD>C++<TD>593<TD>2.89<TD>0.054<TD>98

<TR><TD>SipHash-4-8 (128-bit key)<TD>C++<TD>350<TD>4.90<TD>0.054<TD>98
<TBODY style="background: yellow;">
<TR><TD>Panama-LE (256-bit key)<TD>C++<TD>257<TD>6.67<TD>3.810<TD>6859
<TR><TD>Panama-BE (256-bit key)<TD>C++<TD>250<TD>6.86<TD>3.836<TD>6904
<TR><TD>Salsa20<TD>C++<TD>222<TD>7.74<TD>0.485<TD>873
<TR><TD>Salsa20/12<TD>C++<TD>311<TD>5.52<TD>0.602<TD>1084
<TR><TD>Salsa20/8<TD>C++<TD>385<TD>4.46<TD>0.608<TD>1094
<TR><TD>ChaCha20<TD>NEON<TD>253<TD>6.79<TD>0.475<TD>854
<TR><TD>ChaCha12<TD>NEON<TD>396<TD>4.34<TD>0.585<TD>1052
<TR><TD>ChaCha8<TD>NEON<TD>549<TD>3.12<TD>0.591<TD>1064
<TR><TD>ChaChaTLS (256-bit key)<TD>NEON<TD>253<TD>6.80<TD>0.564<TD>1016
<TR><TD>Sosemanuk (128-bit key)<TD>C++<TD>484<TD>3.55<TD>1.393<TD>2507
<TR><TD>Rabbit (128-bit key)<TD>C++<TD>130<TD>13.17<TD>0.599<TD>1079
<TR><TD>RabbitWithIV (128-bit key)<TD>C++<TD>130<TD>13.17<TD>1.285<TD>2313
<TR><TD>HC-128 (128-bit key)<TD>C++<TD>350<TD>4.90<TD>18.160<TD>32688
<TR><TD>HC-256 (256-bit key)<TD>C++<TD>143<TD>12.01<TD>119.603<TD>215285
<TR><TD>MARC4 (128-bit key)<TD>C++<TD>114<TD>15.02<TD>4.447<TD>8004
<TR><TD>SEAL-3.0-LE (160-bit key)<TD>C++<TD>280<TD>6.13<TD>31.427<TD>56569
<TR><TD>WAKE-OFB-LE (256-bit key)<TD>C++<TD>184<TD>9.31<TD>4.016<TD>7229
<TBODY style="background: white;">
<TR><TD>AES/CTR (128-bit key)<TD>ARMv8<TD>598<TD>2.87<TD>1.009<TD>1816
<TR><TD>AES/CTR (192-bit key)<TD>ARMv8<TD>538<TD>3.19<TD>1.018<TD>1832
<TR><TD>AES/CTR (256-bit key)<TD>ARMv8<TD>502<TD>3.42<TD>1.071<TD>1928
<TR><TD>AES/CBC (128-bit key)<TD>ARMv8<TD>341<TD>5.04<TD>0.817<TD>1470
<TR><TD>AES/CBC (192-bit key)<TD>ARMv8<TD>306<TD>5.60<TD>0.830<TD>1493
<TR><TD>AES/CBC (256-bit key)<TD>ARMv8<TD>278<TD>6.17<TD>0.888<TD>1599
<TR><TD>AES/XTS (256-bit key)<TD>ARMv8<TD>298<TD>5.76<TD>1.430<TD>2574
<TR><TD>AES/XTS (384-bit key)<TD>ARMv8<TD>281<TD>6.11<TD>1.474<TD>2654
<TR><TD>AES/XTS (512-bit key)<TD>ARMv8<TD>274<TD>6.26<TD>1.583<TD>2849
<TR><TD>AES/OFB (128-bit key)<TD>ARMv8<TD>317<TD>5.41<TD>0.963<TD>1733
<TR><TD>AES/CFB (128-bit key)<TD>ARMv8<TD>344<TD>4.99<TD>1.135<TD>2044
<TR><TD>AES/ECB (128-bit key)<TD>ARMv8<TD>731<TD>2.35<TD>0.458<TD>825
<TR><TD>ARIA/CTR (128-bit key)<TD>C++<TD>34<TD>50.1<TD>0.753<TD>1356
<TR><TD>ARIA/CTR (256-bit key)<TD>C++<TD>28<TD>62.2<TD>0.772<TD>1390
<TR><TD>HIGHT/CTR (128-bit key)<TD>C++<TD>17<TD>104.0<TD>1.302<TD>2343
<TR><TD>Camellia/CTR (128-bit key)<TD>C++<TD>57<TD>30.4<TD>0.736<TD>1325
<TR><TD>Camellia/CTR (256-bit key)<TD>C++<TD>44<TD>38.9<TD>0.774<TD>1393
<TR><TD>Twofish/CTR (128-bit key)<TD>C++<TD>68<TD>25.3<TD>8.807<TD>15853
<TR><TD>Threefish-256(256)/CTR (256-bit key)<TD>C++<TD>93<TD>18.55<TD>0.759<TD>1366
<TR><TD>Threefish-512(512)/CTR (512-bit key)<TD>C++<TD>106<TD>16.27<TD>0.768<TD>1382
<TR><TD>Threefish-1024(1024)/CTR (1024-bit key)<TD>C++<TD>58<TD>29.6<TD>0.835<TD>1503
<TR><TD>Serpent/CTR (128-bit key)<TD>C++<TD>41<TD>42.0<TD>1.561<TD>2809
<TR><TD>CAST-128/CTR (128-bit key)<TD>C++<TD>46<TD>37.3<TD>1.121<TD>2018
<TR><TD>CAST-256/CTR (256-bit key)<TD>C++<TD>45<TD>38.3<TD>2.417<TD>4351
<TR><TD>RC6/CTR (128-bit key)<TD>C++<TD>86<TD>19.86<TD>2.722<TD>4899
<TR><TD>MARS/CTR (128-bit key)<TD>C++<TD>49<TD>34.7<TD>4.714<TD>8485
<TR><TD>SHACAL-2/CTR (128-bit key)<TD>C++<TD>75<TD>22.96<TD>1.005<TD>1809
<TR><TD>SHACAL-2/CTR (512-bit key)<TD>C++<TD>75<TD>22.98<TD>1.051<TD>1892
<TR><TD>DES/CTR (64-bit key)<TD>C++<TD>31<TD>55.6<TD>14.622<TD>26319
<TR><TD>DES-XEX3/CTR (192-bit key)<TD>C++<TD>26<TD>65.6<TD>14.736<TD>26525
<TR><TD>DES-EDE3/CTR (192-bit key)<TD>C++<TD>12<TD>140.4<TD>43.754<TD>78758
<TR><TD>IDEA/CTR (128-bit key)<TD>C++<TD>32<TD>54.3<TD>0.902<TD>1624
<TR><TD>RC5 (r=16)<TD>C++<TD>79<TD>21.82<TD>2.320<TD>4176
<TR><TD>Blowfish/CTR (128-bit key)<TD>C++<TD>59<TD>29.2<TD>64.819<TD>116674
<TR><TD>SKIPJACK/CTR (80-bit key)<TD>C++<TD>17<TD>99.6<TD>7.781<TD>14006
<TR><TD>SEED/CTR (1/2 K table)<TD>C++<TD>32<TD>52.8<TD>0.842<TD>1516
<TR><TD>SM4/CTR (128-bit key)<TD>C++<TD>38<TD>44.6<TD>1.025<TD>1846
<TR><TD>Kalyna-128(128)/CTR (128-bit key)<TD>C++<TD>57<TD>30.2<TD>1.134<TD>2041
<TR><TD>Kalyna-128(256)/CTR (256-bit key)<TD>C++<TD>39<TD>44.5<TD>1.312<TD>2361
<TR><TD>Kalyna-256(256)/CTR (256-bit key)<TD>C++<TD>36<TD>47.4<TD>1.914<TD>3445
<TR><TD>Kalyna-256(512)/CTR (512-bit key)<TD>C++<TD>29<TD>59.7<TD>2.210<TD>3978
<TR><TD>Kalyna-512(512)/CTR (512-bit key)<TD>C++<TD>30<TD>56.5<TD>3.552<TD>6393
<TBODY style="background: yellow;">
<TR><TD>CHAM-64(128)/CTR (128-bit key)<TD>C++<TD>21<TD>81.4<TD>0.625<TD>1125
<TR><TD>CHAM-128(128)/CTR (128-bit key)<TD>C++<TD>46<TD>37.5<TD>0.617<TD>1110
<TR><TD>CHAM-128(256)/CTR (256-bit key)<TD>C++<TD>40<TD>43.2<TD>0.663<TD>1193
<TR><TD>LEA-128(128)/CTR (128-bit key)<TD>NEON<TD>89<TD>19.28<TD>0.848<TD>1526
<TR><TD>LEA-128(192)/CTR (192-bit key)<TD>NEON<TD>78<TD>22.03<TD>0.967<TD>1741
<TR><TD>LEA-128(256)/CTR (256-bit key)<TD>NEON<TD>69<TD>24.8<TD>1.007<TD>1812
<TR><TD>SIMECK-32(64)/CTR (64-bit key)<TD>C++<TD>20<TD>84.8<TD>0.820<TD>1476
<TR><TD>SIMECK-64(128)/CTR (128-bit key)<TD>C++<TD>44<TD>39.0<TD>0.847<TD>1525
<TR><TD>SIMON-64(96)/CTR (96-bit key)<TD>C++<TD>52<TD>33.1<TD>0.844<TD>1520
<TR><TD>SIMON-64(128)/CTR (128-bit key)<TD>C++<TD>50<TD>34.5<TD>0.882<TD>1587
<TR><TD>SIMON-128(128)/CTR (128-bit key)<TD>NEON<TD>61<TD>27.9<TD>0.916<TD>1648
<TR><TD>SIMON-128(192)/CTR (192-bit key)<TD>NEON<TD>61<TD>28.3<TD>0.916<TD>1649
<TR><TD>SIMON-128(256)/CTR (256-bit key)<TD>NEON<TD>58<TD>29.4<TD>0.963<TD>1734
<TR><TD>SPECK-64(96)/CTR (96-bit key)<TD>C++<TD>75<TD>22.97<TD>0.658<TD>1184
<TR><TD>SPECK-64(128)/CTR (128-bit key)<TD>C++<TD>72<TD>23.72<TD>0.653<TD>1175
<TR><TD>SPECK-128(128)/CTR (128-bit key)<TD>NEON<TD>166<TD>10.35<TD>0.675<TD>1216
<TR><TD>SPECK-128(192)/CTR (192-bit key)<TD>NEON<TD>162<TD>10.61<TD>0.656<TD>1180
<TR><TD>SPECK-128(256)/CTR (256-bit key)<TD>NEON<TD>158<TD>10.88<TD>0.656<TD>1182
<TR><TD>TEA/CTR (128-bit key)<TD>C++<TD>37<TD>46.2<TD>0.735<TD>1323
<TR><TD>XTEA/CTR (128-bit key)<TD>C++<TD>26<TD>65.7<TD>0.746<TD>1343
<TBODY style="background: white;">
<TR><TD>AES/GCM<TD>ARMv8<TD>379<TD>4.53<TD>1.553<TD>2795
<TR><TD>AES/CCM (128-bit key)<TD>ARMv8<TD>218<TD>7.89<TD>1.201<TD>2161
<TR><TD>AES/EAX (128-bit key)<TD>ARMv8<TD>215<TD>7.99<TD>1.844<TD>3319
<TR><TD>ChaCha20/Poly1305 (256-bit key)<TD>NEON<TD>136<TD>12.59<TD>3.307<TD>5952
<TR><TD>XChaCha20/Poly1305 (256-bit key)<TD>NEON<TD>136<TD>12.58<TD>3.945<TD>7100

</TABLE>

<BR>
<TABLE>
<COLGROUP><COL style="text-align: left;"><COL style="text-align: right;"><COL style="text-align: right;">
<THEAD style="background: #F0F0F0">
<TR><TH>Operation<TH>Milliseconds/Operation<TH>Megacycles/Operation
<TBODY style="background: white;">
Exception caught: FileStore: error opening file for reading: TestData/rsa1024.dat

Jeffrey Walton

unread,
May 12, 2023, 10:40:10 AM5/12/23
to cryptop...@googlegroups.com
On Fri, May 12, 2023 at 10:27 AM Dwight Kulkarni <dwi...@realtime-7.com> wrote:
>
> These are the results from CryptoPP 8.7 vs 8.1 earlier:
> [...]
> <TR><TD>AES/CFB (128-bit key)<TD>ARMv8<TD>344<TD>4.99<TD>1.135<TD>2044
> [...]

5 MB / 344 MB/s = 0.01453 s = 14/53 ms. That is below 200 ms.

Jeff

Dwight Kulkarni

unread,
May 12, 2023, 10:48:42 AM5/12/23
to Crypto++ Users
Hi Jeff,

The new library is much faster, but I am still not getting that speed in the code. 1 second to encrypt 5 mb versus 3 seconds before.

I added the .reserve(...) code also.



 Start at: 05/12/2023 14:46:39.358
 Start Encrypted at: 05/12/2023 14:46:39.607
 in encrypt aes
 returning cipher
 Encrypted at: 05/12/2023 14:46:40.626


string bkey1 = "cX/8AascXHJz6Anr02GHZg==";
string biv2 = "WdHMzK+OrQOTjxZ8cAXQ6g==";
std::pair<SecByteBlock,SecByteBlock> akeys = load_aes_key_from_b64_str(bkey1, biv2);
cout << " Start at: " << get_curr_datetime_str() << endl;

const int num_bytes = 5000003;
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<int> dist(0,255);
std::string random_string;
random_string.reserve(num_bytes);
for(int i=0; i<num_bytes; i++){
random_string += static_cast<char>(dist(gen));
}

cout << " Start Encrypted at: " << get_curr_datetime_str() << endl;
string message_bytes = encrypt_aes(random_string, akeys.first, akeys.second);
cout << " Encrypted at: " << get_curr_datetime_str() << endl;



std::string encrypt_aes(std::string message, SecByteBlock key, SecByteBlock iv) {
try {
cout <<" in encrypt aes " <<endl;
AlgorithmParameters params = MakeParameters(Name::FeedbackSize(), 1/*8-bits*/)
(Name::IV(), ConstByteArrayParameter(iv));
CFB_Mode<AES>::Encryption e;
std::string cipher;
cipher.reserve(message.size()+16);
e.SetKey(key, key.size(), params);
StringSource ss(message, true, new StreamTransformationFilter(e, new StringSink(cipher)));
cout << " returning cipher " << endl;
return cipher;
}
catch (CryptoPP::Exception e) {
std::cerr << e.what() << std::endl;
return "";
}
}

Dwight Kulkarni

unread,
May 12, 2023, 10:50:38 AM5/12/23
to Crypto++ Users
I didn't give the load_aes_key function....here it is :



std::pair<SecByteBlock,SecByteBlock> load_aes_key_from_b64_str(string bkey, string biv){

SecByteBlock key;
SecByteBlock iv;
try {
AutoSeededRandomPool prng;
key = SecByteBlock(udp_aes_key_size_in_bytes);
iv = SecByteBlock(udp_aes_key_iv_in_bytes);

//prng.GenerateBlock(key.data(), key.size());
//prng.GenerateBlock(iv.data(), iv.size());
bkey = b64decode(bkey);
biv = b64decode(biv);

ArraySink sk(key.data(), key.size());
sk.ChannelPut(DEFAULT_CHANNEL, reinterpret_cast<const byte*>(bkey.c_str()),key.size());

ArraySink siv(iv.data(), iv.size());
siv.ChannelPut(DEFAULT_CHANNEL, reinterpret_cast<const byte*>(biv.c_str()),iv.size());

//CryptoPP::GenerateIntoBufferedTransformation(s, DEFAULT_CHANNEL, size);
// key.data() = reinterpret_cast<byte*>(const_cast<char*>(nkey.c_str()));
//iv.data() = reinterpret_cast<byte*>(const_cast<char*>(niv.c_str()));
std::cout << "Loaded blocks with key size " << int(key.size()) << " and iv " << int(iv.size()) << endl;

return std::make_pair(key,iv);

}

catch (CryptoPP::Exception e) {
std::cerr << e.what() << std::endl;
return std::make_pair(key,iv);
}

}

Dwight Kulkarni

unread,
May 12, 2023, 10:53:37 AM5/12/23
to Crypto++ Users
Two others:


string get_curr_datetime_str(){
return datetime_to_str(chrono::system_clock::now());

}

string datetime_to_str(std::chrono::high_resolution_clock::time_point utctime){
std::chrono::high_resolution_clock::time_point::duration tt = utctime.time_since_epoch();
const time_t durs = std::chrono::duration_cast<std::chrono::seconds>(tt).count();
std::ostringstream ss;
string format = "%m/%d/%Y %H:%M:%S";
if (const std::tm *tm = std::gmtime(&durs)){
ss << std::put_time(tm,format.c_str());
const long long durms = std::chrono::duration_cast<std::chrono::milliseconds>(tt).count();
ss << "." << std::setw(3) << std::setfill('0') << int(durms - durs * 1000);
return ss.str();
}else{
return "";
}
}

Jeffrey Walton

unread,
May 12, 2023, 11:17:32 AM5/12/23
to cryptop...@googlegroups.com
On Fri, May 12, 2023 at 10:48 AM Dwight Kulkarni <dwi...@realtime-7.com> wrote:
>
> The new library is much faster, but I am still not getting that speed in the code. 1 second to encrypt 5 mb versus 3 seconds before.
>
> I added the .reserve(...) code also.

Well, you should profile your code to find the bottlenecks.

Jeff

Dwight Kulkarni

unread,
May 12, 2023, 11:46:08 AM5/12/23
to Crypto++ Users
Hi Jeff,

The time is taken up all on the one line:


StringSource ss(message, true, new StreamTransformationFilter(e, new StringSink(cipher)));


Do you know what might be the difference versus the .exe test ?    <TR><TD>AES/CFB (128-bit key)<TD>ARMv8<TD>344<TD>4.99<TD>1.135<TD>2044



Encrypted 1at: 05/12/2023 15:40:55.588
 Encrypted 2at: 05/12/2023 15:40:55.588
 Encrypted 3at: 05/12/2023 15:40:55.588
returning cipher
 Encrypted 4at: 05/12/2023 15:40:56.599
 Encrypted at: 05/12/2023 15:40:56.600


std::string encrypt_aes(std::string message, SecByteBlock key, SecByteBlock iv) {
try {
cout <<" in encrypt aes " <<endl;
cout << " Encrypted 1at: " << get_curr_datetime_str() << endl;
AlgorithmParameters params = MakeParameters(Name::FeedbackSize(), 1/*8-bits*/)
(Name::IV(), ConstByteArrayParameter(iv));
cout << " Encrypted 2at: " << get_curr_datetime_str() << endl;
CFB_Mode<AES>::Encryption e;
std::string cipher;
cipher.reserve(message.size()+16);
e.SetKey(key, key.size(), params);
cout << " Encrypted 3at: " << get_curr_datetime_str() << endl;
StringSource ss(message, true, new StreamTransformationFilter(e, new StringSink(cipher)));
cout << " returning cipher " << endl;
cout << " Encrypted 4at: " << get_curr_datetime_str() << endl;
return cipher;
}
catch (CryptoPP::Exception e) {
std::cerr << e.what() << std::endl;
return "";
}
}
Message has been deleted

Dwight Kulkarni

unread,
May 12, 2023, 1:09:46 PM5/12/23
to Crypto++ Users
Update: It seems to be an issue with CFB mode. If I switch to ECB: 12 ms to process the workload.

 in encrypt aes
 Encrypted 1at: 05/12/2023 17:06:54.838
 Encrypted 2at: 05/12/2023 17:06:54.838
 Encrypted 3at: 05/12/2023 17:06:54.838
 returning cipher
 Encrypted 4at: 05/12/2023 17:06:54.850

std::string encrypt_aes(std::string message, SecByteBlock key, SecByteBlock iv) {
try {
cout <<" in encrypt aes " <<endl;
cout << " Encrypted 1at: " << get_curr_datetime_str() << endl;
AlgorithmParameters params = MakeParameters(Name::FeedbackSize(), 1/*8-bits*/)
(Name::IV(), ConstByteArrayParameter(iv));
cout << " Encrypted 2at: " << get_curr_datetime_str() << endl;
ECB_Mode<AES>::Encryption e;
std::string cipher;
cipher.reserve(message.size()+16);
e.SetKey(key, key.size(), params);
cout << " Encrypted 3at: " << get_curr_datetime_str() << endl;

/*
const size_t buf_sz = 320000;
byte buffer[buf_sz];
StringSink ssk(cipher);
ArraySource ars((const byte*)message.data(), message.size(), true, new StreamTransformationFilter(e, new ArraySink(buffer,buf_sz)));
while(true){
size_t bread = ars.Get(buffer, buf_sz);
if(bread==0){
break;
}
ssk.Put(buffer,bread);
}
ssk.MessageEnd();
*/

Jeffrey Walton

unread,
May 12, 2023, 2:09:38 PM5/12/23
to cryptop...@googlegroups.com
On Fri, May 12, 2023 at 1:09 PM Dwight Kulkarni <dwi...@realtime-7.com> wrote:
>
> Update: It seems to be an issue with CFB mode. If I switch to ECB: 12 ms to process the workload.
>
> in encrypt aes
> Encrypted 1at: 05/12/2023 17:06:54.838
> Encrypted 2at: 05/12/2023 17:06:54.838
> Encrypted 3at: 05/12/2023 17:06:54.838
> returning cipher
> Encrypted 4at: 05/12/2023 17:06:54.850

Yeah, CFB mode is a hard mode in software. The software is effectively
providing a linear feedback shift register, and it does a lot of
[non-accelerated] bit twiddling. That's why it needs 4+ cycles to
process a byte.

I think you would have better results with GCM mode. First, GCM mode
is an authenticated encryption mode, so you get authenticity
assurances over the ciphertext. Second, you get hardware acceleration
with both the bulk cipher encryption (AES), and the mac over the
ciphertext (GMAC).

Also see https://www.cryptopp.com/wiki/Authenticated_Encryption .

Jeff
Reply all
Reply to author
Forward
Message has been deleted
Message has been deleted
0 new messages