> 1) I compiled Crypto++ on a little-endian MIPS machine under debian
> squeeze and encountered problems. First, config.h assumes anything
> that defines __mips__ is big endian. I recompiled with -
> DIS_LITTLE_ENDIAN, and all was well.
In SVN, this has already been fixed by using __MIPSEB__ instead.
> 2) the GNUmakefile didn't detect GCC42_OR_LATER. g++ --version returns
> "g++ (Debian 4.4.4-6) 4.4.4" on its first line, which doesn't match
> the grep regex. Not much depended on this macro, so I hand-changed the
I changed the regex to "\((Debian|GCC\)) (4.[2-9]|[5-9])", but I wonder if
other Linux distributions also modify the GCC version output.
> 3) My biggest problem is trying to get good performance with
> individual AES block encryptions. In the Crypto++ API I could only
> find AES::ProcessBlock() as the method for enciphering single AES
> blocks. However, this call appears to entail so much overhead that
> performance is poor. (My OCB implementation is peaking at around 23
> cycles per byte while it theoretically should be closer to 13 or 14
> cpb since CTR is around 11.) Is there a higher-performance interface
> to the raw block cipher?
Is this on x86/x64? If so, you can use AdvancedProcessBlocks (search for it
in cryptlib.h), which will give you a big performance boost, since it
encrypts multiple blocks with only one set of overhead. That function will
let you choose to XOR the input or output to AES with something, but not
both (which is what OCB calls for) but you can do the second XOR yourself.
But I just recalled that there is currently no assembly code for AES block
decryption so OCB decryption in Crypto++ would be pretty slow right now.
BTW, I just did an implementation of AES and GCM using the new AES-NI and
CLMUL instructions, and got 3.5 cpb for AES-GCM, of which 1.4 is for AES-CTR
and 2.1 is for GMAC. Looking at the OCB description, it seems like it should
be possible for AES-OCB to clock in at less than 2 cpb. (Too bad OCB is