You cannot post messages because only members can post, and you are not currently a member.
Description:
An online forum for discussing tips and tricks of crypto optimization.
|
|
|
SPEED-CC registration deadline
|
| |
If you intended to come to Berlin for SPEED-CC please register asap.
Registration officially closes tomorrow, I can still keep it open for
1-2 days.
[link]
|
|
SPEED-CC Call for papers
|
| |
Dear all,
You might be interested in the upcoming workshop "SPEED-CC -- Software
Performance Enhancement for Encryption and Decryption and
Cryptographic Compilers" which will take place October 12&13, 2009 in
Berlin, Germany. The web page is at
[link]
Please find the call for papers below.... more »
|
|
New AMD Packed Integer Rotates and Shifts instructions
|
| |
Hi,
I have just read that AMD are introducing new set of instructions for
"Packed Integer Rotates and Shifts".
That is good news for x86 family of new processors.
In my opinion many cryptographic algorithms can benefit from those
instructions (at least several SHA-3 hash candidate functions that... more »
|
|
constant-time extended GCD?
|
| |
I've been doing an x86/SSE2 implementation of curve25519 using ideas similar to the ones in "Fast elliptic-curve cryptography on the Cell Broadband Engine" ([link]). (It currently takes less than 260,000 cycles on Core 2 32-bit mode for a scalar multiplication using Montgomery form, not counting the final modular... more »
|
|
secrets of GMP 4.3
|
| |
I noticed that GMP 4.3.0, released 2 days ago, contains new optimized assembly code. In mpn/x86_64/addmul_2.asm, there is this comment: C cycles/limb C K8,K9: 2.375 C K10: ? C P4: ? C P6-15: 4.45 C This code is the result of running a code generation and optimization tool C suite written by David Harvey and Torbjorn Granlund.... more »
|
|
Jacobi quartic form curves
|
| |
I've been updating myself on the latest elliptic curve techniques, and found a March 2009 update of [link], which claims (based on theoretical calculations) that elliptic curve scalar multiplication should be fastest in Jacobi quartic form. Looking at the latest eBATS results, the fastest curves that have been implemented are gls1271 and... more »
|
|
SSE2 Comba multiplication
|
| |
eBATS results seem to show that Crypto++ in x86 32-bit mode ([link]) is competitive in speed with GMP in 64-bit mode ([link]) for some operations. This note describes a couple of the techniques used by Crypto++... more »
|
|
AES register saving tricks
|
| |
The straightforward way of computing the AES round function requires 8 32-bit registers, but the x86 architecture provides only 7 useable ones. I noticed a neat trick that Brian Gladman used in his AES x86 assembly code to avoid spilling a register, and have an improvement upon it of my own. The basic problem is, for each of the four 32-bit register representing the... more »
|
|
GCC compile farm
|
| |
Some of you probably know this already, but if you need to test your crypto code (and if it's free software) on different CPU architectures, you can get a shell account on the GCC Compile Farm. The URL is [link]. In the past I've used the Sourceforge Compile Farm and HP Test Drive for the... more »
|
|
tips for instruction scheduling?
|
| |
Does anyone have tips for how to schedule a sequence of assembly language instructions, in order to maximize the instructions per cycle (IPC) executed by a CPU? What I've been doing so far is just simple trial and error. I'd move a few instructions around more or less at random, benchmark the resulting code... more »
|
|
|