The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
From: "Wei Dai" <wei...@weidai.com>
Date: Thu, 9 Apr 2009 17:13:45 -0700
Local: Thurs, Apr 9 2009 8:13 pm
Subject: Re: Serpent optimisation
It occurs to me that if we do a bitsliced implementation of Serpent in SSE2
(computing 128 blocks in parallel), the rotates and shifts in the linear transform can be done for free by variable-renaming. So instead of 1728 instructions for 4 blocks, it would take an average of only 31*7.7 + 32*15 + 33*4 = 850 instructions per 4 blocks, or around 4.43 cpb at an IPC of 3. This doesn't account for loads and stores caused by register spills (which will require some optimization to minimize), but it seems that Serpent may finally be able to overtake Rijndael in performance, at least for long messages in counter mode. In fact it seems likely that Serpent is now the fastest AES finalist for You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
| ||||||||||||||