Best way to replace RC4 and SHA1 in crypto/tls

579 views
Skip to first unread message

John Graham-Cumming

unread,
Jan 22, 2013, 1:17:13 PM1/22/13
to golan...@googlegroups.com
I have fast versions of crypto/rc4, crypto/md5 and crypto/sha1 that use cgo to interface to OpenSSL: http://blog.jgc.org/2013/01/calling-openssl-libcrypto-functions.html I would like these to be used by crypto/tls. I can't see any way to replace them in that package without maintaining a local version of Go, or making a copy of crypto/tls.

1. Would the Go team accept my modules (or similar) that rely on OpenSSL?

2. If not, am I missing some way to mess around inside crypto/tls to make the replacement at run time?

If no to both I'm probably going to have a local version of Go as TLS performance (and specifically SHA1 and RC4) plus MD5 are the major CPU utilizers in one of my applications:

    7217  20.0%  20.0%     7217  20.0% crypto/sha1.block
    4852  13.5%  33.5%     4852  13.5% md5._Block
    4461  12.4%  45.8%     4461  12.4% crypto/rc4.(*Cipher).XORKeyStream

John.

agl

unread,
Jan 22, 2013, 1:35:16 PM1/22/13
to golan...@googlegroups.com
We don't have a way to swap out specific implementations of ciphers, although we almost do for hash functions. However, crypto/tls was written before the hash function mechanism and so doesn't use it.

We wouldn't (at least I wouldn't want to) accept patches that make Go link against OpenSSL.

There isn't a way to inject alternative implementation into crypto/tls, although I could do that without too much fuss for the hash functions if it was useful. (crypto/tls doesn't use MD5.)

Generally we have wished to improve the implementations in Go rather than anything else. I believe the RC4 code was the first Go code that I ever wrote, and I'm sure that some amd64 asm would speed it up considerably. The SHA1 code is not quite so basic, but I can still easily believe that OpenSSL beats it. (Importing the OpenSSL asm directly has license problems.)


Cheers

AGL 

John Graham-Cumming

unread,
Jan 22, 2013, 1:48:37 PM1/22/13
to golan...@googlegroups.com
On Tuesday, January 22, 2013 6:35:16 PM UTC, agl wrote:
There isn't a way to inject alternative implementation into crypto/tls, although I could do that without too much fuss for the hash functions if it was useful. (crypto/tls doesn't use MD5.)

It would simplify my life greatly if I could inject a new RC4 and SHA1 as those are major CPU users in my application.
 
Generally we have wished to improve the implementations in Go rather than anything else. 

Agreed. And that would be preferable. In the interim I need to make changes to a running application so the alternative route of using OpenSSL is the fastest way of getting there. 

John.

Thomas Bushnell, BSG

unread,
Jan 22, 2013, 2:41:22 PM1/22/13
to John Graham-Cumming, golang-nuts
Note that md5 and sha1 shouldn't be used much anymore. MD5 should be aggressively dropped, and sha1 is nearing the problem zone.



John.

--
 
 

Patrick Mylund Nielsen

unread,
Jan 22, 2013, 2:55:02 PM1/22/13
to Thomas Bushnell, BSG, John Graham-Cumming, golang-nuts
Depends on what you're doing. MD5 is still pretty fast for file CRC where collisions are unlikely, and a construction like HMAC-SHA1 is still safe. If it indeed is for CRC (or anything that requires a fast hash function, really), then BLAKE2 might be an interesting alternative: https://blake2.net/ / https://github.com/dchest/blake2b. As far as I can tell, Dmitry's implementation is very fast.

Note that SHA3 (Keccak) is particularly slow in software.


--
 
 

John Graham-Cumming

unread,
Jan 23, 2013, 1:19:02 PM1/23/13
to golan...@googlegroups.com, Thomas Bushnell, BSG, John Graham-Cumming

On Tuesday, January 22, 2013 7:55:02 PM UTC, Patrick Mylund Nielsen wrote:
If it indeed is for CRC (or anything that requires a fast hash function, really), then BLAKE2 might be an interesting alternative: https://blake2.net/ / https://github.com/dchest/blake2b. As far as I can tell, Dmitry's implementation is very fast.

I tested that implementation of Blake2 using the same tests as I used for native Go and OpenSSL functions.  Looking just at native Go my test of hashing 4.4GB of data came in at MD5 404 MB/s, SHA1 123 MB/s, Blake2 201 MB/s. So, it was about half the speed of MD5. I will spend some time looking at the optimized C version of Blake2 with a Go wrapper for comparison.

John.
 

Patrick Mylund Nielsen

unread,
Jan 23, 2013, 1:49:40 PM1/23/13
to John Graham-Cumming, golang-nuts, Thomas Bushnell, BSG, John Graham-Cumming
Hmm, interesting. A cgo wrapper for the optimized version should definitely be faster.

Of course, if you don't care about anyone tampering with the data/producing identical checksums, an actual CRC function would probably be faster than any of the above.



John.
 

--
 
 

John Graham-Cumming

unread,
Jan 23, 2013, 2:03:16 PM1/23/13
to golan...@googlegroups.com, John Graham-Cumming, Thomas Bushnell, BSG, John Graham-Cumming
On Wednesday, January 23, 2013 6:49:40 PM UTC, Patrick Mylund Nielsen wrote:
Hmm, interesting. A cgo wrapper for the optimized version should definitely be faster.

Yes, I've tested that. It runs at about 712 MB/s. For comparison my wrapped OpenSSL functions: MD5 607 MB/s and SHA1 636 MB/s. So, Blake2 optimized is 12% faster than SHA1 and 17% faster than MD5.

John.
 

Dmitry Chestnykh

unread,
Feb 4, 2013, 12:42:52 PM2/4/13
to golan...@googlegroups.com, Thomas Bushnell, BSG, John Graham-Cumming
On Wednesday, January 23, 2013 7:19:02 PM UTC+1, John Graham-Cumming wrote:
I tested that implementation of Blake2 using the same tests as I used for native Go and OpenSSL functions.  Looking just at native Go my test of hashing 4.4GB of data came in at MD5 404 MB/s, SHA1 123 MB/s, Blake2 201 MB/s. So, it was about half the speed of MD5. I will spend some time looking at the optimized C version of Blake2 with a Go wrapper for comparison.

Very strange that there's 2x difference between MD5 and BLAKE2. Did you test BLAKE2b (it's faster on 64-bit CPUs)? Here are benchmarks on my Core 2 Duo laptop:

~/goproj/src/github.com/dchest/blake2b $ go test -test.bench=.
PASS
BenchmarkWrite1K  500000      4950 ns/op 206.83 MB/s
BenchmarkWrite8K   50000     38870 ns/op 210.75 MB/s
BenchmarkHash64 1000000      1520 ns/op  42.10 MB/s
BenchmarkHash128 1000000      1435 ns/op  89.15 MB/s
BenchmarkHash1K  500000      5955 ns/op 171.93 MB/s

~/sources/go/src/pkg/crypto/md5 $ go test -test.bench=.
PASS
BenchmarkHash8Bytes 5000000       740 ns/op  10.81 MB/s
BenchmarkHash1K  500000      4349 ns/op 235.45 MB/s
BenchmarkHash8K  100000     29707 ns/op 275.75 MB/s
BenchmarkHash8BytesUnaligned 5000000       740 ns/op  10.80 MB/s
BenchmarkHash1KUnaligned  500000      4734 ns/op 216.26 MB/s
BenchmarkHash8KUnaligned   50000     31874 ns/op 257.01 MB/s
ok   crypto/md5 18.776s


Note that MD5 uses unsafe package, while BLAKE2b doesn't.
When I modified it to use unsafe, it was as fast as MD5.

-Dmitry

Dmitry Chestnykh

unread,
Feb 4, 2013, 1:01:46 PM2/4/13
to golan...@googlegroups.com, John Graham-Cumming, Thomas Bushnell, BSG, John Graham-Cumming
On Wednesday, January 23, 2013 7:49:40 PM UTC+1, Patrick Mylund Nielsen wrote:
Hmm, interesting. A cgo wrapper for the optimized version should definitely be faster.

Of course, if you don't care about anyone tampering with the data/producing identical checksums, an actual CRC function would probably be faster than any of the above.

CRC in software is not very fast compared to modern checksums (if your CPU doesn't have instruction for it, of course). For 64-bit checksums you can use SipHash-2-4 (http://github.com/dchest/siphash) with a constant key, which is at least twice as fast when comparing current Go implementations.

You can also reduce rounds in SipHash: e.g. use 1-4 version, which, in C version runs at around ~2 GiB/s on Core 2 Duo and doesn't seem to have flaws when used as non-cryptographic checksum (SMHasher gives it the maximum score).

-Dmitry
Reply all
Reply to author
Forward
0 new messages