I'm pleased to announce software online for a demo of web browsing
taking just 166000 Haswell cycles to generate a new one-time sntrup761
public key for each TLS 1.3 session. This demo uses
(1) the Gnome web browser (client) and stunnel (server) using
(2) a patched version of OpenSSL 1.1.1f using
(3) a new OpenSSL ENGINE using
(4) a new sntrup761 library.
This is joint work. Authors in alphabetical order: Daniel J. Bernstein,
Billy Bob Brumley, Ming-Shing Chen (leader for #4), and Nicola Tuveri
(leader for #3). Email address:
authorcontac...@box.cr.yp.to.
The new speed is much faster than previously announced speeds for
sntrup761 keygen. In combination with the (recently announced) 48780
Haswell cycles for enc and 59120 Haswell cycles for dec, this new keygen
speed means a total of just 273900 cycles for sntrup761 keygen+enc+dec.
The TLS 1.3 integration here uses the same basic data flow as the CECPQ2
experiment carried out by Google and Cloudflare: the client generates a
one-time public key, the server encapsulates to that one-time key, and
the client decapsulates, obtaining a one-time session key. Beware that
this data flow is designed only to protect against attacks by future
quantum computers ("transitional" security); stopping active attacks
will also require long-term post-quantum identity keys.
CECPQ2 used (a minor variant of) ntruhrss701 for this data flow. The
state-of-the-art (March 2020) software for ntruhrss701 takes 272028
cycles for keygen, 26116 cycles for enc, and 63632 cycles for dec, for a
total of 361776 cycles.
The CECPQ2 experiments showed that ntruhrss701's CPU time consumes very
little of the overall TLS time. The new sntrup761 software here consumes
even less time. The CECPQ2 experiments showed a somewhat more noticeable
impact of network traffic on the slowest connections; sntrup761 sends
2197 bytes (one-time key+ciphertext) where ntruhrss701 sends 2276.
Here's the comparison table (all numbers are from SUPERCOP except for
the new 166000 for sntrup761 keygen):
sntrup761 ntruhrss701
public-key bytes 1158 1138
ciphertext bytes 1039 1138
pk+ciphertext bytes 2197 2276
keygen cycles 166000 272028
enc cycles 48780 26116
dec cycles 59120 63632
keygen+enc+dec cycles 273900 361776
1000*bytes+cycles 2470900 2637776
This should put an end to the idea that sntrup761 keygen is too slow for
TLS.
Both sntrup761 and ntruhrss701 are designed for IND-CCA2 security, as
recommended in most of the NISTPQC lattice submissions and in Google's
CECPQ2 announcement: "CCA2-security is worthwhile, even though TLS can
do without. ... CPA vs CCA security is a subtle and dangerous
distinction, and if we're going to invest in a post-quantum primitive,
better it not be fragile."
Taking away IND-CCA2 security would speed up both ntruhrss701 and
sntrup761 by removing some hashing and removing (basically) a copy of
enc from dec. For comparison, Google's earlier CECPQ1 experiment used an
early non-IND-CCA-secure version of newhope1024, with approximately
200000 cycles of total computation and more than 4000 bytes of network
traffic (more than 4 million in the 1000*bytes+cycles metric), and
concluded that this "would be practical to quickly deploy".
Algorithmically, the new sntrup761 keygen speed comes from generating
32 independent keys at once, using Montgomery's trick for batch
inversion. This option has been pointed out before. The new demo shows
that this option fits into a CECPQ2-type data flow in TLS 1.3. The total
latency of generating 32 keys is around two milliseconds; even better,
keys can be generated in advance of being used, reducing the impact on
TLS latency to zero (with or without Montgomery's trick). Of course, one
still has to generate each new key at some point, but the new sntrup761
software shows that Montgomery's trick provides excellent throughput.
Montgomery's trick replaces each batch of inversions with one shared
inversion and a batch of multiplications. In the context here, there is
a batch of 32 inversions mod q and a batch of 32 inversions mod 3, using
1 shared inversion mod q and 1 shared inversion mod 3. Out of the 166000
cycles per key for a batch of 32 keys, about 30000 cycles per key are
spent on the shared inversions, and simply increasing the batch size
further reduces this cost. With slightly more work it is possible to
share transforms across various multiplications. Consequently, the
current software speed is not the limit of what can be achieved.
One can also use Montgomery's trick for some other NISTPQC submissions
that rely on inversion as part of keygen, but the dramatic speedup for
sntrup761 doesn't imply a similarly dramatic (or even nonzero) speedup
for those other submissions. In particular, the current ntruhrss701
keygen already exploits the power-of-2 structure of its q for a Hensel
lift. In the Montgomery context, the Hensel speedup rapidly vanishes,
while multiplication speeds and other overheads become more important.
There's _some_ gap between ntruhrss701 and sntrup761 in multiplication
speed (sntrup761 aims for a higher security level, uses larger
polynomials, and requires a field) but this is only about 8000 cycles
per multiplication with the current software.
Demo instructions appear below.
---Dan
### Demo overview
Warning: This demo comes with no cryptographic warranties and no other
security warranties. The software here is experimental, and is built
upon other software with a long history of security problems, such as
OpenSSL. The purpose of this demo is purely to show the sntrup761
performance achievable with a CECPQ2-type data flow for TLS 1.3.
The demo has two parts: a server side and a client side. We recommend
running each side in its own VM.
The server side uses stunnel for SSL termination. It receives TLS
connections, including sntrup761 connections, and passes along the
answers provided by a preexisting back-end web server, which does not
need to support sntrup761 connections. For example, the demo site
https://test761.cr.yp.to looks just like the preexisting site
https://ntruprime.cr.yp.to, but with the extra feature of supporting
sntrup761 connections. Internally,
https://test761.cr.yp.to passes
requests along through a local connection to the preexisting back-end
web server for
ntruprime.cr.yp.to. You can use
https://test761.cr.yp.to
as the server side of this demo, or you can set up the server side for a
web server of your choice.
The client side uses Epiphany, the Gnome web browser, with no
modifications to the Epiphany source code. The glib-networking library
used inside Epiphany already supports OpenSSL as an option for outgoing
connections, and is configured below to use this option.
Both sides use a version of OpenSSL 1.1.1f patched inside libssl to
support sntrup761 as experimental group 0xfe00 for TLS 1.3, and patched
inside libcrypto to include a reference implementation of sntrup761. Our
new engntru library then overrides this reference implementation with a
fast implementation, which in turn is built on top of our new
libsntrup761. This way of using the OpenSSL ENGINE feature allows
OpenSSL to take advantage of fast software implementations while
allowing those implementations to be developed in separate libraries;
see
https://eprint.iacr.org/2018/354.
Various other applications that use OpenSSL have been verified to work
with libsntrup761 via engntru. This demo focuses on stunnel on the
server side and Epiphany on the client side.
### Server side
The following instructions for setting up the server side have been
tested in a VM running Debian 11 (Bullseye) on a CPU supporting AVX2.
You can skip down to the client side if you simply want to try
https://test761.cr.yp.to as the server.
As root:
apt install wget python3 build-essential clang cmake ruby pkg-config -y
adduser --disabled-password --gecos opensslntru opensslntru
As the new opensslntru user (change the first three lines for your own
demo server name, demo server address, and preexisting back-end server
address---of course, you should use your favorite VPN to protect the
connection from this SSL terminator to the back-end server):
EXTERNALNAME=
test761.cr.yp.to
EXTERNALADDRESS=
1.2.3.4:65024 # provide TLS service on this address
INTERNALADDRESS=
5.6.7.8:80 # use existing server on this address
export PATH=$HOME/bin:$PATH
cd
wget
https://www.openssl.org/source/openssl-1.1.1f.tar.gz
wget
https://ntruprime.cr.yp.to/opensslntru/openssl-1.1.1f-ntru.patch
tar -xf openssl-1.1.1f.tar.gz
mv openssl-1.1.1f openssl-1.1.1f-ntru
cd openssl-1.1.1f-ntru
patch -p1 < ../openssl-1.1.1f-ntru.patch
./config shared --prefix=$HOME --openssldir=$HOME -Wl,-rpath=$HOME/lib
make -j8 # a few minutes
make test # more minutes
make install_sw
cd
wget
https://ntruprime.cr.yp.to/opensslntru/libsntrup761-20200415.tar.gz
tar -xf libsntrup761-20200415.tar.gz
cd libsntrup761-20200415
env USE_RPATH=RUNPATH DESTDIR=$HOME CPATH=$HOME/include LIBRARY_PATH=$HOME/lib make all install test
cd
wget
https://ntruprime.cr.yp.to/opensslntru/engntru-20200415.tar.gz
tar -xf engntru-20200415.tar.gz
cd engntru-20200415
mkdir build
cd build
cmake -DCMAKE_PREFIX_PATH="$HOME;$HOME/usr/local" ..
make
make test
make install
cd
wget
https://www.stunnel.org/downloads/stunnel-5.56.tar.gz
tar -xf stunnel-5.56.tar.gz
cd stunnel-5.56
./configure --prefix=$HOME --with-ssl=$HOME LDFLAGS=-Wl,-rpath=$HOME/lib
make
make install
cd
mkdir service
cd service
openssl req -x509 -sha256 -nodes -newkey rsa:2048 -keyout "$EXTERNALNAME.key" -days 730 -out "$EXTERNALNAME.crt" -subj "/CN=$EXTERNALNAME" -config /etc/ssl/openssl.cnf
(
echo "key = $EXTERNALNAME.key"
echo "cert = $EXTERNALNAME.crt"
echo 'foreground = yes'
echo 'engine = engntru'
echo 'engineDefault = ALL'
echo '[forward]'
echo "accept = $EXTERNALADDRESS"
echo "connect = $INTERNALADDRESS"
echo 'curves = SNTRUP761:X25519:P-256'
echo 'config = MinProtocol:TLSv1.2'
echo 'ciphers = ECDHE+CHACHA20:ECDHE+AES256:ECDHE+AES128:!aNULL:!eNULL:!LOW:!EXPORT:!DES:!3DES:!RC4:!MD5:!PSK:!SRP:!DSS:!aECDSA'
) > stunnel.conf
As root:
(
echo '[Unit]'
echo 'Description=opensslntru forwarding'
echo 'DefaultDependencies=no'
echo 'After=network.target'
echo ''
echo '[Service]'
echo 'Type=simple'
echo 'User=opensslntru'
echo 'Group=opensslntru'
echo 'WorkingDirectory=/home/opensslntru/service'
echo 'ExecStart=/home/opensslntru/bin/stunnel stunnel.conf'
echo ''
echo '[Install]'
echo 'WantedBy=default.target'
) > /etc/systemd/system/opensslntru.service
systemctl restart opensslntru
At this point the server should be working. Try any browser to connect
to the server's external address. The certificate is self-signed;
signing it with Let's Encrypt is recommended but is outside the scope of
these instructions.
This stunnel configuration passes SNI along from the client to the
server, so the client is free to access any server name provided by the
server. For example, almost all *.
cr.yp.to are hosted on the same
back-end server and can now be retrieved through sntrup761, although for
the moment this is announced to the client (and signed) only for
test761.cr.yp.to. You can advertise multiple names on the same server
through the same stunnel configuration by adding those names to DNS and
creating an appropriate certificate. You can instead configure stunnel
to forward different SNI choices to different servers with different
certificates.
### Client side
The following instructions for setting up the client side have been
tested in a VM running Debian 10 (Buster) on a CPU supporting AVX2.
As root:
apt install wget python3 build-essential clang cmake \
ruby pkg-config epiphany-browser meson gnome-pkg-tools \
libglib2.0-dev libproxy-dev \
gsettings-desktop-schemas-dev ca-certificates -y
adduser --disabled-password --gecos opensslntru opensslntru
As the new opensslntru user:
export PATH=$HOME/bin:$PATH
cd
wget
https://www.openssl.org/source/openssl-1.1.1f.tar.gz
wget
https://ntruprime.cr.yp.to/opensslntru/openssl-1.1.1f-ntru.patch
tar -xf openssl-1.1.1f.tar.gz
mv openssl-1.1.1f openssl-1.1.1f-ntru
cd openssl-1.1.1f-ntru
patch -p1 < ../openssl-1.1.1f-ntru.patch
./config shared --prefix=$HOME --openssldir=$HOME -Wl,-rpath=$HOME/lib
make -j8 # a few minutes
make test # more minutes
make install_sw
cd
wget
https://ntruprime.cr.yp.to/opensslntru/libsntrup761-20200415.tar.gz
tar -xf libsntrup761-20200415.tar.gz
cd libsntrup761-20200415
env USE_RPATH=RUNPATH DESTDIR=$HOME CPATH=$HOME/include LIBRARY_PATH=$HOME/lib make all install test
cd
wget
https://ntruprime.cr.yp.to/opensslntru/engntru-20200415.tar.gz
tar -xf engntru-20200415.tar.gz
cd engntru-20200415
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_PREFIX_PATH="$HOME;$HOME/usr/local" ..
make
make test
make install
cd
git clone --branch 2.60.2
https://gitlab.gnome.org/GNOME/glib-networking.git
cd glib-networking
mkdir build
cd build
env PKG_CONFIG_PATH=$HOME/lib/pkgconfig CPATH=$HOME/include LIBRARY_PATH=$HOME/lib meson --prefix=$HOME -Dopenssl=enabled -Dgnutls=disabled ..
ninja
ninja install
cd
wget
https://ntruprime.cr.yp.to/opensslntru/openssl-engntru.cnf
export OPENSSL_CONF=$HOME/openssl-engntru.cnf
export LD_LIBRARY_PATH=$HOME/lib
export GIO_MODULE_DIR=$HOME/lib/x86_64-linux-gnu/gio/modules
export ENGNTRU_DEBUG=4 # to watch engntru activating
ln -s /etc/ssl/certs $HOME/certs
epiphany
https://test761.cr.yp.to
You should be able to browse to this demo server (using sntrup761),
whichever other demo servers you set up above (using sntrup761), and
other sites (typically not using sntrup761 yet). The ENGNTRU_DEBUG=4 log
information in the terminal includes a note for each sntrup761 keygen, a
note for each sntrup761 dec, and a note for each computation of a batch
of 32 keys.