Re: Replication for read-scalability

Showing 1-3 of 3 messages
Re: Replication for read-scalability Josiah Carlson 2/1/12 10:06 AM
In SSH, SSL, and OpenVPN, the expensive part is the initial connection
handshake where all of the RSA/DSA stuff occurs. After an initial key
exchange (with signature checks determined by the protocol),
everything is transferred by one of a few different fairly high
performance algorithms. If you are really concerned about performance,
use RC4 or AES, the former of which will do 350 megs/second on a
single core of a 2.4 ghz core 2 duo, with AES-128 at 180 megs/second.

You can benchmark them on your platform using: 'openssl speed rc4 aes
blowfish des'.

Despite the bad press against RC4 due to WEP cracking, RC4 itself is
solid, and it was an improper implementation in WEP that ignored
almost every single "if you want security don't do X" paper prior to
it's implementation that caused the insecurity. If you have a little
spare processor and want the added "the NSA says it's safe" security,
go with AES-128 or AES-256.


All of these options can be set on a per-protocol basis with each of
the systems. SSH cipher can be set via the "-c <alg>" command line
option. SSL through stunnel can be set via the 'ciphers' configuration
option (read http://vincent.bernat.im/en/blog/2011-ssl-benchmark-round2.html
for information on how to optimize performance with various SSL
endpoints if you need the highest handshake throughput, though the
predecessor article mentions that Google uses RC4-SHA1 as their SSL
encryption), though you can use NGINX as a TCP tunnel and have it work
very well without much configuration. OpenVPN also uses the 'ciphers'
configuration option, which you can see if you look at sample
configurations (use this script to help you create and distribute
client keys: https://gist.github.com/1323978).

In terms of performance, CPU, etc., use encryption for
cross-datacenter stuff, and within the DC if you are paranoid. CPU use
is effectively low enough not to matter as long as you have a spare
CPU and a 20 megs of extra memory (no EC2-small boxes, but an EC2
high-cpu medium does very well). Both SSH and OpenVPN can be set up
with transparent TUN interface support, so you just connect to a
different IP instead of a tunneled port. If you want specific port
tunneling, STunnel/NGINX/SSH can do it (if you use SSH, remember to
set up password-less logins:
http://lani78.wordpress.com/2008/08/08/generate-a-ssh-key-and-disable-password-authentication-on-ubuntu-server/
and consider using -M if you want the connection to reconnect
automatically).

If anyone has specific questions about specific software, ask here or
off-list, and I'll try to answer or point you off to the right
information. Otherwise I should probably blog about this stuff, having
set up and/or used all three daily for the last 2 years. Maybe this
weekend.

Regards,
 - Josiah

On Wed, Feb 1, 2012 at 7:01 AM, Jak Sprats <jaks...@gmail.com> wrote:
> Hi Pedigree & josiah,
>
> having a ssh enabled wan redis setup is something that a lot of people
> may benefit from knowing about. if there is a very simple, easy to
> setup, reliable way to do this, it would be a great way to have a cold
> backup system in say another EC2 region. Most people dont realise how
> simple stunnel is to configure to do this type of stuff.
>
> I would be interested in knowing how much bigger the machines running
> the ssh-tunnel have to be as compared to the redis-server & the slave.
> The tunnel must require a lot more CPU than the actual serving :)
> Which begs the question of what is the optimal setup for the 4 moving
> parts [redis-server, master-stunnel, slave-stunnel, slave].
>
> - jak
>
> On Jan 31, 1:38 pm, Josiah Carlson <josiah.carl...@gmail.com> wrote:
>> On Tue, Jan 31, 2012 at 4:57 AM, pedigree <p...@stopforumspam.com> wrote:
>> > We run a master redis server, with a localhost slave that has all the
>> > dangerous commands disabled (flush etc).  4 nodes replicate from this
>> > slave, via stunnel, from Europe/USA and asia.  Everything works really
>> > well apart from one issue with the master.  If we just copied the
>> > master rdb/aof files in order to back them up, the slaves would
>> > randomly disconnect and would require stunnel to be restarted on the
>> > master server.  a slave node stunnel restart wouldnt work, which is
>> > strange as mysql is replicating over the same stunnel instance.  The
>> > fix we put in place was the local slave and backing up a replica
>> > instead.  since putting that in place, there hasn't been an issue with
>> > a fatal disconnect (running 5 weeks).  As our memory set is 250mb, it
>> > doesnt take a huge amount of time to replicate as stunnel is set to
>> > compress.
>>
>> If you are having trouble with stunnel, you may want to check out
>> Openpn. It has been very solid for me for the last couple years, and
>> it manages to recover very well from network instability. I've not
>> done Redis replication over it, but I have used it for NFS mounts,
>> Samba mounts, and countless multi-gig sftp transfers. Point-to-point
>> links do have to go through a central router, but if you set your
>> central router to be your Redis master, then you have the same number
>> of hops as your current stunnel configuration.
>>
>> Regards,
>>  - Josiah
>
> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Replication for read-scalability Jak Sprats 2/1/12 8:13 PM
Hi Josiah,

wow, that was a wealth of information ... you could just take what you
wrote here, clean it up, and you got a blog.

thanks for the info, if you do blog on it, please post it here, so we
can all read it.

- jak

On Feb 1, 3:06 pm, Josiah Carlson <josiah.carl...@gmail.com> wrote:
> In SSH, SSL, and OpenVPN, the expensive part is the initial connection
> handshake where all of the RSA/DSA stuff occurs. After an initial key
> exchange (with signature checks determined by the protocol),
> everything is transferred by one of a few different fairly high
> performance algorithms. If you are really concerned about performance,
> use RC4 or AES, the former of which will do 350 megs/second on a
> single core of a 2.4 ghz core 2 duo, with AES-128 at 180 megs/second.
>
> You can benchmark them on your platform using: 'openssl speed rc4 aes
> blowfish des'.
>
> Despite the bad press against RC4 due to WEP cracking, RC4 itself is
> solid, and it was an improper implementation in WEP that ignored
> almost every single "if you want security don't do X" paper prior to
> it's implementation that caused the insecurity. If you have a little
> spare processor and want the added "the NSA says it's safe" security,
> go with AES-128 or AES-256.
>
> All of these options can be set on a per-protocol basis with each of
> the systems. SSH cipher can be set via the "-c <alg>" command line
> option. SSL through stunnel can be set via the 'ciphers' configuration
> option (readhttp://vincent.bernat.im/en/blog/2011-ssl-benchmark-round2.html

> for information on how to optimize performance with various SSL
> endpoints if you need the highest handshake throughput, though the
> predecessor article mentions that Google uses RC4-SHA1 as their SSL
> encryption), though you can use NGINX as a TCP tunnel and have it work
> very well without much configuration. OpenVPN also uses the 'ciphers'
> configuration option, which you can see if you look at sample
> configurations (use this script to help you create and distribute
> client keys:https://gist.github.com/1323978).
>
> In terms of performance, CPU, etc., use encryption for
> cross-datacenter stuff, and within the DC if you are paranoid. CPU use
> is effectively low enough not to matter as long as you have a spare
> CPU and a 20 megs of extra memory (no EC2-small boxes, but an EC2
> high-cpu medium does very well). Both SSH and OpenVPN can be set up
> with transparent TUN interface support, so you just connect to a
> different IP instead of a tunneled port. If you want specific port
> tunneling, STunnel/NGINX/SSH can do it (if you use SSH, remember to
> set up password-less logins:http://lani78.wordpress.com/2008/08/08/generate-a-ssh-key-and-disable...

> and consider using -M if you want the connection to reconnect
> automatically).
>
> If anyone has specific questions about specific software, ask here or
> off-list, and I'll try to answer or point you off to the right
> information. Otherwise I should probably blog about this stuff, having
> set up and/or used all three daily for the last 2 years. Maybe this
> weekend.
>
> Regards,
>  - Josiah
>
>
>
>
>
>
>
> > For more options, visit this group athttp://groups.google.com/group/redis-db?hl=en.

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Re: Replication for read-scalability Josiah Carlson 2/1/12 8:53 PM
On Wed, Feb 1, 2012 at 8:30 PM, Jak Sprats <jaks...@gmail.com> wrote:
> Hi Josiah,
>
> looking at the benchmarks, in the article:
> http://vincent.bernat.im/en/blog/2011-ssl-benchmark-round2.html
> the max numbers are about 2.5K TPS
> request/response payload is 1K
> It is an 8 core system effectively using different cores.
>
> At a 1K payload redis can do maybe 10X this thruput (i.e. about 25K
> TPS) ... I think Didier has some numbers on redis speeds at different
> payload sizes (dont know how comparable the cpus are).
>
> So here is a 10 to 1 disjoint, which would result in a backQ (i.e. the
> stunnel/stud/nginx-tunnel could not keep up).

Here's the thing: that is creating and throwing away connections
continuously. If you are using SSH or OpenVPN, all of your raw Redis
connections are going to be funneled through a single TCP (or UDP in
the case of OpenVPN's suggested configuration) socket, in which the
handshaking has already occurred. At that point you are just looking
at latencies on top of your requests. Those latencies may be high (I
haven't tested it because I haven't had a need for high numbers of
commands through an encrypted connection).

If you are replicating data from server A to B (or from A to B,C,...),
the initial connection latencies are going to be relatively
insignificant for the overall behavior, and your concern is primarily
the long-held connection overhead (encryption/decryption, not
handshaking).

> I dont have the hardware to test this stuff, otherwise I just would. I
> have some use cases where I need WAN replication, but I have not
> benchmarked them yet, and these numbers scare me a little.

Don't let them. Unless your slave is reconnecting thousands of times a
second, the initial connection times are roughly 200-1000x more
expensive than an initial connection. Remember, they are doing
2048-bit arithmetic in order to set up a key exchange. Once the
exchange is complete, they immediately switch to the RC4, AES, etc.
You can check the speed of RSA/DSA operations via 'openssl speed rsa
dsa'. That's just for connection setup - which is exactly why you
*don't* want to use Stunnel for general clients connecting to Redis
(for replication it's okay).

To be more specific. This is the output I get from 'openssl speed rsa':
                  sign    verify    sign/s verify/s
rsa  512 bits 0.000118s 0.000011s   8443.6  90975.9
rsa 1024 bits 0.000594s 0.000031s   1683.2  32049.2
rsa 2048 bits 0.003674s 0.000108s    272.2   9246.8
rsa 4096 bits 0.025601s 0.000404s     39.1   2473.4

Given that 1024 bits is 128 bytes, using 1024 bit RSA encryption, I
can only get roughly 215K/second. It's slow. Really slow. For the sake
of comparison, I've got a pure Python RC4 implementation that can do
>4M/second on a 1.2 ghz Core-1 solo. But they *have* to do that
expensive math to exchange session keys (that's where the security
comes from). Once they've exchanged session keys, they switch to
RC4/AES and push the 130/180/350 M/second.

> I am kind of guessing at all of these numbers, so none of this is
> exact, but I have the feeling, that the number of cores it takes to
> encrypt redis' CRUD is much higher (maybe 10X) than the number of
> cores (=1) the redis-server uses. You can throw haproxy into the mix
> to get a 10-to-1 split, but then you kill replication serialisability,
> so that is a no go, CRUD statements can not arrive at a slave out-of-
> order.
>
> I get the feeling that replicating over WAN w/ encryption may have a
> much lower maximum thruput than we all would want, and it seems like a
> tough problem to engineer around because of replication
> serialisability. Maybe batching commands (pipelining) would help
> somewhat, although this inherently increases replication lag :(.

Don't throw the baby out! Those are times for creating connections,
not for long-lived slave connections.

> if anyone has any real world numbers on maximum WAN replication speed,
> please chime in, needing a backup in a spare EC2 zone is a good
> idea :)

I think I'll have time to write the blog post on Saturday, which will
include some numbers from at least a couple boxes in EC2.

 - Josiah