Redis scale limits/common pitfalls

109 views
Skip to first unread message

Vaishaal Shankar

unread,
Jul 18, 2018, 2:02:52 AM7/18/18
to Redis DB
Hello All,

I am a graduate student and I have a big experiment planned that has uses Redis for some coordination. I plan on running Redis in single node mode on an Amazon x32.xlarge. It is a pretty beefy machine which has 2 TB of ram and 64 Physical cores ( Intel Xeon E7 8880 v3 ). These machines have a 25 GB/s link and nothing will be running on the machines other than Vanilla Redis (built from source from  http://download.redis.io/redis-stable.tar.gz). 

So the question I have is this (I won't be able to verify this until the day I get the resources to run the experiment). If i have 80,000 simultaneous clients ( from up to 80,000 independent machines) mostly reading/writing keys to Redis with little key contension are there any special configuration parameters I should worry about. *Most* of the requests will be simple unconditional reads/incrs/writes. There will be a few light transactions that conditionally increment a key based on the value of another key, but nothing else heavy. Again all values written are just integers. I expect this computation to go on for around 8 hours (the number of clients will stay at around 80,000 but there will be some churn of clients coming and going) 

When I ran a small test with 10,000 clients I got the occasional request timeout, but I was able to alleviate this with some exponential backoff. So is there anything else I have to worry about? I have updated the Redis conf file to support up to 500000 workers and have checked the ulimits to make sure this isn't an issue. But I've only used Redis for these experiments for around a year now and never at this scale so I was wondering if the community has some best practices/common pitfalls for this kind of set up (is this even a reasonable scale for Redis?). 

Thanks,
Vaishaal Shankar

Marc Gravell

unread,
Jul 18, 2018, 2:46:20 AM7/18/18
to redi...@googlegroups.com
Side note: you probably want to shard the data (perhaps redis-cluster, perhaps basic sharding) - otherwise most of those cores will be sat idle. Since the box is just for redis, that feels like a huge waste. Also, you can get into problems with huge numbers of clients on a single port (ephemeral port exhaustion) - and using multiple ports (1 per instance) would mitigate against that risk very nicely.

Marc

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at https://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

hva...@gmail.com

unread,
Jul 18, 2018, 8:05:12 AM7/18/18
to Redis DB
Trivago blogged a description of the journey they had with Redis in production and the lessons they learned:  Learn Redis the hard way (in production).  They posted the blog link to the Redis Reddit group and there was some discussion there as well:  https://www.reddit.com/r/redis/comments/5q5ddr/learn_redis_the_hard_way_in_production/   Trivago made use of twemproxy because they needed to use client programs (written in PHP) that would not keep connections open to Redis.  In my opinion it's better to use a client that will keep connected to Redis than to introduce a proxy in between.

Since your keys will tend to contain integer values, a reading of the Instagram blog post may also suggest a way to economize on the amount of RAM required, and perhaps reduce the size (i.e. cost) of the AWS server you need.  The blog post is:  Storing Hundreds Of Millions Of Simple Key-Value Pairs In Redis.  The blog post is pretty old (Oct 2011) and the key config parameters have changed name from "zipmap" to "ziplist", but the core principle for reducing RAM consumption of many integer values is still effective.  It's something that you have to test yourself, though.  Instagram's use case was heavy on reads and light on writes, and you might not get the same results with your use case.

Vaishaal Shankar

unread,
Jul 18, 2018, 9:05:23 AM7/18/18
to redi...@googlegroups.com
Ah. So this was one of the things. Since I have few of these conditional increments that absolutely need to be atomic, I wasn't sure how sharding would play into this. Do you mean run multiple redis processes on one machine?


From: redi...@googlegroups.com <redi...@googlegroups.com> on behalf of Marc Gravell <marc.g...@gmail.com>
Sent: Tuesday, July 17, 2018 11:46:04 PM
To: redi...@googlegroups.com
Subject: Re: [redis-db] Redis scale limits/common pitfalls
 

Marc Gravell

unread,
Jul 19, 2018, 2:02:38 PM7/19/18
to redi...@googlegroups.com
Yes, I mean run multiple redis processes on one machine. The core of redis is single threaded - it is very fast, but to make effective use of multiple cores you currently need multiple processes.

Nick Farrell

unread,
Jul 23, 2018, 7:51:24 AM7/23/18
to Redis DB
out of the box redis will be configured to start 16 databases. by default you connect to database 0, but it's trivial to use another one. And it's easy enough to recompile if you want 64. Though given your load will not be completely evenly spread, you probably want a slightly higher prime number of databases, say 79.

However, as mentioned above, each instance of redis will be listening on a single port, hence the value of running multiple instances. But then you realise that you may as well get 10 VMs with 8 CPUs each, and "only" 256Gb of RAM each. Not only would the units be cheaper, but you avoid the networking bottlenecks too.

The other thing is to consider whether you need to push everything to redis immediately. If there is little contention, can you not aggregate the values on the 80k clients, and then push the results periodically, rather than at each operation?

Reply all
Reply to author
Forward
0 new messages