concurrent writes/reads

1,477 views
Skip to first unread message

zzz

unread,
Oct 26, 2011, 7:04:00 PM10/26/11
to Redis DB
What's the typically number of concurrent reads and writes that redis
can support?

I have an application that would potentially do 100 concurrent reads
and 100 concurrent writes per 5 seconds; but the reads and writes
essentially read and write all keys in the database (about 10k). Is
this something that redis can support? and what's the best pattern to
do this (i.e., pipeline/transaction? or some other mechanism)

thanks.

Josiah Carlson

unread,
Oct 26, 2011, 7:09:59 PM10/26/11
to redi...@googlegroups.com
What you are doing isn't 100 reads + 100 writes/5 seconds, you are
instead doing: 100*10,000/5 = 200k reads and writes per second, or
400k total operations per second. I am not aware of anyone achieving
that kind of throughput with a single Redis, but I have heard of
people doing roughly half that. Can you shard?

If you could explain a bit about what you are using Redis for, we may
potentially be able to offer better ways of building your application
that doesn't require so many reads/writes.

Regards,
- Josiah

> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
>

zzz

unread,
Oct 26, 2011, 9:10:37 PM10/26/11
to Redis DB
So the scenario is that I have a bunch of clients that are updating
central data cache and reading as well. Note that the number is the
max number on the keys, it's more likely to be around 2-3k, and not
each read/write will touch all keys, but the worst case they will
tough all. Sharding is not at the moment.

I can probably relax the update frequency to be around 10-20 seconds,
and let's say that the key space is around 3k, so 100 * 3000 / 20 =
15k concurrent operations. I assume that this is easily handled by
redis? Is pipelining the best option for client? and what's the
latency like? (data size is about 200 bytes)

Josiah Carlson

unread,
Oct 27, 2011, 12:24:55 AM10/27/11
to redi...@googlegroups.com
On Wed, Oct 26, 2011 at 6:10 PM, zzz <chad...@gmail.com> wrote:
> So the scenario is that I have a bunch of clients that are updating
> central data cache and reading as well. Note that the number is the
> max number on the keys, it's more likely to be around 2-3k, and not
> each read/write will touch all keys, but the worst case they will
> tough all. Sharding is not at the moment.

You've not really explained your problem, so it's very difficult for
us to help you really bring it down.

> I can probably relax the update frequency to be around 10-20 seconds,
> and let's say that the key space is around 3k, so 100 * 3000 / 20 =
> 15k concurrent operations. I assume that this is easily handled by
> redis? Is pipelining the best option for client? and what's the
> latency like? (data size is about 200 bytes)

Pipelining without MULTI/EXEC will be your fastest option. Assuming
that your Redis can handle 100k operations/second (what I get roughly
on a core 2 at 2.5 ghz), assuming each client is writing 3k keys, that
would put you at roughly 3ms per client write, another 3ms per client
read, for a total of around 600ms for those 100 clients every 20
seconds. If you let them stagger, that's only 5 per second, or 30ms
busy every second with cache updates. That's not unreasonable.

Regards,
- Josiah

zzz

unread,
Oct 27, 2011, 1:36:17 AM10/27/11
to Redis DB
Let's try again.

Take a simple example, where you have a set of products, the central
data cache has the available unit for each product. Essentially, you
have <product_id, # of unit>, and let's say that there are 10k type of
products. If we store these in redis, and we have a distributed set of
servers (100) that are selling these products, and each need to update
the central data cache with the unit sold every 5 seconds, and read
the remaining unit left. Let's say that we can tolerate a bit of
inaccuracy (1%) on the numbers. The global data cache would serve as a
central dash-board/database.
Because each server/site might vary in terms of # of product and units
sold, so the actual write might be a lot smaller, but the read would
almost always be 10k, unless a server/site decide that it doesn't want
to sell some product.

Does this make sense?

On Oct 26, 9:24 pm, Josiah Carlson <josiah.carl...@gmail.com> wrote:

Pedro Melo

unread,
Oct 27, 2011, 1:43:09 AM10/27/11
to redi...@googlegroups.com
Hi,

On Thu, Oct 27, 2011 at 6:36 AM, zzz <chad...@gmail.com> wrote:
> Take a simple example, where you have a set of products, the central
> data cache has the available unit for each product. Essentially, you
> have <product_id, # of unit>, and let's say that there are 10k type of
> products. If we store these in redis, and we have a distributed set of
> servers (100) that are selling these products, and each need to update
> the central data cache with the unit sold every 5 seconds, and read
> the remaining unit left. Let's say that we can tolerate a bit of
> inaccuracy (1%) on the numbers. The global data cache would serve as a
> central dash-board/database.
> Because each server/site might vary in terms of # of product and units
> sold, so the actual write might be a lot smaller, but the read would
> almost always be 10k, unless a server/site decide that it doesn't want
> to sell some product.

I must be missing something: why can't those 100 servers just INC a
counter with a negative value from time to time? Why read/write
everything?

Bye,
--
Pedro Melo
@pedromelo
http://www.simplicidade.org/
http://about.me/melo
xmpp:me...@simplicidade.org
mailto:me...@simplicidade.org

zzz

unread,
Oct 27, 2011, 1:56:20 AM10/27/11
to Redis DB
That's what I'm planning on doing for the writes, but that's still 10k
INCR calls potentially. They'd still need to read -- because others
might have changed the value that they last had.

On Oct 26, 10:43 pm, Pedro Melo <m...@simplicidade.org> wrote:
> Hi,
>
> xmpp:m...@simplicidade.org
> mailto:m...@simplicidade.org

Josiah Carlson

unread,
Oct 27, 2011, 4:06:15 AM10/27/11
to redi...@googlegroups.com
I can see 2 needs to have all of those numbers up-to-date on any given server:
1. searches on the server rely on having those numbers correct for
"item in stock"
2. people are browsing the site sufficiently often that knowing the
precise number is important

If you only had a handful of servers, I'd say everyone should just
write to the master, and you could use slaves on your local hosts to
have that data locally. But at 100 servers, you start really needing
to have 2 levels of slaving, which can get nasty very quickly.

Your servers could cache the real known number at two recent specific
times (or a dozen), which it could then use to project a linear
function, which would predict reasonably well for much longer than 20
seconds. Of course that is under the assumption that your volume in
any particular item is relatively low, or at least is somewhat
predictable. However... because your volume is actually quite high (as
the 100 servers and 3k writes per server every 20 seconds implies), If
you had a sample every minute for 5+ minutes, your prediction should
be pretty good.

So yeah. Sample and prediction could get you down reasonably to
refreshing 1/minute or less.

Regards,
- Josiah

catwell

unread,
Oct 27, 2011, 4:23:47 AM10/27/11
to Redis DB
On Oct 27, 7:56 am, zzz <chad....@gmail.com> wrote:

> That's what I'm planning on doing for the writes, but that's still 10k
> INCR calls potentially. They'd still need to read -- because others
> might have changed the value that they last had.

You don't need to read keys you INCR since INCR returns the new value.

Otherwise a fast way to do that would be to use scripting and
implement a command (XXX here) that does:

XXX set_of_keys key1 incr1 [key2 incr2 ...]

The set of keys contains all the keys you track. The script:

1) increments every keyN key by incrN
2) returns key/value pairs for all the keys in set_of_keys

catwell

unread,
Oct 27, 2011, 4:25:35 AM10/27/11
to Redis DB
On Oct 27, 10:23 am, catwell <catwell-goo...@catwell.info> wrote:

> XXX set_of_keys key1 incr1 [key2 incr2 ...]

To be clear: that's the theoretical command. In practice, the
scripting implementation will be like:

EVALSHA my_sha N+1 set_of_keys key1 ... keyN incr1 ... incrN

Salvatore Sanfilippo

unread,
Oct 27, 2011, 4:49:42 AM10/27/11
to redi...@googlegroups.com
Hi zzz, yes that is easily handled, but I suspect that if you tell use
what you want to do instead of telling *how* you would do that we can
help a lot more. If you can't disclose the exact case just invent a
similar problem that has the same needs.

Maybe you can use a single string for all the 3k entries, using
SETRANGE/GETRANGE, or maybe even a bitmap.
It depends on the exact problem we have.

Cheers,
Salvatore

--
Salvatore 'antirez' Sanfilippo
open source developer - VMware

http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele

Pedro Melo

unread,
Oct 27, 2011, 5:29:18 AM10/27/11
to redi...@googlegroups.com
On Thu, Oct 27, 2011 at 6:56 AM, zzz <chad...@gmail.com> wrote:
> That's what I'm planning on doing for the writes, but that's still 10k
> INCR calls potentially. They'd still need to read -- because others
> might have changed the value that they last had.

INCR will return the value after the increment operation, so you can
combine at least that read into the INCR op. You might need to GET
only if you don't need to INCR, so at most is one operation per key
per X seconds per server.

Josiah Carlson

unread,
Oct 27, 2011, 11:36:49 AM10/27/11
to redi...@googlegroups.com
On Thu, Oct 27, 2011 at 1:23 AM, catwell <catwell...@catwell.info> wrote:
> On Oct 27, 7:56 am, zzz <chad....@gmail.com> wrote:
>
>> That's what I'm planning on doing for the writes, but that's still 10k
>> INCR calls potentially. They'd still need to read -- because others
>> might have changed the value that they last had.
>
> You don't need to read keys you INCR since INCR returns the new value.

Unless other servers alter values that are not being changed by a
given calling server...

Regards,
- Josiah

> Otherwise a fast way to do that would be to use scripting and
> implement a command (XXX here) that does:
>
> XXX set_of_keys key1 incr1 [key2 incr2 ...]
>
> The set of keys contains all the keys you track. The script:
>
> 1) increments every keyN key by incrN
> 2) returns key/value pairs for all the keys in set_of_keys
>

Pedro Melo

unread,
Oct 27, 2011, 11:45:41 AM10/27/11
to redi...@googlegroups.com
Hi,

On Thu, Oct 27, 2011 at 4:36 PM, Josiah Carlson
<josiah....@gmail.com> wrote:
> On Thu, Oct 27, 2011 at 1:23 AM, catwell <catwell...@catwell.info> wrote:
>> On Oct 27, 7:56 am, zzz <chad....@gmail.com> wrote:
>>
>>> That's what I'm planning on doing for the writes, but that's still 10k
>>> INCR calls potentially. They'd still need to read -- because others
>>> might have changed the value that they last had.
>>
>> You don't need to read keys you INCR since INCR returns the new value.
>
> Unless other servers alter values that are not being changed by a
> given calling server...

Huhs?

INCR is atomic, if every servers uses INCR you will always get back
the correct value. What am I missing?

Didier Spezia

unread,
Oct 27, 2011, 1:05:07 PM10/27/11
to redi...@googlegroups.com
Hi,

I don't want to comment on the design, only complete the answers to your
original questions.

Redis does serialize all operations because of its single-thread design.
So at a given point in time, it can only handle one single operation (read or write).

Redis can easily support more than 10000 concurrent connections if you run it
on Linux or BSD (due to the epoll/kqueue event loop implementation). The exact
limit depends on your TCP configuration.

The limiting factor is the total number of operations per second (triggered by all
the connections). With an Intel CPU (>=Nehalem), Redis is able to sustain
more than 100 Kops/s (tested with Redis benchmark). Because everything is
serialized, the nature of this workload (pure read, pure write, or mixed read/write)
has not much impact.

Now if heavy pipelining is used and with recent powerful CPUs, more throughput
can be achieved. You can have a look at the following benchmark result. We were
able to achieve 300 KOps/s in write and 400 KOps in read on a single core of
an Intel X5670 (2.93 GHz) using basic hiredis features + unix domain sockets.


Contrary to what most people think, it is actually easy to get more throughput
than redis-benchmark with a simple program, provided pipelining is used.

If you can use aggregated commands (MGET, MSET, etc ...), it is even better.
With Redis 2.4, you  have a good number of them. The optimal get/set throughput
is probably reached with a combination of aggregated commands plus pipelining.

Regards,
Didier.

zzz

unread,
Oct 27, 2011, 2:04:08 PM10/27/11
to Redis DB
These are the exact two cases that we want to support -- both of them
relying on the global number being relatively accurate.

I don't think that sampling would work, as the pattern isn't quite
linear.

Salvatore, I don't understand how setrange would help here.

Catwell, do you mean to implement server side scripting?

Josiah Carlson

unread,
Oct 27, 2011, 4:55:12 PM10/27/11
to redi...@googlegroups.com
On Thu, Oct 27, 2011 at 8:45 AM, Pedro Melo <me...@simplicidade.org> wrote:
> Hi,
>
> On Thu, Oct 27, 2011 at 4:36 PM, Josiah Carlson
> <josiah....@gmail.com> wrote:
>> On Thu, Oct 27, 2011 at 1:23 AM, catwell <catwell...@catwell.info> wrote:
>>> On Oct 27, 7:56 am, zzz <chad....@gmail.com> wrote:
>>>
>>>> That's what I'm planning on doing for the writes, but that's still 10k
>>>> INCR calls potentially. They'd still need to read -- because others
>>>> might have changed the value that they last had.
>>>
>>> You don't need to read keys you INCR since INCR returns the new value.
>>
>> Unless other servers alter values that are not being changed by a
>> given calling server...
>
> Huhs?
>
> INCR is atomic, if every servers uses INCR you will always get back
> the correct value. What am I missing?

He has 100 servers. Just because server X does an INCR on key Y
doesn't mean that the other 99 servers will have that information.
That's why he's got to perform reads, which seems to be roughly 1/2 of
his load.

- Josiah

zzz

unread,
Oct 27, 2011, 7:58:24 PM10/27/11
to Redis DB
Yes.

One thing that I'm wondering if there is a way to optimize on the
redis side where if a counter hasn't been updated since last time
shouldn't be returned/read. What's the best way to do that in redis?
I.e. a command like GET_SINCE_LAST_MODIFIED(timestamp)

On Oct 27, 1:55 pm, Josiah Carlson <josiah.carl...@gmail.com> wrote:
> On Thu, Oct 27, 2011 at 8:45 AM, Pedro Melo <m...@simplicidade.org> wrote:
> > Hi,
>
> > On Thu, Oct 27, 2011 at 4:36 PM, Josiah Carlson
> > <josiah.carl...@gmail.com> wrote:

Josiah Carlson

unread,
Oct 27, 2011, 8:08:03 PM10/27/11
to redi...@googlegroups.com
On Thu, Oct 27, 2011 at 4:58 PM, zzz <chad...@gmail.com> wrote:
> Yes.
>
> One thing that I'm wondering if there is a way to optimize on the
> redis side where if a counter hasn't been updated since last time
> shouldn't be returned/read. What's the best way to do that in redis?
> I.e. a command like GET_SINCE_LAST_MODIFIED(timestamp)

You could put update times in a zset, then do a zrange followed by a
get, but you would do twice as many writes. If your writes are
significantly lower than your reads, that may be the win.

- Josiah

> On Oct 27, 1:55 pm, Josiah Carlson <josiah.carl...@gmail.com> wrote:
>> On Thu, Oct 27, 2011 at 8:45 AM, Pedro Melo <m...@simplicidade.org> wrote:
>> > Hi,
>>
>> > On Thu, Oct 27, 2011 at 4:36 PM, Josiah Carlson
>> > <josiah.carl...@gmail.com> wrote:
>> >> On Thu, Oct 27, 2011 at 1:23 AM, catwell <catwell-goo...@catwell.info> wrote:
>> >>> On Oct 27, 7:56 am, zzz <chad....@gmail.com> wrote:
>>
>> >>>> That's what I'm planning on doing for the writes, but that's still 10k
>> >>>> INCR calls potentially. They'd still need to read -- because others
>> >>>> might have changed the value that they last had.
>>
>> >>> You don't need to read keys you INCR since INCR returns the new value.
>>
>> >> Unless other servers alter values that are not being changed by a
>> >> given calling server...
>>
>> > Huhs?
>>
>> > INCR is atomic, if every servers uses INCR you will always get back
>> > the correct value. What am I missing?
>>
>> He has 100 servers. Just because server X does an INCR on key Y
>> doesn't mean that the other 99 servers will have that information.
>> That's why he's got to perform reads, which seems to be roughly 1/2 of
>> his load.
>>
>>  - Josiah
>

Pedro Melo

unread,
Oct 28, 2011, 1:21:18 AM10/28/11
to redi...@googlegroups.com
Hi,

ahs, he wants the other servers to know that the value has changed.

In that case I would suggest each server to keep a second connection
for pubsub, and publish the new value on a channel. Is it better than
what he currently has? depends a lot on the rate of change by all the
servers...

Josiah Carlson

unread,
Oct 28, 2011, 2:21:18 AM10/28/11
to redi...@googlegroups.com

If he uses pubsub, he may as well run Redis slaves everywhere. Then
he's only got one remote connection from each box, moving the same
data as pubsub would, probably offering much better overall latency,
less code to write, etc. But I already talked about why that might not
be a good solution earlier.

Regards,
- Josiah

zzz

unread,
Oct 28, 2011, 2:41:27 AM10/28/11
to Redis DB
A couple of question related to this is the size of pipeline -- how
much commands should one pipeline (i.e 1000s commands or 100s
commands)? or it doesn't make any difference?

In my scenario, does the number of concurrent client (servers) impact
redis performance? typically how does one tune such set up?

thanks

Josiah Carlson

unread,
Oct 28, 2011, 3:48:01 AM10/28/11
to redi...@googlegroups.com
On Thu, Oct 27, 2011 at 11:41 PM, zzz <chad...@gmail.com> wrote:
> A couple of question related to this is the size of pipeline -- how
> much commands should one pipeline (i.e 1000s commands or 100s
> commands)? or it doesn't make any difference?

If you want to maximize throughput, the more commands the better. I've
typically tried to stick with roughly 1-5k commands at a time at the
most, which seemed to work well for our choice of Redis servers and
clients. Depending on your network topology, speed, latency, etc.,
that may creep up for optimum performance (for higher-latency
networks, large ethernet frames, etc.).

> In my scenario, does the number of concurrent client (servers) impact
> redis performance? typically how does one tune such set up?

As long as the number of clients doesn't exceed your file handle limit
on your platform and compiled Redis settings (typically by default in
the 1k-10k range), there is nothing to worry about. There are people
here running 50k connected clients that haven't reported any
significant degradation in Redis throughput.

It's rea;;y all about total number and the size of commands. For
example, if you need to perform a quick set of bulk operations, it
turns out that sending a group of commands that fits just under the
1500 byte ethernet frame maximizes performance for very low latencies
(it was benchmarked and reported on here about 6 months back, if I
remember correctly). You go beyond 1500 bytes total, and now your
latency goes up, because you've got to send more than one ethernet
frame, but your throughput also goes up. Use jumbo frames and that
cutoff creeps up to 9000 bytes.

It depends on what you want to do. You want throughput? Perform at
least 1k operations per client-pipelined request (try benchmarking up
to 20k requests at a time). You want low latency? Keep it under 1500
(or 9000) bytes per chunk of requests.

Regards,
- Josiah

catwell

unread,
Oct 28, 2011, 4:57:12 AM10/28/11
to Redis DB
On 27.10.2011 20:04, zzz wrote:

> Catwell, do you mean to implement server side scripting?

Yes, I mean using Redis Scripting. The only problem with
that approach is that Scripting is not in the stable version
of Redis yet, so you have to see if you can live with it.

Didier Spezia

unread,
Jul 12, 2012, 5:55:15 AM7/12/12
to redi...@googlegroups.com

Hi,

Redis does not write (or read) data in parallel.
At most one write (or read) operation can be on-going at a given point in time.

Redis can process commands and serve data to its clients
in a concurrent way. It is just the commands will be applied 
in a sequential way. Your 1000 write operations will be serialized.

This may help you:

If your question is whether Redis can support a global throughput of 1000 write
operations per second triggered by 1000 different connections, the anwser is probably
yes (it depends on the size of your data though).

You can use redis-benchmark to evaluate  if Redis fits your needs.

Redis is routinely benchmarked at more than 100K op/s on most
Intel hardware, and much more if pipelining is used.

Regards,
Didier.


On Thursday, July 12, 2012 10:06:10 AM UTC+2, Raj wrote:
Hi Salvatore,

I am looking for 1000 concurrent writes in Redis DB.Is it possible?

- Raj

Reply all
Reply to author
Forward
0 new messages