Redis warm-up - Data load performance.

1,128 views
Skip to first unread message

Sairam M P

unread,
Mar 21, 2016, 10:32:25 AM3/21/16
to Redis DB

Hi,

We were exploring the best possible ways to load the data-set to redis in lesser time. The redis benchmark statistics in http://redis.io/topics/benchmarks (under 'Factors impacting redis performance) claims it can process upto 100kb of data per sec. with 20,000 q/s in an unix socket. Based on this numbers, redis should be able to process 1.8 GB of data per sec. (20,000 * 100kb) = 1.86 GB.

We have started our redis server with aof enabled. After capturing the whole data-set (which is 11 GB in disk, aof format), now we restarted the redis server to load the data from the aof file, which took about 8 minutes. Based on the aforementioned benchmark statistics from redis.io, loading 11 GB of data should have happened in seconds, but contrarily it took 8 minutes.

Kindly through some lights on the above difference..!!

We are expecting the throughput of aof format to be equal to client, since aof format saves the commands as like client command processing. The one difference we see in the aforementioned benchmark statistics and the one with aof load is the number of threads used. With benchmark utility its 20,000 queries triggered in parallel, where as with aof load its a single thread processing the commands in sequence.

Is this the one making such a huge throughput difference !!?? If this is the case, can splitting the aof file and loading it in parallel could help us in achieving the throughput mentioned in the redis.io doc !?

Server details :

Our redis server where we tried the aof load is running in a VM, with 32 GB of RAM, 200 GB of Disk. The host is an Supermicro with 2 physical CPU, 10 cores each and hyper-threading enabled. Its an 4U server.

OS - CentOS-6.4.

Redis- Version - 2.8.17.

Kindly help us in understanding the redis performance, in the above context.

-Sairam.

Didier Spezia

unread,
Mar 22, 2016, 3:50:42 AM3/22/16
to Redis DB

I'm the author of the benchmark you mentioned.
I think you make a number of very wrong assumptions here.

1) 100 KB @ 20 Kop/s => Redis can "process" 1.8 GB/s

This extrapolation is hazardous.
The data size of your items matters, and the throughput is not necessarily linear.
In fact, the benchmark you mention proves the opposite.

2) A random guy on the Internet gives a result => we should be able to replicate it with our VM

Virtualization has an overhead. For most NoSQL engine, it is a very significant overhead.
I used physical hardware for this benchmark

Furthermore, the hardware I used is different from your own.
It is generally useless to compare a result like this when you don't know the condition of the original benchmark.
Benchmarks are not useful for their absolute results, but for their relative results
(i.e. on the same environment, compare different data size, or compare TCP loopback to Unix sockets).

3) Loading the AOF and processing online commands have the same cost

Well, they do not.
Obviously, the AOF needs to be read from disk before the operations can be applied.
11 GB in 8 minutes mean about 23 MB/s of throughput.
Depending on the storage hardware, it is not so bad for sequential I/Os from a single thread on a VM.

The size of data is another point.
11 GB on disk does not mean Redis will use 11 GB in memory. It will likely use more space.
Your VM has only 32 GB. The filesystem needs some space to cache the GB of data
read from the disk, while Redis also requires some memory at the same time.
Hypervisors do not like VMs suddenly requesting massive amount of memory.
They may trigger some memory ballooning activity.
You should check the behavior of the hypervisor and if your VM swaps.

Finally, the content of the AOF is different from the actual commands you would play
to create the same data. Redis is able to perform a number of transformations
to optimize the size of the file (for instance by aggregating several commands).

Best regards,
Didier.

Sairam M P

unread,
Mar 24, 2016, 12:26:41 PM3/24/16
to Redis DB
Hi Didier,

Thanks a lot for the response.

We understand comparing results from two different environments / server configs is of lesser significant, but we are just trying to get a fraction of the throughput that is mentioned in the benchmark doc. As per our throughput from vm its apprx. 80-85 times lesser than that mentioned in the doc, which makes us to ponder.

We just tried with a physical server with the below config, but still the throughput is 50 times lesser than 1.86 Gbps. For 12 GB aof file it took 5.5 minutes to load. We are just trying to find out where do we miss in getting such a best throughput results as like benchmark in redis.io.

Physical server config :

126 GB RAM, 1 TB hard disk, The server is a Supermicro-FT with 2 Physical CPU, 8 Cores each, and hyper-threading enabled. Its an 4U server.

OS - CentOS-6.6

Redis- Version - 2.8.17.

Kindly let us know if we are missing something in analyzing the benchmark stats.

-Sairam.

Sairam M P

unread,
Mar 24, 2016, 12:38:55 PM3/24/16
to Redis DB
Hi Didier,

Just forget to mention, in both the cases(VM or Physical server) we are trying the load the aof file from tmpfs(RAM) and not from disk.

-Sairam.

On Tuesday, March 22, 2016 at 1:20:42 PM UTC+5:30, Didier Spezia wrote:

Stefan Parvu

unread,
Mar 24, 2016, 1:05:47 PM3/24/16
to redi...@googlegroups.com

We were exploring the best possible ways to load the data-set to redis in lesser time. The redis benchmark statistics in http://redis.io/topics/benchmarks (under 'Factors impacting redis performance) claims it can process upto 100kb of data per sec. with 20,000 q/s in an unix socket. Based on this numbers, redis should be able to process 1.8 GB of data per sec. (20,000 * 100kb) = 1.86 GB.


We can process between 50-90k ops per second on a Xen environment. Eg:

"instantaneous_ops_per_sec:91088"

We don’t use Linux. We have experimented with FreeBSD and ZFS and we are happy. We found more 
performant FreeBSD, ZFS using Redis than Debian, Ext4|XFS. 


---
Stefan Parvu



Didier Spezia

unread,
Mar 24, 2016, 6:12:15 PM3/24/16
to Redis DB

Ok, but IMO you are still trying to compare operations which are not comparable.
Reading a 12 GB AOF is not the same as running a unix domain socket throughput benchmark.

Throughput benchmarks tend to perform less allocations, and always access/write the same
objects in memory. It makes them cache friendy, contrary to an AOF load.
My original benchmark was to prove a point at transport level (network, unix socket).
Its purpose was never to simulate an AOF load.

Anyway, the performance actually depends on the average size of your objects.
You cannot extrapolate a result from a benchmark based on different object sizes.

Why don't you run a throughput benchmark on your hardware and see by yourself?
It is quite easy. 

Just edit a redis configuration file to remove any bgsave or aof activity, and activate a unix domain socket.
Once Redis has started:

$ ./redis-benchmark -q -n 100000 -d 100000 -r 100000 -s /tmp/redis.sock -t set

With these parameters, a run should not consume more than 10 GB of memory.
But it will not be cache friendly, and will perform at most 100000 allocations.
You may want to adjust the -n an -r parameters higher (but it will consume more memory).

Note the throughput, and how far/close you are from the throughput you measured
for the AOF load (12 GB / 5.5 min).

On my (recent) laptop, I'm at 14000 op/s, so 1.4 GB/s for this throughput benchmark.

You may want to experiment with different object sizes, closer to your use case,
(change the -d parameter).

Best regards,
Didier.
Reply all
Reply to author
Forward
0 new messages