Optimizing index handling with pipelining

Phil Stewart

unread,

May 18, 2011, 4:15:43 AM5/18/11

to Ohm Ruby

Hi all,

I've been experimenting with improving Ohm's index handling with
pipelining. I've had particular success with
Ohm::Model#add_to_indices, the core of which I've wrapped in a
pipeline as follows:

def add_to_indices
db.pipelined do
indices.each do |att|
next add_to_index(att) unless collection?(send(att))
send(att).each { |value| add_to_index(att, value) }
end
end
end

I've also tried to do something similar with delete_from_indices, but
with less success. Here's some numbers from a benchmark run on a
project I'm working on. This particular benchmark amongst other things
creates 2000 instances of models subclassing Ohm::Model, hits
#add_to_indices 3000 times, and #delete_from_indices 1000 times:

Rehearsal
---------------------------------------------------------------------
no pipelining 7.860000 1.880000 9.740000
( 13.856807)
add_to_indices pipelining 7.280000 1.620000 8.900000
( 12.299852)
delete_from_indices pipelining 7.690000 2.140000 9.830000
( 13.957360)
no pipelining 2nd run 7.810000 2.000000 9.810000
( 13.946544)
add and delete indices pipelining 7.280000 1.840000 9.120000
( 12.613177)
----------------------------------------------------------- total:
47.400000sec

user system
total real
no pipelining 7.470000 2.130000 9.600000
( 13.590801)
add_to_indices pipelining 7.470000 1.630000 9.100000
( 12.634070)
delete_from_indices pipelining 7.770000 2.060000 9.830000
( 13.957237)
no pipelining 2nd run 7.830000 2.050000 9.880000
( 14.006379)
add and delete indices pipelining 7.300000 1.780000 9.080000
( 12.569478)

OK this looks like a mess as I post it, but I don't seem able to
format or mark this message plain text to render it fixed width -
fneh. Anyhow, what these numbers seem to show is a 7-11% increase in
performance when #add_to_indices is pipelined. This is probably
because #add_to_indices spends it's entire time generating a bunch of
SADD commands (via #add_to_index), which is a primary use case for
pipelining in Redis. Applying pipelining to #delete_from_indices
doesn't seem to show any benefit, and may in fact be slowing things
down. I've run the above benchmark a number of times and the results
are pretty consistent.

More targeted benchmarking is needed, particularly to measure
performance vs number of indices. The project I'm working on has a
mix, one class has four indexes (=> 8 SADDs), which I suspect benefits
the most. Classes with only one or two indexes will only generate a
few SADDs, so the benefit of sticking these in a pipeline might be
outweighed by the overhead of setting up the pipeline.

Conclusion: pipelining Ohm::Model#add_to_indices looks like a win,
particularly for models with large numbers of indices, but more
testing is required to ensure single index models don't suffer.

If anyone want to try the benchmarks I've used above, take a look at
https://github.com/LichP/Porp, checkout the pipeline-bench tag, and
poke around in the benchmarks directory. WARNING: the benchmark
scripts will flush Redis DB 0 by default, so be careful where you run
them.

--
Phil Stewart

Damian Janowski

unread,

May 24, 2011, 10:37:37 PM5/24/11

to ohm-...@googlegroups.com

On Wed, May 18, 2011 at 5:15 AM, Phil Stewart <phil.s...@gmail.com> wrote:
> Hi all,
>
> I've been experimenting with improving Ohm's index handling with
> pipelining. I've had particular success with
> Ohm::Model#add_to_indices, the core of which I've wrapped in a
> pipeline as follows:

Phil,

Thanks for your input!

Michel and I have been postponing a much needed upgrade to Ohm in
order to take advantage of new features in Redis 2.2. This would
entail using pipelining as well.

We'll check out the code for sure.

Thanks!

Phil Stewart

unread,

May 25, 2011, 12:28:27 PM5/25/11

to Ohm Ruby

On May 25, 3:37 am, Damian Janowski <djanow...@dimaion.com> wrote:
> Michel and I have been postponing a much needed upgrade to Ohm in
> order to take advantage of new features in Redis 2.2. This would
> entail using pipelining as well.
>
> We'll check out the code for sure.

One thing I forgot to mention about the benchmarks in my project is
that the project specific code is also using some experimental
additions to ohm-contrib, which I've forked on https://github.com/LichP/ohm-contrib
(namely Ohm::Looseref, which implements a variant of references/
collections which allows subclassing, and Ohm::Struct, which embeds
Struct objects as attributes). I'll have a crack at writing some
proper pipelining benchmarks at the weekend :-)

--
Phil

Phil Stewart

unread,

Jun 13, 2011, 6:04:07 PM6/13/11

to Ohm Ruby

On May 25, 5:28 pm, Phil Stewart <phil.stew...@gmail.com> wrote:
> I'll have a crack at writing some proper pipelining benchmarks at
> the weekend :-)

Well, it's taken me a bit longer to get round to this, but here's some
pipeline specific benchmark code:

https://github.com/LichP/ohm-bench

The code runs 10000 creates on models with 1, 2, 4, and 8 indices.
Here's the result on my system:

user system
total real
1 indices: no pipelining 4.810000 2.250000 7.060000
( 11.427640)
1 indices: add_to_indices pipelining 4.160000 1.390000 5.550000
( 8.353072)
2 indices: no pipelining 6.400000 3.080000 9.480000
( 15.510005)
2 indices: add_to_indices pipelining 5.420000 1.740000 7.160000
( 10.564996)
4 indices: no pipelining 9.500000 4.290000 13.790000
( 22.803929)
4 indices: add_to_indices pipelining 7.520000 2.330000 9.850000
( 14.049890)
8 indices: no pipelining 15.350000 6.810000 22.160000
( 36.642235)
8 indices: add_to_indices pipelining 11.760000 3.890000 15.650000
( 22.277761)

This looks like a substantial win :-)

--
Phil

Reply all

Reply to author

Forward