My plans on Redis (g++ compiler, offload on GPU and other stuffs)

502 views
Skip to first unread message

Gaetano Mendola

unread,
Jan 1, 2014, 4:48:20 PM1/1/14
to redi...@googlegroups.com
Hi all,
my 2014 new year's resolution is to contribute to Redis code.
My field of expertise is GPU computing and I would like to implement
some functionality (like the sort) using a GPU as backend.

So far I have forked Redis and I have a branch cplusplus 
that branch is able to be compiled with a c++ compiler (deps are
untouched), it's enough "CXX=1 make" and it uses g++ as compiler.

Note that I'm not going to change the code base to c++ or willing to 
change it to c++, I have just modified the code so it's able to being 
compiled by a standard c++ compiler, what I have changed so far:

  • Casts between void* and X*
  • Fixed some incongruenses between signature in headers and
    actual implementation
  • Added extern "C" on external libraries headers (where needed)
  • Renamed variable using c++ reserved words, namely: new, class,
    throw, this.
  • Renamed variable using the same name of the type they were
    refering (list* list;)
so far I have quickly benchmarked redis using gcc/g++/llvm/icc on a dual
socket Intel Xeon E5-2690 equiped with 32GB and my feeling (I have to run 
a serious benchmark section) is that:

gcc = g++ > icc > llvm (3.3)

On why the adoption of a c++ compiler for redis you can read about the
motivation the GCC development group decided to switch to it:


My next step is to implement the SORT command so it can be (for large 
data set) off-loaded on a GPU ( a very cheap board can sort around
900Milions key per second).

What do  you think about it?
What is a good benchmark set of options ?
Is redis evaluating to change the build chain (CMake for example)?

I believe there are a lot of questions/proposal for being my first write on this
mailing list.

Regards
Gaetano Mendola














Pedro Melo

unread,
Jan 3, 2014, 1:27:10 PM1/3/14
to redi...@googlegroups.com
Hi,

On Wed, Jan 1, 2014 at 9:48 PM, Gaetano Mendola <men...@gmail.com> wrote:

My next step is to implement the SORT command so it can be (for large 
data set) off-loaded on a GPU ( a very cheap board can sort around
900Milions key per second).

Out of curiosity, and please bear in mind I have very little knowledge about main memory/GPU memory interactions, is it worth to send 900MKeys from main memory to GPU memory to sort them? Doesn't the bandwith + copy time requirements overshadow any gains you might have on doing this?

Bye,0
--
Pedro Melo
@pedromelo
http://www.simplicidade.org/
xmpp:me...@simplicidade.org
mailto:me...@simplicidade.org

Josiah Carlson

unread,
Jan 3, 2014, 2:03:31 PM1/3/14
to redi...@googlegroups.com
Redis already copies all of the necessary data to an array for sorting via quicksort (libc standard library for complete sorts, a modified NetBSD libc quicksort for partial sorts), though it does use indirection for comparing string objects. The additional data copies necessary likely won't be all that bad (they'd have to be done for sorting anyway), though I wonder if there is a critical size under which CPU sorting makes more sense than GPU sorting. All of the graphs and literature that I'm able to read for free worries about sorts with millions of elements, which I have found to be somewhat rare in practice with Redis (which may change with a faster sort).

 - Josiah


--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/groups/opt_out.

Gaetano Mendola

unread,
Jan 3, 2014, 2:31:10 PM1/3/14
to redi...@googlegroups.com
On Friday, 3 January 2014 19:27:10 UTC+1, melo wrote:
Hi,

On Wed, Jan 1, 2014 at 9:48 PM, Gaetano Mendola <men...@gmail.com> wrote:

My next step is to implement the SORT command so it can be (for large 
data set) off-loaded on a GPU ( a very cheap board can sort around
900Milions key per second).

Out of curiosity, and please bear in mind I have very little knowledge about main memory/GPU memory interactions, is it worth to send 900MKeys from main memory to GPU memory to sort them? Doesn't the bandwith + copy time requirements overshadow any gains you might have on doing this?

Any GPU operation is composed, as you said, by the following operations:

  1. Host -> GPU data transfer
  2. GPU sort
  3. GPU -> Host data transfer
what happens is that for certain data problem size the 3 steps above are quicker than a simple CPU run especially on nowaday systems with PCI3
capable systems and RAM running at 1600 Mhz (I'm talking about server class memory). To give you an idea on such system you can transfer 
from host to device around 5GB/s of data

There is another factor to keep in mind the 3 operations above can run in parallel with other operations so if redis gets 2 pipelined sorts basically what
you can have is:

  1. Host -> GPU data transfer for container1
  2. GPU sort on container 1 + Host -> GPU data transfer for container2
  3. GPU sort on container 2 + GPU -> Host data transfer for container 1
  4. GPU -> Host data transfer for container 2
as you can see while you are sorting two containers your overhead is only for one Host -> GPU -> Host.

Consider also that you can "mirror" an area of memory on your HOST to the GPU one so basically you don't need
an explicit Host -> GPU -> Host explicit copy and that copy will be ammortized.

BTW, I have posted on this list the results I got from just compiling with a C++ compiler and the results are quite
promising, as soon the moderator approve the post you will see it.

Regards
Gaetano Mendola

 

NailK

unread,
Jan 4, 2014, 1:56:55 AM1/4/14
to redi...@googlegroups.com
Wondering if GPU will speed-up sorting smaller (like 10 thousands items) datasets?
It would be cool to have such GPU-accelerated Redis ZSETs.

Regards,
Nail

Josiah Carlson

unread,
Jan 4, 2014, 2:21:58 AM1/4/14
to redi...@googlegroups.com
It wouldn't help with ZSETs - ZSETs are stored with a hash table and a skiplist, so there is no explicit "sort" step involved.

 - Josiah



--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all
Reply to author
Forward
0 new messages