Usage of GPU's in databases

926 views

Skip to first unread message

Kelly Sommers

unread,

Sep 14, 2012, 8:15:24 PM9/14/12

to distsys...@googlegroups.com

Hey all,

I'm curious what you database freaks think of GPU usage in databases. I've thought of this off and on in relation to inverted indexes and things like full text search etc. The thing that I keep reading is that bulk loading data into and out of a GPU is slow and I do see a lot of compression patents in the GPU enabled database space. The other thing is that with a big data set, 1GB of VRAM isn't all that much and you need to shuffle in and out a lot of data/results.

What do you all think of the possibility of using a GPU in a database?

Despite all that, this project on GitHub looks really interesting: https://github.com/antonmks/Alenka/blob/master/GPUvsSparc.txt


System configuration 1 : [1]

   SPARC T4-4 Server with

   4 SPARC T4 3GHz Processors, 32 cores, 256 threads

   512 GB memory

   4 Sun Storage F5100 Flash Arrays w/ 80 24GB FMODs each

Software : Oracle Database 11g Release 2 Enterprise Edition

Total cost : $925,000 

   
System configuration 2 : 
   Pentium G620 2.6GHz 2 core CPU
   16 GB memory
   1 x 120GB internal SSD
   1 Nvidia C2050 GPU
Software : Alenka GPU database [2]
Total cost : $2350 


SQL for test query looks like this :


  select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, sum(l_extendedprice) as sum_base_price,
         sum(l_extendedprice*(1-l_discount)) as sum_disc_price,sum(l_extendedprice*(1-l_discount)*(1+l_tax)) as sum_charge,
         avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, avg(l_discount) as avg_disc,count(*) as count_order
  from lineitem
  where l_shipdate <= date '1998-12-01' - interval '[DELTA]' day (3)
  group by l_returnflag, l_linestatus
  order by l_returnflag, l_linestatus;



Results in seconds of query 1 :


    SPARK      GPU

Q1   189s      173s


That's pretty impressive. Thoughts?

John Muellerleile

unread,

Sep 14, 2012, 10:49:46 PM9/14/12

to distsys...@googlegroups.com

Hi Kelly,

There has been a lot of research over the years on this, though mostly regarding analytical workloads -- search, currently, isn't quite a fit. That said, the points you bring up -- moving data from system memory to GPU memory, etc. -- are a significant part of the work people in academia and companies alike are working to address, both in software and in hardware.

For the purposes of my reply, I'm going to ignore the benchmark numbers; what you quoted is pretty much apples to oranges, and it's what it doesn't say that's the important part (were the data structures optimized for, and was the code written to benefit from, the monster $925k setup?). What I'll do is address the point about moving data around being slow, relatively speaking, and provide a handful of resources to do some exploration on your own. I find the use of GPUs in databases compelling (not to mention other configurations, e.g., arrays of lower-power, lower-cost processors with solid state storage, FPGA configurations, & perhaps most of all, hybrids). The more I follow the developments, the more I find it's both many-faceted and not without nuance.

I also want to say up front that I do not have in-depth experience applying any of this, and certainly not in a production setting -- yet. I've been reading everything I can get my hands on and following a handful of companies and related subjects. In that light, I'll share some of the resources I've gotten the most from.

So, to address the bandwidth issues, I'll start by saying that nVidia has been working on this for some time on both the hardware and software front, mostly by way of what they call "GPUDirect":

NVIDIA GPUDIRECT

http://developer.nvidia.com/cuda/nvidia-gpudirect

NVIDIA GPUDirect Technology

http://developer.download.nvidia.com/devzone//devcenter/cuda/docs/GPUDirect_Technology_Overview.pdf

Regarding 1 GB of VRAM, that's pretty outdated; the latest cards (e.g., Tesla K10, M2090, M2075, ...) come with up to 6 GB of GDDR5 w/ a single GPU (the Fermis), or 8 GB w/ two GPUs (Kepler) and up to 512 cores w/ a Fermi, or 3072 with the Kepler -- these things are monsters:

http://www.nvidia.com/object/tesla-servers.html

Another project you will probably want to take a look at is The Virginian Database, written summer of 2010 @ NEC Labs:

The Virginian Database

"an experimental heterogeneous SQL database written to compare data processing on the CPU and NVIDIA GPUs"

https://github.com/bakks/virginian

directly related papers:

Efficient Data Management for GPU Databases

http://pbbakkum.com/virginian/paper.pdf

Accelerating SQL Database Operations on a GPU with CUDA

http://pbbakkum.com/db/bakkum.sql.db.gpu.pdf

Code for the above paper: http://pbbakkum.com/db/

ParStream is one GPU-specific database that I've been following for some time (though undoubtedly there are others -- let me know if you find others like it!):

http://www.parstream.com/en/big_data_analytics_product/oneplatform/oneplatform.html

Here are a handful of general GPU-specific, in terms of databases, pubs I keep going back to:

Oncilla - Optimizing Accelerator Clouds for Data Warehousing Applications

http://gpuocelot.gatech.edu/wp-content/uploads/gt-oncilla-whitepaper-spring2012.pdf

Relational Query Co-Processing on Graphics Processors

http://www.cse.ust.hk/catalac/papers/gpuqp_tods09.pdf

High-Throughput Transaction Executions on Graphics Processors

http://www.vldb.org/pvldb/vol4/p314-he.pdf

Self-Tuning Distribution of DB-Operations on Hybrid CPU/GPU Platforms

http://ceur-ws.org/Vol-850/paper_bress.pdf

Scaling PostgreSQL Using CUDA

http://highscalability.com/scaling-postgresql-using-cuda

Comparing CPU and GPU in OLAP Cube Creation

http://kedrigern.dcs.fmph.uniba.sk/kralovic/sofsem2011/presentations/kaczmarski2011.pdf

GPU Processors in Databases

MOLAP based on parallel scan

http://si.pjwstk.edu.pl/prezentacje/molap-btree-1.pdf

I would like to add that exploring other similar types of systems, FPGAs in particular, provide, if nothing else, a useful contrast for the fundamental data structures, processing mechanics, and underlying operations that are typical targets for hardware accelerated database systems; to that end, here are a few ones I've found to be particularly interesting:

The “Chimera”: An Off-The-Shelf CPU/GPGPU/FPGA Hybrid Computing Platform

http://downloads.hindawi.com/journals/ijrc/2012/241439.pdf

With the above paper in particular, check out section 5, and especially section 6, "Berkeley's Thirteen Dwarves". The decomposition of database operations into their fundamentals is both a fun exercise, though their application in specialized hardware will quickly give you a picture of the big and small aspects of each type of acceleration (i.e., CPU vs. FPGA vs. GPU).

Other FPGA pubs (two of which directly address search):

FPGA: What’s in it for a Database?

http://people.inf.ethz.ch/jteubner/publications/fpga-tutorial-sigmod2009.pdf

FPGAs: A New Point in the Database Design Space

http://people.inf.ethz.ch/jteubner/publications/fpga-tutorial-edbt2010/fpga-tutorial-edbt2010.pdf

An FPGA-based Search Engine for Unstructured Database

http://www.ccrc.wustl.edu/~roger/papers/wciz03.pdf

FPGA based hardware implementation and parallel processing of database operations on streaming projections in C-Store (a column oriented database)

http://www.cs.ucr.edu/~skulhari/C-Store-FPGA.pdf

If you're interested in doing some hacking, here are some libraries that I've played with to varying degrees that are good candidates for tinkering:

cudpp: CUDA Data Parallel Primitives Library

http://code.google.com/p/cudpp/

thrust: a parallel algorithms library which resembles the C++ Standard Template Library (STL);

https://github.com/thrust/thrust

thrust graph library

http://code.google.com/p/thrust-graph/

Rootbeer

The Rootbeer GPU Compiler makes it easy to use Graphics Processing Units from within Java.

https://github.com/pcpratts/rootbeer1

Lastly, if you find yourself itching to try some of this stuff out, there are several "cloud GPU" on-demand HPC commercial offerings; you're undoubtedly aware of EC2's stuff, but there are a few that are lesser known:

Amazon: http://aws.amazon.com/ec2/instance-types/

Nimbix: http://www.nimbix.net

Peer 1: http://www.peer1.com/sites/default/files/pdf/datasheets/GPU_cloud_nvidia.pdf

Penguin Computing: http://www.penguincomputing.com/Services/HPCCloud/POD/Architecture

In short, I think it's a great idea, and given the economics of scale that have come into effect, and will likely continue, with respect to both GPU and FPGA, not to mention ARM and ARM-like processors (check out CUDA for ARM http://www.nvidia.com/object/carma-devkit.html), the prices of the related hardware should continue to be, or become an even better, value prop. Combined with the dropping prices of SSDs, esp. eMLC-backed drives, and given the more or less stable costs of typical server hardware you'll put all this fancy stuff in, it's probably a good bet for the future too -- in my opinion. The research seems to agree.

Hopefully this is helpful information! I'm interested in any other companies, open-source or commercial systems, libraries, etc. that you might come across.

Cheers,

John

rektide

unread,

Sep 15, 2012, 2:17:19 AM9/15/12

to Kelly Sommers, distsys...@googlegroups.com

On Fri, Sep 14, 2012 at 08:15:24PM -0400, Kelly Sommers wrote:
> Hey all,
> I'm curious what you database freaks think of GPU usage in databases. I've
> thought of this off and on in relation to inverted indexes and things like
> full text search etc. The thing that I keep reading is that bulk loading
> data into and out of a GPU is slow and I do see a lot of compression
> patents in the GPU enabled database space. The other thing is that with a
> big data set, 1GB of VRAM isn't all that much and you need to shuffle in
> and out a lot of data/results.
> What do you all think of the possibility of using a GPU in a database?
> Despite all that, this project on GitHub looks really

> interesting:�[1]https://github.com/antonmks/Alenka/blob/master/GPUvsSparc.txt

GPU databases require some restructuring- it's not about building smart query plans. It's
about getting bulk data that many queries will need to scan on to the GPU, then having
all running queries interested in that data run against that data. Pretty obvious:
transactional cost of pulling in data is high, query cost for running against the on gpu
data set is low, it's just a matter of job tracking all the queries that would benefit from
running against the range of the current data set.

Reply all

Reply to author

Forward

0 new messages