Dictionary pushdown

43 views
Skip to first unread message

Juan Emilio Gabito Decuadra

unread,
Jun 18, 2018, 11:11:46 AM6/18/18
to cstore users
cstore_fdw can skip blocks by using the min and max value of the data stored in those blocks.

It would be nice to have a dictionary pushdown. That would allow cstore to skip the block with out the need to loop thru it's data. 

For example, you have a column stored in a block with min value 10 and max value 15.

Maybe the data looks like this:

10
10
10
11
11
14
14
14
14
15

If you do this: SELECT SUM(COL) FROM TABLE WHERE COL = 12, 12 is inside the min max range, but 12 is not stored in that block. I believe it would give a good performance boost to cstore.

Netflix did something similar to the parquet columnar format: https://youtu.be/A4OU6i4AQsI?t=23m46s

Regards!

Murat Tuncer

unread,
Jun 25, 2018, 3:59:27 AM6/25/18
to Juan Emilio Gabito Decuadra, cstore users
Hi Juan Emilio,

What you describe can be achieved by adding bloom filter or hash table like structure into metadata. I am not quite convinced on how well it can perform. 

Could you open an issue on this with the detailed use case if possible ?
thanks


--
You received this message because you are subscribed to the Google Groups "cstore users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cstore-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Murat Tuncer
Software Engineer | Citus Data
mtu...@citusdata.com
Reply all
Reply to author
Forward
0 new messages