Data modeling

164 views
Skip to first unread message

Juan Emilio Gabito Decuadra

unread,
Jan 27, 2016, 3:27:22 PM1/27/16
to cstore users
Hi everyone.

So, I have this doubt about data modeling with cstore_fdw.

I have a classical Kimball model, fact table, dimensions, the usual stuff. I have_ dim tables for the dimensions and fact_ for the fact table.

But, is this the right approach, thinking in performance terms, to modeling?

I could have just the fact table and query that. But if I have the dimensions tables then my fact table would be smaller, right? Instead of having a dimension label for example, it would have the dimension FK. The query in the fact table would be faster because numbers are better compressed than strings, but then I would have to join with the dimension table. Would that make a performance penalty?

Regards!

Murat Tuncer

unread,
Feb 1, 2016, 5:11:11 AM2/1/16
to Juan Emilio Gabito Decuadra, cstore users
Hello

cstore_fdw works better if your table contains the whole dataset as it can filter out some data block before reading the actual data.   Having a foreign key may require you to run a join at each query. 

Postgresql needs to read all rows from cstore table to perform join (due to cstore_fdw being a foreign data wrapper). This may cause a significant performance issue depending on the joined tables, join expression, and join plan picked by postgresql.

I suggest modeling the data such that your query returns less number of rows coming from dimension tables. This would prevent multiple passes on the columnar table.  

Running 'analyze' on the columnar table would help postgresql planner to pick a better strategy. This will have positive effect on query performance when running joins and subqueries.

I hope this answers your questions. 

Murat


--
You received this message because you are subscribed to the Google Groups "cstore users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cstore-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Murat Tuncer
Software Engineer | Citus Data
mtu...@citusdata.com

Juan Emilio Gabito Decuadra

unread,
Feb 3, 2016, 2:58:46 PM2/3/16
to cstore users, gabi...@gmail.com
Understood, so the best thing is just to have one fact table with all the data that wee need.

Thanks!

Ivan Voras

unread,
Aug 30, 2016, 1:50:40 PM8/30/16
to cstore users, gabi...@gmail.com
Hello!

Continuing this old thread... are there some guidelines on how to do implement some common structures? For example, I have a master-detail relationship, and the only way this will fit in a single row is to use array, but then I would probably need to create my own data type for the "details" elements. In addition to that, I need to query by specific fields in the detail elements.

What would be a good cstore structure for this?

Juan Emilio Gabito Decuadra

unread,
Aug 30, 2016, 1:54:26 PM8/30/16
to Ivan Voras, cstore users
Maybe you can use the json data type.

Other way might be to convert the detail array into several tuples.

Ivan Voras

unread,
Aug 30, 2016, 1:56:41 PM8/30/16
to cstore users, ivo...@gmail.com
If using JSON(B), what will happen with queries of JSON(B) inner elements, which is a requirements?
I.e. how would they get indexed?

Murat Tuncer

unread,
Aug 31, 2016, 5:00:36 AM8/31/16
to Ivan Voras, cstore users
Filtering on jsonb column woud require returning of whole jsonb field and also add decompress cost if the data is compressed.  This may cancel out benefits of using columnar store since it would be returning whole data set anyway.

If you are going to filter only on some elements, you better put them as columns and filter on that. although cstore_fdw does not have indexing support, it can still do some smart filtering to reduce io.



To unsubscribe from this group and stop receiving emails from it, send an email to cstore-users+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Murat Tuncer
Software Engineer | Citus Data
Reply all
Reply to author
Forward
0 new messages