what's next at cstore_fdw

440 views
Skip to first unread message

Murat Tuncer

unread,
Feb 5, 2016, 10:31:28 AM2/5/16
to cstore users
Hi folks

We are in the process of determining next cstore_fdw features.  There are few candidate items being considered, however, we need to hear from you.

So what would you like to see at next cstore_fdw ? What would make you like it better ? What could make you use cstore_fdw at bigger extend ?

I appreciate if you could share a feedback. You could do it over here for discussion or send me private email if you like.

We need to hear from you.

-- 
Murat Tuncer
Software Engineer | Citus Data
mtu...@citusdata.com

Murat Tuncer

unread,
Feb 9, 2016, 4:02:16 AM2/9/16
to cstore users
Hi everybody

Here are some candidate items that came up so far 
- backup / restore facility
- delete rows 
- update rows
- in-place aggregation (aggregation pushdown)
- join pushdown
- indexing
or
- your candidate feature

What would be your top three picks ?


thanks
Murat

Juan Emilio Gabito Decuadra

unread,
Feb 11, 2016, 9:32:29 AM2/11/16
to cstore users
I would add some sort of caching to that list. I don't know if you already have something like it, but it would be nice that if I run twice the same query against the same data set then the second time you run the query the results come from the cache instead of querying again the table.

Can you describe how in-place aggregation would work?

Thanks!

Murat Tuncer

unread,
Feb 12, 2016, 8:35:36 AM2/12/16
to cstore users
Hi Juan

I am not sure if supporting caching would provide enough benefits. Due to being a foreign data wrapper, cstore_fdw does not know much about the query other than basic filtering clauses. So it does not run the query itself, and returns all matching rows to postgresql for further processing (grouping etc). Here caching would mean keeping everything in data file in memory,

I will keep this in to-do list, but do not foresee any immediate action.

I haven't full think thorough how in-place aggregation would work. We could probably execute min/max/count/sum/avg queries which do not contain grouping inside cstore_fdw and return only the results to postgresql. This would reduce the amount of data we transfer.  However, this is possible only if we are running in postgresql 9.5 or higher which means we would loose compatibility with older postgresql releases.

since you asked about it, do you have other ideas on how it should work ?

thanks
Murat

Karri Niemelä

unread,
Feb 21, 2016, 3:00:58 PM2/21/16
to cstore users
Hi!

SIMD/Vector instructions. I guess you did some nice experiments already https://github.com/citusdata/postgres_vectorization_test?

Kun Yang

unread,
Mar 30, 2017, 11:56:38 PM3/30/17
to cstore users
indexing, join pushdown, backup/restore facility

在 2016年2月9日星期二 UTC+8下午5:02:16,Murat Tuncer写道:

Luqman Syauqi Hidayat

unread,
Apr 6, 2017, 7:42:02 AM4/6/17
to cstore users
Delete/ update rows, and indexing.

Thank you.

Balazs Gunics

unread,
Apr 6, 2017, 7:48:42 AM4/6/17
to Luqman Syauqi Hidayat, cstore users
If we create a 3rd file where we store the invalid records then update / delete can be easily implemented.
Delete would mean we add the internal id (does it use CTID?) to this invalid records file.
Update would be converted to a delete + insert operation.

Upon reading all the matching CTID-s needs to be filtered out as they are invalid.

This would also require a mechanism to "reorganize" or "rebuild" the files where we would physically delete the invalid records from the main file and truncate the invalid records file.

This wouldn't mean you can use cstore as a regular table, but it would at least let you delete some records.


However based on how this works and that single record inserts are not supported I feel the architecture of cstore is may not be ready for this.


--
You received this message because you are subscribed to the Google Groups "cstore users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cstore-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Balázs Gunics
Product Manager - ETL specialist - Starschema Ltd.
Mobile: +36 30 657 4744
Email:  gun...@starschema.net    Web: www.starschema.net

Murat Tuncer

unread,
Apr 7, 2017, 5:04:57 AM4/7/17
to cstore users, luqm...@gmail.com
So far we have always positioned cstore_fdw for bulk loading and append only use cases. 

Due to how data is laid out, implementing delete/update and single row inserts are causing serious data fragmentation and read performance issues at the moment. 
Implementing 'vacuum' feature for cstore_fdw would eliminate/reduce these problems. 


To unsubscribe from this group and stop receiving emails from it, send an email to cstore-users...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

spi...@takeit.se

unread,
Apr 11, 2017, 12:50:40 PM4/11/17
to cstore users
Backup /Restore & Streaming Replication support (WAL-stored data)

It's our blocker right now.

Balazs Gunics

unread,
Apr 12, 2017, 6:44:25 AM4/12/17
to spi...@takeit.se, cstore users
I have a query that crashes our Postgres so I wanted to get cstore_fdw on our Windows box, but seems it is not supported.

Would it be possible to add support for it?

--
You received this message because you are subscribed to the Google Groups "cstore users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cstore-users+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Murat Tuncer

unread,
Apr 12, 2017, 9:12:35 AM4/12/17
to Balazs Gunics, spi...@takeit.se, cstore users
Hey Balazs,

Unfortunately we are not thinking about adding windows support at this time. I haven't check if it is possible at all.

Murat

pr...@growthintel.com

unread,
Apr 20, 2017, 6:03:07 AM4/20/17
to cstore users, gun...@starschema.net, spi...@takeit.se
Hi Murat,

I am specifically interested in the ability to add OR update an entire column of data to an existing table (i.e. updating all the rows for a particular column)

At the moment this is a bit of a blocker for us adopting cstore_fdw

Any thoughts on when the next release might be and whether it will include these features?

Thanks



On Wednesday, April 12, 2017 at 2:12:35 PM UTC+1, Murat Tuncer wrote:
Hey Balazs,

Unfortunately we are not thinking about adding windows support at this time. I haven't check if it is possible at all.

Murat

On Wed, Apr 12, 2017 at 1:44 PM, Balazs Gunics <gun...@starschema.net> wrote:
I have a query that crashes our Postgres so I wanted to get cstore_fdw on our Windows box, but seems it is not supported.

Would it be possible to add support for it?
On Tue, Apr 11, 2017 at 6:50 PM, <spi...@takeit.se> wrote:
Backup /Restore & Streaming Replication support (WAL-stored data)

It's our blocker right now.


On Friday, February 5, 2016 at 4:31:28 PM UTC+1, Murat Tuncer wrote:
Hi folks

We are in the process of determining next cstore_fdw features.  There are few candidate items being considered, however, we need to hear from you.

So what would you like to see at next cstore_fdw ? What would make you like it better ? What could make you use cstore_fdw at bigger extend ?

I appreciate if you could share a feedback. You could do it over here for discussion or send me private email if you like.

We need to hear from you.

-- 
Murat Tuncer
Software Engineer | Citus Data
mtu...@citusdata.com

--
You received this message because you are subscribed to the Google Groups "cstore users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cstore-users...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
--
Balázs Gunics
Product Manager - ETL specialist - Starschema Ltd.
Mobile: +36 30 657 4744
Email:  gun...@starschema.net    Web: www.starschema.net

--
You received this message because you are subscribed to the Google Groups "cstore users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cstore-users...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Murat Tuncer

unread,
Apr 20, 2017, 9:14:04 AM4/20/17
to pr...@growthintel.com, cstore users, Balazs Gunics, spi...@takeit.se
Hey Prash

Could you explain your use case a bit more ?
Is it "you have an existing table, you want to add a new column and populate its values from an external source" or you want to have newly added column to have a single value in all rows.

The first one is currently not planned or thought about. However you can achieve the latter one by setting a default value to newly added column. 

Updating whole column value is currently not planned and very very unlikely for the upcoming release. I think that should be evaluated with arbitrary row update/delete item.

We are in the final stages for roadmap preparation. Soon I will share it here for another round of feedback.

Murat


To unsubscribe from this group and stop receiving emails from it, send an email to cstore-users+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Adam Scott

unread,
Apr 20, 2017, 12:53:38 PM4/20/17
to cstore users
Having just one function-based index would help performance for us.

The skip indexes have been working wonderfully though.

Just some feedback for our purposes.

Great work!

Thank you,
Adam

Prashant Majmudar

unread,
Apr 21, 2017, 9:02:53 AM4/21/17
to Murat Tuncer, cstore users, Balazs Gunics, spi...@takeit.se
Hi Murat,

Thanks for getting back to me.

Yes my use case is "I have an existing table, I want to add a new column and populate its values from an external source"

It is also:

"I have an existing table, I want to update all the values in one column with new values from an external source"

At the moment, I can achieve this by creating a brand new table, and using an insert select from the old table and a temporary staging temporary, but that's not scalable I think.

If I can achieve the above use cases by updating all the rows, that is ok, as long as it is not prohibitively slow - e.g. I will have ~3 million rows with ~100s of columns, and I want to update a subset of columns.

It'd be great to see some ability to update in the next release.

Thanks,

Prash


--
Prash Majmudar / CTO
DDI: 0203 668 3666  M: 07789 073 478

Net.Works, 25-27 Horsell Road, Highbury, London, N5 1XL
Reply all
Reply to author
Forward
0 new messages