Hello:
Apologies in advanced for the n00b questions. I am looking for general discussion and guidance on this topic.
I have about 15 years of daily close prices for 10,000+ products, and 14 columns wide. The schema looks somewhat like:
date
numeric identifier
close price
other prices...
volume...
etc...
I am looking for general guidance on moving this table from standard PostgreSQL (actually running in AWS RDS) to a locally run (on an AWS EC2 machine) PosgreSQL 14 with the Citus extension.
Data is typically queried as "select date, close price from table where numeric_identifier=X order by date"
Additionally, new daily data is appended every day to reflect new market close prices.
Some questions I have:
1. Some documentation indicates that the data should be sorted when loaded. How exactly does this work? I'm a long-time user of KDB - which has the concept of a "sort" attribute on a column. There seems to be no equivalent in PostgreSQL or cstore.
2. If there are incremental updates to the data, daily appends for 1 day and 10k+ identifiers, any considerations should be made w.r.t sorting data?
3. The "order by" clause seems to take an order of magnitude longer on a COLUMNAR store test table than on my standard row-oriented table, controlling for as many variables as I can. EXPLAIN on the same query on both databases concurs with this observation. I suppose the standard indices are not being used on the columnar table. Is that right?
thanks