Suggestions for fast retrieval of financial time-series, daily close prices

38 views
Skip to first unread message

Emil Tarazi

unread,
Nov 7, 2021, 10:08:49 AM11/7/21
to cstore users
Hello: 

Apologies in advanced for the n00b questions. I am looking for general discussion and guidance on this topic. 

I have about 15 years of daily close prices for 10,000+ products, and 14 columns wide. The schema looks somewhat like:

date
numeric identifier
close price
other prices...
volume...
etc...

I am looking for general guidance on moving this table from standard PostgreSQL (actually running in AWS RDS) to a locally run (on an AWS EC2 machine) PosgreSQL 14 with the Citus extension. 

Data is typically queried as "select date, close price from table where numeric_identifier=X order by date"

Additionally, new daily data is appended every day to reflect new market close prices. 

Some questions I have: 
1. Some documentation indicates that the data should be sorted when loaded. How exactly does this work? I'm a long-time user of KDB - which has the concept of a "sort" attribute on a column. There seems to be no equivalent in PostgreSQL or cstore.
2. If there are incremental updates to the data, daily appends for 1 day and 10k+ identifiers, any considerations should be made w.r.t sorting data? 
3. The "order by" clause seems to take an order of magnitude longer on a COLUMNAR store test table than on my standard row-oriented table, controlling for as many variables as I can. EXPLAIN on the same query on both databases concurs with this observation. I suppose the standard indices are not being used on the columnar table. Is that right? 

thanks

Reply all
Reply to author
Forward
0 new messages