How to calculate correlation every few rows

74 views
Skip to first unread message

Emily Wilson

unread,
Apr 6, 2021, 8:51:35 AM4/6/21
to Kdb+ Personal Developers
Hi All,

I have a table which looks like this:
num  sym1       sym2    sym3.    sym4
-----------------------------------------------------------------
1         1.5           -0.1.       9.4.            3.1
2         2.5          - 0.9        3.1             0.5
3         5.4            2.9.       -0.3.           1.0
4         0.5            0.1.       -0.4.           1.4
5         0.3            0.0         3.6.            0.4

How can I calculate the correlation between sym1 and sym2, sym1 and sym3, sym1 and sym4 for every 3 rows? 
(i.e, the correlation between sym1 row 1-3 and sym2/sym3/sym4 row 1-3, 
the correlation between sym1 row 2-4 and sym2/3/4 row 2-4,
the correlation between sym1 row 3-5 and sym2/3/4 row 3-5)

Any help will be very much appreciated!! 

András Dőtsch

unread,
Apr 6, 2021, 10:15:55 AM4/6/21
to personal...@googlegroups.com
One way to do it:

select num, s1 cor' s2, s1 cor' s3, s1 cor' s4 from -2 _ select num, s1:sym1 i+\:0 1 2 ,s2:sym2 i+\:0 1 2 ,s3:sym3 i+\:0 1 2 ,s4:sym4 i+\:0 1 2 from t

Regards,
Andras


--
You received this message because you are subscribed to the Google Groups "Kdb+ Personal Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to personal-kdbpl...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/personal-kdbplus/d40f838e-f003-40a5-9efd-28978b5aa645n%40googlegroups.com.

Matthew Clark

unread,
Apr 6, 2021, 10:16:01 AM4/6/21
to Kdb+ Personal Developers
Hi Emily, 

I can refer you to this link here

Programming idioms – Knowledge Base – kdb+ and q documentation - Kdb+ and q documentation (kx.com)

which talks about sliding windows to aggregate over (primarily for monadic functions). I had a quick play around and have got a code that works based on some of this. However, I am sure you are likely to find some improvements to my code. Nevertheless, I am sure that this is enough to get you started.

// function to print out sliding window values
swin:{[w;s]{ 1_x,y }\[w#0;s]}

// function to calculate correlation between two columns from table
cf:{[w;table;syms]cor'[(w-1)_swin[w;(flip table)[syms[0]]];(w-1)_swin[w;(flip table)[syms[1]]]]}

You may wish to choose a better way to format the output, but here is some examples of how the output looks:

// Rolling correlation between sym1 and sym2 (for rows 1-3, 2-4 and 3-5)
q)cf[3;table;(`sym1`sym2)]
0.9002607 0.7809604 0.999991

// Rolling correlation for sym1 with each of sym2, sym3 and sym4 (for rows 1-3, 2-4 and 3-5)
q)cf[3;table]'[`sym1,'`sym2`sym3`sym4]
0.9002607  0.7809604   0.999991
-0.9025613 -0.08046509 -0.5109631
-0.5570187 -0.346547   0.1490308

Hope this gives you a good place to start.

All the best,
Matthew

Emily Wilson

unread,
Apr 11, 2021, 10:46:28 AM4/11/21
to Kdb+ Personal Developers
Thank you Andras!

Emily Wilson

unread,
Apr 11, 2021, 10:46:34 AM4/11/21
to Kdb+ Personal Developers
Thank you so much Matthew! I will try it out!
Reply all
Reply to author
Forward
0 new messages