Is there linear regression algorithms written by Q language?

2,452 views
Skip to first unread message

afancy

unread,
Dec 17, 2013, 8:47:52 AM12/17/13
to personal...@googlegroups.com
I am looking for  linear regression, and k-mean algorithm for KDB, but cannot find it. 
I guess someone must have already implemented it before. I would appreciate it if someone could share with me. thanks a lotQ

Paweł Tryfon

unread,
Dec 17, 2013, 9:06:54 AM12/17/13
to personal...@googlegroups.com
You can try using lsq function (http://code.kx.com/wiki/Reference/lsq) to implement linear regression. It should work well, if the number of variables in the problem you are trying to solve is not too large. In case of highly dimensional problems it's rather recommended to implement linear regression using gradient descent or some other function minimization algorithm.

As far as k-means algorithm is concerned, I don't have an implementation in q, but I recently did one in octave. I will translate and post it somewhere once I get some free time, probably during New Year's break.

Thanks,
Pawel


2013/12/17 afancy <toxi...@gmail.com>
I am looking for  linear regression, and k-mean algorithm for KDB, but cannot find it. 
I guess someone must have already implemented it before. I would appreciate it if someone could share with me. thanks a lotQ

--
You received this message because you are subscribed to the Google Groups "Kdb+ Personal Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to personal-kdbpl...@googlegroups.com.
To post to this group, send email to personal...@googlegroups.com.
Visit this group at http://groups.google.com/group/personal-kdbplus.
For more options, visit https://groups.google.com/groups/opt_out.

Kim Kuen Tang

unread,
Dec 17, 2013, 9:20:05 AM12/17/13
to personal...@googlegroups.com

Yes, there is one available from Andrey Zholos in qml:

 

/ linreg[y;X]: Performs linear regression of y (vector) on X (list of vectors).

/   This function computes the least squares estimates of parameters and

/   covariance matrix and then calls linregtests to compute test statistics.

/.

/   e.g. exec linreg[price;(1.;sign;quantity)] from trades  / 1. for constant

/.  

/   Returns dictionary:

/     `X     = X (list of row vectors)

/     `y     = y (vector)

/     `S     = covariance matrix

/     `b     = parameter estimates

/     `e     = residuals

/     `n     = number of observations

/     `m     = number of parameters

/     `df    = degrees of freedom

 

linreg1:{[y;X]

    if[any[null y:"f"$y]|any{any null x}'[X:"f"$X];'`nulls];

    if[$[0=m:count X;1;m>n:count X:flip X];'`length];

    Z:.qml.minv[flip[X]mmu X];

    ZZ:X mmu Z mmu flip[X];

    e:y- yhat:X mmu beta:Z mmu flip[X] mmu y;

    linregtests1 ``X`y`S`beta`e`n`m`df`ZZ`Z`yhat!(::;X;y;Z*mmu[e;e]%n-m;beta;e;n;m;n-m;ZZ;Z;yhat)};

 

If you don’t want to rely on qml then just replace .qml.minv with lsq.

 

The linregtests1 function calculates some diagnostics numbers. The implementation depends heavily on qml

 

/ linregtests[R]: Perform linear regression tests on a set of estimation

/   results. This function is called automatically by linreg, but can be called

/   again, for example, if the covariance matrix is adjusted. None of the values

/   returned by linreg are recalculated, in particular, if b is adjusted, e

/   needs to be recalculated.

/.

/   Updates R dictionary with:

/     `se    = standard error of estimates vector

/     `tstat = vector of t-statistics

/     `tpval = vector of p-values for t-test

/     `rss   = sum of squared residuals

/     `tss   = total sum of squares

/     `r2    = R-squared statistic

/     `r2adj = adjusted R-squared

/     `fstat = f-statistic

/     `fpval = p-value for f-test

 

linregtests1:{[R]

    tstat:R[`beta]%se:sqrt R[`S]@'til count R`S;

    fstat:(R[`df]*rss-tss:{x mmu x}R[`y]-+/[R`y]%R`n)%(1-R`m)*rss:e mmu e:R`e;

    R,m:`se`tstat`tpval`rss`tss`r2`r2adj`fstat`fpval!(se;tstat;

        2*1-R[`df] .qml.stcdf/:abs tstat;rss;tss;1-rss%tss;

        1-(rss*-1+R`n)%tss*R`df;fstat;1-.qml.fcdf[-1+R`m;R`df;fstat])};

 

You should exclude it from linreg1 if you don’t need it.

 

Kim

--

Zak Oudrhiri

unread,
Feb 14, 2014, 12:12:46 PM2/14/14
to personal...@googlegroups.com
I might have a k-means implementation I've been fooling around with if you're still interested.

dmitrijs...@gmail.com

unread,
Feb 12, 2017, 1:40:31 PM2/12/17
to Kdb+ Personal Developers

y = -3 - 2x + x^2

q)x:    1.0    2.0    3.0    4.0

q)y:   -4.0   -3.0    0.0    5.0

fit:{(enlist y) lsq x xexp/: til 1+z}

q)fit[x;y;2]

-3 -2 1

Krishna Kumar

unread,
Feb 12, 2017, 10:40:20 PM2/12/17
to personal...@googlegroups.com
Linear regression - https://github.com/krish240574/kaggle-deandecock - Linear regression applied to Kaggle's DeAn De Cock competition. (Predicting housing prices, based on the Ames housing dataset - https://www.kaggle.com/c/house-prices-advanced-regression-techniques )
K means - https://github.com/jlas/ml.q/blob/master/ml.q - a bunch of useful algorithms implemented in q. 

Kumar

--
You received this message because you are subscribed to the Google Groups "Kdb+ Personal Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to personal-kdbplus+unsubscribe@googlegroups.com.
To post to this group, send email to personal-kdbplus@googlegroups.com.
Visit this group at https://groups.google.com/group/personal-kdbplus.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages