kalman_loglik, fast recursion for ARMA

52 views
Skip to first unread message

Robert

unread,
Jun 7, 2011, 1:18:16 PM6/7/11
to pystatsmodels
Hey All,

I've been trying to use the scikits ARMA but I was hoping to use the
"fast recursion" speed up in version .3. talked about in:

http://old.nabble.com/Re%3A-fast-small-matrix-multiplication-with-cython--p30421098.html

However it doesn't seem like this was released.

I was wondering if this code is avilable somewhere to be used.

Thanks,
Robert

Skipper Seabold

unread,
Jun 7, 2011, 1:27:46 PM6/7/11
to pystat...@googlegroups.com

Hi,

This should be in the most recent (.3) release (candidate), but it's
not the default for the build. You must install from source, and do

python setup.py build --with-cython

Then

python setup.py install

Please let us know if you don't see a significant speed-up. If the
model is correctly specified, you should see a much faster
convergence. In future releases, this with-cython flag won't be
necessary.

Skipper

josef...@gmail.com

unread,
Jun 7, 2011, 1:32:30 PM6/7/11
to pystat...@googlegroups.com
On Tue, Jun 7, 2011 at 1:18 PM, Robert <phu...@gmail.com> wrote:

The code is (supposed to be) included in the distribution, but
statsmodels needs to be build with --with-cython as command line
argument. (cython needs to be installed and a c compiler needs to be
available)

I need to find the documentation for this since I haven't used it
myself. And I think we will need to add it more prominently.

Josef

>
> Thanks,
> Robert

Robert

unread,
Jun 7, 2011, 1:42:25 PM6/7/11
to pystatsmodels
hi!

Thanks for the quick responses,

the issue is that when you do --with-cython it looks for kalman_loglik
but it isn't included

if compile_cython:
config.add_extension('tsa/kalmanf/kalman_loglike',
sources = ['scikits/statsmodels/tsa/kalmanf/
kalman_loglike.c'],
include_dirs=[numpy.get_include()])

from the file in:

http://pypi.python.org/packages/source/s/scikits.statsmodels/scikits.statsmodels-0.3.0rc1.zip#md5=6c6e0ddcc404bbf9518b9ee670cda54c



On Jun 7, 12:32 pm, josef.p...@gmail.com wrote:
> On Tue, Jun 7, 2011 at 1:18 PM, Robert <phub...@gmail.com> wrote:
> > Hey All,
>
> > I've been trying to use the scikits ARMA but I was hoping to use the
> > "fast recursion" speed up in version .3. talked about in:
>
> >http://old.nabble.com/Re%3A-fast-small-matrix-multiplication-with-cyt...
>
> > However it doesn't seem like this was released.
>
> > I was wondering if this code is avilable somewhere to be used.
>
> The code is (supposed to be) included in the distribution, but
> statsmodels needs to be build with --with-cython as command line
> argument. (cython needs to be installed and a c compiler needs to be
> available)
>
> I need to find the documentation for this since I haven't used it
> myself. And I think we will need to add it more prominently.
>
> Josef
>
>
>
>
>
> > Thanks,
> > Robert- Hide quoted text -
>
> - Show quoted text -

josef...@gmail.com

unread,
Jun 7, 2011, 1:44:02 PM6/7/11
to pystat...@googlegroups.com
On Tue, Jun 7, 2011 at 1:27 PM, Skipper Seabold <jsse...@gmail.com> wrote:
> On Tue, Jun 7, 2011 at 1:18 PM, Robert <phu...@gmail.com> wrote:
>> Hey All,
>>
>> I've been trying to use the scikits ARMA but I was hoping to use the
>> "fast recursion" speed up in version .3. talked about in:
>>
>> http://old.nabble.com/Re%3A-fast-small-matrix-multiplication-with-cython--p30421098.html
>>
>> However it doesn't seem like this was released.
>>
>> I was wondering if this code is avilable somewhere to be used.
>>
>
> Hi,
>
> This should be in the most recent (.3) release (candidate), but it's
> not the default for the build. You must install from source, and do
>
> python setup.py build --with-cython
>
> Then
>
> python setup.py install

I don't find it anywhere in the documentation. We should add these
instructions to the front page.

Is there a specific cython version required, or should we include the
generated c code ?

Josef

Skipper Seabold

unread,
Jun 7, 2011, 1:46:48 PM6/7/11
to pystat...@googlegroups.com
On Tue, Jun 7, 2011 at 1:42 PM, Robert <phu...@gmail.com> wrote:
> hi!
>
> Thanks for the quick responses,
>
> the issue is that when you do --with-cython it looks for kalman_loglik
> but it isn't included
>
>    if compile_cython:
>        config.add_extension('tsa/kalmanf/kalman_loglike',
>                sources = ['scikits/statsmodels/tsa/kalmanf/
> kalman_loglike.c'],
>                include_dirs=[numpy.get_include()])
>
> from the file in:
>
> http://pypi.python.org/packages/source/s/scikits.statsmodels/scikits.statsmodels-0.3.0rc1.zip#md5=6c6e0ddcc404bbf9518b9ee670cda54c
>

Ah, ok. Thanks for the report.

Skipper

Robert

unread,
Jun 7, 2011, 1:47:30 PM6/7/11
to pystatsmodels
I'm not 100% sure, I was just looking at the setup.py included in that
zip file and it tried to get the .c file if you use the cython flag.

I figure you need to include the cython file.

I do have cython etc installed.

On Jun 7, 12:44 pm, josef.p...@gmail.com wrote:
> On Tue, Jun 7, 2011 at 1:27 PM, Skipper Seabold <jsseab...@gmail.com> wrote:
> > On Tue, Jun 7, 2011 at 1:18 PM, Robert <phub...@gmail.com> wrote:
> >> Hey All,
>
> >> I've been trying to use the scikits ARMA but I was hoping to use the
> >> "fast recursion" speed up in version .3. talked about in:
>
> >>http://old.nabble.com/Re%3A-fast-small-matrix-multiplication-with-cyt...
>
> >> However it doesn't seem like this was released.
>
> >> I was wondering if this code is avilable somewhere to be used.
>
> > Hi,
>
> > This should be in the most recent (.3) release (candidate), but it's
> > not the default for the build. You must install from source, and do
>
> > python setup.py build --with-cython
>
> > Then
>
> > python setup.py install
>
> I don't find it anywhere in the documentation. We should add these
> instructions to the front page.
>
> Is there a specific cython version required, or should we include the
> generated c code ?
>
> Josef
>
>
>
>
>
> > Please let us know if you don't see a significant speed-up. If the
> > model is correctly specified, you should see a much faster
> > convergence. In future releases, this with-cython flag won't be
> > necessary.
>
> > Skipper- Hide quoted text -
>
> - Show quoted text -- Hide quoted text -

Skipper Seabold

unread,
Jun 7, 2011, 1:50:04 PM6/7/11
to pystat...@googlegroups.com
On Tue, Jun 7, 2011 at 1:44 PM, <josef...@gmail.com> wrote:
> On Tue, Jun 7, 2011 at 1:27 PM, Skipper Seabold <jsse...@gmail.com> wrote:
>> On Tue, Jun 7, 2011 at 1:18 PM, Robert <phu...@gmail.com> wrote:
>>> Hey All,
>>>
>>> I've been trying to use the scikits ARMA but I was hoping to use the
>>> "fast recursion" speed up in version .3. talked about in:
>>>
>>> http://old.nabble.com/Re%3A-fast-small-matrix-multiplication-with-cython--p30421098.html
>>>
>>> However it doesn't seem like this was released.
>>>
>>> I was wondering if this code is avilable somewhere to be used.
>>>
>>
>> Hi,
>>
>> This should be in the most recent (.3) release (candidate), but it's
>> not the default for the build. You must install from source, and do
>>
>> python setup.py build --with-cython
>>
>> Then
>>
>> python setup.py install
>
> I don't find it anywhere in the documentation. We should add these
> instructions to the front page.
>
> Is there a specific cython version required, or should we include the
> generated c code ?
>

I think we should include both, though strictly speaking I don't think
the .pyx file is needed.

Skipper

josef...@gmail.com

unread,
Jun 7, 2011, 1:51:00 PM6/7/11
to pystat...@googlegroups.com

mis-communication here,
Since I never used it and didn't look carefully enough, I didn't
realize that setup.py requires that we ship the .c file.

either we need to generate the c file for the source distribution, or
require cython to be installed and use the pyx file in setup.py

Josef

>
> Skipper
>

Skipper Seabold

unread,
Jun 7, 2011, 1:52:00 PM6/7/11
to pystat...@googlegroups.com
> On Jun 7, 12:44 pm, josef.p...@gmail.com wrote:
>> On Tue, Jun 7, 2011 at 1:27 PM, Skipper Seabold <jsseab...@gmail.com> wrote:
>> > Hi,
>>
>> > This should be in the most recent (.3) release (candidate), but it's
>> > not the default for the build. You must install from source, and do
>>
>> > python setup.py build --with-cython
>>
>> > Then
>>
>> > python setup.py install
>>
>> I don't find it anywhere in the documentation. We should add these
>> instructions to the front page.
>>

It's mentioned in CHANGES, but should probably be in the INSTALL notes as well.

Skipper

josef...@gmail.com

unread,
Jun 7, 2011, 2:00:29 PM6/7/11
to pystat...@googlegroups.com

Sorry, I'm talking to fast, the c file is just missing from the manifest

The file is in the source repository
http://bazaar.launchpad.net/~scipystats/statsmodels/devel/files/head:/scikits/statsmodels/tsa/kalmanf/

I think you should be able to just download the c file and put it into
your statsmodels.

Thanks for reporting, we will get a new release out and I will check
better that the source distribution has all the files.

Josef


>
> Josef
>
>>
>> Skipper
>>
>

kalman_loglike.c

josef...@gmail.com

unread,
Jun 7, 2011, 2:07:44 PM6/7/11
to pystat...@googlegroups.com

Yes, I think we just need to add *.pyx and *.c in the global-include
of the manifest.in

Josef

>
> Skipper
>

josef...@gmail.com

unread,
Jun 7, 2011, 2:28:28 PM6/7/11
to pystat...@googlegroups.com

adding *.py *.pyx *.c to manifest.in takes care of this and also the
missing examples directory that Wes reported

global-include *.csv *.py *.txt *.pyx *.c

Josef

>
> Josef
>
>>
>> Skipper
>>
>

josef...@gmail.com

unread,
Jun 7, 2011, 2:34:34 PM6/7/11
to pystat...@googlegroups.com

A question for cython experts

The same .c file should be good on all python 2.5, 2.6 2.7 versions,
but for python 3.2 we have to rebuild the c source file from the pyx.
Is that correct?

Josef
>
> Josef
>
>>
>> Josef
>>
>>>
>>> Skipper
>>>
>>
>

Robert

unread,
Jun 7, 2011, 3:33:12 PM6/7/11
to pystatsmodels
Thank you all for the quick responses,

Investigating the cython code that I found in your repository it seems
that this is just rewriting the liklihood to be in cython.

http://old.nabble.com/Re%3A-fast-small-matrix-multiplication-with-cython--p30421098.html

how did you manage to get the liklihood function called less
frequently (in the last example you have here 18 calls of logliklihood
vs 90 calls). Was this a result of more effectively choosing the
optimizer? Or mainly just moving to putting the liklihood in cython.

(Just trying to understand what optimizations you ended up using that
made a difference so I can consider potential areas where I can also
work on improving the speed:))

Also would you happen to know why R's implementation of arima is so
dang fast? Are they just not using the kalman filter approach?
Unfortunately their code isn't even remotely documented.

-Rob

On Jun 7, 1:34 pm, josef.p...@gmail.com wrote:
> On Tue, Jun 7, 2011 at 2:28 PM,  <josef.p...@gmail.com> wrote:
> > On Tue, Jun 7, 2011 at 2:07 PM,  <josef.p...@gmail.com> wrote:
> >> On Tue, Jun 7, 2011 at 1:50 PM, Skipper Seabold <jsseab...@gmail.com> wrote:
> >>> On Tue, Jun 7, 2011 at 1:44 PM,  <josef.p...@gmail.com> wrote:
> >>>> On Tue, Jun 7, 2011 at 1:27 PM, Skipper Seabold <jsseab...@gmail.com> wrote:
> >>>>> On Tue, Jun 7, 2011 at 1:18 PM, Robert <phub...@gmail.com> wrote:
> >>>>>> Hey All,
>
> >>>>>> I've been trying to use the scikits ARMA but I was hoping to use the
> >>>>>> "fast recursion" speed up in version .3. talked about in:
>
> >>>>>>http://old.nabble.com/Re%3A-fast-small-matrix-multiplication-with-cyt...

Ralf Gommers

unread,
Jun 7, 2011, 3:45:48 PM6/7/11
to pystat...@googlegroups.com
Not that I'm an expert, but that's certainly not necessary. Cython generated C files are compatible with at least Python 2.4 - 3.x.

Cheers,
Ralf

josef...@gmail.com

unread,
Jun 7, 2011, 3:59:04 PM6/7/11
to pystat...@googlegroups.com

Thanks Ralf, one worry less

Josef
>
> Cheers,
> Ralf
>

Skipper Seabold

unread,
Jun 7, 2011, 4:18:58 PM6/7/11
to pystat...@googlegroups.com
On Tue, Jun 7, 2011 at 3:33 PM, Robert <phu...@gmail.com> wrote:
> Thank you all for the quick responses,
>
> Investigating the cython code that I found in your repository it seems
> that this is just rewriting the liklihood to be in cython.
>
> http://old.nabble.com/Re%3A-fast-small-matrix-multiplication-with-cython--p30421098.html
>
> how did you manage to get the liklihood function called less
> frequently (in the last example you have here 18 calls of logliklihood
> vs 90 calls). Was this a result of more effectively choosing the
> optimizer? Or mainly just moving to putting the liklihood in cython.
>
> (Just trying to understand what optimizations you ended up using that
> made a difference so I can consider potential areas where I can also
> work on improving the speed:))
>

Briefly (in the middle of studying), the biggest savings is from
taking advantage of the steady state in the (time-invariant) Kalman
filter. This means we can skip calling numpy.dot (a lot). I found
convergence to ss to happen in ~10-15 loops on average, rather than
re-estimating P nobs number of times, at each candidate for params.
See Durbin and Koopman's book for the details.

> Also would you happen to know why R's implementation of arima is so
> dang fast? Are they just not using the kalman filter approach?
> Unfortunately their code isn't even remotely documented.
>

Not in detail no. I would have to look at the docs again or browse the
code (arima vs arima0, whether it's actually exact likelihood, how
they determine starting parameters, etc.), but it shouldn't be *that
much* faster than ours with the optimizations (and set disp < 0 and
pick a good value for m in the optimizer). My guess is that it's all
done in C, the system matrix is set up and then all calls to the
optimizer and resultant loops. There also might be savings in the
approximation to the gradient/Hessian, and it probably uses a
different optimization algorithm (maybe, but it's probably full BFGS).
There are definitely a lot more optimizations that could be done to
speed this up (Fernando has provided a C drop-in for dot that I
haven't tried out yet, making sure that all copies are avoided...),
but I just haven't found the time and I am reasonably happy with the
speed here vs. say Stata. If you see any improvements, please let us
know. The X-11-arima code would be interesting to go through. It's
public domain, written in Fortran I believe, and very fast (though
limited in the number of observations it can handle).

Skipper

Robert

unread,
Jun 7, 2011, 4:53:28 PM6/7/11
to pystatsmodels
That is a very helpful response,

I appreciate the quick responses, and I'm thinking that maybe I just
missed the fact that you used this steady state stopping procedure. I
was able to find where you use it and I'll find a copy of that book to
investigate the algorithm:)

Still new to using kalman filters with ARMA.

Thanks,
Rob

On Jun 7, 3:18 pm, Skipper Seabold <jsseab...@gmail.com> wrote:
> On Tue, Jun 7, 2011 at 3:33 PM, Robert <phub...@gmail.com> wrote:
> > Thank you all for the quick responses,
>
> > Investigating the cython code that I found in your repository it seems
> > that this is just rewriting the liklihood to be in cython.
>
> >http://old.nabble.com/Re%3A-fast-small-matrix-multiplication-with-cyt...
Reply all
Reply to author
Forward
0 new messages