Statsmodels has been at version 0.4 for ages, and MANY very helpful additions have been introduced since then. Is there any plan to bring it all together, and get a version 0.5 out?
On Sat, Mar 23, 2013 at 11:40 AM, Skipper Seabold <jsse...@gmail.com> wrote:
Can you give me a better sense of what you have in mind here. I> On Sat, Mar 23, 2013 at 9:34 AM, Thomas Haslwanter
> <thomas.h...@gmail.com> wrote:
>>
>> Statsmodels has been at version 0.4 for ages, and MANY very helpful
>> additions have been introduced since then. Is there any plan to bring it all
>> together, and get a version 0.5 out?
>
>
> I was hoping to be able to find some time at the beginning of the month to
> close a few remaining warts with the 0.5 milestone tag (patches welcome),
> but that didn't happen. Hopefully, ASAP, though I'm pretty well buried at
> the moment. From my end I'll try to close up the things I want to do this
> week. No new features, but a couple of bothersome warts/bugs. Josef,
> thoughts?
>
> We could use a dedicated release manager to pitch in with bug fixes and help
> keep us on schedule. Volunteers welcome.
clicked on the 0.5 Milestone tag on the issue tracker but got a list
of 84 issues back. Surely we can't fix all of those before release.
At a more general level, do you feel happy with the current release
cycle.
I mean, independently of the amount of code written. I'm asking
because I saw a couple people mention (on blog posts and
stackoverflow) that the pace of development of SM is really slow.
That's partially a function of manpower, but almost certainly also an
impression that people get because of release structure. For example,
it makes sense to have a 0.5 with the formula framework, but a quick
0.5.1 release with NBin and QuantReg and MosaicPlot would both make
features available quickly and convey a sense that this is an active
project.
Of course, I really don't know what kind of work is involved in doing
releases...
and the online docs are not updating correctly.
(aside: I looked recently into adding cython based lowess in the same
way as the other cython extensions.)
I still don't like fast and sloppy.
>
> Skipper
>
>
>>
>> At a more general level, do you feel happy with the current release
>> cycle.
>
>
> No. I would like to switch to a time-based release structure, though life is
> still often too busy for me to commit to this. It's clear that once a year
> does not cut it. We need to get over our slow and perfect is better than
> fast and sloppy. Sometimes fast and sloppy gets the job done. Since we've
> merged in new features that have been on the TODO list, we've gotten several
> corner case bug reports, which we've been able to fix. This is a good thing.
> We're still nowhere near 1.0.
As user I don't like a library where you have to gamble whether the
numbers are correct (with more than a small probability).
As developer, it often makes life more difficult down the road.
(I didn't change my opinion about stats libraries since my early scipy days.)
I'd rather have people that complain about the slow pace, than users
complain about buggy results, and a library that is not trustworthy.
I haven't read those comments in a some time.
It would be easier if we had a fast response and maintenance team,
that consists of more than ... developers.
I've responded to these objections numerous times, and I've shown you how to build on windows and I've written instructions no how to do this. You do NOT have to rebuild every time you change something. You only have to rebuild if change things that need compiling. On the other hand, right now I DO have to rebuild everytime because the way we have the compiler checks is broken. If you have a compiler, the Cython extensions get built twice on build and install. Let's get passed this. Performance is going to become more critical as we become better and actually competitive with alternatives. None of my students now want to learn Python. They all ask me about julia...
If you don't install for development then you don't need to rebuild.
I don't see that the cython extension gets build twice (unless you
tell setuptools/distutils to build it twice.)
All the previous issues that we had, turned out to come from other
packages and were unrelated to our way contional building.
About performance: we still have some slack, and I don't think going
through pandas and formulas in loops is very "performant" but nobody
complains.
And I don't argue about the fashionable programming language of the day.
(my last and only comment about julia was: no classes and no namespaces
and I just barely start to figure out the dispatch system of R)
On Sat, Mar 23, 2013 at 12:10 PM, <josef...@gmail.com> wrote:
and the online docs are not updating correctly.Low priority for me. I have again two presentations on my work on Monday (and not all results) and numpy/scipy currently broken. Haven't forgotten.
On Tue, Mar 26, 2013 at 10:48 AM, Skipper Seabold <jsse...@gmail.com> wrote:Thanks, It's nice to see the updated documentation for current master.
> On Sat, Mar 23, 2013 at 3:32 PM, Skipper Seabold <jsse...@gmail.com>
> wrote:
>>
>> On Sat, Mar 23, 2013 at 12:10 PM, <josef...@gmail.com> wrote:
>>>
>>>
> <snip>
>>>
>>> and the online docs are not updating correctly.
>>
>>
>> Low priority for me. I have again two presentations on my work on Monday
>> (and not all results) and numpy/scipy currently broken. Haven't forgotten.
>>
>
>
> Fixed. Wasn't thinking in how I was using virtualenv in a subprocess. Am
> thinking now.
Initially I wanted to keep it together with the basic statistics and
>
> http://statsmodels.sourceforge.net/devel/stats.html#basic-statistics-and-t-tests-with-frequency-weights
>
> The power stuff probably needs a new section header?
(parametric) tests, since I started to write the power functions for
those tests. When the power part gets larger, it will need a new
section.