Functional Boxplot

27 views
Skip to first unread message

roy.pa...@gmail.com

unread,
Aug 11, 2017, 7:46:32 AM8/11/17
to pystatsmodels
Hi everyone,

I just put on GitHub a version of functional Highest Density Region (HDR) Boxplot:


From what I saw in your doc, this could interest you as it is the only method missing:


For now, it uses scikit-learn for PCA and kernel density estimation, there is also a branch using openTURNS.
But you can easily replace this dependency with your methods.

Feel free to take this :) (licence MIT)

josef...@gmail.com

unread,
Aug 11, 2017, 10:38:09 AM8/11/17
to pystatsmodels
Thanks, I opened an issue for it.
PRs welcome, I don't know when I would get around to looking at the details.

We don't have much support for functional data besides the graphics function that Ralph had added.

Related asides:

I recently opened an issue for the wishlist
This is currently mainly a getting organized issue, and is related to support additional types of response variables.
I would like to push for and organize these kind of models and data analysis into new subpackages like

- compositional
- circular
- functional

and whatever else will show up. Only compositional has currently a discussion and a PR to get started.
Structure and content of those subdirectories are still an open question.

less related aside:
I still have no clear idea how to organize models like selection models, tobit, heckman, two-part models and so on.
We don't have a `limdep` (limited dependent variables) folder and those models might end up next to the main model, although I don't think that is completely appropriate.

Josef

roy.pa...@gmail.com

unread,
Aug 17, 2017, 6:51:59 PM8/17/17
to pystatsmodels
Thanks for the reply. I will try to find some time to PR this.

Pamphile

roy.pa...@gmail.com

unread,
Aug 18, 2017, 9:26:09 AM8/18/17
to pystatsmodels
Before submitting a PR, I have cleaned the code to use only some statsmodels instead of sklearn.


I have to fix some tests but it is working as expected.

For now, I only have a few concerns:
  1. How to return the plots: Is it okay to return a single object containing all figures?
  2. I did not find a way to simply define the variance level in PCA, so I use the flag ncomp. Is it okay? We could compute everything and truncate the output (this is what sklearn does).
  3. The projection part is a bit hackish, I am temporary replacing the PCA.factors and then call PCA.projection(). Okay?
Thanks for the help on this.

Pamphile


p.s. is it the right place to talk about this or should I create a PR now and discuss there or should we discuss in the issue?

josef...@gmail.com

unread,
Aug 18, 2017, 9:52:32 AM8/18/17
to pystatsmodels
On Fri, Aug 18, 2017 at 9:26 AM, <roy.pa...@gmail.com> wrote:
Before submitting a PR, I have cleaned the code to use only some statsmodels instead of sklearn.


I have to fix some tests but it is working as expected.

For now, I only have a few concerns:
  1. How to return the plots: Is it okay to return a single object containing all figures?
  2. I did not find a way to simply define the variance level in PCA, so I use the flag ncomp. Is it okay? We could compute everything and truncate the output (this is what sklearn does).
  3. The projection part is a bit hackish, I am temporary replacing the PCA.factors and then call PCA.projection(). Okay?
Thanks for the help on this.

Pamphile


p.s. is it the right place to talk about this or should I create a PR now and discuss there or should we discuss in the issue?

I replied on the issue but not yet on specifics. The best place for discussing details and implementation is on github. Often we have preliminary discussion in an issue and then discussing details in the PR.

Josef
Reply all
Reply to author
Forward
0 new messages