Re: GSOC 2020 Stats module and Rough Timeline

35 views
Skip to first unread message

Shekhar Rajak

unread,
Mar 5, 2020, 11:19:36 PM3/5/20
to Sympy, Smit Lunagariya 5-Yr IDD. Math. Sciences., IIT(BHU), Varanas
Hi Sumit,

Happy to see your interest and rough timeline you have prepared. Please go through each and every link, given in the idea list and start working on easy fix issues and/or any enhancement part for the project to understand the codebase. It will help you to understand the bottleneck of the project and great ideas to solve the issues.

You can also open discussion issue and ping us to discuss each project tasks separately. Having rough/WIP PRs to showcase what do you want to do and how do you want to achieve the project goal is plus.

Looking forward to see your complete project. Keep committing!

Regards,
Contact : +918142478937
Skype: shekhar.rajak1



On Tuesday, 3 March 2020, 7:13:14 pm GMT+5:30, Smit Lunagariya 5-Yr IDD. Math. Sciences., IIT(BHU), Varanas <smitlunaga...@itbhu.ac.in> wrote:


Hi,
I am Smit Lunagariya, an undergraduate student from Mathematics and Computing Engineering, Indian Institute Technology-BHU. I am programming in python for one year. I am interested in Mathematics and its symbolic computation, specifically in Statistics. I have experience in Probabilistic Machine Learning and Deep Learning.
I have undertaken several relevant Courses related such as Probability and Statistics, Abstract Algebra, Engineering Mathematics, NPTEL -Stochastic Process By Dr. S. Dharmaraja (IIT-Delhi), Data Structures and Information Technology Workshop (on Python). Currently, I am enrolled in several institute courses such as Algorithms, Numerical Techniques, Operating Systems, and Mathematical Methods.
I have been contributing to sympy since December 2019 and got quite familiar with the contributing guidelines and workflow.
I would like to discuss related to the idea for GSoC 2020 in the stats module. I have prepared a rough timeline regarding this summer project.
Community Bonding Period :
As many distributions can be added in the stats module under Discrete and Continuous Random variable, I would like to add them as some of them might be useful in further implementation of Joint Multivariate Distributions. They are:

  1. Borel (Discrete)
  2. Conway-Maxwell-Poisson (Discrete)
  3. Gauss-Kuzmin (Discrete)
  4. Lomax (Continuous)
  5. Feller-Pareto (Continuous)
  6. Bounded Pareto (Continuous)
  7. Symmetric Pareto (Continuous)

Also, I would add the .doit() method in class Probability.
While adding these distributions I would also work upon increasing the code coverage by adding tests and also tests of missing lines from the crv.py, drv.py, frv.py, drv_types.py, crv_types.py and frv_types.py.
Phase 1 :
Currently, the stats module supports Markov chains and Bernoulli Process as the stochastic processes. I would like to add more of such stochastic processes which include:

  1. Poisson Process
  2. Birth-Death Process
  3. Wiener Process
  4. Levy Process
  5. Random Walks
  6. Gamma Process
  7. Queueing Process

While adding the above process, I would also work upon adding their related tests and increasing the code coverage of the stochastic_process_types.py.
Phase 2:
During the beginning of this phase, I would try to clean up the remaining part of Phase 1 and would then start implementing the following portions:

  1. Work upon Adding assumptions of the dependence of random variables.
  2. Work upon Adding support of Compound Distributions and adding more examples related to it.
  3. Currently, Joint distribution lacks a well-defined framework, I would work upon changing the specific portions and make it more general for more distributions to add upon it.

While discussing the API and implementing it, I would ensure to add the necessary tests and work on increasing the code coverage.
Phase 3 :
During the beginning of this phase, I would try to clean up the remaining part of Phase 2 and would then start implementing the following portions:

  1. Adding more multivariate distributions which include:
    1. Wishart
    2. Matrix Gamma
    3. Normal Inverse Gamma
    4. Inverse Wishart
    5. Normal Wishart
    6. Normal Inverse Wishart
    7. Inverse Matrix Gamma
  2. Adding densities of Circular ensembles in Random Matrices.
  3. Adding sampling methods to Continuous Random variables from external libraries such as pyc3, NumPy, and scipy.

While adding these distributions I would also work upon increasing the code coverage by adding tests and also tests of missing lines from the joint_rv.py and joint_rv_types.py.

Finally, I would complete the remaining work before the final evaluation.
I have provided the rough timeline which I would like to follow during this project. Changes and the addition of ideas and suggestions are appreciated.
Thank you.
Please share your views on this.

Regards,

Smit Lunagariya

Reply all
Reply to author
Forward
0 new messages