GSoC 2018

175 views
Skip to first unread message

Shilpa Sangappa

unread,
Jan 22, 2018, 12:19:51 PM1/22/18
to sympy
Hello,

My name is Shilpa Sangappa and I intend to contribute to sympy through GSoC 2018.

I have 6+ years of python programming experience.

I am currently pursuing my masters in Machine learning and intelligent systems from Ramaiah University, Bangalore.

I wish to contribute to the probability section of mathematics in sympy.

Probability and statistics is one of the core subject of machine learning. I wish to enhace my understanding and develop expertise by contributing to the stats module of sympy.

With Regards,
Shilpa.

Leonid Kovalev

unread,
Jan 22, 2018, 1:34:50 PM1/22/18
to sympy
Thanks for your interest. An issue that recently came up in the Probability module is sampling from a Poisson distribution. It used to not work at all, and now it does but the algorithm is not efficient when the parameter lamda is large. For example:

from sympy.stats import *
sample
(Poisson('x', 1000))

can take a while to return. 

Usually one can sample by generating a Uniform(0, 1) random number u, and then apply the inverse of the cumulative distribution function (CDF) to u. But there isn't a formula for the inverse of the of a Poisson random variable. The current algorithm simply goes over all integers looking for the first one where CDF(n) >= u. There ought to be a better way of doing this. 

The first idea that comes to mind is to make giant steps (in power of 2) until CDF(n) >= u is reached, and then refine by bisection. But perhaps it's better to do research first, there is probably an algorithm out there that we can use. Maybe R has it? https://www.r-project.org/ 


 

Shilpa Sangappa

unread,
Jan 23, 2018, 5:27:34 AM1/23/18
to sympy
Thanks Leonid, for pointing me to this issue. I will start looking into it.
What is the issue id? 

With Regards,
Shilpa.

P.S.: I didn't get a notification of your reply to my e-mail. Is there any settings that I need to do?

Leonid Kovalev

unread,
Jan 23, 2018, 6:03:33 AM1/23/18
to sy...@googlegroups.com
There is a discussion at https://github.com/sympy/sympy/pull/13943 but no separate issue (yet). 

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+unsubscribe@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at https://groups.google.com/group/sympy.
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/0cb65d20-5126-4387-8aa4-f2e5ec1b82cc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Francesco Bonazzi

unread,
Jan 24, 2018, 3:54:48 AM1/24/18
to sympy
Some ideas for the probability module:

  • hyperparameters
  • random matrices
  • stochastic processes (through indexed random variables)

On Tuesday, 23 January 2018 12:03:33 UTC+1, Leonid Kovalev wrote:
There is a discussion at https://github.com/sympy/sympy/pull/13943 but no separate issue (yet). 
On Jan 23, 2018 5:27 AM, "Shilpa Sangappa" <shilpa....@gmail.com> wrote:
Thanks Leonid, for pointing me to this issue. I will start looking into it.
What is the issue id? 

With Regards,
Shilpa.

P.S.: I didn't get a notification of your reply to my e-mail. Is there any settings that I need to do?

On Tuesday, January 23, 2018 at 12:04:50 AM UTC+5:30, Leonid Kovalev wrote:
Thanks for your interest. An issue that recently came up in the Probability module is sampling from a Poisson distribution. It used to not work at all, and now it does but the algorithm is not efficient when the parameter lamda is large. For example:

from sympy.stats import *
sample
(Poisson('x', 1000))

can take a while to return. 

Usually one can sample by generating a Uniform(0, 1) random number u, and then apply the inverse of the cumulative distribution function (CDF) to u. But there isn't a formula for the inverse of the of a Poisson random variable. The current algorithm simply goes over all integers looking for the first one where CDF(n) >= u. There ought to be a better way of doing this. 

The first idea that comes to mind is to make giant steps (in power of 2) until CDF(n) >= u is reached, and then refine by bisection. But perhaps it's better to do research first, there is probably an algorithm out there that we can use. Maybe R has it? https://www.r-project.org/ 


 

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.

Shilpa Sangappa

unread,
Jan 25, 2018, 1:26:33 AM1/25/18
to sympy
Thanks for the pointer, Francesco.

I am in the process of setting up my environment. 
Hope to start soon.

With Regards,
Shilpa.

Nirvan Anjirbag

unread,
Jan 27, 2018, 2:37:41 PM1/27/18
to sympy
Hi,
   I'm Nirvan, a sophomore from BITS Pilani, Goa.  I'd like to do a project in sympy as part of GSoC this year. Can someone please tell me where I can start contributing from? Please tell me if there's some specific issue or idea I should look into.

Thank you
Reply all
Reply to author
Forward
0 new messages