GSoC 2018

Shilpa Sangappa

unread,

Jan 22, 2018, 12:19:51 PM1/22/18

to sympy

Hello,

My name is Shilpa Sangappa and I intend to contribute to sympy through GSoC 2018.

I have 6+ years of python programming experience.

I am currently pursuing my masters in Machine learning and intelligent systems from Ramaiah University, Bangalore.

I wish to contribute to the probability section of mathematics in sympy.

Probability and statistics is one of the core subject of machine learning. I wish to enhace my understanding and develop expertise by contributing to the stats module of sympy.

With Regards,

Shilpa.

Leonid Kovalev

unread,

Jan 22, 2018, 1:34:50 PM1/22/18

to sympy

Thanks for your interest. An issue that recently came up in the Probability module is sampling from a Poisson distribution. It used to not work at all, and now it does but the algorithm is not efficient when the parameter lamda is large. For example:

from sympy.stats import *
sample(Poisson('x', 1000))

can take a while to return.

Usually one can sample by generating a Uniform(0, 1) random number u, and then apply the inverse of the cumulative distribution function (CDF) to u. But there isn't a formula for the inverse of the of a Poisson random variable. The current algorithm simply goes over all integers looking for the first one where CDF(n) >= u. There ought to be a better way of doing this.

The first idea that comes to mind is to make giant steps (in power of 2) until CDF(n) >= u is reached, and then refine by bisection. But perhaps it's better to do research first, there is probably an algorithm out there that we can use. Maybe R has it? https://www.r-project.org/

Shilpa Sangappa

unread,

Jan 23, 2018, 5:27:34 AM1/23/18

to sympy

Thanks Leonid, for pointing me to this issue. I will start looking into it.

What is the issue id?

With Regards,

Shilpa.

P.S.: I didn't get a notification of your reply to my e-mail. Is there any settings that I need to do?

Leonid Kovalev

unread,

Jan 23, 2018, 6:03:33 AM1/23/18

to sy...@googlegroups.com

There is a discussion at https://github.com/sympy/sympy/pull/13943 but no separate issue (yet).

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+unsubscribe@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at https://groups.google.com/group/sympy.
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/0cb65d20-5126-4387-8aa4-f2e5ec1b82cc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Francesco Bonazzi

unread,

Jan 24, 2018, 3:54:48 AM1/24/18

to sympy

Some ideas for the probability module:

hyperparameters
random matrices
stochastic processes (through indexed random variables)

On Tuesday, 23 January 2018 12:03:33 UTC+1, Leonid Kovalev wrote:

There is a discussion at https://github.com/sympy/sympy/pull/13943 but no separate issue (yet).

On Jan 23, 2018 5:27 AM, "Shilpa Sangappa" <shilpa....@gmail.com> wrote:

Thanks Leonid, for pointing me to this issue. I will start looking into it.
What is the issue id?

With Regards,
Shilpa.

P.S.: I didn't get a notification of your reply to my e-mail. Is there any settings that I need to do?

On Tuesday, January 23, 2018 at 12:04:50 AM UTC+5:30, Leonid Kovalev wrote:
Thanks for your interest. An issue that recently came up in the Probability module is sampling from a Poisson distribution. It used to not work at all, and now it does but the algorithm is not efficient when the parameter lamda is large. For example:

from sympy.stats import * sample(Poisson('x', 1000))

can take a while to return.

Usually one can sample by generating a Uniform(0, 1) random number u, and then apply the inverse of the cumulative distribution function (CDF) to u. But there isn't a formula for the inverse of the of a Poisson random variable. The current algorithm simply goes over all integers looking for the first one where CDF(n) >= u. There ought to be a better way of doing this.

The first idea that comes to mind is to make giant steps (in power of 2) until CDF(n) >= u is reached, and then refine by bisection. But perhaps it's better to do research first, there is probably an algorithm out there that we can use. Maybe R has it? https://www.r-project.org/

--
You received this message because you are subscribed to the Google Groups "sympy" group.

To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.

Shilpa Sangappa

unread,

Jan 25, 2018, 1:26:33 AM1/25/18

to sympy

Thanks for the pointer, Francesco.

I am in the process of setting up my environment.

Hope to start soon.

With Regards,

Shilpa.

Nirvan Anjirbag

unread,

Jan 27, 2018, 2:37:41 PM1/27/18

to sympy

Hi,

I'm Nirvan, a sophomore from BITS Pilani, Goa. I'd like to do a project in sympy as part of GSoC this year. Can someone please tell me where I can start contributing from? Please tell me if there's some specific issue or idea I should look into.

Thank you

Reply all

Reply to author

Forward