Symbolic probability expression

181 views
Skip to first unread message

Francesco Bonazzi

unread,
Dec 15, 2015, 1:05:20 PM12/15/15
to sympy
I started recently some work on a symbolic expression for variance and covariance in sympy.stats module:

https://github.com/sympy/sympy/pull/10247

The current stats module defines functions such as P (probability), E (expectation), variance, covariance, moment, and so on, to perform the integral given random variables or conditions on random variables.

I think it's convenient to also have the possibility to operate at a higher level of abstraction, by keeping unevaluated symbolic expressions and operating with their properties on them.

The idea is to define a class with the same name of the function, with capital first letter:
  • Expectation( ) vs expectation( )
  • Probability( ) vs probability( )
  • Variance( ) vs variance( )
  • ... and so on ...

The latter ones are the existing functions, whereas the first ones are classes that create the unevaluated expression.


The method .doit() calls the corresponding function to perform the integral.


A similar relationship already exists in SymPy:

  • Integral( ) vs integrate( )
  • Derivative( ) vs diff( )
  • Sum( ) vs sum( )

.doit() calls the function.


I'd like to have some feedback before going on. Do you think this is a good idea? Would you merge this once it's finished?

Jonathan Crall

unread,
Jan 6, 2016, 3:52:50 PM1/6/16
to sympy
I think its a pretty good idea. I'm not a sympy dev, but I stumbled on this thread because I was looking for some way to do this.

Aaron S. Meurer

unread,
Jan 6, 2016, 4:19:16 PM1/6/16
to sy...@googlegroups.com
It seems like a good idea to me. There are some identities that hold for variance and covariance independent of the probability distribution and it would be cool to be able to apply those identities symbolically. 

Aaron Meurer
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at https://groups.google.com/group/sympy.
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/d0dff984-3457-4e53-89ea-bb2b120beb1c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Qcode

unread,
Jan 10, 2016, 9:38:15 AM1/10/16
to sympy
Hello,

It's good idea.
I have time to help you.

Kind regards
Kevin

Francesco Bonazzi

unread,
Jan 10, 2016, 2:49:26 PM1/10/16
to sympy


On Sunday, 10 January 2016 15:38:15 UTC+1, Qcode wrote:
Hello,

It's good idea.
I have time to help you.

Great! You're welcome to join the discussion on github:

https://github.com/sympy/sympy/pull/10247

 

QCode

unread,
Jan 10, 2016, 7:57:21 PM1/10/16
to sy...@googlegroups.com
Hi,

To add to the forum, my comments to Francesco:
2 types of computation: algebric forms on expectation and integral computation.

Variance and Covariance are algebric forms of Expectation, mainly this is the work on Expectation forms simplifications.
Main issue occurs when X and Y are not independant...

On the integral, even there is integral calculation in Sympy, sometimes integral does not calculate well,
without semi-manual simplification, especially multiple integral.
Additionnally, the boundary can be complicated...


Just to understand, what is the current implementation ?
and What is the current target ?

Think this is a good idea, but need to think/agree on the details first.

Thanks
Kind regards
Kevin



Monday, January 11, 2016 4:49 AM


On Sunday, 10 January 2016 15:38:15 UTC+1, Qcode wrote:

Great! You're welcome to join the discussion on github:

https://github.com/sympy/sympy/pull/10247

 
--
You received this message because you are subscribed to a topic in the Google Groups "sympy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sympy/iD4weUJbKzg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sympy+un...@googlegroups.com.

To post to this group, send email to sy...@googlegroups.com.
Visit this group at https://groups.google.com/group/sympy.

For more options, visit https://groups.google.com/d/optout.
Sunday, January 10, 2016 5:51 PM
Hello,

It's good idea.
I have time to help you.

Kind regards
Kevin

On Wednesday, December 16, 2015 at 3:05:20 AM UTC+9, Francesco Bonazzi wrote:
--
You received this message because you are subscribed to a topic in the Google Groups "sympy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sympy/iD4weUJbKzg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sympy+un...@googlegroups.com.

To post to this group, send email to sy...@googlegroups.com.
Visit this group at https://groups.google.com/group/sympy.

Qcode

unread,
Jan 11, 2016, 5:16:47 AM1/11/16
to sympy
Hi Francesco,

Can  you higlights those 2 items ?


1)  Do you expect to manage algebric forms of expectation as base structure  ?

2)  How would you expect to manage when X and Y are not independant ?

3) What is the current status of functionnality/architecture ?

Kind regards
Kevin



On Monday, January 11, 2016 at 9:57:21 AM UTC+9, Qcode wrote:
Hi,

To add to the forum, my comments to Francesco:
2 types of computation: algebric forms on expectation and integral computation.

Variance and Covariance are algebric forms of Expectation, mainly this is the work on Expectation forms simplifications.
Main issue occurs when X and Y are not independant...

On the integral, even there is integral calculation in Sympy, sometimes integral does not calculate well,
without semi-manual simplification, especially multiple integral.
Additionnally, the boundary can be complicated...


Just to understand, what is the current implementation ?
and What is the current target ?

Think this is a good idea, but need to think/agree on the details first.

Thanks
Kind regards
Kevin



Monday, January 11, 2016 4:49 AM


On Sunday, 10 January 2016 15:38:15 UTC+1, Qcode wrote:

Great! You're welcome to join the discussion on github:

https://github.com/sympy/sympy/pull/10247

 
--
You received this message because you are subscribed to a topic in the Google Groups "sympy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sympy/iD4weUJbKzg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sympy+unsubscribe@googlegroups.com.

To post to this group, send email to sy...@googlegroups.com.
Visit this group at https://groups.google.com/group/sympy.
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/a4b19f8c-6e31-4640-a17f-57ef0b9088a6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Sunday, January 10, 2016 5:51 PM
Hello,

It's good idea.
I have time to help you.

Kind regards
Kevin

On Wednesday, December 16, 2015 at 3:05:20 AM UTC+9, Francesco Bonazzi wrote:
--
You received this message because you are subscribed to a topic in the Google Groups "sympy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sympy/iD4weUJbKzg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sympy+unsubscribe@googlegroups.com.

Francesco Bonazzi

unread,
Jan 11, 2016, 12:37:34 PM1/11/16
to sympy


On Monday, 11 January 2016 11:16:47 UTC+1, Qcode wrote:
Hi Francesco,

Can  you higlights those 2 items ?


1)  Do you expect to manage algebric forms of expectation as base structure  ?

My intention is to add expressions. Once you have a class for Variance and a class for Expectation, you can easily write a function to expand the variance expression into expectations, and (maybe less easily) vice versa.


2)  How would you expect to manage when X and Y are not independant ?

Well, that's a hard problem. I am not away of dependent RV management in the current code. RV and created by RandomSymbol or something similar, and they are meant to represent a single variable distribution.

Adding a concept like independence may require some review of the existing code, and has to be pondered carefully.
 

3) What is the current status of functionnality/architecture ?

Francesco Bonazzi

unread,
Jan 11, 2016, 12:42:55 PM1/11/16
to sympy


On Monday, 11 January 2016 01:57:21 UTC+1, Qcode wrote:

On the integral, even there is integral calculation in Sympy, sometimes integral does not calculate well,
without semi-manual simplification, especially multiple integral.
Additionnally, the boundary can be complicated...


Integrals are limited by SymPy's current integration abilities, while boundary conditions are limited by the solvers.

Maybe one could add pre-computed cumulative distribution functions to overcome the limitations of current integration algorithms?

and What is the current target ?

I don't know if there is a development target. Contributors are free to join in adding new feature to any part of SymPy.
Reply all
Reply to author
Forward
0 new messages