Integrate over Union of Interval's

46 views
Skip to first unread message

Francesco Bonazzi

unread,
Mar 25, 2015, 7:13:50 AM3/25/15
to sy...@googlegroups.com
I opened this issue a few days ago:
https://github.com/sympy/sympy/issues/9189

I bumped into this problem because it was raised by the stats module. Apparently the integration algorithm is unable to cope with unions of intervals.

I was wondering, is it correct to pass a union of intervals to the integrator? The stats module calls such integrals. Are there any plans to support them?

Aaron Meurer

unread,
Mar 25, 2015, 12:29:18 PM3/25/15
to sy...@googlegroups.com
Wow, I didn't know integrate supported Intervals. If we are going to
support sets, we might as well support all of them (at least
unevaluated). A union of intervals is easy if the values are
computable (i.e., if you can determine if the union is disjoint).
Otherwise, it's complicated. Integrate(1, (x, Union(Interval(a, b),
Interval(c, d)))) depends on the relative order of a, b, c, and d.
Even the current integrate(1, (x, Interval(a, b))) is wrong if b > a.

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at http://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/80cd6586-749a-404f-97d4-0b1bafcf30ab%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Francesco Bonazzi

unread,
Mar 25, 2015, 2:21:03 PM3/25/15
to sy...@googlegroups.com


On Wednesday, March 25, 2015 at 5:29:18 PM UTC+1, Aaron Meurer wrote:
Wow, I didn't know integrate supported Intervals.

I had a look at the code, I think that behaviour is unintended.

The point is, the stats module apparently calls integrate with intervals to determine probabilities.

If we are going to
support sets, we might as well support all of them (at least
unevaluated). A union of intervals is easy if the values are
computable (i.e., if you can determine if the union is disjoint).
Otherwise, it's complicated. Integrate(1, (x, Union(Interval(a, b),
Interval(c, d)))) depends on the relative order of a, b, c, and d.
Even the current integrate(1, (x, Interval(a, b))) is wrong if b > a.

I guess that if the relative order cannot be determined, the integral should remain unevaluated.

Is is worth if I try to create a PR to accept unions of intervals?

Aaron Meurer

unread,
Mar 25, 2015, 6:33:00 PM3/25/15
to sy...@googlegroups.com
If the code is really unintended I suppose we should first sort out if
we really want to do this, and with this syntax. But otherwise, +1 for
a PR fixing these issues.

Aaron Meurer

>
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at http://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/361c5845-693a-4e5a-910b-b2abba807e48%40googlegroups.com.

Francesco Bonazzi

unread,
Mar 26, 2015, 8:42:40 AM3/26/15
to sy...@googlegroups.com

Well, I was a bit surprised too, but the stats module apparently does so, as shown in this example:

In [1]: from sympy.stats import *

In [2]: var('sigma', positive=True)
Out[2]: σ

In [3]: N = Normal('X', mu, sigma)

In [6]: P(N**2>1, evaluate=False)
Out[6]:
(-∞, -1) (1, ∞)                    
       
                           
       
                       2    
       
               -(z - μ)    
       
               ──────────  
       
                     2      
       
          ___     2⋅σ      
       
        ╲╱ 2 ⋅ℯ            
       
        ───────────────── dz
       
                ___        
       
            2⋅╲╱ π ⋅σ      
       



In [7]: srepr(P(N**2>1, evaluate=False))
Out[7]: "Integral(Mul(Rational(1, 2), Pow(Integer(2), Rational(1, 2)), Pow(pi, Rational(-1, 2)), Pow(Symbol('sigma'), Integer(-1)), exp(Mul(Integer(-1), Rational(1, 2), Pow(Symbol('sigma'), Integer(-2)), Pow(Add(Dummy('z'), Mul(Integer(-1), Symbol('mu'))), Integer(2))))), Tuple(Dummy('z'), Union(Interval(-oo, Integer(-1), S.true, S.true), Interval(Integer(1), oo, S.true, S.true))))"


Apart the fact that such an integral looks wrong to me, i.e. there is no account for the random variable being squared (or am I missing something?), it looks like SymPy is OK with intervals, but not with unions of intervals:

https://github.com/sympy/sympy/blob/9242d31f6d31a1d9c3464264a5a6e61eab8acfb8/sympy/concrete/expr_with_limits.py#L37

That's the point where an Interval gets parsed by the integration algorithm.

I think it's an easy fix to add the processing for unions of intervals.

Matthew Rocklin

unread,
Mar 26, 2015, 4:12:37 PM3/26/15
to sy...@googlegroups.com
You don't need to square the random variable to compute the result.  You just need to integrate the pdf over x < -1 and x > 1

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at http://groups.google.com/group/sympy.

Francesco Bonazzi

unread,
Mar 26, 2015, 7:54:46 PM3/26/15
to sy...@googlegroups.com
Matthew... what do you think of the union of intervals as an alternative to the usual ranges in integrate/Integral?

I suppose that you wrote the code outputting that integral, which currently does not work, and I want to make it work.

I am undecided on whether to edit sympy.stats in order to give Integral( ... , (x, -oo, -1)) + Integral( ... , (x, 1, oo)) instead of Integral( ... , (x, Union(Interval(-oo, -1), Interval(1, oo)))).

On the other hand, this alternative notation may be useful. Unfortunately it would require some algorithmic changes and I am a bit wary about a substantial edit of the integration algorithm.

Aaron Meurer

unread,
Mar 26, 2015, 9:50:56 PM3/26/15
to sy...@googlegroups.com
I think integrating over sets is a useful thing to allow, but we
definitely need to be more careful about symbolic intervals. And given
that, it probably means that sympy.stats should just do normal
integrals, since it can make assumptions about symbolic intervals that
won't be present when passed to Integral.

Aaron Meruer
> https://groups.google.com/d/msgid/sympy/c3701c8c-3577-45a0-afdd-2a000bcd32d8%40googlegroups.com.

Matthew Rocklin

unread,
Mar 27, 2015, 9:57:38 PM3/27/15
to sy...@googlegroups.com
>  Matthew... what do you think of the union of intervals as an alternative to the usual ranges in integrate/Integral?

Seems like a decent plan.  I haven't been actively working on stats in a while so I don't have strong opinions here.


Reply all
Reply to author
Forward
0 new messages