Fractions of sympy.Product of sympy.tensor.Indexed variables seem to not cancel correctly.

51 views
Skip to first unread message

Jonathan Crall

unread,
Feb 9, 2016, 3:45:22 PM2/9/16
to sympy
I'm having an issue where an expresion with sympy.tensor.Indexed variables does not seem to simplify correctly. 
It is likely that I'm just doing something incorrect, so I was wondering if anyone could help me figure this out.

I'm using sympy to just generate some simple equations based on Bayes rule. 

N is an event and I'm given a set of observations \set{X} = \{\ldots d_i \ldots }\. 

I'm using an indexed base to represent the set of observations as an array. 
I'm then using sympy.Product to multiply the probability of these observations together
(I'm assuming independence), so I create an Idx variable ``i`` and several sets of varaiables
that are indexed by ``i``. However, at the end of this script. It looks like ``P(di)[i]`` should be canceled out by a
simplification step, but it is not. 

Here is the script: 


import sympy
from sympy.tensor import IndexedBase, Idx  # NOQA
from sympy import tensor
from sympy import *  # NOQA
cardX = sympy.symbols('|X|', integer=True, positive=True, finite=True)
start, stop = 1, cardX
i = Idx(sympy.symbols('i', integer=True, finite=True))


def psym(expr):
    s = symbols(expr, real=True, finite=True, negative=False)
    return s


def IdxBase(expr):
    #s = tensor.IndexedBase(expr, shape=(cardX,))[i]
    s = tensor.IndexedBase(expr, shape=(1,))[i]
    return s


if 1:
    def Prod(s):
        return sympy.Product(s, (i, start, stop))
else:
    def Prod(s):
        return sympy.prod([s.subs(i, i_) for i_ in range(1, 4)])

P_N          = psym('P(N)')
P_X          = psym('P(X)')
P_X_given_N  = psym('P(X|N)')
P_N_given_X  = psym('P(N|X)')
P_di         = IdxBase('P(di)')
P_N_given_di = IdxBase('P(N|di)')
P_di_given_N = IdxBase('P(di|N)')

pprint = sympy.pretty_print
print('''
-----------------------
OUTPUT OF SVM: P(N | di)
-----------------------
''')
P_N_given_di_ = (P_di_given_N * P_N) / P_di
pprint(Eq(P_N_given_di, P_N_given_di_))

print('''
-----------------------
REARANGE USING BAYES P(di | N)
-----------------------
''')
P_di_given_N_ = (P_N_given_di * P_di) / P_N
pprint(Eq(P_di_given_N, P_di_given_N_))

print('''
-----------------------
AGGREGATE USING INDEPENDENCE
-----------------------
''')
prod_P_di_given_N  = Prod(P_di_given_N)
prod_P_di_given_N_ = Prod(P_di_given_N_)
P_X_given_N__ = prod_P_di_given_N
P_X_given_N_ = prod_P_di_given_N_
pprint(Eq(P_X_given_N, P_X_given_N__))
pprint(Eq(P_X_given_N, P_X_given_N_))

print('''
   === ALSO ===
      ''')
prod_P_di = Prod(P_di)
P_X_      = prod_P_di
pprint(Eq(P_X, P_X_))

print('''
-----------------------
REARANGE TO LIKELIHOOD USING BAYES AGAIN
-----------------------
''')
P_N_given_X__ = (P_X_given_N * P_N) / (P_X)
P_N_given_X_ = (P_X_given_N_ * P_N) / (P_X_)
pprint(Eq(P_N_given_X, P_N_given_X__))
print('---')
pprint(Eq(P_N_given_X, P_N_given_X_))
print('--- simplify --- ')
P_N_given_X_done = P_N_given_X_.doit(deep=True)
pprint(Eq(P_N_given_X, P_N_given_X_done))

# Does not seem to cancel out the P(di)[i] variable
#pprint(Eq(P_N_given_X, sympy.simplify(P_N_given_X_done)))


The output of this script is: 

-----------------------
OUTPUT OF SVM: P(N | di)
-----------------------

             P(N)⋅P(di|N)[i]
P(N|di)[i] = ───────────────
                 P(di)[i]   

-----------------------
REARANGE USING BAYES P(di | N)
-----------------------

             P(N|di)[i]⋅P(di)[i]
P(di|N)[i] = ───────────────────
                     P(N)       

-----------------------
AGGREGATE USING INDEPENDENCE
-----------------------

          |X|            
         ┬───┬           
P(X|N) = │   │ P(di|N)[i]
         │   │           
         i = 1           
           |X|                       
         ┬──────┬                    
         │      │ P(N|di)[i]⋅P(di)[i]
P(X|N) = │      │ ───────────────────
         │      │         P(N)       
         │      │                    
          i = 1                      

   === ALSO ===
      
        |X|          
       ┬───┬         
P(X) = │   │ P(di)[i]
       │   │         
       i = 1         

-----------------------
REARANGE TO LIKELIHOOD USING BAYES AGAIN
-----------------------

         P(N)⋅P(X|N)
P(N|X) = ───────────
             P(X)   
---
                |X|                       
              ┬──────┬                    
              │      │ P(N|di)[i]⋅P(di)[i]
         P(N)⋅│      │ ───────────────────
              │      │         P(N)       
              │      │                    
               i = 1                      
P(N|X) = ─────────────────────────────────
                    |X|                   
                   ┬───┬                  
                   │   │ P(di)[i]         
                   │   │                  
                   i = 1                  
--- simplify --- 
                        |X|                     
                  -|X| ┬───┬                    
         P(N)⋅P(N)    ⋅│   │ P(N|di)[i]⋅P(di)[i]
                       │   │                    
                       i = 1                    
P(N|X) = ───────────────────────────────────────
                       |X|                      
                      ┬───┬                     
                      │   │ P(di)[i]            
                      │   │                     
                      i = 1                     

There are no additions in this formula, so the denominator should completely cancel. 

Any ideas why the bottom term is not canceled by the top term? 

Aaron Meurer

unread,
Feb 9, 2016, 4:14:57 PM2/9/16
to sy...@googlegroups.com
Support for Product in SymPy isn't as great as it could be. simplify() doesn't know how to combine two products that are multiplied or divided by each other. 

It looks like it works for Sum, so the first step would be to see how it works there and then see if a similar thing can be done for Product.

In [10]: simplify(Sum(f(x) + g(x), (x, 0, n)) - Sum(f(x), (x, 0, n)))
Out[10]:
  n
 ___
 ╲
  ╲   g(x)
  ╱
 ╱
 ‾‾‾
x = 0

Aaron Meurer

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at https://groups.google.com/group/sympy.
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/e7d3ea7d-831e-4236-bd44-a81e22f5468a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages