# Are distributions like normal and bernaull independent by default?

23 views

### Gurkirat Singh

Oct 27, 2022, 10:29:54 PM10/27/22
to TensorFlow Probability
Let's say I have the following bernoulli distribution

p_random_var1 = 0.5
p_random_var2 = 0.6

batch_bernoulli_dist = tfd.Bernoulli(probs=[
p_random_var1,
p_random_var2,
])

How it is different from

batch_bernoulli_dist_ind = tfd.Independent(batch_bernoulli_dist, reinterpreted_batch_ndims=1)

Also I am facing problem understanding different between batch shape and event shape in terms of indepdent/dependent and simultaneous/non-simunltaneous

### rif

Oct 28, 2022, 6:02:48 AM10/28/22
to Gurkirat Singh, TensorFlow Probability
Hi Gurkirat.

You may find this tutorial helpful, since it explains exactly this in great (perhaps excruciating) detail. An immediate answer to your question is below.:

Event shape is the shape of a single sample from an underlying distribution.
Batch shape is a set of independent draws from the underlying distribuiton.

Your first distribution object, `batch_bernoulli_dist`, has `batch_shape=[2], event_shape=[]`: it is two independent one-dimensional Poisson distributions. `tfd.Independent` turns batch dimensions into event dimensions. So `batch_bernoulli_dist_ind` has `batch_shape=[], event_shape[2]`. It is a single two-dimensional Poisson distribution (independent across the dimensions).

It is possible that you're wondering what the difference is. If all you're doing is sampling, there is no difference --- if you generate n samples from the same seed, the two objects will give you identical tensors with shape `[n, 2]`. The difference shows up when computing log probabilities: in the simplest case, `batch_bernoulli_dist.log_prob([3, 4]).shape == (2,)`, but `batch_bernoullid_dist_ind.log_prob([3, 4]).shape == ()`. The first object is computing log probs of two indepedent one-dimensional distributions, while the second is computing the log prob of a single two-dimensional distribution.

Hope this helps.

Best,

rif

### Gurkirat Singh

Oct 28, 2022, 8:52:46 AM10/28/22
to TensorFlow Probability, r...@google.com, TensorFlow Probability, Gurkirat Singh
Based on what you have said, most of the things are cleared. Thank you for sharing the notebook as well.

However to completely understand these topics and working with library, I want few more questions answered

Q1. You said "it is two independent one-dimensional Poisson distributions." If I use prob method, I get joint probability instead, is that correct? This is what I got from "while the second is computing the log prob of a single two-dimensional distribution." last sentence
Q2. In this example I have used Binomial distribution. If I use Normal distribution wrapped in Indepenent class. How it is different from MultivariateNormalDiag?

### rif

Oct 28, 2022, 11:27:27 AM10/28/22
to Gurkirat Singh, TensorFlow Probability

On Fri, Oct 28, 2022 at 5:52 AM Gurkirat Singh <tbh...@gmail.com> wrote:
Based on what you have said, most of the things are cleared. Thank you for sharing the notebook as well.

However to completely understand these topics and working with library, I want few more questions answered

Q1. You said "it is two independent one-dimensional Poisson distributions." If I use prob method, I get joint probability instead, is that correct? This is what I got from "while the second is computing the log prob of a single two-dimensional distribution." last sentence
Q2. In this example I have used Binomial distribution. If I use

`log_prob` and `prob` behave the same way with respect to these issues.
So `batch_bernoulli_dist.prob([3, 4]).shape == (2,)`, with each entry of the vector representing the probability of one of the two Poissons, and `batch_bernoulli_dist.prob([3, 4]).shape == ()`, returning the probability of the newly created two-dimensional "joint distribuiton."

Normal distribution wrapped in Indepenent class. How it is different from MultivariateNormalDiag?

It is semantically identical. (The actual implementation doesn't work this way.) Asking this question generally indicates you "get it."

Best,

rif

### Gurkirat Singh

Oct 30, 2022, 3:20:46 PM10/30/22
to TensorFlow Probability, r...@google.com, TensorFlow Probability, Gurkirat Singh
Well now I have more questions, please try to answer them sequentially

1. If we have event shape value, without independent class, then prob and log_prob then will the even shape dependent or indepenent? for example check this https://i.imgur.com/lT7QAad.png
2. Also as the name suggest Independent class will make each experiment of event_shape size such that each sub experiment can be done simultaneously (nature of independent like tossing coin 10 times)
3. How is joint probability related to all the Independent class. From the probability class I have learnt that joint probability is probability two or more independent events happening simultaneously

### rif

Oct 30, 2022, 3:54:29 PM10/30/22
to Gurkirat Singh, TensorFlow Probability
On Sun, Oct 30, 2022 at 12:20 PM Gurkirat Singh <tbh...@gmail.com> wrote:
Well now I have more questions, please try to answer them sequentially

1. If we have event shape value, without independent class, then prob and log_prob then will the even shape dependent or indepenent? for example check this https://i.imgur.com/lT7QAad.png
The event shape represents a single draw from a distribution. If the event shape is not `()` (the distribution isn't scalar-valued), the dimensions can be either dependent or independent. In the case of `MultivariateNormalDiag`, the dimensions happen to be independent, in the case of a generic multivariate normal they're dependent. (If you know that the distribution was created using `tfd.Independent`, then of course you know the dimensions are independent, but in general you don't know.)
1. Also as the name suggest Independent class will make each experiment of event_shape size such that each sub experiment can be done simultaneously (nature of independent like tossing coin 10 times)
Yes.
1. How is joint probability related to all the Independent class. From the probability class I have learnt that joint probability is probability two or more independent events happening simultaneously
Quoting the first sentence of the Wikipedia article, "Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs." Note that this definition says nothing about independence (contra your italicized phrase), and also that this can of course be trivially extended to more than two random variables.

TensorFlow probability offers a powerful "JointDistribution" abstraction (JD) for playing with distributions over multiple random variables (that may or may not be independent); see our documentation. In the specific case of `tfd.Independent`, what you are doing is essentially creating a (mathematical) joint distribution out of a number of independent distributions with the same form. For example, a general joint distribution could be the joint probability of a normal, a Poisson and a Bernoulli random variable; in TFP you can do that using our JD abstraction but not with tfd.Independent. tfd.Independent covers the special case of multiple independent normals, or multiple independent Poissons, or multiple independent Bernoulli's. (Under the hood, the reason it works this way is that this abstraction plays well with the kinds of easily available vectorized parallelization available in TensorFlow and JAX.)

### Gurkirat Singh

Oct 30, 2022, 6:04:28 PM10/30/22
to TensorFlow Probability, r...@google.com, TensorFlow Probability, Gurkirat Singh
Based on your explainations and wikipedia article I have fixed my definition of Joint Probability.

Previously I used to think P(A and B) is joint probab iff A and B are independent, which I find it wrong now.

For Independent P(A and B) = P(A) x P(B)
For Dependent P(A and B) = P(A) x P(B | A)

Please correct me if I am wrong.

### rif

Oct 30, 2022, 6:15:55 PM10/30/22
to Gurkirat Singh, TensorFlow Probability
What you wrote is correct because for independent random variables, P(B|A) = P(B), and P(A|B) = P(A).

In general, whether variables are dependent or independent, P(A, B) = P(A) * P(B | A) = P(B) * P(A | B) is always true. In the particular case where they're independent, P(A, B) = P(A) * P(B).

rif

### Gurkirat Singh

Oct 30, 2022, 6:19:20 PM10/30/22
to rif, TensorFlow Probability
Thank you Rif, you have helped me a lot. Now I understand probability better

Regards,
Gurkirat Singh

### Gurkirat Singh

Oct 30, 2022, 8:13:51 PM10/30/22
to TensorFlow Probability, Gurkirat Singh, TensorFlow Probability, r...@google.com
In the tutorial https://www.tensorflow.org/probability/examples/Understanding_TensorFlow_Distributions_Shapes

What does it exactly mean by "identically distributed draws" and "non-identically distributed draws"? Would provide an example as well?

### rif

Oct 30, 2022, 10:23:05 PM10/30/22
to Gurkirat Singh, TensorFlow Probability
So take the example `tfd.Poisson(rate=[1., 10., 100.], name='Three Poissons')`, representing three Poisson distributions. The `batch_shape` is `[3]` -- the Python object represents three non-identical (because they have different rate parameters) Poisson distributions. So asking for a single sample from this distribution draws from three non-identical Poissons. On the other hand, asking for more than one sample will replicate the entire batch, so you can think of that as multiple identical draws of the entire batch of non-identical distributions.

The tutorial covers this pretty extensively.

rif