Comparing 2 beta distributions

512 views
Skip to first unread message

vkara...@ucdavis.edu

unread,
Sep 7, 2016, 11:22:45 PM9/7/16
to Davis R Users' Group
Hi! Is there a way to estimate the probability that 2 RVs are coming from the same beta distribution (short of something generic like the Kolmogorov-Smirnov test)?
Many Thanks!

Matt Espe

unread,
Sep 8, 2016, 1:09:17 AM9/8/16
to Davis R Users' Group
Sure - Bayes Theorem will get you there, but denominator is going to be a bit tricky. Much easier is to get it to a proportional constant.

However, I am guessing you want to weigh the evidence between two models, Model1: RV1, RV2 ~ Beta(A,B) vs. Model2: RV1 ~ Beta(A1, B1) and RV2 ~ Beta(A2, B2)?

If the second, you can fit two models, and then compare them using your choice of information criteria. It would not get you an exact probability, but it would give you the comparable weight of evidence in favor of each of the alternatives. Someone else will have to help with fitting beta models in R - I default to other tools for my modeling.

Matt

vkara...@ucdavis.edu

unread,
Sep 10, 2016, 9:09:02 PM9/10/16
to Davis R Users' Group
The second option is a great idea; naively though the 1st one appears simpler in concept and implementation, so I'd like to learn a bit more on how to do that.
For example, if for a set of model predicted (Y) and observed (X) values we have P(Y|X) = P(X|Y)*P(Y)/P(X), would P(X|Y) be the probability of each X given a PDF fitted to all Y (summed over all X), and P(Y) and P(X) be just the probability of the element in each set given a PDF fitted to that set (and again summed over all known values)? A more correct form I think suggests something like P(Y|X) = sum( P(X|Y_i)*P(Y_i)/P(X) ), though in this case I'm not sure how to get P(X|Y_i).

Thanks!!
Vadim

Matt Espe

unread,
Sep 12, 2016, 1:23:31 AM9/12/16
to Davis R Users' Group
Hi Vadim,

The full explanation is likely beyond the scope of this forum (being more statistical than R related). If you are interested in learning this, you should explore classes on probability theory or Bayesian statistics to get the basics (e.g., joint probabilities are products not sums, predictions are conditioned on data (X_pred) and a model/parameters, which are in turn conditional on assumed model structure and observed data). 

More related to R, there are some excellent tools to fit probability based models which are interfaced to R. I tend to default to using Stan (mc-stan.org), but others include JAGS, NIMBLE, rethinking, arm, etc. I would caution against jumping in before doing your homework though. I can say from experience that these models/analyses are very attractive because of their flexibility, but you can really get into trouble without a firm foundation.

If you are interested in the probability stuff, feel free to PM me to avoid taking things too off-topic here.

Matt
Reply all
Reply to author
Forward
0 new messages