Goodness-of-fit for Gaussian distribution of values X,Y

Fred

unread,

Apr 13, 2016, 9:12:30 AM4/13/16

to julia-stats

Hi,

I read in distributions.jl documentation that it is possible to fit a distribution to a given set of samples using : d = fit(D, sample)

I have a set of (X,Y) values and I would like to determine if the distribution of these values is Gaussian and to have a goodness of fit or Pvalue to accept or reject the hypothesis that the distribution is Gaussian.

1- First of all, in the equation d = fit(Normal, sample), is is unclear for me how sample should be organised : can I concatenate the X and Y vectors like that : sample = [X Y] ?

2- params(d) does not give goodness-of-fit, how it is possible to have this information ?

Many thanks for your comments !

Dan

unread,

Apr 13, 2016, 10:54:10 AM4/13/16

to julia-stats

The `D` in the code snippet could be replaced with `MultivariateNormal`. `sample` should have a row for `X` and a row for `Y`. This could be arranged using `sample = [X'; Y']`.

The resulting fit is a 2D Gaussian, with a mean for X and Y and the correlations.

If this is not the desired fit, perhaps an Ordinary Least Squares would do the job.

Hope this helps.

Fred

unread,

Apr 13, 2016, 11:04:58 AM4/13/16

to julia-stats

Thank you very much Dan ! I tried with a big array and I had an error :

julia> typeof(mi)
Array{Float64,2}


julia> size(mi)
(1913,2)


julia> d = fit_mle(MultivariateNormal, mi)
ERROR: Base.LinAlg.PosDefException(2)
 in chol! at linalg/cholesky.jl:28
 in cholfact at linalg/cholesky.jl:126 (repeats 2 times)
 in fit_mle at /home/fred/.julia/v0.4/Distributions/src/multivariate/mvnormal.jl:261
 in fit_mle at /home/fred/.julia/v0.4/Distributions/src/multivariate/mvnormal.jl:251

So I created a simple example :



julia> x
6-element Array{Any,1}:
 1
 2
 3
 4
 5
 2


julia> y
6-element Array{Any,1}:
 2
 3
 3
 5
 4
 2


julia> s = [x y]
6x2 Array{Any,2}:
 1  2
 2  3
 3  3
 4  5
 5  4
 2  2

julia> d = fit_mle(MultivariateNormal, s)
ERROR: suffstats is not implemented for (Distributions.MvNormal{Cov<:PDMats.AbstractPDMat{T<:Real},Mean<:Union{Array{Float64,1},Distributions.ZeroVector{Float64}}},Array{Any,2}).
 in error at ./error.jl:21

Fred

unread,

Apr 13, 2016, 11:22:09 AM4/13/16

to julia-stats

julia> d = fit_mle(MvNormal, mi)

Reply all

Reply to author

Forward