Goodness-of-fit for Gaussian distribution of values X,Y

83 views
Skip to first unread message

Fred

unread,
Apr 13, 2016, 9:12:30 AM4/13/16
to julia-stats
Hi,

I read in distributions.jl documentation that it is possible to fit a distribution to a given set of samples using : d = fit(D, sample)

I have a set of (X,Y) values and I would like to determine if the distribution of these values is Gaussian and to have a goodness of fit or Pvalue to accept or reject the hypothesis that the distribution is Gaussian.

1- First of all, in the equation d = fit(Normal, sample), is is unclear for me how sample should be organised : can I concatenate the X and Y vectors like that  : sample = [X Y] ?

2- params(d) does not give goodness-of-fit, how it is possible to have this information ?

Many thanks for your comments !

Dan

unread,
Apr 13, 2016, 10:54:10 AM4/13/16
to julia-stats
The `D` in the code snippet could be replaced with `MultivariateNormal`. `sample` should have a row for `X` and a row for `Y`. This could be arranged using `sample = [X'; Y']`.

The resulting fit is a 2D Gaussian, with a mean for X and Y and the correlations.

If this is not the desired fit, perhaps an Ordinary Least Squares would do the job.
Hope this helps.

Fred

unread,
Apr 13, 2016, 11:04:58 AM4/13/16
to julia-stats
Thank you very much Dan ! I tried with a big array and I had an error :

julia> typeof(mi)
Array{Float64,2}


julia
> size(mi)
(1913,2)


julia
> d = fit_mle(MultivariateNormal, mi)
ERROR
: Base.LinAlg.PosDefException(2)
 
in chol! at linalg/cholesky.jl:28
 
in cholfact at linalg/cholesky.jl:126 (repeats 2 times)
 
in fit_mle at /home/fred/.julia/v0.4/Distributions/src/multivariate/mvnormal.jl:261
 
in fit_mle at /home/fred/.julia/v0.4/Distributions/src/multivariate/mvnormal.jl:251


So I created a simple example :


julia
> x
6-element Array{Any,1}:
 
1
 
2
 
3
 
4
 
5
 
2


julia
> y
6-element Array{Any,1}:
 
2
 
3
 
3
 
5
 
4
 
2


julia
> s = [x y]
6x2 Array{Any,2}:
 
1  2
 
2  3
 
3  3
 
4  5
 
5  4
 
2  2

julia
> d = fit_mle(MultivariateNormal, s)
ERROR
: suffstats is not implemented for (Distributions.MvNormal{Cov<:PDMats.AbstractPDMat{T<:Real},Mean<:Union{Array{Float64,1},Distributions.ZeroVector{Float64}}},Array{Any,2}).
 
in error at ./error.jl:21





Fred

unread,
Apr 13, 2016, 11:22:09 AM4/13/16
to julia-stats
julia> d = fit_mle(MvNormal, mi)
Reply all
Reply to author
Forward
0 new messages