Autocorrelation and Effective sample size

563 views
Skip to first unread message

PasanS

unread,
Sep 20, 2012, 10:48:18 AM9/20/12
to migrate...@googlegroups.com
Hi,
 
I am trying to understand the relationship between Autocorrelation and effective sample size. I thought that the two are inversely related (i.e., lower autocorrelation mean higher effective sample size and vice versa). I analyzed the same data file twice with different seed numbers and compared them. I noticed that both autocorrelation and effective sample size was low in one analysis for a particular parameter, and both  autocorrelation and effective sample size was higher in the other analysis, which is counterintuitive to me. Can someone clarify.
 
Thanks
 
Pasan

Peter Beerli

unread,
Sep 20, 2012, 10:50:54 AM9/20/12
to migrate...@googlegroups.com
Perhaps a few numbers would help?
Peter


--
You received this message because you are subscribed to the Google Groups "migrate-support" group.
To view this discussion on the web visit https://groups.google.com/d/msg/migrate-support/-/NXdAQN1ATFYJ.
To post to this group, send email to migrate...@googlegroups.com.
To unsubscribe from this group, send email to migrate-suppo...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/migrate-support?hl=en.

Pasan Samarasin

unread,
Sep 20, 2012, 11:10:52 AM9/20/12
to migrate...@googlegroups.com
Here are few parameters I noticed
 
               Analysis 1                                                           Analysis 2
 
param       Autocorr             ESS                      param        Autocorr             ESS

    T1              0.9825             27831                            T1            0.7710                 21677

    T3              0.8088              33061                            T3           0.8188                 41502

    M 1->3      0.975                  9753                         M 1->3     0.9625                    7011

 

Thanks!

Pasan

 

--
Pasan

Peter Beerli

unread,
Sep 20, 2012, 4:42:25 PM9/20/12
to migrate...@googlegroups.com
Not clear what is going on, it looks like the two runs where not run for the same amount of samples, because the parameter updates are picked randomly.
There is a table with "Acceptance ratios for all parameters and the genealogies", could you give numbers for the parameters below for "Accepted changes" [accepted/attempted].

Peter
P.S. In any case ESS and autocorrelation are very rough estimators of convergence, to me interrogating the posterior histograms is more powerful. Usually having several thousand ESS is OK, but I have seen cases
that needed much more than that.

Pasan Samarasin

unread,
Sep 20, 2012, 5:06:05 PM9/20/12
to migrate...@googlegroups.com
Thanks Peter,
 
I used the same data file so there shouldn't be any difference in data. The acceptance ratios are all 1. I think because it is slice sampling. I attached the two output files here.
 
Thank you for your assistance!
 
Pasan

m=0.01-4980gen&0.005-20gen_006_1.pdf
m=0.01-4980gen&0.005-20gen_006_2.pdf

Peter Beerli

unread,
Sep 20, 2012, 5:29:38 PM9/20/12
to migrate...@googlegroups.com

Pasan:

Your run parameters in your two pdf files are slightly different because the temperature for the hottest chain is different in the two files

<….1.pdf> 10.00 3.00 1.50 1.00

<….2.pdf> 12.00 3.00 1.50 1.00 

I am surprised that differences in the hottest chain can have such effect on the ESS, but it looks  like that [I do not claim that there is absolutely no issue with ESS calculations, but at the moment the smoking gun is the difference in heating]. Be aware that with your heating scheme you should ignore the marginal likelihood table, because the hottest temperature of 10 or 12 is too cold to sample from the prior (temperature > 100000).

BTW both of your runs look good enough for me.

Peter



<m=0.01-4980gen&0.005-20gen_006_1.pdf><m=0.01-4980gen&0.005-20gen_006_2.pdf>

Pasan Samarasin

unread,
Sep 24, 2012, 11:56:02 AM9/24/12
to migrate...@googlegroups.com
Hi Peter,
 
Here are two analysis with identical parameters (including heating) that show the discrepancy between Autocorrelation and ESS. Take a look at Theta 1 and theta 3. Maybe there is something wrong with the calculations.
 
Thank you,
 
Pasan 

m=0.01-4990gen&0.005-10gen_007 (1).pdf
m=0.01-4990gen&0.005-10gen_007 (2).pdf

buckland steeves

unread,
Sep 24, 2012, 11:57:28 AM9/24/12
to migrate...@googlegroups.com
wrong email dude!



From: Pasan Samarasin <pasan.s...@gmail.com>
To: migrate...@googlegroups.com
Sent: Monday, September 24, 2012 4:56:02 PM
Subject: Re: [migrate-support] Autocorrelation and Effective sample size

Peter Beerli

unread,
Oct 7, 2012, 12:56:47 PM10/7/12
to migrate...@googlegroups.com
Pasan,
I don't think there is something wrong,
after looking at the code the two numbers
autocorrelation and ESS are generated 
like this through a run, the autocorrelation is an average over the whole run and so differnt runs may have slightly different values, the ESS is built up during a run and depends on loci, replicates and autocorelation,
it seems that the resulting discrepancies are just a result of this build up process. I would also think that these are 
only rough guides anyway and +- 10 or 20% of the value would not really astonish me.

Peter


<m=0.01-4990gen&0.005-10gen_007 (1).pdf><m=0.01-4990gen&0.005-10gen_007 (2).pdf>

Reply all
Reply to author
Forward
0 new messages