Spikes in Bayesian blocks?

94 views
Skip to first unread message

Matteo Bachetti

unread,
Mar 5, 2013, 9:15:07 AM3/5/13
to astroml...@googlegroups.com

Hi, thanks for sharing this nice library. I'm playing with it right now, to analyze lightcurves of X-ray binaries.

I notice that for event data there are quite some intervals where the bayesian blocks algorithm gives "spikes" in the histogram. Also in one of your examples HERE there is that spike around t=4. In my plots there are many more. Do you know where it comes from? Can you suggest a way to avoid it (e.g., a minimum bin size)? It seems unphysical.
Thanks again!

Matteo

Jake Vanderplas

unread,
Mar 5, 2013, 11:51:17 AM3/5/13
to astroml...@googlegroups.com

Hi Matteo,

Duplicating my response to the same question on the Pythonic Perambulations comment thread...

Simply put, there are spikes because the piecewise constant likelihood model says that spikes are favored. By saying that the spikes seem unphysical, you are effectively adding a prior on the model based on your intuition of what it should look like. You can play with that by adjusting the Bayesian prior in the code: that will take digging a bit deeper than just using astroML's hist() function. AstroML includes several prior forms for Bayesian Blocks, which you can see here: https://github.com/astroML/astroML/blob/master/astroML/density_estimation/bayesian_blocks.py There's also the reference there to the Scargle paper which discusses them in more detail.

I haven't yet put together any examples of adjusting priors or creating custom priors, but that's on my (rather long) to-do list! Hope that helps.

   Jake


--
You received this message because you are subscribed to the Google Groups "astroML-general" group.
To unsubscribe from this group and stop receiving emails from it, send an email to astroml-gener...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Jeff Scargle

unread,
Oct 10, 2014, 7:55:25 PM10/10/14
to astroml...@googlegroups.com
Hi … Jake's comment is exactly correct.  

I am used to automatically including the "penalty" against large numbers of blocks/bins that the prior provides.
I would recommend either experimenting with this parameter, or adopting the default values (based on achieving a given false positive rate) in our paper.
But also beware of getting rid of spikes too easily.  A number of times I have found spikes that look out of place but on further investigation are "real".
Remember that one of the main features of the algorithm is that it can find structure on any scale as long as it is supported by the data.
When you fix bins you automatically limit the resolution.
Reply all
Reply to author
Forward
0 new messages