EMCEE leads to wrong value of parameter without any changes when we change prior function

107 views
Skip to first unread message

Ehsan Sadri

unread,
Oct 5, 2018, 12:30:34 PM10/5/18
to lmfit-py

Dear Friends,
I have to ask my question here. PLEASE PLEASE help me.
I wrote the EMCEE code as you can see in the ohd.py file.
It ends to the plot, it is ok, BUT with wrong value!!! Also, when I change the range of parameter in def log_prior(theta), the final answer would not change.
I just can say that the BAO, SGL and OHD functions are correct and are used in other program. For debugging we can use one of these function in the code to find the problem faster. I expect to get answer aroung 0.3 and 0.7 for parameters used in the code.

I do really need it. It made me mad. I appreciate your help.
I would be very very very grateful if you could help me.

Thank you.
ohd.py
BAO.csv
OHD.txt
SGL.txt

Matt Newville

unread,
Oct 5, 2018, 2:24:22 PM10/5/18
to lmfit-py
On Fri, Oct 5, 2018 at 11:30 AM Ehsan Sadri <esad...@gmail.com> wrote:

Dear Friends,
I have to ask my question here.

I could be wrong, but I do not think that is actually a requirement.  The code you attached does not use lmfit.  I would assume that Emcee has its own methods for help its users.  You might consult https://github.com/dfm/emcee for more information.
I don't actually understand what your question is.  When I try to run your code, it does actually run and shows a plot.   It prints out no information.  I have no idea what your code is meant to do -- there is no explanation and the code is a mess. 

The solution from `scipy.optimize.minimize` gives parameter values of ~ 400 and -500.  Starting from those, your `log_prior` will always return -Inf, and so will your `log_probabilty`.   I don't know why your initial fit doesn't match with your prior knowledge, but you *are* tell the MC sampler that `log_probability` is always -Inf.

I would assume the problem is actually in your use of `minimize()` and your `log_likelihood()` function, but I have no idea what that problem might be.

Hope that helps. But also, if you want further help here, please ask actual questions that can be answered.

--Matt

Ehsan Sadri

unread,
Oct 5, 2018, 4:02:52 PM10/5/18
to lmfit-py
 The question is:

PLOT  shows the out puts in the graph. The values are wrong. I set the initial values as 0.3 and 0.7, but you see in the plot range of numbers around 0.0003 or -0.0085.
We should have a plot with values around 0.3 and 0.7 in its axes.
Also if you change the range of values in log_prior function, output does not change

Matt Newville

unread,
Oct 5, 2018, 5:28:38 PM10/5/18
to lmfit-py
Hi Ehsan,


On Fri, Oct 5, 2018 at 3:02 PM Ehsan Sadri <esad...@gmail.com> wrote:
 The question is:


I don't want to put too fine a point on this, but this really isn't a question. You are making statements, not asking questions.

PLOT  shows the out puts in the graph. The values are wrong. I set the initial values as 0.3 and 0.7, but you see in the plot range of numbers around 0.0003 or -0.0085.

Err, no.  That's a misreading of the plot.  In fairness, `corner` (or perhaps it is matplotlib) is showing a very confusing plot with axes that are far too easy to misread.   The values are more like 405 and -496.   You can probably see a '+405e2' and '-496e2' somewhere on the plot.  The idea appears to be that the range of values is something 405 +/- 0.005 and so on.   It's a very unfortunate display of data and I suggest you send this to the `corner` maintainers and ask them why their library that claims to display quantitative data shows such a misleading plot.

But, as you would see from printing out the results, the values are not 0.3 to 0.7, and are not 0.0003, but are rather 405 and -496.  That is what your fit is returning.   Unfortunately, you are relying on your ability to read an image displaying the fit results rather than reading the text of fit results.

We should have a plot with values around 0.3 and 0.7 in its axes.

Well, if those were the best fit values, that's probably what you would get.  But those are not the fit results you get.
Also if you change the range of values in log_prior function, output does not change

I do not know what you mean.  But if the parameter values are outside [0, 1],  log_prior() returns -Inf, and then log_posterior() also returns -Inf.  That's what you are getting.  That's what your functions do.  I don't have the slightest idea why your initial fit with log_likelihood() returns values (in your soln.x) around 400, but with those results as your starting value, emcee isn't going to do much to move those values with your initial expectation.  

So, why are you starting your sample with values ~400 when you expect values of ~0.5?  With values so far out of range to return a finite log probability,  emcee is not going to find a good solution.

Hope that helps,

--Matt

Ehsan Sadri

unread,
Oct 5, 2018, 5:50:51 PM10/5/18
to lmfi...@googlegroups.com
yes, +405e2' and '-496e2, or '+4.5e2' and '-4.96e2'. But what do you mean, why do I start with ~4.00? I choose initial value, at the top of code and the rest should be done by the algorithm used by  the code. I really do not Understand what the problem is. I use my MCMC code written by me, but this one made me crazy, btw. I really dont know where should I ask my question. I know it may has a simple solution. But I am getting stuck in it.

Matt Newville

unread,
Oct 5, 2018, 7:52:01 PM10/5/18
to lmfit-py
On Fri, Oct 5, 2018 at 4:50 PM Ehsan Sadri <esad...@gmail.com> wrote:


On Saturday, October 6, 2018 at 12:58:38 AM UTC+3:30, Matt Newville wrote:
Hi Ehsan,


On Fri, Oct 5, 2018 at 3:02 PM Ehsan Sadri <esad...@gmail.com> wrote:
 The question is:


I don't want to put too fine a point on this, but this really isn't a question. You are making statements, not asking questions.

PLOT  shows the out puts in the graph. The values are wrong. I set the initial values as 0.3 and 0.7, but you see in the plot range of numbers around 0.0003 or -0.0085.

Err, no.  That's a misreading of the plot.  In fairness, `corner` (or perhaps it is matplotlib) is showing a very confusing plot with axes that are far too easy to misread.   The values are more like 405 and -496.   You can probably see a '+405e2' and '-496e2' somewhere on the plot.  The idea appears to be that the range of values is something 405 +/- 0.005 and so on.   It's a very unfortunate display of data and I suggest you send this to the `corner` maintainers and ask them why their library that claims to display quantitative data shows such a misleading plot.

But, as you would see from printing out the results, the values are not 0.3 to 0.7, and are not 0.0003, but are rather 405 and -496.  That is what your fit is returning.   Unfortunately, you are relying on your ability to read an image displaying the fit results rather than reading the text of fit results.

We should have a plot with values around 0.3 and 0.7 in its axes.

Well, if those were the best fit values, that's probably what you would get.  But those are not the fit results you get.
Also if you change the range of values in log_prior function, output does not change

I do not know what you mean.  But if the parameter values are outside [0, 1],  log_prior() returns -Inf, and then log_posterior() also returns -Inf.  That's what you are getting.  That's what your functions do.  I don't have the slightest idea why your initial fit with log_likelihood() returns values (in your soln.x) around 400, but with those results as your starting value, emcee isn't going to do much to move those values with your initial expectation.  

So, why are you starting your sample with values ~400 when you expect values of ~0.5?  With values so far out of range to return a finite log probability,  emcee is not going to find a good solution.

Hope that helps,

--Matt


That are not +405e2' and '-496e2, they are +4.5e2' and '-4.96e2'. And What do you mean, why do I start with ~4.00? I choose initial value, at the top of code and the rest should be done by the algorithm used by  the code.

Well, I think you are asking for advice about how to change the code.

You do not start the emcee sampler with your values of ~0.5.  You send those to minimize() and then use the results of that to start the emcee sampler.   Those values from minimize() (in your `soln.x`) are on the scale of 400, not 0.5.   And those values you are choosing to use to start the emcee sampler are generating -Inf.  That is what your model chooses to do.   
 
I really do not Understand what the problem is. I use my MCMC code written by me, by this one made me crazy, btw. I really dont know where should I ask my question. I know it may has a simple solution. But I am getting stuck in it.

Well,why are you starting the emcee sampler with values that are well outside the acceptable range of values in your log_prior() function?
That kind of makes no sense.

Why do you assert that values outside [0, 1] are infinitely unlikely?  I have no idea why that is -- your model is meaningless to me.  But for sure, values of ~400 are so far outside the acceptable values to your log_prior() function.   

Good luck!

--Matt 
Reply all
Reply to author
Forward
0 new messages