optimization & integration error

250 views
Skip to first unread message

Rebecca Harris

unread,
Aug 2, 2016, 7:34:24 PM8/2/16
to dadi-user
Hi,

I'm trying to fit the 1D models to each of my populations before moving onto the 2D models. I've incorporated the code for misidentified SNPs (p_misid) into my code and I've tried optimizing each model with 5 independent runs. For the bottlegrowth and 3epoch models, I find the same likelihood score but get very different parameter values. I read in other posts that this could be due to a lack of rare alleles, but it looks like the model fits my data pretty well there. I then used the hessian method to calculate the uncertainty of my parameters. It's unclear to me whether I need to do more work with my 1D models or if I can be content with calling the two epoch model the best model for both populations. 

I tried moving on to the standard 2D bottlegrowth models. However, I am getting the following error when I try to run the bottlegrowth_split model:
"ValueError: Final integration time T (-2.453609) is less than intial_time (0.000000). Integration cannot be run backwards."
I searched around to figure out what this might be and saw mention of make_extrap_log_func and dadi.Integration.timescale_factor. I am using make_extrap_log_func and have increased the value of dadi.Integration.timescale_factor, but continue to get this integration error. I don't get any errors using the 2D snm or bottlegrowth functions. 

Any help would be greatly appreciated. 

Thanks!
Rebecca
trOUT_P1_22_stats.txt
trOUT_P2_16_stats.txt
trOUT_P2_16_results.pdf

Gutenkunst, Ryan N - (rgutenk)

unread,
Aug 4, 2016, 3:07:57 PM8/4/16
to dadi...@googlegroups.com
Hello Rebecca,

On Aug 2, 2016, at 4:34 PM, Rebecca Harris <reb...@gmail.com> wrote:
I'm trying to fit the 1D models to each of my populations before moving onto the 2D models. I've incorporated the code for misidentified SNPs (p_misid) into my code and I've tried optimizing each model with 5 independent runs. For the bottlegrowth and 3epoch models, I find the same likelihood score but get very different parameter values. I read in other posts that this could be due to a lack of rare alleles, but it looks like the model fits my data pretty well there. I then used the hessian method to calculate the uncertainty of my parameters. It's unclear to me whether I need to do more work with my 1D models or if I can be content with calling the two epoch model the best model for both populations.

I’m not sure what your LRT comparisons are between, but the two-epoch model fits the data almost as well as the bottlegrowth (and as well as three epoch). So I would stick with that. 

I tried moving on to the standard 2D bottlegrowth models. However, I am getting the following error when I try to run the bottlegrowth_split model:
"ValueError: Final integration time T (-2.453609) is less than intial_time (0.000000). Integration cannot be run backwards."
I searched around to figure out what this might be and saw mention of make_extrap_log_func and dadi.Integration.timescale_factor. I am using make_extrap_log_func and have increased the value of dadi.Integration.timescale_factor, but continue to get this integration error. I don't get any errors using the 2D snm or bottlegrowth functions.

My guess would be that you’re getting negative parameter values passed into the demographic model function somehow. Check that you’ve bounded the parameter search so no parameters can be negative. You can also set verbose=1 in the optimization functions to see each evaluation, to determine exactly what set of parameter values is causing the problem.

Best,
Ryan

Any help would be greatly appreciated. 

Thanks!
Rebecca

--
You received this message because you are subscribed to the Google Groups "dadi-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dadi-user+...@googlegroups.com.
To post to this group, send email to dadi...@googlegroups.com.
Visit this group at https://groups.google.com/group/dadi-user.
For more options, visit https://groups.google.com/d/optout.
<trOUT_P1_22_stats.txt><trOUT_P2_16_stats.txt><trOUT_P2_16_results.pdf>

--
Ryan Gutenkunst
Assistant Professor of Molecular and Cellular Biology, University of Arizona
phone: (520) 626-0569, office: LSS 325, web: http://gutengroup.mcb.arizona.edu

Latest papers: 
“Selection on network dynamics drives differential rates of protein domain evolution”
PLoS Genetics; http://dx.doi.org/10.1371/journal.pgen.1006132
"Triallelic population genomics for inferring correlated fitness effects of same site nonsynonymous mutations"
Genetics; http://dx.doi.org/10.1534/genetics.115.184812
"Whole genome sequence analyses of Western Central African Pygmy hunter-gatherers reveal a complex demographic history and identify candidate genes under positive natural selection"
Genome Research; http://dx.doi.org/10.1101/gr.192971.115

Rebecca Harris

unread,
Aug 4, 2016, 3:31:08 PM8/4/16
to dadi-user
Hi -

I retried the 2D models and it seems like the model will run 1/10 of the time but spit an error the rest of the time. When I do get an integration error, I get the message below. It gives this error when I run optimize_log, after I run make_extrap_log_func and perturb_params. Changing my upper/lower bounds don't seem to help. Is this just a starting seed problem and something I can ignore or does this reflect something about my data that I should try to troubleshoot?

Thanks for your help!
Rebecca



    popt_bgs = dadi.Inference.optimize_log(p0_bgs, fs, extrap_bgs, pts_range, lower_bound=lower, upper_bound=upper, verbose=1, maxiter=100)
  File "/home/haekel/anaconda/lib/python2.7/site-packages/dadi-1.7.0-py2.7-linux-x86_64.egg/dadi/Inference.py", line 165, in optimize_log
  File "/home/haekel/anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.py", line 782, in fmin_bfgs
    res = _minimize_bfgs(f, x0, args, fprime, callback=callback, **opts)
  File "/home/haekel/anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.py", line 855, in _minimize_bfgs
    old_fval, old_old_fval)
  File "/home/haekel/anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.py", line 690, in _line_search_wolfe12
    **kwargs)
  File "/home/haekel/anaconda/lib/python2.7/site-packages/scipy/optimize/linesearch.py", line 96, in line_search_wolfe1
    c1=c1, c2=c2, amax=amax, amin=amin, xtol=xtol)
  File "/home/haekel/anaconda/lib/python2.7/site-packages/scipy/optimize/linesearch.py", line 167, in scalar_search_wolfe1
    phi1 = phi(stp)
  File "/home/haekel/anaconda/lib/python2.7/site-packages/scipy/optimize/linesearch.py", line 82, in phi
    return f(xk + s*pk, *args)
  File "/home/haekel/anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.py", line 282, in function_wrapper
    return function(*(wrapper_args + args))
  File "/home/haekel/anaconda/lib/python2.7/site-packages/dadi-1.7.0-py2.7-linux-x86_64.egg/dadi/Inference.py", line 78, in _object_func_log
  File "/home/haekel/anaconda/lib/python2.7/site-packages/dadi-1.7.0-py2.7-linux-x86_64.egg/dadi/Inference.py", line 52, in _object_func
  File "/home/haekel/anaconda/lib/python2.7/site-packages/dadi-1.7.0-py2.7-linux-x86_64.egg/dadi/Numerics.py", line 261, in extrap_func
  File "/home/haekel/anaconda/lib/python2.7/site-packages/dadi-1.7.0-py2.7-linux-x86_64.egg/dadi/Demographics2D.py", line 57, in bottlegrowth_split
  File "/home/haekel/anaconda/lib/python2.7/site-packages/dadi-1.7.0-py2.7-linux-x86_64.egg/dadi/Demographics2D.py", line 83, in bottlegrowth_split_mig
  File "/home/haekel/anaconda/lib/python2.7/site-packages/dadi-1.7.0-py2.7-linux-x86_64.egg/dadi/Integration.py", line 166, in one_pop
ValueError: Final integration time T (-0.833182) is less than intial_time (0.000000). Integration cannot be run backwards.

Rebecca Harris

unread,
Aug 4, 2016, 3:33:02 PM8/4/16
to dadi-user
I should add that changing verbose=1 doesn't tell me anything because the error occurs before anything gets printed out. Thanks!

Gutenkunst, Ryan N - (rgutenk)

unread,
Aug 4, 2016, 3:43:56 PM8/4/16
to dadi...@googlegroups.com
Print out p0_bgs, does it have negative values?

Best,
Ryan

Rebecca Harris

unread,
Aug 4, 2016, 3:52:23 PM8/4/16
to dadi-user
Nope - but I'm guessing this is because T < Ts. Would an appropriate work around be to check the perturb_params output prior to running optimize_log to ensure that T < Ts and re-run perturb_params if this is the case? 

Again - thanks for your help.

p0_bgs
[  9.07646263  18.76323601   7.42963619   8.41311915]
[Omitting full error message but it's the same as before]
ValueError: Final integration time T (-0.983483) is less than intial_time (0.000000). Integration cannot be run backwards.

Gutenkunst, Ryan N - (rgutenk)

unread,
Aug 4, 2016, 4:15:50 PM8/4/16
to dadi...@googlegroups.com
Hi Rebecca,

Yes, you found a bug. I’ve committed a fix to the dadi git repository. This model hasn’t been heavily used. Sorry for the inconvenience!

Best,
Ryan

mountain...@gmail.com

unread,
Apr 18, 2017, 2:18:48 PM4/18/17
to dadi-user
Hi Dr. Gutenkunst,

I'm getting the same error as mentioned by Rebecca, but I compiled my version of dadi just a few weeks ago. What are the other reasons to get the 'Final integration time T (-1.276407) is less than intial_time (0.000000). Integration cannot be run backwards.' error? My specified upper and lower bounds confine the search to all positive values. Could it be due to specifying an overlap in bounds of when two populations diverge and demographic functions (like gene flow) that happen between these two populations? That is, if the program is trying to fit part of the model for one population and part of the model is parameterizing processes for two populations?

Thanks,
Jared

Gutenkunst, Ryan N - (rgutenk)

unread,
Apr 18, 2017, 3:50:33 PM4/18/17
to dadi...@googlegroups.com
Hi Jared,

Did you pull dadi from the git repository? Which model is giving you this error?

Best,
Ryan

mountain...@gmail.com

unread,
Apr 18, 2017, 4:34:56 PM4/18/17
to dadi-user, rgu...@email.arizona.edu

Hi Ryan,

I did, I cloned it from the git repository. My model is somewhat complex, and I'm not quite sure if it's doing what I'd like. This is my conceptual model, with the code for it below:



def exp_iso_exp_mig(params, ns, pts):
    '''nuB: Ratio of population size after instantanous change to ancestral
    population size
    nuF: Ratio of contempoary to ancient population size
    m: Migration rate between the two populations (2*Na*m).
    T: Time in the past at which instantaneous change happened and growth began
    (in units of 2*Na generations)
    Ts: Time in the past at which the two populations split.      
    nu1: Ratio of population 1 size after split to ancestral pop. size
    nu2: Ratio of population size after split to ancestral pop. size
    s: Fraction going into pop 1
    T1: when pops. 1 and 2 split from each other
    n1,n2: Sample sizes of resulting Spectrum
    pts: Number of grid points to use in integration
    '''
    nuB,nuF,m,T,Ts,nu1,nu2,s,T1 = params
   
    xx = Numerics.default_grid(pts)
    phi = PhiManip.phi_1D(xx)

    nu_func = lambda t: nuB*numpy.exp(numpy.log(nuF/nuB) * t/T)
    phi = Integration.one_pop(phi, xx, T-Ts, nu_func)

    phi = PhiManip.phi_1D_to_2D(xx, phi)
    nu1_func = nu1*s
    nu2_func = lambda t: (1-s) * (nu2/(1-s))**(t/T1)
    phi = Integration.two_pops(phi, xx, T1, nu1, nu2_func, m12=m, m21=m)
    fs = Spectrum.from_phi(phi, ns, (xx,xx))
    return fs

Since my earlier post, I made my T and Ts argument bounds non-overlapping (bounded based on the analyses that were working before), and the error seems to not be happening (out of ~5 replicates).


On Tuesday, April 18, 2017 at 12:50:33 PM UTC-7, Gutenkunst, Ryan N - (rgutenk) wrote:
Hi Jared,

Did you pull dadi from the git repository? Which model is giving you this error?

Best,
Ryan
Auto Generated Inline Image 1

Gutenkunst, Ryan N - (rgutenk)

unread,
Apr 18, 2017, 5:20:53 PM4/18/17
to mountain...@gmail.com, dadi-user
You’re model has T-Ts as a time, which breaks if Ts > T. I suggest rewriting your model to define T as the time interval between the expansion and the split, so that all times are always positive.

On Apr 18, 2017, at 1:34 PM, mountain...@gmail.com wrote:


Hi Ryan,

I did, I cloned it from the git repository. My model is somewhat complex, and I'm not quite sure if it's doing what I'd like. This is my conceptual model, with the code for it below:

<Auto Generated Inline Image 1.png>
<Auto Generated Inline Image 1.png>

mountain...@gmail.com

unread,
Apr 18, 2017, 6:15:39 PM4/18/17
to dadi-user, mountain...@gmail.com, rgu...@email.arizona.edu
Hi Ryan,

I was thinking that was the problem, and like I said, it seemed to go away when I set non-overlapping bounds to avoid the issue you mentioned. I'll see if I can re-code it as you suggested.

Thank you!
Jared


On Tuesday, April 18, 2017 at 2:20:53 PM UTC-7, Gutenkunst, Ryan N - (rgutenk) wrote:
You’re model has T-Ts as a time, which breaks if Ts > T. I suggest rewriting your model to define T as the time interval between the expansion and the split, so that all times are always positive.

Reply all
Reply to author
Forward
0 new messages