WARNING:Inference:Model is nan in some entries where data is not masked.

Giordano

unread,

Apr 28, 2015, 1:18:07 PM4/28/15

to dadi...@googlegroups.com

Hi Ryan,

I'm trying to model very simple demographic scenarios between two populations (e.g. split, split whith gene flow, split with gene flow and changes in the Ne) from genomic data. Unfortunately I keep getting this:

WARNING:Inference:Model is nan in some entries where data is not masked.
WARNING:Inference:Number of affected entries is 6888. Sum of data in those entries is 2.22286e+06:

I've tried what you suggested in a previous post: increase the  number of grid points, tight the range of the upper-lower parameters values, but with no success.

I've tried even models with increased complexity and number of parameters and to reduce the number of variants.

As far as only one population at a time was modeled  everything worked fine.

Please find the script I used and attached the sfs plot.

Thanks a lot for your support! Looking forward to provide any further details which could help in resolve the issue.

Best regards,

Giordano

def split_mig( params , ns , pts ):
    nu1 , nu2 ,T , m = params
    xx = Numerics.default_grid( pts )
    phi = PhiManip.phi_1D( xx )
    phi = PhiManip.phi_1D_to_2D( xx , phi )
    phi = Integration.two_pops(phi , xx , T , nu1 , nu2 , m12 =m , m21 = m )
    fs = Spectrum.from_phi( phi , ns , ( xx , xx ))
    return fs

fs = fs_3L + fs_3R
fig = pylab.figure(1)
dadi.Plotting.plot_single_2d_sfs(fs, vmin=1)
ns = fs.sample_sizes

pts = [300, 300]

params = array([1, 1, 1, 1])
lower_bound = [0.01, 0.01, 0, 0]
upper_bound = [100, 100, 5, 10]

func = split_mig
    
func_ex = dadi.Numerics.make_extrap_log_func(func)



l = list()
for i in range(50):
    p0 = dadi.Misc.perturb_params(params, fold=1.5, lower_bound=lower_bound, upper_bound=upper_bound)

    popt = dadi.Inference.optimize_log(p0, fs, func_ex, pts, lower_bound=lower_bound, upper_bound=upper_bound, 
                                   verbose=len(params), maxiter = 300, multinom=True)
    l.append(popt)

Screen Shot 2015-04-28 at 18.17.03.png

Gutenkunst, Ryan N - (rgutenk)

unread,

May 7, 2015, 1:20:49 PM5/7/15

to dadi...@googlegroups.com

Hello Giordano,

Apologies for my slow reply. I've been traveling recently, and this slipped through the cracks.

The issue is in defining your pts list for extrapolation. You want it to be an increasing number of points, whereas you just have [300, 300]. We typically recommend quadratic extrapolation, so a list of 3 pts settings. Something like [200, 220, 240] should work well for your problem. You could probably get by with coarse grids like [150, 160, 170] if you need faster speed.

Best,

Ryan

--
You received this message because you are subscribed to the Google Groups "dadi-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dadi-user+...@googlegroups.com.
To post to this group, send email to dadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/dadi-user.
For more options, visit https://groups.google.com/d/optout.
<Screen Shot 2015-04-28 at 18.17.03.png>

--

Ryan Gutenkunst

Assistant Professor
Molecular and Cellular Biology
University of Arizona
phone: (520) 626-0569, office LSS 325

http://gutengroup.mcb.arizona.edu

Giordano

unread,

May 11, 2015, 1:18:50 PM5/11/15

to dadi...@googlegroups.com

Hi Ryan,

thanks a lot for your time and support!

Unfortunately the issue is still present even defining the pts list with an increasing number of points as [200, 220, 240]. But if I project the spectra to a smaller number of samples, let's say 20 chromosomes for pop and using this grid [40, 50, 60], the optimisation procedure works properly. The issue is present again at 60 chromosomes for pop with this grid [100, 110, 120].

Should I consider to use a smaller number of samples or is there a way to get around it?

Thanks a lot again.

Cheers,

Giordano

Gutenkunst, Ryan N - (rgutenk)

unread,

May 12, 2015, 6:06:00 AM5/12/15

to dadi...@googlegroups.com

Hi Giordano,

Is this happening for all parameter values during the optimization, or just a few? Are your fits going to extreme parameter values? (Such as migration rates > 10.) That can cause numerical problems even with relatively large numbers of grid points. In any case, going to even larger grid points should fix it, at the cost of speed.

Best,

Ryan

Giordano

unread,

May 14, 2015, 2:50:30 PM5/14/15

to dadi...@googlegroups.com

Hi Ryan,

Is this happening for all parameter values during the optimization, or just a few?

- Sorry, could you give me a clue on how to figure out which parameters is causing the issue during the optimization?

Are your fits going to extreme parameter values? (Such as migration rates > 10.)

- No, they are not going to extreme values, for example migration rate between 0.1 and 2.

I'll try using a very big pts.

Thanks a lot again!

Cheers,

Giordano

Gutenkunst, Ryan N - (rgutenk)

unread,

May 14, 2015, 3:54:56 PM5/14/15

to dadi...@googlegroups.com

Hi Giordano,

On May 14, 2015, at 11:50 AM, Giordano <botta.g...@gmail.com> wrote:

Is this happening for all parameter values during the optimization, or just a few?

- Sorry, could you give me a clue on how to figure out which parameters is causing the issue during the optimization?

When calling the optimization function, add the argument "verbose=1". This will print out every parameter set that is tried. Then the question is: Are you getting warnings for all or most of your parameter sets, or is it only occasionally?

Best,

Ryan

Giordano

unread,

May 15, 2015, 7:52:17 AM5/15/15

to dadi...@googlegroups.com

Hi Ryan,

please find attached a notebook with an example analysis with the Warnings and the spectra used, maybe is the best way to figure out what is happening.

Thanks a lot for your support!

Best,

Giordano

Screen Shot 2015-05-15 at 13.46.58.png

20150513_dadi_polarized_example.html

Gutenkunst, Ryan N - (rgutenk)

unread,

May 25, 2015, 5:40:21 PM5/25/15

to dadi...@googlegroups.com

Hi Giordano,

It's not clear to me what's causing the problem, because the model is very complex. Have you seen similar problems with simpler models (without so many parameters)?

Best,

Ryan

<Screen Shot 2015-05-15 at 13.46.58.png><20150513_dadi_polarized_example.html>

Giordano

unread,

Jun 16, 2015, 7:04:02 PM6/16/15

to dadi...@googlegroups.com

Hi Ryan,

sorry for the very slow reply, I finally found the time to run simple models like split and asymmetric migration. There're always some warning, but very sporadic and the optimisation procedures were always successful.

Thanks a lot!

Cheers,

Giordano

Message has been deleted

daniqu...@gmail.com

unread,

Sep 15, 2016, 11:09:08 AM9/15/16

to dadi-user

Hello, Ryan!
My problem is exactly like Giordano's. However, my model is not so complex, is the IM and I also tried versions with no migration or simetric mig. It is just not working! Even with bigger grid, I get a lot of warming messages. What could it be?
Thanks!

Gutenkunst, Ryan N - (rgutenk)

unread,

Sep 16, 2016, 7:32:18 PM9/16/16

to dadi...@googlegroups.com

Hello Dani,

It’s not obvious to me. If you have a small working example that exhibits the problem, I could look into it.

Best,

Ryan

--
You received this message because you are subscribed to the Google Groups "dadi-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dadi-user+...@googlegroups.com.
To post to this group, send email to dadi...@googlegroups.com.

Visit this group at https://groups.google.com/group/dadi-user.

For more options, visit https://groups.google.com/d/optout.

--
Ryan Gutenkunst
Assistant Professor of Molecular and Cellular Biology, University of Arizona
phone: (520) 626-0569, office: LSS 325, web: http://gutengroup.mcb.arizona.edu

Latest papers:
“Selection on network dynamics drives differential rates of protein domain evolution”
PLoS Genetics; http://dx.doi.org/10.1371/journal.pgen.1006132
"Triallelic population genomics for inferring correlated fitness effects of same site nonsynonymous mutations"
Genetics; http://dx.doi.org/10.1534/genetics.115.184812
"Whole genome sequence analyses of Western Central African Pygmy hunter-gatherers reveal a complex demographic history and identify candidate genes under positive natural selection"
Genome Research; http://dx.doi.org/10.1101/gr.192971.115

daniqu...@gmail.com

unread,

Sep 19, 2016, 2:04:03 PM9/19/16

to dadi-user

Here is my script:

import dadi
import numpy
dd = dadi.Misc.make_data_dict("c:\Python27\input_dadi.txt")
fs2 = dadi.Spectrum.from_data_dict(dd, pop_ids=['ama1','ama2'], projections=[16,14], polarized=False)
def IM(params, ns, pts):

s,nu1,nu2,T,m = params

xx = dadi.Numerics.default_grid(pts)

phi = dadi.PhiManip.phi_1D(xx)
phi = dadi.PhiManip.phi_1D_to_2D(xx, phi)

        nu1_func = lambda t: s * (nu1/s)**(t/T)
        nu2_func = lambda t: (1-s) * (nu2/(1-s))**(t/T)
        phi = dadi.Integration.two_pops(phi, xx, T, nu1_func, nu2_func,
                               m12=m, m21=m)

        fs = dadi.Spectrum.from_phi(phi, ns, (xx,xx))
        return fs
    def IM_nomig(params, ns, pts):
        s,nu1,nu2,T = params
        return IM((s,nu1,nu2,T,0),ns,pts)
    my_extrap_func = dadi.Numerics.make_extrap_log_func( IM_nomig )
    params = ["s","nu1","nu2","T"]
    ns = [16,14]
    pts_1 = [30,40,50]
    upper_bound = [100,100, 100, 5]
    lower_bound = [1e-2, 1e-2, 1e-2, 0]
    p0 = [1, 1, 2, 0.3]
    p0 = dadi.Misc.perturb_params(p0, fold=1, upper_bound=upper_bound,lower_bound=lower_bound)
    popt = dadi.Inference.optimize_log(p0, fs2, my_extrap_func, pts_1,
                                       lower_bound=lower_bound,
                                       upper_bound=upper_bound,
                                       verbose=len(p0), maxiter=20)
    print('Best-fit parameters: {0}'.format(popt))
    model = my_extrap_func(popt, ns, pts_1)
    ll_model = dadi.Inference.ll_multinom(model, fs2)
    print('Maximum log composite likelihood: {0}'.format(ll_model))
    theta = dadi.Inference.optimal_sfs_scaling(model, fs2)
    print('Optimal value of theta: {0}'.format(theta))

Na here is the end of my result: