1D frequency spectrum model data too low

94 views
Skip to first unread message

Mario Ernst

unread,
Oct 16, 2023, 4:56:41 AM10/16/23
to dadi-user

Dear all,

I have been trying to run different 1D models but get results that are a bit strange. When comparing the allele frequency curves of the modelled and the observed data the shape itself looks alike. However, the modelled curve is always much lower... No matter what population or model I am running, this seems to always be the case. I have attached the results for one of the populations as an example. Does anybody know what may be causing this?

Thanks very much in advance!

Find below my code (I am using Portik's dadi pipeline for optimization and fitting):

#########################

## Required modules

#########################
import numpy as np
import pandas as pd
import dadi

import Optimize_Functions
import Plotting_Functions

 

#########################
##Import data
#########################

dd = dadi.Misc.make_data_dict_vcf(vcf_path, pop_file_path)

fs = dadi.Spectrum.from_data_dict(dd, [pop], polarized = False,projections =[18]) ## fs folded because the ancestral state of SNPs is unknown

ns = fs.sample_sizes

pts = [max(ns)+300, max(ns)+350, max(ns)+400]

 

#########################

#optimization

#########################

reps = [50,50,100]

maxiters = [100,100,100]

folds = [3,2,1]

prefix="alleni_r70_pop1_reps_" + str(reps[0]) + "_" + str(reps[1]) + "_" + str(reps[2]) + "_maxiters_" + str(maxiters[0]) + "_" + str(maxiters[1]) + "_" + str(maxiters[2]) + "_folds_" + str(folds[0]) + "_" + str(folds[1]) + "_" + str(folds[2])

Optimize_Functions.Optimize_Routine(fs, pts, prefix, ‘two_epoch’,  dadi.Demographics1D.two_epoch , 3, 2, fs_folded=True, param_labels = "ne, T", reps = reps, folds = folds, maxiters = maxiters)

Optimize_Functions.Optimize_Routine(fs, pts, prefix, ‘three_epoch’,  dadi.Demographics1D.three_epoch  , 3, 4, fs_folded=True, param_labels = " neB, neF, TB, TF", reps = reps, folds = folds, maxiters = maxiters)

 

#########################

#fitting

#########################

Best_Params_two_epoch= [3.2363, 0.1086] #results from optimization

Best_Params_three_epoch= [0.2222, 1.4899, 0.1299, 0.262] #results from optimization

fitted_fs = "fitted_" + prefix

model_fit_two_epoch = Plotting_Functions.Fit_Empirical(fs, pts, fitted_fs, 'two_epoch', two_epoch, Best_Params_two_epoch, fs_folded=True)")

Plotting_Functions.Plot_1D(fs, model_fit_two_epoch,  fitted_fs, 'two_epoch')

model_fit_three_epoch = Plotting_Functions.Fit_Empirical(fs, pts, fitted_fs, 'two_epoch', two_epoch, Best_Params_two_epoch, fs_folded=True)")

Plotting_Functions.Plot_1D(fs, model_fit_three_epoch,  fitted_fs, 'three_epoch')

 

fitted_alleni_r70_pop1_reps_50_50_100_maxiters_100_100_100_folds_3_2_1.fs_three_epoch.pdf
fitted_alleni_r70_pop1_reps_50_50_100_maxiters_100_100_100_folds_3_2_1.fs_two_epoch.pdf

Ryan Gutenkunst

unread,
Oct 16, 2023, 2:28:28 PM10/16/23
to dadi-user
Hello Mario,

I’m not sure what’s happening under the hood in the Portik plotting routines. The most likely case is that theta is not being applied to properly scale the model sfs before plotting.

Best,
Ryan
> --
> You received this message because you are subscribed to the Google Groups "dadi-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to dadi-user+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/dadi-user/f8b771fd-07d4-4bc8-97d0-b082bbcda344n%40googlegroups.com.
> <fitted_alleni_r70_pop1_reps_50_50_100_maxiters_100_100_100_folds_3_2_1.fs_three_epoch.pdf><fitted_alleni_r70_pop1_reps_50_50_100_maxiters_100_100_100_folds_3_2_1.fs_two_epoch.pdf>

Message has been deleted

Mario

unread,
Mar 29, 2024, 7:44:52 PMMar 29
to dadi-user
If anyone is facing a similar issue, I wanted to share that I discovered the discrepancy was due to the Plotting_Functions.py in the Portik pipeline utilizing the Poisson approach to plot the model data (dadi.Plotting.plot_1d_comp_Poisson(model_fit_two_epoch, fs)). By using the composite polynomial approach instead, specifically dadi.Plotting.plot_1d_comp_multinom(model_fit, fs), the observed and modeled curves aligned correctly.
Reply all
Reply to author
Forward
0 new messages