Sorry if this is a simple question, but I haven't found a clear answer so far.
I'm going through the process of running dadi on a population dataset for which I have a SNP-only VCF. I can go through the process of variant calling again, but at present, I don't have access to a VCF with variant and invariant sites. I'm wondering whether I can use dadi with this input only, whether I need to pad the dataset with monomorphic sites, or whether it's not an issue. I've tried running dadi so far and population size estimates look way low when I use L=length of callable input sequence, but when I use L= number of sites in VCF (again, mostly polymorphic) the population size estimates are a reasonable order of magnitude (compared with nucleotide diversity).
I think I understand that the monomorphic sites would not make it into the allele frequency spectrum, but I'm concerned that it would impact the underlying model or estimation of theta.
Thanks,
Isaac