Hi David,
Using the kmer filtering is akin to randomly sampling a subset of reads.
So, you are halving the amount of data that you are pushing into the
pipeline -- this will result in less robust SNP calls down the line
(since your ustacks output show a mean read depth of ~9x, you are going
to push this down to 4-5x, too low to call most SNPs).
I would recommend increasing the number of threads, or better, running
the individual ustacks runs independently on a cluster, assuming you are
submitting jobs to a cluster job scheduler. You could also disable
gapped alignments, which will speed things up, but again, it will cause
you to miss assembling some alleles into their respective loci.
Best,
julian
DAVID VENDRAMI wrote on 6/16/21 3:26 AM: