RADseq data should not be that susceptible to batch effects compared to
say, RNAseq data (where individual read counts matter). For RAD, the
read counts only matter in terms of how many of a particular allele was
sequenced on the platform at each RAD locus, as these allele counts will
be fed into the SNP caller and considered collectively. If you have very
low coverage, you might see some alleles occur on one sequencing
platform more than the other, but if you have good coverage, I wouldn't
worry about it. This is a similar problem for very rare alleles detected
in the data -- they are likely to be error, so it is good to employ a
modest minor allele frequency filter after you process the data.
I think your library construction effects would likely outweigh any
effects from sequencing platform. The flowcell and chemistry is
different on the 4000, but again, I think with good coverage these
effects should not be a problem.
Anyway, you can always do a PCA of your genotypes after processing and
look to see if you see any splits in the data appear based on what
sequencing platform was used.
julian
nicolef...@gmail.com wrote on 11/26/18 1:00 PM: