Hello all,
I have read the few threads that mention this option for sstacks, but I could find no explanation of what it actually does. I would like it to allow haplotypes from individuals that were not in the original catalog to be included in the output of sstacks. As others have said, if we build the catalog with a subset of individuals it may not include all alleles and any that are not in the original catalog are discarded when mapping in sstacks. It seems this is, at best, throwing away data and, at worst, biasing the results.
If someone could explain just what this options does, and its implications, that would be very much appreciated.
For context, I have 1035 samples from which to build a catalog and it is just too many to do so efficiently. I would like to build the catalog on a subset but not exclude any alleles in the other individuals when running sstacks.
Thanks!!
Brian D.