Hi Maja,
If you want to add to an existing catalog, just make sure the input path of the existing catalog (--catalog) is different from your output path for cstacks (--outpath). If so, cstacks will write the new catalog in the location you specified and leave the old catalog files untouched. You can’t run multiple copies of cstacks at the same time and expect it to somehow merge them all into a single catalog. The independent program runs don’t know anything about any other cstacks runs executing at the same time. You can run cstacks one time, with as many additional samples as you want, and that one run will write a new catalog containing the old and new samples together.
That said, you don’t need to put all your samples in the catalog. It is fine to put a representative number of samples in the catalog so that you are likely to see all the SNPs in your various populations one or more times. You can then just match the remaining samples to your catalog using sstacks. So, for example, you could load 10 individuals from each population into the catalog and then just match the remaining samples to that catalog using sstacks.
Best,
Julian
Hi Maja,
To simplify running things by hand, you should use a population map. For example, you could have used this file to simplify running sstacks. Regardless, you will need it for tsv2bam, which will sort the data in each sample and incorporate the paired-end reads. You run it like this:
tsv2bam -P stacks_dir -M popmap -R paired_reads_dir
The popmap will list the name of each sample, e.g. Fb141_1. You then provide the path to the directory that contains the raw, paired-end reads, which should have been output by process_radtags and be named with the same prefix, e.g. Fb141_1.
You will then use the same popmap to run gstacks and finally populations.