Hi Mattia,
Stacks is not aware of what the collection of sequences are that you want to align against. If you want to combine components of different references together into a FASTA file you can. Stacks will take the output of the aligner (e.g. BWA) and assemble the RAD loci based on the alignment positions the aligner specified. I’m not sure why you specify different ‘genes’ in your example, unless you are trying to only align against the protein coding genes… however, if you combine a bunch of similar sequences together, from different related species or assemblies, you are not necessarily going to get a better result, since the aligner will likely find multiple alignments for different reads and the software has no way to sort this out. Typically, if you want to do a multi-species comparison, you need to pick one reference to use as the base and adjust your alignment parameters to make sure a sufficient number of reads from the non-native species are aligning properly. Alternatively, you can do a de novo assembly first and align the assembled loci to the different references afterwards – it depends on what your analysis goals are.
Best,
Julian
Hi Mattia,
I would take the GenBank sequences and place them all in a single FASTA file and make a reference index out of them, then align my ddRAD data to that with BWA. You will likely have your ddRAD reads aligning to multiple loci in the ‘reference’ but it should mark the best alignment as ‘primary’ and the other alignments as ‘secondary’. Stacks would then proceed with the primary alignments for assembling your raw ddRAD data into loci. After that, you could feed them into structure or PCA, or similar. If genetic variation is very low, the aligner may not be able to distinguish between mt genes of different, but closely related species. You might play around with samtools after the alignment is done to see for the other mt genes, based on the RNAseq data, if your ddRAD data for that gene goes to more than one species (indicating the aligner can’t distinguish) or if you always get a good species delineation from the ddRAD data.
Cheers,
Julian
--
Stacks website: http://catchenlab.life.illinois.edu/stacks/
---
You received this message because you are subscribed to the Google Groups "Stacks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
stacks-users...@googlegroups.com.
To view this discussion visit
https://groups.google.com/d/msgid/stacks-users/b9aecabe-aa24-4f02-a97f-9aa67dd10eeen%40googlegroups.com.