Hi Julian,
Many thanks for the reply - very much appreciated
Very interesting to hear your thoughts on phred score filtering thank you and actually pleased you agree this shouldn't be done too harshly, so I'll leave it to process-radtags rather than do any phred based trimming in an external programme prior.
Regarding adapter contamination, it's not an issue with all samples at all, just a small minority, that I assume size selection didn't deal with and thus slipped through into being sequenced. The majority of reads are fine. It was just an issue of whether this small minority of contaminated reads should be removed before processradtags/stacks, as they could I assume impact results OR can they be left in. The vast majority of read 1 and read 2 the will be OK (have no contamination) and will be kept as the 125bp-PE reads that STACKS requires (ie all the same size) and we won't trim them down at all – reads that were trimmed prior to process_radtags, because of barcode/adapter presence would be excluded from inclusion prior to process_radtags as they would be trimmed below the 125bp read length of the majority, thus would not run in STACKS.
This leaves options as:
1. trim no reads, run them all through process_radtags and assume that those containing barcode/adapter (not the barcode used for demultiplexing which will of course remain in read 1) will subsequently not align to a reference sequence OR if denovo, not be in sufficient quantity to form stacks and thus such reads with contamination will be automatically removed.
2. Use a programme such as cutadapt, trimmomatic, prior to process_radtags to remove an element of the contamination. Naturally some partial match containment that does not meet the minimum threshold nucleotide match will remain in the reads and in likelihood an element of true sequence will be removed through random matching of true sequence to nucleotides of the barcode/adapter