I have some multiplexed 454 data and in order to get this into dada2, it needs to be in fastq format. It also has to have the primers trimmed off. I have a pipeline that seems to do the former, but I can't tell if the later happens, or if not what I need to run to remove the primers.
First I convert the qual and fna files into a fastq file with
convert_fastaqual_fastq.py -f test.fna -q test.qual -o test.fastq
Then I seperate the barcodes from everything else
extract_barcodes.py -f test.fastq -o test_sep -l 6
Then I demultiplex the fastq file
split_libraries_fastq.py -i test_sep/reads.fastq -b test_sep/barcodes.fastq -m test_map.csv --barcode_type 6 -o test_demult --phred_offset 33 --start_seq_id 100000 --store_demultiplexed_fastq
It seems like this pipeline should give me demultiplexed reads with quality scores, but with the primers still in place. If I was using split_libraries.py on regular fasta data I would have specified the -z truncate_remove option to get rid of the reverse primers (presumably that already gets rid of the forward primers). Am I missing something?
I can include test files and qiime version on request, but I think this question is philosophical enough not to require them.