Questions about two variable regions workflow

16 views

Skip to first unread message

Cesar Alejandro Perez Fernandez

unread,

Aug 21, 2017, 3:25:25 PM8/21/17

to qiime...@googlegroups.com

Hi!

I'm analyzing different datasets from 16S projects from different articles. The authors of a research realized in Kenia amplified the v4 hypervariable region of 16s to obtain their taxonomic profile (https://bmcmicrobiol.biomedcentral.com/articles/10.1186/s12866-016-0748-x). I downloaded de .fastq that they published (its one fastq for all the samples), and I hace some troubles to analyze the data.

On its supplementary material they provide barcodes and primers used in the study. With these information I firstly convert the .fastq to a .fasta + .qual archives since I was not able to obtain a .fastq for the barcodes. Then, I realize the demultiplexing usin split_libraries.py in the next way:
split_libraries.py -f fasta_qual/Kenia.fna -m Kenia_map.txt -q fasta_qual/Kenia.qual -l 200 -b 8 -o demultiplexed

For check the demultiplexing i perform the validate_demultiplexed_fasta.py, and the results shows that there still exists barcodes (percentage of sequences: 0.191) and linker primer (percentage of sequences: 0.002). Afterthat I realize the pick_open_refences.otus.py and all the taxonomy results in unassigned. The same results I obtained when I used pick_de_novo_otus.py

I have some questions about my steps:
The authors used MiSeq 2x300 bp technology for sequencing, The amplicos after demultiplexing have around 500bp of length. This is weird for me since MiSeq technology should result in sequences of around 200 bp approximately. what does 2x300bp refer? It could be the reason for the failure in the otu picking steps?

The step of demultiplexing is failing in completely remove primers and barcodes. This sequences are interrupting the otu picking a taxonomy assignment? Any suggestion for completely remove them?

Additionally, I have another set where the authors amplified two hypervariable regions (v5 - v8). For picking otus step in this case I should use pick_close_reference_otus? Am I right?

Thanks!
Cesar

Jose

unread,

Aug 24, 2017, 9:46:54 AM8/24/17

to Qiime 1 Forum

Hi Cesar,

I suggest you contact directly the authors of the paper to obtain the fastq file with the barcodes, otherwise it will be very difficult to troubleshoot what the issue might be. Once you have that, please get back to us and we can try to figure out why the OTU picking might be failing.

As for why the sequences are 500bp, I assume the authors must have joined the 300bp reads into single reads (which also explain why there is a single file, instead of one for the forward and one for the reverse read).

Jose

Reply all

Reply to author

Forward

0 new messages