convert RDP database (fasta file - 3,000,000ish items) for qiime database format

unread,

Jun 20, 2017, 8:52:43 PM6/20/17

to qiime...@googlegroups.com

Hello,

I am doing metagenomics data analysis with qiime and would like to use RDPdatabase.(https://rdp.cme.msu.edu/misc/resources.jsp) for 16s reference database.

I found this post, so I used cd-hit-est for sequence similarity clustering - 97% and extracted taxonomy information (7 levels)

So I got 400,000ish items ( grep ">" -c is 400,000ish) in the fasta file and id_to_taxonomy file.

I was just wondering this is the right way of converting the rdp fasta file for qiime database format.

Am I missing out on something or doing the wrong way?

Any piece of advice would be really appreciated.

Thank you

-jk Kim-

sorry for my English.

unread,

Jun 21, 2017, 5:18:57 PM6/21/17

to Qiime 1 Forum

Sounds correct to me. Note that the default greengenes reference is only about 100,000 items [1], so it'll take a little longer.

[1] grep -c '>' $(print_qiime_config.py | grep assign_taxonomy_reference | cut -f 2)

unread,

Jun 21, 2017, 8:18:28 PM6/21/17

to Qiime 1 Forum

Thank you! anyway, that's a nice one liner!

-jk-

unread,

Jun 27, 2017, 1:02:05 PM6/27/17

to Qiime 1 Forum

Hello, jk Kim

Could you convert the RDP database to use it in QIIME?
How did you do it?

Thanks,
Daniela

unread,

Jul 25, 2017, 3:25:39 AM7/25/17

to Qiime 1 Forum

Hi DV,

I did, but I found out that I didn't make pynast template files.

what I did was I opened otu_97.fasta and taxonomy files and parsed it; it was just dirty but not difficult.

I am still figuring out how to generate pynast template files which is used for MSA.

Hope this helps.

Take care,

Jk Kim

Reply all

Reply to author

Forward