Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

generate OTU table including fasta sequences

26 views
Skip to first unread message

Muhammad Arslan

unread,
Aug 21, 2019, 6:06:59 PM8/21/19
to Qiime 1 Forum
Hi there, 
I want to generate OTU table with fasta file for view in excel. I have tried some ways to combine both files separately but it's not working for me. I am wondering if you can guide me on how can I generate such a table (using biom file etc)? I need fasta sequence in addition to taxonomy as a final output file. 
Best regards 
Arslan

TonyWalters

unread,
Aug 22, 2019, 3:35:26 AM8/22/19
to Qiime 1 Forum
Hello Arslan,

There is not an easy way to directly add fasta sequences to the biom file. You'd have to make a custom script first to parse the fasta sequences into a format that could be read in as metadata for the OTUs (e.g., a tab delimited file of sequence label<tab>sequence).

May I ask why you need to have this format? There might be an alternative approach to use depending upon what you're trying to accomplish.

Muhammad Arslan

unread,
Aug 22, 2019, 11:40:30 AM8/22/19
to Qiime 1 Forum
Hi Tony,
 
Maybe it was unclear in my earlier post but I want to make OTU excel table including taxonomy and fasta file. I am not saying that I want to make biom file with fasta sequence inside. So, there might be some script available that extract the information from two separate files and make one OTU table (as Mothur do).
Do you know any such thing?

Thank you very much
Arslan

Muhammad Arslan

unread,
Aug 22, 2019, 11:44:05 AM8/22/19
to Qiime 1 Forum
And I need to do this just because I want to check which OTUs belong to which sequence, although they share 97% cutoff. 

TonyWalters

unread,
Aug 22, 2019, 1:59:30 PM8/22/19
to Qiime 1 Forum
Well, there's not an easy way (without writing a script) to do either a .biom or a tab-delimited sort of excel file that includes the fasta sequences.

However, the rep_set.fna file has the representative sequence for each OTU, and the labels in that file match the OTU identifiers that are in the OTU table, so if you need to look up sequences for a few OTUs, you could just search for their ID in the rep_set.fna file, or by the command line with:
grep -A 1 xxx bbb
where xxx is the OTU ID
and bbb is the rep_set.fna file
to get the sequence for the particular OTU so you don't have to open a fasta file in a text editor, if that's inconvenient.

Muhammad Arslan

unread,
Aug 22, 2019, 2:08:58 PM8/22/19
to Qiime 1 Forum
Dear Tony,

Thank you for the detailed reply. I think its really helpful.

Best regards
Arslan
Reply all
Reply to author
Forward
0 new messages