Cluster different samples, I need same otus id

40 views

Skip to first unread message

Karen Daiana Schoninger

unread,

Dec 16, 2024, 3:26:18 AM12/16/24

to VSEARCH Forum

Hello everyone :)
I am doing a metabarcoding project.
The thing is that I have 50 samples, which I have demultiplexed and preprocessed separately.
At first I have separately clustered each sample, as a result I get a fasta file for each sample with different OTUs. The thing is that I need the OTUs between samples to have the same identifier. That is, when clustering separately the OTU_1 of sample A is not the same as the OTU_1 of sample B (they are even different in the taxonomic classification).
I would like to get a list of OTUs and know which OTUs are repeated in the samples and which are not.
I have decided to concatenate the 50 fasta files and cluster them all together with the following command:

vsearch --cluster_fast all_samples.fasta \
--id 0.97 \
--centroids centroids.fasta \
--uc clusters.uc \
--relabel OTU_ \
--sizeout

And then compare each original sample with the centroids that appeared previously with:

vsearch --usearch_global sample1.fasta \
--db centroids.fasta \
--id 0.97 \
--otutabout sample1_otutable.txt

With this I will get a table of otus for each sample and then combine them using R to get a single table.
As I mentioned before, my goal is to know which otus are repeated and which samples are not. Do you think this workflow is correct?

Thank you very much in advance :)

Frédéric Mahé

unread,

Dec 17, 2024, 3:44:05 AM12/17/24

to VSEARCH Forum

Hello Karen, if I am not mistaken, you've posted a similar question on GitHub (https://github.com/torognes/vsearch/issues/585). As described in my answer on GitHub, you can combine sample tagging (--sample) and OTU tracking (--otutabout) to obtain an OTU table. Please, let me know if that works for you.

Reply all

Reply to author

Forward

0 new messages