Cluster different samples, I need same otus id

36 views
Skip to first unread message

Karen Daiana Schoninger

unread,
Dec 16, 2024, 3:26:18 AM12/16/24
to VSEARCH Forum
Hello everyone :)
I am doing a metabarcoding project.
The thing is that I have 50 samples, which I have demultiplexed and preprocessed separately.
At first I have separately clustered each sample, as a result I get a fasta file for each sample with different OTUs. The thing is that I need the OTUs between samples to have the same identifier. That is, when clustering separately the OTU_1 of sample A is not the same as the OTU_1 of sample B (they are even different in the taxonomic classification).
I would like to get a list of OTUs and know which OTUs are repeated in the samples and which are not.
I have decided to concatenate the 50 fasta files and cluster them all together with the following command:
vsearch --cluster_fast all_samples.fasta \
        --id 0.97 \
        --centroids centroids.fasta \
        --uc clusters.uc \
        --relabel OTU_ \
        --sizeout

And then compare each original sample with the centroids that appeared previously with:
vsearch --usearch_global sample1.fasta \
        --db centroids.fasta \
        --id 0.97 \
        --otutabout sample1_otutable.txt

With this I will get a table of otus for each sample and then combine them using R to get a single table.
As I mentioned before, my goal is to know which otus are repeated and which samples are not. Do you think this workflow is correct?

Thank you very much in advance :)

Frédéric Mahé

unread,
Dec 17, 2024, 3:44:05 AM12/17/24
to VSEARCH Forum
Hello Karen, if I am not mistaken, you've posted a similar question on GitHub (https://github.com/torognes/vsearch/issues/585). As described in my answer on GitHub, you can combine sample tagging (--sample) and OTU tracking (--otutabout) to obtain an OTU table. Please, let me know if that works for you.
Reply all
Reply to author
Forward
0 new messages