--minsize while using --cluster_fast

70 views
Skip to first unread message

redan...@ucdavis.edu

unread,
Jun 20, 2019, 8:13:48 PM6/20/19
to VSEARCH Forum
Hello,

From reading another post, it seems you are aware of issues with the --minsize option while using --cluster_fast.
I have been referring to the manual for version 2.13.4 (version I have installed) and published sequence processing pipeline which utilizes vsearch and specifies the use of --minsize 2 when clustering. Since this option is missing with the newest version (produces an error and a list of accepted options), can you recommend how I can attempt to replicate the pipeline specified by https://github.com/lavanyarishishwar/taxadiva ?
I thought at first it was a typo and should be --mintsize which is specified as an option in the error output, but that does not seem correct based on its description in the manual (note: this option is not associated with usage of cluster_fast in the manual) and produces an outrageous number of clusters from singletons (98%).
I should also not the error help/output at the command line does not correspond to the PDF manual I obtained from your github page.

Thanks for your help!

Rachel

Torbjørn Rognes

unread,
Jun 21, 2019, 3:47:12 AM6/21/19
to VSEARCH Forum
Hi Rachel

In vsearch, there is no option to the cluster_fast command to not output singletons.

To remove singletons you may for example run the fastx_filter command with the minsize option, after clustering, like this:

vsearch --fastx_filter input.fasta --minsize 2 --fastaout output.fasta

We have recently introduced a stricter checking of options in vsearch to inform users of options that have no effect on the selected command.

I hope this helps!

Any details would be appreciated if there is any inconsistencies between the manual and other information provided by vsearch. 

- Torbjørn

Reply all
Reply to author
Forward
0 new messages