I am a new to HUMAnN2 and am working to was hoping I might be able to ask what is the rational for using the UniRef50 verus UniReg90 database?
Part of the confusion perhaps stem from the recommendation to use the UniRef90 database on the Huttenhower Lab page (http://huttenhower.sph.harvard.edu/humann2), but the use of the UniRef50 database on the user manual (https://bitbucket.org/biobakery/humann2/src/tip/doc/UserManual.md?fileviewer=file-view-default) and wiki/tutorial pages (https://bitbucket.org/biobakery/biobakery/wiki/humann2).
Are there clear advantages/drawbacks of using the sequences clustered to 90% versus 50% similarity?
Thanks in advance for any assistance in this matter and for developing this fantastic program and tutorials.