Hello Kang,
This vsearch function, --allpairs_global, will align every read with every other read, then produce a uc output. For example, I have my OTUs labeled as OTU_<number>, and get this result from allpairs.
H 3 253 89.3 + 0 0 253M OTU_1 OTU_4
H 7 253 81.8 + 0 0 253M OTU_1 OTU_8
H 1 253 75.5 + 0 0 253M OTU_1 OTU_2
H 5 253 74.7 + 0 0 253M OTU_1 OTU_6
H 9 253 73.9 + 0 0 253M OTU_1 OTU_10
H 6 253 72.7 + 0 0 253M OTU_1 OTU_7
H 4 253 71.1 + 0 0 253M OTU_1 OTU_5
H 8 253 70.8 + 0 0 253M OTU_1 OTU_9
H 2 253 68.0 + 0 0 253M OTU_1 OTU_3
This would be the first row of the similarity table for OTU_1.
VSEARCH makes several notable improvements over USEARCH, one of which is exact alignments (same as Needleman-Wunsch). It also supports the --iddef flag, so you can pick out what number it returns from this list.
--iddef 0|1|2|3|4
Change the pairwise identity definition used in --id. Values accepted are:
0. CD-HIT definition: (matching columns) / (shortest sequence length).
1. edit distance: (matching columns) / (alignment length).
2. edit distance excluding terminal gaps (same as --id).
3. Marine Biological Lab definition counting each extended gap (internal or
terminal) as a single difference: 1.0 - [(mismatches + gaps)/(longest
sequence length)]
4. BLAST definition, equivalent to --iddef 2 in a context of global pairwise
alignment.
Does any of those 4 sound like the metric you are looking for?
Colin