Dear Doug,
Thanks for your query. I realize we never properly documented the
graphic rendering of the score. It is explained and validated in
the recent TCS paper
(https://academic.oup.com/mbe/article/31/6/1625/2925802) as well
as in the original mcoffee paper
(https://academic.oup.com/nar/article/34/6/1692/2401531), but to
be fair, I am not sure we have ever posted a proper tutorial. I am
putting this on the todolist.
Let me start with the first discrepancy: the score are normalized
1-100, but owing to the compressed nature of this score on large
alignments, we have extended the scale to 1-1000 for the score
related to the entire alignment, hence the 652, that should be
65,2. We could have made it a float but we were somehow limited
with several third party packages, including ours that assume an
integer and would have required substantial re-engineering to deal
with a float. Unfortunatelly this was a very early format decision
taken more than 20 years ago.
Now with respect to your example, you have aligned two sequences and the consistency between the alternative MSA of these sequences (these alternative alignments ARE the library) is 65.2%. This consistency is a combination of the sequence identity and the actual agreement between these alternative alignments.
In T-Coffee, the library can be whatever one considers relevant
(all pairwise alignments in T-Coffee, alternative MSAs in
M-Coffee, etc)
This score reflects the overall stability of the alignments
across the various alternative alignments contained in the
library. Unaligned regions have a high score when they are
consistently unaligned across the library.
The cons line provides you an estimate of MSA stability across
columns, while the global numbers near the top provide an
indication of the stability of a given sequence (including cons).
These numbers can be normalized in various ways. In the current
scheme the normalization takes place against the sequence length
hence the asymmetry.
To be honest this scheme was never developed for pairwise sequence alignments but it may give you a - weak -clue of the most reliable parts of your alignment.
Hope this helps and thanks for using T-Coffee. Do not hesitate to ask any more question,
Cheers,
Cedric
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. --
You received this message because you are subscribed to the Google Groups "Tcoffee" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tcoffee+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tcoffee/DBAPR05MB7047AA701AFD2099055C091BA81F0%40DBAPR05MB7047.eurprd05.prod.outlook.com.
-- ########################################## Dr Cedric Notredame, PhD Group Leader Notredame's lab - Comparative Bioinformatics Group Bioinformatics and Genomics Programme Room 440.03 Centre de Regulació Genòmica (CRG) Dr. Aiguader, 88 08003 Barcelona Spain Ph# + 34 93 316 02 71 Fax# + 34 93 316 00 99 Mobile# + 34 66 250 47 82 email cedric.n...@crg.eu url www.tcoffee.org blog cedricnotredame.blogspot.com ORC-ID: 0000-0003-1461-0988 ###########################################