Scoring inapplicable states in the new Nanotyrannus paper

121 views
Skip to first unread message

Mickey Mortimer

unread,
Nov 1, 2025, 8:26:16 PM (5 days ago) Nov 1
to Dinosaur Mailing Group
This is a long one, but tl;dr I don't think Zanno and Napoli's method of scoring inapplicable codings is appropriate for their analysis.

So one part of Zanno and Napoli's new Nanotyrannus paper states "One particularly important point was facilitating contingent (or reductive) coding of inapplicable characters, which recent work suggests is the superior method for coding transformations incorporating superordinate and subordinate characters that define presence or absence of a primary structure, followed by descriptors of that structure that are inapplicable for taxa that lack it. Phylogenetic algorithms have generally considered inapplicable scoring (“-”) as equivalent to missing data (“?”), but several recent approaches allow inapplicability to be interpreted differently increasing the value of contingent coding."

This is based on Goloboff et al. (2021), who argue that coding inapplicable as missing (which is what TNT does by default, and what almost all papers end up doing) is misleading because the program still assigns a hypothetical state to unknown values for each calculation. As an example, in their Figure 2d...

https://onlinelibrary.wiley.com/cms/asset/414d4348-a3f4-46de-879f-ce94d196375a/cla12456-fig-0002-m.jpg

... it's seen as bad that if you have red-tailed ancestors, then the tail is lost, but then it is later regained, that the tail is still parsimoniously red. They see it as misleading that you only need one step at the end there (to re-add the tail), and not two steps (to re-add the tail AND to make the tail red), since in a basic sense it's a new tail so its color would have to be new as well. It couldn't have gotten its redness from the non-tails it directly evolved from.

But I didn't think that made sense genetically, which the authors do address-

"For some cases, there may be evidence that, when the tail itself gets lost, the genes that determine its colour can remain unchanged. In that case, the “long-distance” effect between groups separated by tail absences observed in reductive coding (Figs. 2c,d), as well as the assignment of some colour to nodes reconstructed as tailless, are entirely accurate. In that situation, even if the tail is absent, the (unexpressed) genes may still code for a red or a blue colour; the missing entry used for colour in reductive coding thus appropriately represents our ignorance of what the genes in question code for, but the genes do code for a definite colour (a colour which could in principle be ascertained, e.g. sequencing the unexpressed gene). Should a tail be lost and subsequently regained, the colour would be expected to continue being the same as it was before the loss of the tail. The case where genes continue coding for the secondary character (e.g. colour) even in the absence of the primary feature (e.g. tail) can be called the “Hen’s teeth hypothesis” or HTH (i.e. modern birds show that the genetic and molecular machinery for a character can be retained, even if the character—be it teeth or colours—remains unexpressed). Whether the HTH applies to a particular set of characters is to be determined by the researcher; it may well apply to some of the characters in the matrix, but not the others."
"The examples of Figs 1 and 2 assume that the secondary character is indeed inapplicable (not just unobserved) when the primary structure is absent—that is, the examples assume that the HTH does not apply."

And I think the HTH should at least be presumed to apply in the theropod cases I can think of (missing or transformed structures). If each character is supposed to be a single change, then any factors that affect a once-missing or transformed structure's features (other genes, developmental processes, topological consequences from adjacent structures) should parsimoniously be assumed to stay the same barring evidence otherwise. After all, these programs do work on the assumption of having the least changes necessary, and that characters are coding for homologous structures.

Their example of what their step matrix method codes like is- "in the deletion of a single base in a DNA sequence, followed several nodes down the tree by a subsequent reinsertion of another base at the same position. As soon as the base is deleted, a subsequent reinsertion will introduce a new base (A, G, C or T) independently of the base before the deletion; the intermediate gapped sequence contains no record of the base that existed before."

But in such an example, the base (or tail, or teeth, etc.) aren't truly homologous. It would be like scoring the angle of dorsal fins in fish and dolphins. And sure you could have a more excusable example like scoring the depth of a fossa that happens to be caused by different mechanisms (pneumatic vs. muscular) in different taxa. But really then you've made a mistake and should not having been scoring those for the same character in the first place, because again they are not homologous, and TNT is assuming homology in its workings.

So I don't think this scoring method is appropriate for most if any of dinosaur analyses, let alone a clade as small as Tyrannosauroidea where genetics, development and topology were probably pretty consistent. Thoughts?

Reference- Goloboff, P. A., De Laet, J., Ríos-Tamayo, D. & Szumik, C. A., 2021. A reconsideration of inapplicable characters, and an approximation of step-matrix recoding. Cladistics. 37, 569-629.
https://onlinelibrary.wiley.com/doi/full/10.1111/cla.12456
Reply all
Reply to author
Forward
0 new messages