A comprehensive review. Nice to see our 2014 paper cited albeit not really metagenomics. But regarding "...instead such data can be used to attempt an outgroup rooting of an existing tree, using already classified sequence", our Drosanthemum paper may be an even better example, since we used the LWRs of the alternatives.
Liede-Schumann et al. 2020. Phylogenetic relationships in the southern African genus Drosanthemum (Ruschioideae, Aizoaceae). PeerJ 8:e8999. https://doi.org/10.7717/peerj.8999
Check out figure 3: ML bootstrap consensus network with "Black arrows indicat[ing] potential root positions inferred
by outgroup-EPA, with arrow size proportional to the probability
estimate p (Supplemental Information 4, Table S4)." – it's CC-BY, feel free to include the figure :)
And pondering type of query sequences/NGS applications (same chapter), 18S and ITS are not really single-copy markers, but multi-copy. In plants not rarely involving paralogy – 3 out of 4 genomes mixed in polyploid wheats, up to four loci detected in beeches as NORs, which include the tandem copies of the 18S-ITS-containing 35S [or 45S] rDNA cistrons ;)
Regarding ML placement of queries, that'd be no issue for the 18S being a high-conserved gene with no intra-individual polymorphism and essentially no intra-specific/-generic divergence but ITS can be extremely polymorphic within a genome (even in diploids). Which brings one back to the "...
often overlooked source of query sequences are high-quality reference sequence database
". EPA is of substantial use for these multi-copy, high-variable markers, especially in the context of HTS-target sequencing. "However, this approach produces very short (150-400 nucleotide) reads, that typically only cover fragments of a reference gene. This limits their applicability to phylogenetics due to the lower information content" – that may be true for the mentioned universal single-copy markers but not for cloning classics like ITS1/ITS2 or (5'-) ETS of the 35S rDNA cistron as well as 5S rDNA intergenic spacer (also known as 5S-NTS).
EPA was the main reason, we could re-cherish them recently:
Piredda et al. 2020. High-throughput sequencing of 5S-IGS in oaks: Exploring intragenomic variation and algorithms to recognize target species in pure and mixed samples. Molecular Ecology Resources 21:495–510.
The combination of EPA with (Piredda et al. 2020) or without (Cardon et al. 2021) a well-annotated cloned reference data makes these short but high-divergent (too divergent for classic direct sequencing or barcoding) interesting again (next EPA-ed samples will include hybrid populations).
PS "Furthermore, phylogenetic placement has been used for placement of fossils (165) using morphological data" – much too rarely, thinking this actually was one of the original applications of EPA!
Berger SA, Stamatakis A. 2010. Accuracy of morphology-based phylogenetic fossil placement under Maximum Likelihood. IEEE/ACS International Conference on Computer Systems and Applications (AICCSA). Hammamet: IEEE. p 1-9.
Using probabilistics is pure heretics (there's only one truth in palaeontology) but it works:
Bomfleur B, Grimm GW, McLoughlin S. 2015. Osmunda pulchella sp. nov. from the Jurassic of Sweden—reconciling molecular and fossil evidence in the phylogeny of modern royal ferns (Osmundaceae). BMC Evolutionary Biology 15:126.