Hi Érico, I personally don't know of any prior study doing this. That's an interesting question!
Your question assumes that the input to SNaQ are inferred gene trees, 1 tree per gene (or locus). SNaQ can take this as input, but does not *require* this type of input.
A better input to SNaQ is a table of quartet concordance factors estimated with a method that accounts for gene tree error. The downside of giving 1 tree per gene as input is ignoring gene tree error.
In the SNaQ paper, quartet concordance factors are estimated with BUCKy, to account for gene tree error.
There is prior work looking at the impact of gene tree estimation error on BUCKy.
I mostly know of Chapter 3 (
scan), p.35-52 in the
book "Estimating species trees: Practical and theoretical aspects", edited by Knowles & Kubatko. In section 3.3, figure 3.2, there is an example in which filtering has no impact. It makes no difference using all loci, many of which have poorly resolved trees, versus only using 1/3 of all loci: those whose estimated tree has 95% posterior credibility (and that doesn't reject a clock). Also, sampling 100 loci (out of ~30,000) doesn't make a difference either in terms of the estimated concordance factors (uncertainty increases with fewer loci of course).
The take-home message is that BUCKy does a good job accounting for gene tree error when estimating concordance factors, at least in this example.
So if the input to SNaQ are quartet concordance factors estimated with BUCKy, or some other method that accounts for gene tree error; then I would guess that filtering low-information loci has a small impact. But again, a proper study on this would be interesting, I think.