Hi Guido,
> naturally used the bootstop criterion (always do).
> The ASC run fulfilled it after 900 replicates (WRF average of 1.43)
> being more decisive (?)
> The uncorrected one used up the max.1000 without converging; WRF average
> of 2.33; a value in the usual range for (non-molecular) not overly
> tree-like matrices.
I see.
> I think the first question to ask is how stable the ML tree is.
>
>
> It's languages :) The signal from the matrix is not overly tree-like,
> trees are usually as instable as for morphological data.
That's what I guessed, and I did notice it's about languages :-)
> In this particular case some terminal relationships are trivial,
> signal-wise (relatively young splits), the deeper relationships are
> chronically difficult to resolve (usually two or three equally valid
> alternatives). Even Bayesian analysis doesn't manage to converge to PP ~
> 1 for most branches. Hence, my focus on the bootstrap results.
>
>
> So, if you did, say 20 ML tree searches on random and parsimony
> starting
> trees, how many distinct trees do you get and how different are they
> from each other (i) topologically and (ii) statistically.
>
>
> Will have to verify for this one, but usually you get (i) 20 somewhat
> different trees, aspect-wise, that (ii) are not significantly different
> using RAxML's in-built SH-test
Okay, that's also more or lesse what I'd expect.
> It might well be that the signal itself is rather weak and as a
> consequence what you are observing might be random effects.
>
>
> I compared the result with the original analysis which used more or less
> complex linguistic substitution models in a Bayesian framework. The
> splits that received higher support from the ASC run, are the ones
> preferred by Bayes.
Okay.
> If it's randomness, would it make sense to re-run full analyses (tree
> inference + bootstrapping) from different seeds (for -x and -p), and
> compare the frequencies across the bootstrap runs?
Yes, that's a good idea.
> Hypothesis being that variation let's say in 10 standard analyses and 10
> corrected for the ascertainment bias should differ among each other as
> much as between each other, when it's random. Would 10 each be enough to
> discern between stochasticity and genuine?
Id' do at least 100.
> And should I change to
> standard bootstrapping rather than using the fast-implementation?
Yes that would have been my next suggestion.
> I
> remember having read somewhere that even for molecular data, we'd expect
> a fuzzyness of about +-5 in the BS supports.
Never heard of that but sounds plausible.
We did some experiments with bootstraps for that ascertainment bias
paper, but as far as I remember they were somewhat unconclusive.
Alexis
>
> /G.