Hi Ioana,
since the BS trees are based on resamples of your data matrix, they cannot serve as replacement for the optimised ML tree. Note, that while often addressed in literature as such, a, e.g. majority rule consensus tree based on the BS pseudoreplicate sample (or any other sample of trees), is not a phylogenetic tree or a good representation of a phylogeny (being the product of repeated dichotomies in a genetic lineage). It's always a summary of trees showing fully compatible (BS = 100) or incompatible (BS << 100) topological aspects.
On the other hand, for genetic-taxonomic purposes and phylogenetic considerations, we may be interested on why a BS support for a branch in the optimised ML is low. Essentially, there's 3 possibilities:
- Lack of discriminate signal at a certain or various levels.
- Incompatible signal (i.e. part of your SNPs support a different phylogeny than the rest, e.g. if you look at fungi with high potential to recombine genomes)
- Branching artefacts, pending the signal in the SNP matrix and the outgroup used, the optimised tree may be affected e.g. by ingroup-outgroup LBA attraction, which typically is less pronounced in BS pseudoreplicates (don't ask me why)
Because of these three point, non-trivial datasets may lead to aspect-wise incongruent ML trees and BS consensus trees. Which alternative one should choose in those cases, cannot be said in general but depends on the data's primary signal and structure as well as the biological connection/relativity of the tips (population-scale data, environmental samples, phenotypically sorted or not); not to forget the incongruent topological aspects and their alternatives.
Given your tip set of 1000 tips, that you get 25 unique topologies with parsimony is not surprising (the random trees must be different, given the number of maximum possible trees accomodating 1000 tips). Typically, when we accumulate data for a single species ("clarify the phylogeny of a fungus species"?!), we have a lot of so-called "flat" terminal subtrees comprising near-identical or just randomly varying tips, for which any topological permutation will have the same parsimony score and very similar likelihoods. PS If the parsimony starting trees show substantial incongruences across the entire tree, the signal in your data matrix may not be very treelike at all but the product of reticulate evolution rather than dichotomous (we only model the latter).
For the analysis itself, those incongruences in the starting trees are often irrelevant for running the ML tree inference and bootstrapping towards a phylogeny, because RAxML will find the best-known tree to accomodate the signal in the data within the tree space it searches. For non-trivial data, a higher number of starting trees can assure RAxML roams more of the tree space in order to find its preferred ML tree, i.e. we may get a resultant ML tree with a higher likelihood, the more starting trees we use.
Cheers, Guido