Hi Muriel,
> Let me describe in a better way my goal !
> I have these 600.000 SNPs but I thought that calculating a ML tree on it
> might be problematic because of linkage disequilibrium (this is RNA seq
> data BTW).
Why should linkage desequilibrium affect this? Can you explain in more
detail.
> So I thought about downsampling my vcf to have 1 SNP per contig that is
> ~25.000 SNPs.
>
> But then, I thought that maybe the tree would be different depending on
> the 25.000 SNPs sampled...
Yes, that would rather be expected.
> Maybe what I could do is rather do 1.000 downsampling and calculate a ML
> tree for each and then use this 1.000 trees as if they were BS replicates ?
Well no, technically this is not a bootstrap but a sort of jackknife.
> Can you exactly tell me the diffreence between consensus tree and tree
> with boostrap, it is not clear to me ?
given n BS trees you calculate the consensus trees
given n BS trees and the best-known ML tree on the original alignment
you count how frequentl each bipartition/split of the ML tree appears in
the BS replicates
>
> I just did what I said earlier. Downsampling 1000 times, calculate 1000
> trees and combining thos trees ina single file called TREES.
>
> Running RaxML to obtain a consensus:
> raxmlHPC \
> -f b \
> -m GTRGAMMA \
> -z TREES \
> -J MR \
> -n consensus
>
> I obtained:
> (OC9,(DA1,(Oeu039,(OS3,(Oeu080,(OC16,Oeu067,(Oc417,(Oeu066,((Oeu077,((Oeu007,(Oeu041,Oeu089,(Ost52,Ost45,OC11,Ost25,OS10,OS9,OS5,Ost30,OS4,Ost27,Ost23,OS1,Ost43,Ost37,Ost22,(Ost46,(Oeu084,(Oc605,(Oc582,Oeu088):1.0[55]):1.0[100]):1.0[57]):1.0[51],(Oc593,(OC6,(OC3,Oeu085):1.0[98]):1.0[100]):1.0[65],((Ost35,Ost33):1.0[100],(Ost40,Ost39):1.0[100]):1.0[80],(Ost58,(Ost51,Ost54):1.0[100]):1.0[83]):1.0[96],(Oeu056,(Oc624,(Oeu038,Oeu076,(Oeu042,Oeu054):1.0[93],(Oeu095,OC1):1.0[100]):1.0[51]):1.0[93]):1.0[61],(Oeu008,(Oeu017,OC14):1.0[71]):1.0[92],(OC13,OC15):1.0[53]):1.0[90]):1.0[57],(Oeu097,OC12):1.0[97]):1.0[57]):1.0[64],(Oeu037,(Oeu069,(Oeu073,(Oc586,Oeu070):1.0[100]):1.0[62]):1.0[93]):1.0[98]):1.0[69]):1.0[74]):1.0[66],(Ocu27,OB4):1.0[100]):1.0[100]):1.0[52]):1.0[93]):1.0[100]):1.0[99],(DA3,(MZ4,MZ1):1.0[100]):1.0[97]);
>
> But I can't plot it using Trex
> (
http://www.trex.uqam.ca/index.php?action=newick). It says that it's not
> an OK Newicq format, why ?
you probably need to use Dendroscope for it to display correctly.
> I thought I could calculate bootstrap rather than make a consensus. But
> then, which tree should I use to put the bootstrap values ? I can't use
> the consensus one since there is multifurcation...
The ML tree on the original alignment, please read the RAxML manual, it
is all outlined in great detail there.
alexis
>
> Thanks a gain for your help,
>
> Muriel
>
>
> On Thursday, September 24, 2015 at 10:03:28 AM UTC+2, Muriel
> Gros-Balthazard wrote:
>
> Hi everybody !
>
> I want to make a tree using SNPs in vcf files.
> The thing is that I downsampled my vcf file containing 600.000 SNPs
> to obtain 25.000 SNPs and made 10 replicates so that I have 10 vcf
> files containing 25.000 randomly sampled.
> My goal is to check that the 25.000 SNPs downsampled won't have
> effects on the tree obtained.
>
> For each of this 10 vcf files, I created 10 bootstrap replicates.
> raxmlHPC -# 10 -b 12345 -f j -m GTRGAMMA -s $phy.phylip -n $i\.REPS
> So that I have 100 replicates in total.
>
> I then calculated a starting tree for each of the 100 bootstrap
> replicates.
> raxmlHPC-PTHREADS -T 2 -y -s ../2_BSrep/$
phy.phylip.BS
> <
http://phy.phylip.BS>$r -m GTRCAT -n vc_$i\_BS$r -p 12$i$r
> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to
raxml+un...@googlegroups.com
> <mailto:
raxml+un...@googlegroups.com>.
> For more options, visit
https://groups.google.com/d/optout.