gene tree in raxml!

101 views
Skip to first unread message

Mergi Daba

unread,
Jun 26, 2021, 5:12:41 AM6/26/21
to ra...@googlegroups.com
Dear Alexis

I wanted to generate a loci/gene tree as input for ASTRAL-III. The sequence I have is the concatenated MSA in one file (stacks_M12n12S4L1.sel.phy) not per loci or gene.
Lately, I have learned about the pargenes but pargenes need individual gene alignments as input on one directory. However I have only one unpartitioned file with 34 individual sequences concatenated. I tried to google how different authors used raxml to generate gene trees but I didn't find the right command line parameters they used. 
How should I generate a loci/gene tree from a concatenated sequence of mine to use it as input for ASTRAL-III?

Best 
Mergi Dinka

               biru...@yahoo.com

Alexandros Stamatakis

unread,
Jun 27, 2021, 8:05:57 AM6/27/21
to ra...@googlegroups.com
Dear Mergi,

If you have a partition file in RAxML format that is easy, you can just
used the respective command in standard-RAxML (I don't remember it, but
it is listed in the manual) to read in a partitioned multi-gene
alignment and split it up inti individual alignments.

Alexis
> E - Mail:- merg...@gmail.com <mailto:merg...@gmail.com>
> biru...@yahoo.com <mailto:biru...@yahoo.com>
>
> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/raxml/CAGm-en7%3Dd%2BQ%2BMX4gZRXozuurUWFudr%2BetadfyRyt44Tu76fJKw%40mail.gmail.com
> <https://groups.google.com/d/msgid/raxml/CAGm-en7%3Dd%2BQ%2BMX4gZRXozuurUWFudr%2BetadfyRyt44Tu76fJKw%40mail.gmail.com?utm_medium=email&utm_source=footer>.

--
Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies
Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
Affiliated Scientist, Evolutionary Genetics and Paleogenomics (EGP) lab,
Institute of Molecular Biology and Biotechnology, Foundation for
Research and Technology Hellas

www.exelixis-lab.org

Mergi Daba

unread,
Jun 27, 2021, 8:24:21 AM6/27/21
to ra...@googlegroups.com
Dear Alexis

Thank you for your reply. 
However, I don't have a partition file. I have one unpartitioned file with 34 individual sequences concatenated generated from RADseq data from the RADIS pipeline.
Is there a pipeline that generates the partition file? or anything you can recommend to me?

Best regards,
Mergi Dinka

               biru...@yahoo.com


To unsubscribe from this group and stop receiving emails from it, send an email to raxml+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/raxml/cfa77dbb-cd53-4483-1d19-7841c42128ff%40gmail.com.

Benoit Morel

unread,
Jun 27, 2021, 4:19:01 PM6/27/21
to raxml
Dear Mergi,

I don't know the RADIS pipeline but I had a quick look at their paper. I think that they build the concatenated sequences from a set of loci. So instead of using a tool to infer the loci from the concatenated sequence, I encourage you to look at their documentation or to ask them how to get the per-locus sequences that were used to build the final concatenated alignment.

Best,
Benoit

Mergi Daba

unread,
Jun 28, 2021, 6:24:32 AM6/28/21
to ra...@googlegroups.com
Dear Benoit
Thank you for your email. I went through the Documentation of RADIS pipeline and  asked them how to get the per-locus sequences but I got no response. That is why I am looking for alternative way to get per-locus sequences or to generate partition file from the concatenated sequence if possible.
Best  
Mergi Dinka

               biru...@yahoo.com


Benoit Morel

unread,
Jun 29, 2021, 3:16:35 PM6/29/21
to raxml
Dear Mergi,

This goes beyond my knowledge. Maybe someone else in this group knows a tool that could be adapted... 
I have already seen people just cutting the alignment into chunks of equal sizes (to then run astral), but I can't tell if this is a good practice.
Otherwise, I suggest asking the Astral team, since you might not be the first one that needs to partition a long alignment to produce gene trees for Astral.

Best,
Benoit

gabriele...@gmail.com

unread,
Jun 29, 2021, 4:14:12 PM6/29/21
to raxml
Dear Mergi,

I had a quick look at RADIS pipeline. It relies on STACKS, which means that you should have intermediate files corresponding to those produced by STACKS.
If I'm not wrong, only the most recent versions of STACKS can produce concatenated phylip files. At the same time, STACKS will output a file including the partitions that you need to use in RAXML for splitting the concatenated file.

I do not know which version of STACKS  does RADIS uses.

However, there might be some workaround in case it is not able to produce the partition file (as it seems to be the case). 
STACKS usually produces output files specifying the number of included loci in your final dataset, the position of each locus and the length. STACKS uses this information to create the partition file, in fact each partitioned locus correspond to separate catalog loci (equivalent to reference loci).

I'm showing some of my results.

This is the LOG file generated by STACKS when creating the phylip concatenated file (populations.all.phylip.log), which lists the loci that are included in the concatenated file, the sequence position and the length. These information are used to create the partition file (see below).

---------------

# Stacks v2.5;  Phylip interleaved; June 23, 2021

# Locus ID      LocusCnt        Sequence position       Length

155     1       1       642

159     2       643     697

172     3       1340    765

179     4       2105    830

189     5       2935    822

195     6       3757    738

273     7       4495    150

311     8       4645    726

---------------

This is the PARTITION file generated by STACKS when creating the phylip concatenated file (populations.all.partitions.phylip):

DNA, p1=1-642

DNA, p2=643-1339

DNA, p3=1340-2104

DNA, p4=2105-2934

DNA, p5=2935-3756

DNA, p6=3757-4494

DNA, p7=4495-4644

DNA, p8=4645-5370


You can see that the information in populations.all.phylip.log corresponds to the partitioned loci generated by STACKS.

If you have these files, then you should be able to create a partition file using some bash commands.

Ultimately, (without wanting to be annoying)..I suggest you to use STACKS. It gives you more freedom to explore your data and decide with which format output your data.

I hope it helps,

Bests,

Gabriele

Mergi Daba

unread,
Jun 30, 2021, 5:36:16 AM6/30/21
to ra...@googlegroups.com
Dear Gabriele
Thank you for your nice and valuable explanation. RADIS uses stack 1.34 version. I tried to use stacks 2.54 but I couldn't manage to generate the concatenated phylip and the partition file using population pipeline. I feel lost while i use the stacks and didn't play that much with. 
Here is the command line i used for population:
 "populations -P /home/my/other_stack/ustack_out/ -M /home/my/Documents/my_file/stacks/popmaps/test.popmap -O /home/my/other_stack/new_par/ -R 0.20 --phylip"
I wanted to generate the concatenated phylip and partition files for my data. Could you send me the steps I should follow? and the commands I should use to generate these files, please?

Best,

Mergi Dinka

               biru...@yahoo.com

gabriele...@gmail.com

unread,
Jun 30, 2021, 6:38:36 AM6/30/21
to raxml

Hi Mergi,

Can you send me the log files that you obtained using the populations program of STACKS 2.54? I'm afraid that the input files you may have obtained by using the implemented version of STACKS (1.54) in RADIS is in 'conflict' with the STACKS v2.54.

Bests,

Gabriele

Mergi Daba

unread,
Jun 30, 2021, 7:41:57 AM6/30/21
to ra...@googlegroups.com
Dear Gabriele
I just mentioned the stack version that RADIS pipeline used. I used stack version 2.54 starting from process_radtags to population pipelines separately. I attached the log file for denovo.pl and population pipeline as well. Take a look at them and let me know what you think about it.
Thanks alot.

Best, 
Mergi Dinka
               biru...@yahoo.com

denovo_mappl.log
populations.log

gabriele...@gmail.com

unread,
Jun 30, 2021, 8:10:41 AM6/30/21
to raxml

Hi Mergi,

Yes, you had already mentioned the version of STACKS implemented in RADIS. But this is not what I asked in the previous message. 
You had not mentioned that you had gone through the whole STACKS pipeline, and not only the populations program.  
Anyway, then it means that it will be quite straightforward to obtain the  partition file. 
Here it is the command I use for generating the phylip file with the full sequence and the associated partition file. 
Everything should be already mentioned in the manual of STACKS, or if you type 'populations' in your console, it should give you all the information you may need (for instance, the type of output files that you can obtain from STACKS).

The command I use:

populations -P $stacks_dir -M $popmap -O $output_dir -p $min_pop -r $perc_ind --min-maf $min_maf --max-obs-het $max_obs_het -t $threads -B $black_list_path --phylip-var-all --fasta-loci --verbose 

bests,

Gabriele

Mergi Daba

unread,
Jun 30, 2021, 8:19:21 AM6/30/21
to ra...@googlegroups.com
Dear Gabriele

Thank you for your reply. I will go by your suggestion and try out.
Best,

Reply all
Reply to author
Forward
0 new messages