about run Phyml with mpirun

1,084 views
Skip to first unread message

Runhua Lei

unread,
Sep 26, 2013, 10:26:11 AM9/26/13
to phyml...@googlegroups.com
Dear All:
I run 211 primate complete mitochondrial sequence data by phyml with mpirun with the following command line: mpirun -n 16 ./phyml-mpi -i lgroteincut.dat -m GTR -v 0.292 -a 0.6761 -b 100.  After running about 20 hr, the run was terminated.  The error message is as follows:

mpirun has exited due to process rank 9 with PID 13928 on node basestation exiting without calling 'finalize". This may have caused other processes in the application to be terminated by signals send by mpirun (as reported here)

Could you please help me solve this issues?

Thank you very much,

All the best,

Runhua

Stephane Guindon

unread,
Sep 26, 2013, 4:23:54 PM9/26/13
to phyml...@googlegroups.com
Dear Runhua,

Can you please tell me which version of PhyML you are using here?

Regards,

-Stephane-

Runhua Lei

unread,
Sep 27, 2013, 9:48:58 AM9/27/13
to phyml...@googlegroups.com
Dear Stephane:

I used phyml-20130708.tar.gz package, which was recently downloaded from source.

Thanks,

Runhua

Stephane Guindon

unread,
Sep 27, 2013, 8:09:04 PM9/27/13
to phyml...@googlegroups.com
Dear Runhua,

Thanks. I have just uploaded a new (development) version (
20130927) with
a bug fix for the bootstrap under GTR or custom models. Please give it a go
and let us know whether or not this fixes your issues.

Regards,

-Stephane-

--
You received this message because you are subscribed to the Google Groups "PhyML forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to phyml-forum...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Runhua Lei

unread,
Sep 30, 2013, 2:56:37 PM9/30/13
to phyml...@googlegroups.com
Dear Stephane:
After I installed the new version of the software and run the file again.  The same error is as follows.  What do you think?
galaxy@basestation:~/software/Phylogenydivergence/phyml-20130927/src$ mpirun -n 10 ./phyml-mpi -i lgproteincut.dat -m GTR -v 0.292 -a 0.6761 -b 1000

. command-line: ./phyml-mpi -i lgproteincut.dat -m GTR -v 0.292 -a 0.6761 -b 1000




                                 ..........................                                     
 ooooooooooooooooooooooooooooo        CURRENT SETTINGS        ooooooooooooooooooooooooooooooooooo
                                 ..........................                                     

                . Sequence filename:                 lgproteincut.dat
                . Data type:                     dna
                . Alphabet size:                 4
                . Sequence format:                 interleaved
                . Number of data sets:                 1
                . Nb of bootstrapped data sets:             1000
                . Compute approximate likelihood ratio test:     no
                . Model name:                     GTR
                . Proportion of invariable sites:         0.292000
                . Number of subst. rate categs:             4
                . Gamma distribution parameter:             0.676100
                . 'Middle' of each rate class:             mean
                . Nucleotide equilibrium frequencies:         empirical
                . Optimise tree topology:             yes
                . Tree topology search:                 NNIs
                . Starting tree:                 BioNJ
                . Add random input tree:             no
                . Optimise branch lengths:             yes
                . Optimise substitution model parameters:     yes
                . Run ID:                     none
                . Random seed:                     1380551830
                . Subtree patterns aliasing:             no
                . Version:                     20130927

 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo



. 7114 patterns found (out of a total of 10781 sites).

. 3484 sites without polymorphism (32.32%).

. Computing pairwise distances...

. Building BioNJ tree...

. WARNING: this analysis requires at least 1145 MB of memory space.

. Do you really want to proceed? [Y/n] y


. Refining the tree...

. (  731 sec) [   -533599.2805] [GTR parameters     ]
. (  755 sec) [   -533592.2033] [117978] [depth=    1]
. (  756 sec) [   -533592.0139] [117979] [depth=    1]
. (  772 sec) [   -533585.7264] [118011] [depth=    1]
. (  777 sec) [   -533572.5506] [118046] [depth=    1]
. (  779 sec) [   -533516.6078] [118011] [depth=    1]
. (  800 sec) [   -533480.8507] [118011] [depth=    1]
. (  803 sec) [   -533417.2033] [118011] [depth=    1]
. (  805 sec) [   -533255.8640] [118016] [depth=    1]
. (  820 sec) [   -533189.4883] [118041] [depth=    1]
. (  822 sec) [   -533134.5800] [118016] [depth=    1]
. (  827 sec) [   -533047.1938] [117977] [depth=    1]
. (  865 sec) [   -532975.9847] [118027] [depth=    1]
. (  866 sec) [   -532930.2900] [117977] [depth=    1]
. (  869 sec) [   -532386.8448] [118011] [depth=    1]
. (  873 sec) [   -532349.8205] [118066] [depth=    1]
. (  875 sec) [   -532339.9348] [118106] [depth=    1]
. (  882 sec) [   -532200.5295] [118100] [depth=    1]
. (  885 sec) [   -532157.3361] [118105] [depth=    1]
. (  887 sec) [   -532135.7889] [118210] [depth=    1]
. (  890 sec) [   -531942.9937] [118105] [depth=    1]
. (  894 sec) [   -531939.2901] [118104] [depth=    1]
. (  901 sec) [   -531872.5209] [118174] [depth=    1]
. (  903 sec) [   -531838.8949] [118167] [depth=    1]
. (  906 sec) [   -531638.7346] [118160] [depth=    1]
. (  908 sec) [   -531565.0048] [118155] [depth=    1]
. (  912 sec) [   -531552.9464] [118164] [depth=    1]
. (  921 sec) [   -531484.3492] [118087] [depth=    1]
. (  924 sec) [   -531440.9030] [118152] [depth=    1]
. (  933 sec) [   -531114.0734] [118134] [depth=    3]
. (  940 sec) [   -531018.0958] [118073] [depth=    1]
. (  944 sec) [   -530904.8870] [118023] [depth=    1]
. (  952 sec) [   -530645.2802] [118006] [depth=    1]
. (  953 sec) [   -530600.0161] [118023] [depth=    1]
. (  956 sec) [   -530591.0904] [118006] [depth=    1]
. (  961 sec) [   -530461.8743] [117988] [depth=    2]
. (  966 sec) [   -530120.3889] [118003] [depth=    1]
. (  969 sec) [   -530097.4290] [117991] [depth=    1]
. (  978 sec) [   -530096.8696] [118003] [depth=    1]
. (  980 sec) [   -530090.9599] [117991] [depth=    1]
. (  983 sec) [   -529989.5569] [117976] [depth=    1]
. (  987 sec) [   -529965.4881] [117954] [depth=    1]
. (  993 sec) [   -529898.9727] [118103] [depth=    1]
. ( 1014 sec) [   -529897.6971] [118103] [depth=    1]
. ( 1028 sec) [   -529897.5688] [118103] [depth=    1]
. ( 1061 sec) [   -529826.7309] [118216] [depth=    1]
. ( 1157 sec) [   -529820.7982] [118221] [depth=    1]
. ( 1159 sec) [   -529806.3678] [118211] [depth=    1]
. ( 1167 sec) [   -529805.8622] [118211] [depth=    3]
. ( 1170 sec) [   -529800.1084] [118210] [depth=    3]
. ( 1173 sec) [   -529800.0726] [118210] [depth=    1]
. ( 1174 sec) [   -529799.0881] [118210] [depth=    1]
. ( 1177 sec) [   -529798.4982] [118210] [depth=    1]
. ( 1193 sec) [   -529753.7744] [118247] [depth=    1]
. ( 1194 sec) [   -529724.2866] [118210] [depth=    1]
. ( 1197 sec) [   -529510.9565] [118241] [depth=    1]
. ( 1206 sec) [   -529501.7532] [118241] [depth=    1]
. ( 1208 sec) [   -529498.1751] [118240] [depth=    1]
. ( 1220 sec) [   -529495.4248] [118241] [depth=    1]
. ( 1226 sec) [   -529491.2542] [118239] [depth=    1]
. ( 1254 sec) [   -529470.0962] [118246] [depth=    1]
. ( 1257 sec) [   -529464.6141] [118239] [depth=    1]
. ( 1276 sec) [   -529438.2015] [118235] [depth=    1]
. ( 1278 sec) [   -529423.2403] [118239] [depth=    1]
. ( 1291 sec) [   -529423.1094] [118239] [depth=    1]
. ( 1298 sec) [   -529423.0576] [118239] [depth=    1]
. ( 1300 sec) [   -529422.8734] [118239] [depth=    1]
. ( 1301 sec) [   -529422.5512] [118239] [depth=    1]
. ( 1320 sec) [   -529364.1866] [118240] [depth=    1]
. ( 1322 sec) [   -529355.1747] [118239] [depth=    1]
. ( 1324 sec) [   -529146.8035] [118348] [depth=    1]
. ( 1334 sec) [   -529144.6457] [118355] [depth=    1]
. ( 1336 sec) [   -529136.1389] [118362] [depth=    1]
. ( 1342 sec) [   -529132.8901] [118364] [depth=    1]
. ( 1344 sec) [   -529106.5198] [118383] [depth=    1]
. ( 1347 sec) [   -529031.5129] [118364] [depth=    1]
. ( 1352 sec) [   -528992.4845] [118404] [depth=    1]
. ( 1354 sec) [   -528982.2176] [118395] [depth=    1]
. ( 1357 sec) [   -528956.9985] [118395] [depth=    1]
. ( 1362 sec) [   -528865.4288] [118388] [depth=    1]
. ( 1363 sec) [   -528814.5903] [118386] [depth=    1]
. ( 1364 sec) [   -528776.9747] [118388] [depth=    1]
. ( 1368 sec) [   -528759.8937] [118369] [depth=    2]
. ( 1370 sec) [   -528745.6494] [118446] [depth=    1]
. ( 1373 sec) [   -528727.1265] [118463] [depth=    1]
. ( 1374 sec) [   -528576.5865] [118446] [depth=    1]
. ( 1393 sec) [   -528570.6438] [118444] [depth=    1]
. ( 1399 sec) [   -528555.5886] [118445] [depth=    1]
. ( 1400 sec) [   -528553.8886] [118444] [depth=    1]
. ( 1402 sec) [   -528514.2802] [118455] [depth=    1]
. ( 1410 sec) [   -528448.8003] [118540] [depth=    1]
. ( 1411 sec) [   -528159.8763] [118455] [depth=    1]
. ( 1413 sec) [   -528020.8760] [118378] [depth=    1]
. ( 1415 sec) [   -527881.6246] [118416] [depth=    1]
. ( 1426 sec) [   -527301.9990] [118266] [depth=    1]
. ( 1429 sec) [   -527291.5222] [118270] [depth=    1]
. ( 1431 sec) [   -527140.4195] [118246] [depth=    1]
. ( 1438 sec) [   -527139.7210] [118246] [depth=    1]
. ( 1440 sec) [   -527139.1222] [118246] [depth=    1]
. ( 1441 sec) [   -527138.8398] [118246] [depth=    1]
. ( 1444 sec) [   -527138.7818] [118246] [depth=    1]
. ( 1446 sec) [   -527137.4465] [118246] [depth=    1]
. ( 1500 sec) [   -527102.3950] [118241] [depth=    1]
. ( 1504 sec) [   -527088.7161] [118246] [depth=    1]
. ( 1507 sec) [   -527016.6639] [118223] [depth=    1]
. ( 1514 sec) [   -526883.5488] [118120] [depth=    1]
. ( 1523 sec) [   -526081.7515] [118108] [depth=    1]
. ( 1525 sec) [   -525990.8934] [118120] [depth=    1]
. ( 1527 sec) [   -525958.4568] [118109] [depth=    1]
. ( 1536 sec) [   -525814.2791] [118160] [depth=    1]
. ( 1542 sec) [   -525755.6257] [118190] [depth=    1]
. ( 1545 sec) [   -525539.6109] [118160] [depth=    1]
. ( 1549 sec) [   -525520.3460] [118109] [depth=    1]
. ( 1554 sec) [   -525371.1200] [118069] [depth=    1]
. ( 1558 sec) [   -525199.8303] [118050] [depth=    1]
. ( 1576 sec) [   -525198.5123] [118079] [depth=    1]
. ( 1586 sec) [   -525004.9257] [118102] [depth=    1]
. ( 1588 sec) [   -524922.0162] [118079] [depth=    1]
. ( 1590 sec) [   -524884.6752] [118050] [depth=    1]
. ( 1599 sec) [   -524753.4726] [118032] [depth=    1]
. ( 1610 sec) [   -524752.9035] [118032] [depth=    1]
. ( 1623 sec) [   -524726.5266] [118027] [depth=    1]
. ( 1682 sec) [   -524702.9722] [118032] [depth=    1]
. ( 1683 sec) [   -524698.8006] [118033] [depth=    1]
. ( 1729 sec) [   -524698.7871] [118033] [depth=    1]
. ( 1787 sec) [   -523369.5619] [Branch lengths     ]

. End of refining stage...
. The log-likelihood might now decrease and then increase again...


. Maximizing likelihood (using NNI moves)...

== Determinant becomes zero at   2!   
== Failed to invert the matrix.
== Trying Q<-Q*scalar and then Root<-Root/scalar to fix this...

== Determinant becomes zero at   3!   
== Failed to invert the matrix.
== Trying Q<-Q*scalar and then Root<-Root/scalar to fix this...

. ( 1879 sec) [   -497302.9074] [GTR parameters     ]
. ( 1885 sec) [   -497302.9074] [Topology           ][# nnis= 12]
. ( 2138 sec) [   -495831.0237] [Topology           ][# nnis=  4]
. ( 2386 sec) [   -495713.8879] [Topology           ][# nnis=  2]
. ( 2604 sec) [   -495671.0172] [Topology           ][# nnis=  0]
. ( 2821 sec) [   -495648.2931] [Topology           ][# nnis=  1]
. ( 3067 sec) [   -495632.7540] [Topology           ][# nnis=  0]
. ( 3252 sec) [   -495621.8469] [Topology           ][# nnis=  0]
. ( 3431 sec) [   -495615.6311] [Topology           ]
. ( 3504 sec) [   -495563.0628] [GTR parameters     ]
. ( 3513 sec) [   -495563.0628] [Topology           ][# nnis=  0]
. ( 3759 sec) [   -495557.6224] [Topology           ][# nnis=  0]
. ( 3971 sec) [   -495556.9607] [Topology           ]
. ( 4098 sec) [   -495556.4898] [Branch lengths     ]
. ( 4258 sec) [   -495556.2367] [GTR parameters     ]
. ( 4389 sec) [   -495556.0882] [Branch lengths     ]
. ( 4421 sec) [   -495556.0871] [GTR parameters     ]
. ( 4549 sec) [   -495556.0410] [Branch lengths     ]
. ( 4579 sec) [   -495556.0404] [GTR parameters     ]
. ( 4699 sec) [   -495556.0236] [Branch lengths     ]
. ( 4729 sec) [   -495556.0234] [GTR parameters     ]
. ( 4849 sec) [   -495556.0190] [Branch lengths     ]
. ( 4882 sec) [   -495556.0189] [GTR parameters     ]
. ( 5043 sec) [   -495556.0172] [Branch lengths     ]
. ( 5074 sec) [   -495556.0171] [GTR parameters     ]
. ( 5194 sec) [   -495556.0167] [Branch lengths     ]
. ( 5224 sec) [   -495556.0167] [GTR parameters     ]
. ( 5333 sec) [   -495556.0165] [Branch lengths     ]
. ( 5363 sec) [   -495556.0165] [GTR parameters     ]
. ( 5473 sec) [   -495556.0164] [Branch lengths     ]
. ( 5503 sec) [   -495556.0164] [GTR parameters     ]
. ( 5531 sec) [   -495556.0164] [GTR parameters     ]

. Checking for NNIs, optimizing five branches...

. ( 6951 sec) [   -495553.9936] [Topology           ]
. ( 7038 sec) [   -495553.9914] [Branch lengths     ]
. ( 7055 sec) [   -495553.9913] [GTR parameters     ]
. ( 7137 sec) [   -495553.9907] [Branch lengths     ]
. ( 7154 sec) [   -495553.9907] [GTR parameters     ]
. ( 7234 sec) [   -495553.9906] [Branch lengths     ]
. ( 7250 sec) [   -495553.9905] [GTR parameters     ]
. ( 7332 sec) [   -495553.9905] [Branch lengths     ]
. ( 7349 sec) [   -495553.9905] [GTR parameters     ]
. ( 7367 sec) [   -495553.9904] [GTR parameters     ]

. Checking for NNIs, optimizing five branches...

. ( 9266 sec) [   -495552.8356] [Topology           ]
. ( 9421 sec) [   -495552.8003] [Branch lengths     ]
. ( 9451 sec) [   -495552.8002] [GTR parameters     ]
. ( 9639 sec) [   -495552.7992] [Branch lengths     ]
. ( 9688 sec) [   -495552.7991] [GTR parameters     ]
. ( 9868 sec) [   -495552.7990] [Branch lengths     ]
. ( 9906 sec) [   -495552.7989] [GTR parameters     ]
. (10102 sec) [   -495552.7989] [Branch lengths     ]
. (10137 sec) [   -495552.7988] [GTR parameters     ]
. (10177 sec) [   -495552.7988] [GTR parameters     ]

. Checking for NNIs, optimizing five branches...



. Log likelihood of the current tree: -495552.798760.

. Launch bootstrap analysis on the most likely tree...

. The bootstrap analysis will use 10 CPUs.
. Non parametric bootstrap analysis

  [--------------------------------------------------------------------------
mpirun has exited due to process rank 4 with PID 6206 on

node basestation exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).

Thanks,

All the best,

Runhua

Stephane Guindon

unread,
Sep 30, 2013, 3:57:08 PM9/30/13
to phyml...@googlegroups.com

Dear Runhua,

Sorry to see that it still crashes (quite frustrating to wait this much before it does...).
Can you please try without the '-m GTR' option and let me know what the outcome is.
Also, I cannot reproduce this bug on my side on smaller data sets, so could you please
send me your alignment file to see if I manage to see where the problem comes from?

Regards,

-Stephane-



Runhua Lei

unread,
Oct 2, 2013, 10:06:21 AM10/2/13
to phyml...@googlegroups.com
Dear Stephane:
Please see the data set in attachment.

Thanks,
All the best,

Runhua

primatetestlg.dat

Stephane Guindon

unread,
Oct 4, 2013, 3:56:57 PM10/4/13
to phyml...@googlegroups.com
Dear Runhua,

I took me some time but I found a nasty bug in the bootstrap function that was
recently introduced and most likely explain the error you encountered with your
data. I have uploaded a new Development version on the Google Code web site:
http://code.google.com/p/phyml/downloads/list
Please feel free to give it a go and let us know whether it worked or not.

Regards,

-Stephane-



Runhua Lei

unread,
Oct 8, 2013, 10:47:18 AM10/8/13
to phyml...@googlegroups.com

Dear Stephane:
I re-run the data set with a small dataset (100 samples with 1140 bp), which is fine for 1000 bootstrap.  However, when I run the big data set, I got the following errors:
 Log likelihood of the current tree: -495585.709212.


. Launch bootstrap analysis on the most likely tree...

. The bootstrap analysis will use 8 CPUs.

. Non parametric bootstrap analysis

  [

--------------------------------------------------------------------------
mpirun has exited due to process rank 4 with PID 1511 on

node basestation exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------

What do you think?

Thanks,

Runhua

Stephane Guindon

unread,
Oct 8, 2013, 3:10:29 PM10/8/13
to phyml...@googlegroups.com
Dear Runhua,

You might need to check whether there is enough memory space available
on your computer as the amount required for your bootstrap analysis here
is 8x that for a normal (i.e., without bootstrap) one. This said, I think there
still is a problem with the GTR model. I am working on that right now and
will keep you updated. In the mean time, I suggest you try with HKY85.

Regards,

-Stephane-

Runhua Lei

unread,
Oct 10, 2013, 10:42:38 AM10/10/13
to phyml...@googlegroups.com

Dear Stephane:
Thank you very much for your time and efforts on helping me solve this issue.
I tried to re-run the same data set with HKY85 model, which is going well. Please see some running info as follows:

. (17242 sec) [   -497638.0938] [Topology           ]


. Log likelihood of the current tree: -497638.093808.


. Launch bootstrap analysis on the most likely tree...

. The bootstrap analysis will use 8 CPUs.
. Non parametric bootstrap analysis

  [............

As you said, maybe, there is some issues with GTR model.


Thanks,

All the best,

Runhua

Stephane Guindon

unread,
Oct 15, 2013, 3:37:35 PM10/15/13
to phyml...@googlegroups.com
Dear Runhua,

It is likely that the bug with GTR+bootstrap is now fixed. Because your data set contains
a lot of indels and distantly related sequences, initial estimates of the GTR parameters were
a bit extreme, which caused numerical issues. I have now set upper and lower bounds for
these parameters and was able to complete the analysis. I am still running some tests to
check that everything is ok.

Please feel free to download the latest Development version (20131016) from http://code.google.com/p/phyml/.

Regards,

-Stephane-

Runhua Lei

unread,
Oct 21, 2013, 10:12:46 AM10/21/13
to phyml...@googlegroups.com
Dear Stephane:
I re-ran the dataset.  It looks fine now but slow.  I used eight core for more than three days, till now, only 40 replicate done for bootstrap.
For such a big data set, normally, how many replicates do we need for the bootstraping test?  I set 1000, which will take months to be completed!
Is there GPU-version of Phyml available? 
Thank you  very much and I appreciate your time.

All the best,

Runhua

Stephane Guindon

unread,
Oct 21, 2013, 11:21:13 PM10/21/13
to phyml...@googlegroups.com

Dear Runhua,

A version of PhyML that relies on the BEAGLE phylogenetics library (https://code.google.com/p/beagle-lib/) is currently
in development. At the moment, this experimental version does not support bootstrap analysis... I will keep you updated
on our progress on this.

Regards,

-Stephane-
Reply all
Reply to author
Forward
0 new messages