Dear metaphlan2/Strainphlan users and developers,
Thanks a lot for developing and maintaining very useful tools. I am working on oral microbiome samples using meta-genomics (2x250 pb). I have followed your detailed tutorial and I have some questions regarding metaphlan2 /Strainphlan pipeline.
I run metaphlan2v2.7.6 using (both default and --min_alignment_len 100 --bt2_ps sensitive-local) to get the taxonomic profile of my samples and I used strainphlan (i.e., https://bitbucket.org/biobakery/biobakery/wiki/strainphlan) to extract makers and then I explored different parameters to build trees.
I then used R to plot trees produced by strainphlan as well as msaplots and PCoA computed using Kimura’s distance computed using alignment fasta files produced by strainphlan.
1- My understanding is that we get a ‘population profile’ of the strains and based on markers SNP’s we can measure distance among samples regarding that particular species’s ‘population profile’. Is that correct ? Is there a way to estimate strain richness, (i.e., number of different strains)?
2- I am working with 2*250 nts length quality trimmed reads. Would you recommend tuning bowtie2 options --min_alignment_len --bt2_ps as I have seen in some metaphlan2 tutorials e.g., http://bioinformatics-ca.github.io/analysis_of_metagenomic_data_mod3_lab_2015/ for strainphlan?
3- Do you have a recommendation regarding the statistical tests in order to statistically confirm discrimination based on strainphlan? I was thinking on PERMANOVA using Kimura’s distance.
Thanks a ton,
Florentin