Export all taxa's pvalues and coefficients from MaAsLin

171 views
Skip to first unread message

clair...@gmail.com

unread,
Mar 25, 2019, 8:55:47 PM3/25/19
to MaAsLin-users

Dear MaAsLin team,


I’m using MaAsLin for the gut microbiota analysis. Instead of filtering output by FDR, I’d like to get the p-values of all taxa. Therefore, I set the FDR threshold to 1.


However, the output seems always neglect several taxa. For example, I import 20 taxa for the analysis, but only 15 taxa have result in the output. I’ve tried several times with different taxa table. But I never get the whole result.


Would you please kindly tell me how I can get p-values for all taxa?


Thanks a lot in advance!


Claire

clair...@gmail.com

unread,
Mar 25, 2019, 8:56:33 PM3/25/19
to MaAsLin-users
PS: No taxa are filtered by the prevalence and abundance.

Vadim Dubinsky

unread,
Apr 2, 2019, 10:34:50 AM4/2/19
to MaAsLin-users
Hello Claire,
I also ran into the same problem as you with less features/taxa appearing in the result files.

Then I removed all the default filters (dMinSamp=0, dMinAbd=0), this time the results file had more features than in the first run, but still less then suppose to be.
My metadata had 2 predictors, one with 5 factors (different groups) and the other continuous variable - for which I have missing data (NA) for some samples.
Once I removed this variable and keep only the grouping predictor (and removed the filters) - all the features appear in the results file.
I think it had something to do with how the model handle NA's, although I am not quite sure at the moment.

Hope it might help you
Vadim

Jeremy Wilkinson

unread,
Apr 2, 2019, 10:43:15 AM4/2/19
to MaAsLin-users
Turning the feature boosting off along with the significance threshold set to 1 and the feature filters set to 0 should return all features. Those arguments to the Maaslin function in R are:

strModelSelection="none",

dSignificanceLevel=1,

dMinAbd=0,

dMinSamp=0



-Jeremy

Dani Dubinsky

unread,
Apr 2, 2019, 11:23:44 AM4/2/19
to Jeremy Wilkinson, MaAsLin-users
Thanks Jeremy!

Do any how/if turning the feature boosting off will affect the results?


--
You received this message because you are subscribed to the Google Groups "MaAsLin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to maaslin-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jeremy Wilkinson

unread,
Apr 3, 2019, 9:21:08 AM4/3/19
to MaAsLin-users
No problem. It is my understanding that the boosting reduces features by selecting those that have a potential association with the metadata before performing the linear model. It's not mandatory, but is the suggested default. However, if you are interested in "forcing" all features into the model, then you will need to turn it off in most cases along with the setting the feature filters to 0 and the fdr significance level to 1. MaAsLin2, which is also available for use now but is not submitted to a journal yet (near submission), does not have a feature boosting step.

-Jeremy
Thanks Jeremy!
To unsubscribe from this group and stop receiving emails from it, send an email to maasli...@googlegroups.com.

Vadim Dubinsky

unread,
Apr 5, 2019, 1:54:03 PM4/5/19
to MaAsLin-users
Thanks again Jeremy for the explanation.

At this moment I prefer to work with MaAsLin1 rather than MaAsLin2 - which is more like a "black box" at the moment. But will defiantly try it once the paper will be released. I also got several errors for the same data when tried MaAsLin2 at the moment.

Claire

unread,
Sep 30, 2019, 2:08:44 AM9/30/19
to MaAsLin-users
When I remove all the covariates and just use one metadata, I get palues for all the taxa. However, when I include the covariates, some taxa are always missing. I think it might be due to the boosting of metadata for these taxa. 

Will MaAsLin team improve this? Sometimes we really want to show pvalues for all taxa rather than just the significant ones.

Claire

Claire

unread,
Sep 30, 2019, 2:55:37 AM9/30/19
to MaAsLin-users
To add on, which is quite ridiculous that if I re-run those particular taxa that don't have the output, some of them will give the result, but some still don't. And I also check the prevalence, those that don't give the result are not rarely distributed. It's really strange. Seems like a blackbox.

Claire
On Tuesday, April 2, 2019 at 10:43:15 PM UTC+8, Jeremy Wilkinson wrote:
Reply all
Reply to author
Forward
0 new messages