Maaslin2 normalization method and the use of covariates

278 views
Skip to first unread message

Bruno Gabriel Andrade

unread,
Apr 8, 2019, 12:17:36 PM4/8/19
to MaAsLin-users
Dear Maaslin developers, I know that this tool is yet to be published so we don't know the details, but I have several questions regarding Maaslin2.

First, about the normalization methods:

The default method is total sum scaling (TSS), a method which uses the total read count for each sample as the size factor. This method is not robust against outiliers as Chen et al., (2018) pointed. Have you ever heard of a normalization method called geometric mean of pairwise ratios (GMPR)? Its a method designed to deal with zero-inflated sequencing data and I tested it using maaslin2 (GMPR: A robust normalization method for zero-inflated count data with application to microbiome sequencing data). Basically I normalized my data and used the normalized table as input to maaslin2 with the option -n NONE and your default analysis method (LM) and I got the following results:

With TSS:
Metadata        Feature Value   Coefficient     N       N.not.0 P.value Q.value
Diet    ASV_5   Subproduto      0.0245427133803725      50      46      1.86993574395395e-07    0.00040689801788438
Diet    ASV_2   Subproduto      -0.0466961451264282     50      40      1.39957458857055e-05    0.0152273715236476
Diet    ASV_23  Subproduto      -0.00877410172948447    50      39      6.53707283282884e-05    0.0474155682807852
Diet    ASV_8   Subproduto      -0.0262306949247488     50      50      0.000275182526079829    0.132001412230832
Diet    ASV_33  Subproduto      0.00244552188690525     50      24      0.000303312068545111    0.132001412230832

With GMPR:
Metadata        Feature Value   Coefficient     N       N.not.0 P.value Q.value
Diet    ASV_2   Subproduto      -3.62223543575035       50      40      1.93409873859054e-07    0.000420859885517302
Diet    ASV_23  Subproduto      -2.72329703799366       50      39      7.67810658667614e-07    0.000556918664420243
Diet    ASV_11  Subproduto      -2.61778756568838       50      43      5.69570052337066e-07    0.000556918664420243
Diet    ASV_4   Subproduto      3.72618889047551        50      30      3.06715998779641e-05    0.0166853503336125
Diet    ASV_1   Subproduto      -1.92475097859038       50      48      5.02527043572405e-05    0.0218699769362711
Initial_Group   ASV_46  pesado  -2.24491333385346       50      24      0.00027654090198373     0.0935719787594336
Diet    ASV_9   Subproduto      2.09798511350263        50      46      0.000301012799318031    0.0935719787594336
Diet    ASV_8   Subproduto      -0.820235417444514      50      50      0.000443179927597837    0.120544940306612
Diet    ASV_36  Subproduto      -2.4037258014847        50      21      0.000838418952622944    0.183856351445261
Diet    ASV_13  Subproduto      -2.41834253132568       50      23      0.000844928085685942    0.183856351445261
Diet    ASV_32  Subproduto      -2.34312634658199       50      16      0.00100566152536025     0.198938134471263
Diet    ASV_66  Subproduto      -1.44999612282456       50      16      0.00122897041530646     0.211252374027572
Diet    ASV_47  Subproduto      -1.5664764484848        50      9       0.00126207760218678     0.211252374027572

At first glance, the data normalized with GMRP presented more significant results than with TSS, however, 2 results (ASV 5 and 33). The Coefficient values were also higher using GMPR but in the same direction (positive or negative) of the data normalized with TSS. My first question is, is it okay to use a different normalization method with the models available within your tool? Do you intend to add this specific normalization method in a future version of Maaslin2?
Another question is, how do I add covariants to the model?

Thank you.
Reply all
Reply to author
Forward
0 new messages