Dear Maaslin developers, I know that this tool is yet to be published so we don't know the details, but I have several questions regarding Maaslin2.
First, about the normalization methods:
The default method is total sum scaling (TSS), a method which uses the total read count for each sample as the size factor. This method is not robust against outiliers as Chen et al., (2018) pointed. Have you ever heard of a normalization method called geometric mean of
pairwise ratios (GMPR)? Its a method designed to deal with zero-inflated sequencing data and I tested it using maaslin2 (GMPR: A robust normalization method for
zero-inflated count data with application
to microbiome sequencing data). Basically I normalized my data and used the normalized table as input to maaslin2 with the option -n NONE and your default analysis method (LM) and I got the following results:
With TSS:
Metadata Feature Value Coefficient N N.not.0 P.value Q.value
Diet ASV_5 Subproduto 0.0245427133803725 50 46 1.86993574395395e-07 0.00040689801788438
Diet ASV_2 Subproduto -0.0466961451264282 50 40 1.39957458857055e-05 0.0152273715236476
Diet ASV_23 Subproduto -0.00877410172948447 50 39 6.53707283282884e-05 0.0474155682807852
Diet ASV_8 Subproduto -0.0262306949247488 50 50 0.000275182526079829 0.132001412230832
Diet ASV_33 Subproduto 0.00244552188690525 50 24 0.000303312068545111 0.132001412230832
With GMPR:
Metadata Feature Value Coefficient N N.not.0 P.value Q.value
Diet ASV_2 Subproduto -3.62223543575035 50 40 1.93409873859054e-07 0.000420859885517302
Diet ASV_23 Subproduto -2.72329703799366 50 39 7.67810658667614e-07 0.000556918664420243
Diet ASV_11 Subproduto -2.61778756568838 50 43 5.69570052337066e-07 0.000556918664420243
Diet ASV_4 Subproduto 3.72618889047551 50 30 3.06715998779641e-05 0.0166853503336125
Diet ASV_1 Subproduto -1.92475097859038 50 48 5.02527043572405e-05 0.0218699769362711
Initial_Group ASV_46 pesado -2.24491333385346 50 24 0.00027654090198373 0.0935719787594336
Diet ASV_9 Subproduto 2.09798511350263 50 46 0.000301012799318031 0.0935719787594336
Diet ASV_8 Subproduto -0.820235417444514 50 50 0.000443179927597837 0.120544940306612
Diet ASV_36 Subproduto -2.4037258014847 50 21 0.000838418952622944 0.183856351445261
Diet ASV_13 Subproduto -2.41834253132568 50 23 0.000844928085685942 0.183856351445261
Diet ASV_32 Subproduto -2.34312634658199 50 16 0.00100566152536025 0.198938134471263
Diet ASV_66 Subproduto -1.44999612282456 50 16 0.00122897041530646 0.211252374027572
Diet ASV_47 Subproduto -1.5664764484848 50 9 0.00126207760218678 0.211252374027572
At first glance, the data normalized with GMRP presented more significant results than with TSS, however, 2 results (ASV 5 and 33). The Coefficient values were also higher using GMPR but in the same direction (positive or negative) of the data normalized with TSS. My first question is, is it okay to use a different normalization method with the models available within your tool? Do you intend to add this specific normalization method in a future version of Maaslin2?
Another question is, how do I add covariants to the model?
Thank you.