all-against-all vs one-against-all

1.528 visninger
Gå til det første ulæste opslag

Bettina Halwachs

ulæst,
15. jul. 2015, 07.52.4715.07.2015
til lefse...@googlegroups.com

Hi,


I wonder if I understood the difference between the more strict all-against-all and the one-against-all feature detection mode of LefSe correctly. Image 3 groups, no subgroups, A, B, and C. For the all-against-all I would say a feature has to be differentially abundant between all the 3 groups. Eg. Feature X is detected if it is differentially abundant in A compared to B, in A compared C, AND in B compared to C.  

In contrast for one against all: Feature X is detected if it is differentially abundant in A compared to B, and A compared to C BUT NOT in B compared to C.

 

I'm not sure if this is really correct maybe somebody can help me to clarify the difference.

 

Thank you already in advance for your help.

 

Kind regards,

Bettina

Nicola Segata

ulæst,
23. jul. 2015, 15.45.3623.07.2015
til Bettina Halwachs, lefse...@googlegroups.com
Hi Bettina,
 sorry for the late reply. Yes, you interpretation of the two cases is correct. The only thing I would change is "BUT NOT" to  "but it is not necessary that"
cheers
Nicola

Bettina Halwachs

ulæst,
24. jul. 2015, 08.08.4924.07.2015
til LEfSe-users, nicola...@unitn.it
Hi Nicola,

Thanks a lot for your reply and for your confirmation of my hypothesis about the difference between the two LEfSe modes. However, this brings me now a little bit in trouble because I’ve the following scenario which does not meet the described behavior.

Briefly: I tried to compare the features of three different groups (no subgroups), WT, KO, and BD (Analysis performed with LEfSe on the Huttenhower galaxy server), using the all-against-all mode. Firmicutes are reported as significantly different (p-val < 0.05), with an LDA > 2.0 for WT compared to KO, and BD. Well, according to all-against-all this means that the Firmicutes are also significantly different between KO and BD, with the highest abundance in WT. BUT this is not supported by the pairwise LEfSe analysis of KO vs BD. Firmicutes are not detected as significantly different between KO and BD in the pairwise comparison. In addition LEfSe’s plot one feature analysis for Firmicutes resulted in very similar means/medians for and KO. Which is does not indicate for a difference between these two groups. (Hope this was not to confusing.) I’ve attached the LDA feature analysis result for the comparison of all 3 groups, the pairwise comparison of KO and BD, as well as the result of plot one feature for the Firmicutes.

I appreciate any ideas which helps us to explain/understand these results? Thanks a lot.

Don’t hesitate to contact me in case you need more details on the analysis parameters or any other additional information which may help you.


Kind regards,

Bettina

firmicutes_abundance.gif
ldaResult_BDvsKO.txt
ldaResult_BDvsKOvsWT.txt

jfg

ulæst,
24. jul. 2015, 10.46.2524.07.2015
til LEfSe-users, nicola...@unitn.it, b.hal...@gmail.com
Bettina,

For (e.g.) 4 groups, One-Against-All could look like:
   - compare X --> A 
   - compare X --> B
   - compare X  --> C
In the first, One-Against-All comparison, X is significantly different from A (and so is in red!). X is not significantly different from B, or from C. You cannot say anything about the relationships of A:B, B:C or, C:A, because they have not been tested in the One-Against-All setting: Only X has been compared with other groups.


For the same group of samples, the All-Against-All might look like:
   - compare X  --> A   - compare X  --> B   - compare X  --> C
   - compare A  --> B   - compare A  --> C
   - compare B  --> C
In this All-Against-All test, we see the same thing: X and A are significantly different (in red!), and because we are doing more thorough testing (more testing!) of all combinations, we now also see, in a totally unrelated effect (or taxa in our case), that C is different from A and B (in blue!). 
However, this does not make X and C are different, even though X and A are different, and A and C are different. 


In your test,
   - WT is different from KO (compare WT --> KO)
   - WT is different from BD (compare WT --> BD)

this does not mean that KO and BD differ from each other (i.e. compare KO --> BD shows no significant difference). 

This is what your results show you: in particular, the means & medians in KO and BD are very similar: the mean and median for WT is different from them both.
Does this help?


jfg

Bettina Halwachs

ulæst,
28. jul. 2015, 07.01.5328.07.2015
til LEfSe-users, nicola...@unitn.it, fitzg...@gmail.com
Hi jfg,

Thank you very much for this very detailed explanation and for the very helpful examples. This helps me a lot.

Kind regards,
Bettina

Lokesh J

ulæst,
27. okt. 2015, 12.03.4327.10.2015
til LEfSe-users, nicola...@unitn.it, b.hal...@gmail.com
Hi

Thank you for the nice explanation I have a few question thogh

1. How to select a particular group to be compared to the rest. Like in the example you gave how do you select X as a sort of reference group when using lefse on galaxy?

2. why the second scenario is considered as strict when it gives the same result as the first one (i.e compare X --> A). I was expecting that the second one is strict because a feature will be significant only if it is significantly different in all the comparisons involving X.
Like

compare X  --> A
  - compare X  --> B   - compare X  --> C


3. in the second scenario how could i know that a feature is significantly abundant in A only when compared to C and not when compared to B ?

jfg

ulæst,
30. okt. 2015, 10.38.0130.10.2015
til LEfSe-users
Hey Lokesh, 

   It's been a while since I used LEfSe, but from what I can remember:

1. Formatting your dataframe is very important - it allows you to compare the classes / groups you are interested in - they can then be selected as your class or subject in LEfSe. Selecting a 'particular group' as a reference to compare all other groups against is one-against-all testing: Go  B) LDA Effect Size (LEfSe) > Set the strategy for multi-class analysis: > One-Against-All. This will allow you to compare your reference to all the other 'classes' or groups of samples. In the first example above, 'X' is your reference, and you do 3 comparisons (X --> A, X --> B, X  --> C).

2. all-against-all is 'more strict' (I think!) because we need to be more careful when doing multiple comparisons. Instead of the 3 comparisons in One-Against-All above, All-Against-All means: X-->A, X-->B, X-->C, A-->B, A-->C, B-->C (6 comparisons!). When performing multiple comparisons it is necessary to be 'more strict' as we increase our risk of 'Type-1' errors (false postives; thinking we see an effect when in fact it is just a coincidence, simply because we are looking so many times). 'More strict' can mean a more strict cutoff (e.g. dividing the Significance Threshold by the number of comparisons being made) or using a more complicate method. I'm not sure what method LEfSe uses, but for you it means you will get fewer biomarkers, but your results will be more statistically strict/sound. 

3. In the second scenario above, C is simply different from A and B! If you are only interested in comparing A and C:
  • Only compare your A and C! Use a One-Against-All test, organising the subjects and classes in your original dataframe so that they appear in A) Format Data for LEfSe  > Select which row to use as class: / Select which row to use as subject.
  •  Or do an All-Versus-All test, automatically comparing A-->B, A-->C, B-->C, etc. 
If your samples are different, you should see it when compared either way. 
Sorry my explanation is not so shiny; I have no data to add to LEfSe as an example. If this hasn't explained things, ask another question!

jfg

Samantha

ulæst,
28. nov. 2018, 15.34.4528.11.2018
til LEfSe-users
Hello, 

I know it's been a while since this was posted, but I had a follow up question. I understand the difference between the one-agaisnt-all and one-against-one options, but I have two questions:

1. Which class/group is being chosen as the "reference" you mention in one-against-all? I usually use LEfSe on the command line but I went into the Galaxy portal to try your method above, but it didn't give me the option to chose the reference class. Is it therefore using the first class (i.e. whatever class is in cell B1 of an excel file) in the table?

2. For the output of the one-against-one method, when a feature is marked as enriched in group A how do you know the  comparison group used? For example, the LDA plot shows feature y is enriched, which class(es) is y enriched in? Does it mean that y is enriched in one group only? Or will it show features enriched in multiple comparisons? For example, feature y is enriched in A vs B and B vs C, will that come up as a result or because it's in two comparisons it's not considered as a true result?

Hopefully these questions make sense!

Thanks,
Samantha
Svar alle
Svar til forfatter
Videresend
0 nye opslag