Fwd: LefSe Analysis

199 views
Skip to first unread message

Curtis Huttenhower

unread,
Dec 3, 2014, 1:27:34 PM12/3/14
to lefse...@googlegroups.com, do79h...@gmail.com
Thanks for getting in touch with us, and apologies the response has taken so long; I'm forwarding this along to the LEfSe support list so folks can take a look?

Many thanks -
Curtis

---------- Forwarded message ----------
From: Harris Onywera <do79h...@gmail.com>
Date: Tue, Nov 18, 2014 at 5:32 AM
Subject: LefSe Analysis
To: "Huttenhower, Curtis" <chut...@hsph.harvard.edu>


Dear Curtis Huntenhower,

I am a postgraduate student in South Africa at the University of Cape Town. I am working on a field that I am not quite familiar with, but I am trying my best to understand. I recently came across one of your papers titled "Metagenomic biomarker discovery and explanation". I'd like to ask 3 questions:

1) Is the LDA value indirectly indicative of statistical significance? Meaning, the higher the score, the more we expect a differential feature to be true?

2) I tried to work on some tutorial using the LefSe in Galaxy. I got all the expected files. What program is required to view the file(s) produced in the "Plot Differential Features" step? This is step (F) in the Galaxy (LefSe).

3) I went ahead and tried to use a dataset of my own. I got an LDA histogram (see the first attachment). I understand that those bacteria shown in the LDA histograms can be used as biomarkers for the disease in test (either +ve or -ve). I generated the "One Feature" plot (in this case using Aerococcus) using the galaxy LefSe. The subclass was if the condition in test was either high-risk or not (none). The plots histograms generated were of two colours - Red and Green (see the second attachment), which I don't quite understand and wish to know why. I have searched for possible explanation without succeeding.

I would be pleased to hear from you.

-- 
D. O. Harris

Plot_LEfSe_Results_LDA_Pilot.png
Plot_One_Feature_Aerococcaceae_Pilot.png

Nicola Segata

unread,
Dec 5, 2014, 3:18:51 AM12/5/14
to lefse...@googlegroups.com, do79h...@gmail.com
Hi harris,
 thanks for getting in touch.

1. The LDA scores estimate the size of the effect, assuming that the effect is statistically significant (the features tested for LDA already passed the statistical test). It does not have the meaning of the p-value.
2. The plot differential features step produces a zipped archive as it usually contains many images. You need to download and decompress the archive
3. The colors in the barplot and the histogram (the two figures you attached) are unrelated. So the red in one plot does not link to the red in the other plot.

I hope this helps
thanks
Nicola
Reply all
Reply to author
Forward
0 new messages