Dear Curtis Huntenhower,
I am a postgraduate student in South Africa at the University of Cape Town. I am working on a field that I am not quite familiar with, but I am trying my best to understand. I recently came across
one of your papers titled "Metagenomic biomarker discovery and explanation". I'd like to ask 3 questions:
1) Is the LDA value indirectly indicative of statistical significance? Meaning, the higher the score, the more we expect a
differential feature to be true?
2) I tried to work on some tutorial using the LefSe in Galaxy. I got all the expected files. What program is required to view the file(s) produced in the "Plot Differential Features" step? This is
step (F) in the Galaxy (LefSe).
3) I went ahead and tried to use a dataset of my own. I got an LDA histogram (see the first attachment). I understand that those bacteria shown in the LDA histograms can be used as
biomarkers for the disease in test (either +ve or -ve). I generated the "One Feature" plot
(in this case using Aerococcus) using the galaxy LefSe. The subclass was if the condition in
test was either high-risk or not (none). The
plots histograms generated were of two
colours -
Red and
Green (see the second attachment), which I don't quite understand and wish to know why. I have searched for
possible explanation without succeeding.
I would be pleased to hear from you.
--