Relative abundance plots and normalization methods

260 views
Skip to first unread message

lina....@gmail.com

unread,
May 20, 2016, 12:06:11 PM5/20/16
to LEfSe-users
Hi all,

I ran lefse on raw OTU counts and used the -o 1000000 option for format_input.py to normalize the data.

I then ran lefse and plotted bar charts using the plot_features.py script.

For one of my organisms, I got the following barchart:


Note that there are four samples in the "class: omdp" section that seem to be equally highly abundant.

However, when I plot the same organism using the normalized data produced in format_input.py and the --output_table option, these samples look a bit different:


(Screen shot from Excel)


I realize there must be another normalization step somewhere in there (because the y-axis values are changing) but is that enough to make up the differences in abundance in the samples on the right?


In general, is it better to normalize first and then do not normalize again in format_input.py?


Thanks for any insight you might have!

~Lina


Nicola Segata

unread,
May 20, 2016, 3:45:59 PM5/20/16
to lina....@gmail.com, LEfSe-users
Hi Lina,
 in the LEfSe bar plot, we set the limits of the y-axis to try to highlight the differences. In some cases, a very high value can make the other bars almost invisible, so we limit the y-axis at a value that in some cases is below the maximum value. So in your case we do not modify the values of the features in the computation, but we are just cutting them out visually...
cheers
Nicola

lina....@gmail.com

unread,
May 21, 2016, 7:47:35 AM5/21/16
to LEfSe-users, lina....@gmail.com, nicola...@unitn.it
That makes sense, thanks so much for clarifying!

Best,
~Lina
Reply all
Reply to author
Forward
0 new messages