Format Data for LEfSe

1,569 views
Skip to first unread message

Wennie Zhou

unread,
Jul 15, 2013, 1:28:27 PM7/15/13
to lefse...@googlegroups.com
Dear Nicola or Prof. Huttenhower,

Sorry for a silly question here. When I try to execute the "A) Format Data for LEfSe" step on the LEfSe Galaxy server with my uploaded data, I got an error message as follows:

Traceback (most recent call last):
  File "/home/usr/local/galaxy-dist/tools/lefse/format_input.py", line 268, in <module>
    feats = numerical_values(feats,params['norm_v'])
  File "/home/usr/local/galaxy-dist/tools/lefse/format_input.py", line 127, in numerical_values
    for i in range(len(feats.values()[0])): 
IndexError: list index out of range

This error message looks like to be caused by the "Per-sample normalization" setting when formatting the data. In my case, I did per-sample normalization of the sum of the values to 1M, therefore, I chose "Yes" for this setting. Here are the first few lines with my input file:

NAME S1 S2 Stool1 Stool2 S3 S4
ko00564     4364.73 5393.18 3783.45 3963.85 4921.46 5966.87
ko00680     4542.29 3870.01 5717.12 5803.42 5357.55 4521.2
ko00563     0         17.6          0         0          0         260.132

Again, I normalized those values so that all values per column (therefore per sample) added up to 1,000,000.

Would you please give me a hint where I could possibly did wrong here?

Thanks a lot,

Wennie 

Nicola Segata

unread,
Jul 15, 2013, 5:01:01 PM7/15/13
to lefse...@googlegroups.com
Hi Wenni,
 thanks for getting in touch.

The problem might be related with the choice of the class, subclass and subject. What are you specifying for these three options in step "A)" ?

If you selected you first line ("NAME") as class, I'm afraid LEfSe cannot be useful because you would have only one feature for each class, and no statistical significance can be achieved.

Let us know...
thanks
Nicola

Wennie Zhou

unread,
Jul 15, 2013, 6:44:12 PM7/15/13
to lefse...@googlegroups.com
Hi Nicola,

Thanks for getting back so quickly. Yes, as you imaged, I did select the first line ("NAME") as class, and leave the subclass and subject choice as "no subclass" and "no subject", respectively.

Why cannot LEfSe be useful in this case if only class is specified? It should work, according to a previous post where Prof. Huttenhower noted: "without subclasses, there's no Wilcoxon test involved, and only the K-W test is used to compare the top-level classes and build the LDA". Or did I misunderstand something here?

Thanks for your advice,

Nicola Segata

unread,
Jul 16, 2013, 2:02:46 AM7/16/13
to lefse...@googlegroups.com
Hi Wennie,
 it is perfectly fine to specify the class only (i.e. no subclasses and no subjects). What is far from being optimal is having only one sample per class. The statistical tests cannot have enough power to declare a feature to be significant in this setting.

This does not excuse us for generating an error rather than just reporting no biomarkers. If you can send me your dataset (in private) I can take a look to the formatting and dig a bit more into this technical issue. Even a small fraction of the dataset generating the error is fine.

thanks
Nicola 

auberi courchay

unread,
May 16, 2016, 3:10:11 PM5/16/16
to LEfSe-users
Hi

Please can someone help me. I am having the same issue, the same error pops up as I press the button for format data. It doesnt let me even get into the page that allows me to change and pick class and subclass. It just automatically goes straight to that error. I have been using LEFSE this whole tim and it has worked fine. Now all of a sudden this error just pops up. All I do is upload my data table. That works fine and then press format data and it gives me the error. I have reloaded, re-uploaded changed my datasheet and nothing! PLease can someone help me. Really need the results to write up asap.

George Weingart

unread,
May 16, 2016, 9:21:27 PM5/16/16
to auberi courchay, LEfSe-users
Hi Wennie,

Can you send me your whole input file so I'll look at it?
Thanks!

George Weingart PhD
Huttenhower Lab

Bo Sun

unread,
Nov 29, 2016, 4:37:20 PM11/29/16
to LEfSe-users
Dear all,

I have the same issue as well. I have attached the file. May I know how I fill in those three "class", "subclass" and "subject" in my case. I intend to compare WT and Homogeneous groups. 
Look forward to hearing from you soon.

Thank you.
Regards,
Bo

在 2013年7月15日星期一 UTC-7上午10:28:27,Wennie Zhou写道:
Mapping_File_Female_LF_L3.txt
Reply all
Reply to author
Forward
0 new messages