Tool to group abundances of UniRef50 gene families (HUMAnN2) into Gene Ontology (GO) slim terms

209 views
Skip to first unread message

beb...@gmail.com

unread,
Feb 23, 2016, 8:14:33 AM2/23/16
to HUMAnN Users
Hello,

I recently developed a tool to group abundances of UniRef50 gene families generated by HUMAnN2 into Gene Ontology (GO) slim terms.

HUMAnN2 already gives opportunity to regroup UniRef50 gene family abundances into GO abundances. However, these GO terms are still too precise to get a good overview of metabolic processes.

Gene Ontology Consortium proposes GO slim (http://geneontology.org/page/go-slim-and-subset-guide), which are cut-down versions of the GO ontologies to give a broad overview of the ontology content. Here, we can use metagenomics GO slim terms, developed by Jane Lomax and the InterPro group.

With group_humann2_uniref_abundances_to_go (https://github.com/ASaiM/group_humann2_uniref_abundances_to_GO), I propose a tool to regroup HUMAnN2 output containing UniRef50 gene family abundances into abundances of metagenomic GO slim term. This tool uses GoaTools (https://github.com/tanghaibao/goatools) to map GO terms to GO slim terms, HUMAnN2 to regroup abundances of UniRref50 gene families into abundances of metagenomc GO slim terms and custom Python scripts.

This tool can be used by itself (in command line), but it is alos associated with a wrapper for Galaxy. This wrapper is currently available on Galaxy Test ToolShed under name group_humann2_uniref_abundances_to_go, with owner bebatut. It can then be installed on every Galaxy instance.


I would like to have your feedbacks about this tool

Thanks a lot

Bérénice

Eric Franzosa

unread,
Feb 24, 2016, 2:13:55 PM2/24/16
to humann...@googlegroups.com
Hi Bérénice,

This is a really interesting idea! When we built the "Informative GO" terms for HUMAnN2 we tried to strike a balance between the specificity and number of terms, but it's entirely possible that they'll be too specific for some applications. I had looked at the metagenome GO slim previously but worried it might not be specific enough. That said, both options might be of use to HUMAnN2 users.

I think the best way for us to do this would be to bundle an additional UniRef50 to GO Slim mapping file with the existing regroup_table script. If you've built such a mapping file, could you send it to me (fran...@hsph.harvard.edu)? If so, we might be able to bundle it with a future HUMAnN2 release.

Thanks,
Eric


Yang song

unread,
Feb 25, 2016, 1:35:44 PM2/25/16
to HUMAnN Users
I have problem to use this  script.

I have humann2 and python2.7 install properly, but when I try to run install_dependen,

I got error as 

Downloading/unpacking fisher (from -r ../requirements.txt (line 2))

  Downloading fisher-0.1.4.tar.gz (45kB): 45kB downloaded

  Running setup.py (path:/local/projects-t3/PGCET/MT-PGCET/500_humann2/group_humann2_uniref_abundances_to_GO-master/.venv/build/fisher/setup.py) egg_info for package fisher

    Traceback (most recent call last):

      File "<string>", line 17, in <module>

      File "/local/projects-t3/PGCET/MT-PGCET/500_humann2/group_humann2_uniref_abundances_to_GO-master/.venv/build/fisher/setup.py", line 2, in <module>

        import numpy as np

    ImportError: No module named numpy

    Complete output from command python setup.py egg_info:

    Traceback (most recent call last):


  File "<string>", line 17, in <module>


  File "/local/projects-t3/PGCET/MT-PGCET/500_humann2/group_humann2_uniref_abundances_to_GO-master/.venv/build/fisher/setup.py", line 2, in <module>


    import numpy as np


ImportError: No module named numpy


----------------------------------------

Cleaning up...

Command python setup.py egg_info failed with error code 1 in /local/projects-t3/PGCET/MT-PGCET/500_humann2/group_humann2_uniref_abundances_to_GO-master/.venv/build/fisher

Storing debug log for failure in /home/ysong/.pip/pip.log


Install goatools



I have fisher in my python2.7, what is the problem?

Thank you

yang

beb...@gmail.com

unread,
Feb 25, 2016, 3:09:32 PM2/25/16
to HUMAnN Users
Hi Eric,

Thanks for the interest.

Actually in the tool, I don't use a direct mapping between UniRef50 and GO slim.
I download the UniRef50 to GO mapping used in HUMAnN2, I format it to get simple GO id. Then I use HUMAnN2 regroup method with the formatted mapping.
After that, I use Goatools to get a mapping between HUMAnN2 GO and metagenomic GO slim. And to extract GO slim term abundance, I reapply HUMANn2 regroup method
on previously generated GO abundance with the new mapping.
So I do not use direct mapping between UniRef50 and GO slim.

I'm not sure if it is clear :)
I could post a scheme to sum up if needed

Bérénice

beb...@gmail.com

unread,
Feb 25, 2016, 3:23:25 PM2/25/16
to HUMAnN Users
Hi Yang,

Thanks to your interest.

The issue seems to come from lack of numpy.
The `install_dependencies.sh` script will launch a clean virtual environment (i.e. an environment without any dependencies, which will not interact with installations on
your computer). This virtual environment must be filled with needed packages (fisher,
goatols and numpy).
In your case, it seems that numpy is not correctly installed. And fisher need it.

You can force installation of dependencies:

```
$ source .venv/bin/activate #to launch virtual environment
$ pip install numpy
$ pip install fisher
$ pip install goatools
$ git clone https://github.com/tanghaibao/goatools.git
$ hg clone https://bitbucket.org/biobakery/humann2
$ cd humann2
$ python setup.py install --bypass-dependencies-install --install-scripts=./humann2_scripts
$ cd ../
```

I hope it will work

Yang song

unread,
Feb 25, 2016, 4:45:39 PM2/25/16
to HUMAnN Users, beb...@gmail.com
I have everything already.

(.venv)ysong@group_humann2_uniref_abundances_to_GO $ use python2.7

(.venv)ysong@group_humann2_uniref_abundances_to_GO $ pip install numpy 

Requirement already satisfied (use --upgrade to upgrade): numpy in /usr/local/packages/Python-2.7/lib/python2.7/site-packages

Cleaning up...

(.venv)ysong@group_humann2_uniref_abundances_to_GO $ pip install fisher 

Requirement already satisfied (use --upgrade to upgrade): fisher in /usr/local/packages/Python-2.7/lib/python2.7/site-packages

Cleaning up...

(.venv)ysong@group_humann2_uniref_abundances_to_GO $  pip install goatools 

Requirement already satisfied (use --upgrade to upgrade): goatools in /usr/local/packages/Python-2.7/lib/python2.7/site-packages/goatools-0.5.9-py2.7.egg

Requirement already satisfied (use --upgrade to upgrade): fisher in /usr/local/packages/Python-2.7/lib/python2.7/site-packages (from goatools)

Cleaning up...




However, when I run it as following, problems showed

group_to_GO_abundances.sh -i ../../../../204_humann2_stigent_result_new/001_6_10_Stool1.humann2.strigent.new.outpu/001_6_10_Stool1.merge_genefamilies.tsv -m ../../../../001_6_10.MF -b ../../../../001_6_10.BP -c  ../../../../001_6_10.CC 

Format HUMAnN2 UniRef50 GO mapping

==================================


Map to slim GO

==============

Traceback (most recent call last):

  File "goatools/scripts//map_to_slim.py", line 9, in <module>

    from goatools.obo_parser import GODag

  File "goatools/scripts/../goatools/__init__.py", line 5, in <module>

    from goatools.go_enrichment import *

  File "goatools/scripts/../goatools/go_enrichment.py", line 18, in <module>

    import fisher

ImportError: No module named fisher


Format slim GO

==============


Regroup UniRef50 to GO

======================

./group_to_GO_abundances.sh: line 151: /usr/bin//humann2_regroup_table: No such file or directory


Regroup GO to slim GO

=====================

./group_to_GO_abundances.sh: line 160: /usr/bin//humann2_regroup_table: No such file or directory


Format slim GO abundance

========================

Traceback (most recent call last):

  File "./src/format_humann2_output.py", line 90, in <module>

    format_humann2_output(args, go_annotations)

  File "./src/format_humann2_output.py", line 51, in format_humann2_output

    with open(args.humann2_output, "r") as humann2_output:

IOError: [Errno 2] No such file or directory: 'tmp_data/humann2_slim_go_abundances.txt'




what is the problem

Thank you

Yang

beb...@gmail.com

unread,
Feb 26, 2016, 5:01:25 AM2/26/16
to HUMAnN Users, beb...@gmail.com
Hi,

Are you still in the virtualenv when you launch `group_to_GO_abundances.sh`?
You can launch it with

```
$ source .venv/bin/activate
```

Did you install humann2 in the virtualenv also?
Can you run

```
$ which humann2
```

And it is written on your terminal?

Berenice
Reply all
Reply to author
Forward
0 new messages