Mapping functions to genes

159 views
Skip to first unread message

shahrokh....@gmail.com

unread,
Aug 30, 2017, 11:34:13 AM8/30/17
to Anvi'o
Hi Anvio's team!

I've got my gene coverage and detections using anvi-export-gene-coverage script. I am trying to map these genes to COG functions, but I am not sure that the "key" column in this file shows  "entry_id" or "gene_caller_id" in my gene_functions file exported from the contigs database.

Thanks,
Sharok

A. Murat Eren

unread,
Aug 30, 2017, 12:30:41 PM8/30/17
to Anvi'o
​Hey Sharok,

The key column in the anvi-export-gene-coverage-and-detection script is the `gene_caller_id`​ column in your gene_functions file.

But I wonder why don't you do it this way: You could run anvi-run-ncbi-cogs on your contigs database, and then you could run anvi-summarize on a collection with --init-gene-coverages flag. In your results you would find a file for gene calls with everything together (gene caller id, function, coverage across samples, gene sequnce, etc).


Best, 

--

A. Murat Eren (meren)
http://merenlab.org :: twitter :: gpg

--
Anvi'o Paper: https://peerj.com/articles/1319/
Project Page: http://merenlab.org/projects/anvio/
Code Repository: https://github.com/meren/anvio
---
You received this message because you are subscribed to the Google Groups "Anvi'o" group.
To unsubscribe from this group and stop receiving emails from it, send an email to anvio+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/anvio/2050cef0-738b-4e79-91e9-4d392adddf91%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

shahrokh....@gmail.com

unread,
Aug 30, 2017, 2:19:37 PM8/30/17
to Anvi'o

Hi Meren,
Thank you for your quick response.
Yes! I looked into that approach and it's great to look at gene calls per Bin. I am also interested to look at bacterial engraftment for my FMT study at gene-level independent of binning approaches (given all concerns with the binning process).    So the only way I found to look at all the genes (independent of bins) across all samples was to export those mentioned files from anvio seperately, then map those together. Am I missing something or that make sense?

Thank,
Sharok 



On Wednesday, August 30, 2017 at 12:30:41 PM UTC-4, Meren wrote:
​Hey Sharok,

The key column in the anvi-export-gene-coverage-and-detection script is the `gene_caller_id`​ column in your gene_functions file.

But I wonder why don't you do it this way: You could run anvi-run-ncbi-cogs on your contigs database, and then you could run anvi-summarize on a collection with --init-gene-coverages flag. In your results you would find a file for gene calls with everything together (gene caller id, function, coverage across samples, gene sequnce, etc).


Best, 

--

A. Murat Eren (meren)
http://merenlab.org :: twitter :: gpg

On Wed, Aug 30, 2017 at 10:34 AM, <shahrokh....@gmail.com> wrote:
Hi Anvio's team!

I've got my gene coverage and detections using anvi-export-gene-coverage script. I am trying to map these genes to COG functions, but I am not sure that the "key" column in this file shows  "entry_id" or "gene_caller_id" in my gene_functions file exported from the contigs database.

Thanks,
Sharok

--
Anvi'o Paper: https://peerj.com/articles/1319/
Project Page: http://merenlab.org/projects/anvio/
Code Repository: https://github.com/meren/anvio
---
You received this message because you are subscribed to the Google Groups "Anvi'o" group.
To unsubscribe from this group and stop receiving emails from it, send an email to anvio+un...@googlegroups.com.

Mike Lee

unread,
Aug 30, 2017, 2:28:24 PM8/30/17
to Anvi'o
Hi there, Sharok, 

I find myself doing the same thing with every dataset i run through, and while I've been meaning to ask them/maybe help add to that script the scenario that if you *don't* provide a bin or collection, that it will just give you everything. But in the meantime, you can make a single 'bin' that contains all of your contigs and then run anvi-summarize :)

hope that helps!
-mike 

To unsubscribe from this group and stop receiving emails from it, send an email to anvio+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/anvio/fbe73094-ef62-459f-af95-493857563ff2%40googlegroups.com.

shahrokh....@gmail.com

unread,
Aug 30, 2017, 2:52:56 PM8/30/17
to Anvi'o
Hi MIke,
Nice! that makes sense and probably faster than writing a code to do so ! :)
Thanks,
Sharok

A. Murat Eren

unread,
Aug 30, 2017, 2:59:03 PM8/30/17
to Anvi'o
To summarize data in a profile database you need a collection. You could generate a collection called DEFAULT, with a single bin that describes all your contigs, and you could do your summary by calling the anvi-summarize. That would give you all genes and their coverages across all samples :)

The easiest way to it is to use the program anvi-import-collection. Assuming you have a merged profile database, here is how you can do it in multiple steps.

Get a collection of all your splits:

for split in `sqlite3 PATH/TO/PROFILE.db 'select contig from mean_coverage_contigs;'`; do echo $split | awk '{print $1 "\tALL"}'; done > default-collection.txt

Then you can import this it your profile database as a collection:

anvi-import-collection default-collection.txt -p
​ ​
PATH/TO/PROFILE.db -c
​ ​
PATH/TO/​
CONTIGS.db -C DEFAULT

Summarize it to get sweet sweet data:

anvi-summarize -c PATH/TO/CONTIGS.db -p PATH/TO/PROFILE.db -C DEFAULT -o summary --init-gene-coverages

Probably we can put together a script for the first step to make it more easier to add a default collection with everything.​


​Best,​
--

A. Murat Eren (meren)
http://merenlab.org :: twitter :: gpg

To unsubscribe from this group and stop receiving emails from it, send an email to anvio+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/anvio/fbe73094-ef62-459f-af95-493857563ff2%40googlegroups.com.

shahrokh....@gmail.com

unread,
Aug 30, 2017, 3:54:03 PM8/30/17
to Anvi'o
Thanks Meren, this is very helpful!
Sharok
Reply all
Reply to author
Forward
0 new messages