working with pangenomes and getting cluster summaries

294 views
Skip to first unread message

kurtyak...@gmail.com

unread,
Mar 22, 2021, 5:13:42 PM3/22/21
to Anvi'o
Hello, 

I am working with anvio v7. I went through the pangenome tutorial, and now am looking at addressing a couple specific questions I have. One general question I have is how similar the gene content of my MAGs are. The "anvi-compute-functional-enrichment" program, in addition to its primary function, gives a nice matrix of the annotated genes across each MAG which I have found useful to use for use in an ordination, but am also curious to see the presence/absence (or counts?) of gene clusters across all of my MAGs, since it is not limited to metabolic genes described in a database. 

Q1) I tried to use anvi-summarize on my pangenome db to see if it outputs some summary data on the gene clusters. However, it demands a collection name, but my pangenome has no collections. What is the easiest way around this? How do I create a collection encompassing all of the genomes in the pan db? 

-and will anvi-summarize provide me with a matrix of presence/absence of gene clusters across all the MAGs?

Q2) For anvi-display-pan, is there a way to toggle which layers are visualized? I have many annotation sources, but dont really want to see them all at the same time or all the time. I played around with the interactive view checking and unchecking the layers, and then trying to redraw, but that didnt seem to make a difference. I feel like I am missing something. 

Best regards, and thanks in advance for any help. 
Kurt




Iva Veseli

unread,
Mar 22, 2021, 5:38:48 PM3/22/21
to an...@googlegroups.com
Hi Kurt, 

1) The easiest way around this is to add a default collection with everything in it using this program: https://merenlab.org/software/anvio/help/main/programs/anvi-script-add-default-collection/ Then you can run anvi-summarize with the default collection. 

1b) anvi-summarize will not give you the gene cluster matrix you are looking for (but you would be able to generate one if you wrote some code to parse its output files). HOWEVER, there is an easier way to get a gene cluster matrix directly :)  

I am assuming that what you already tried (to get the gene matrix you referred to in your email) was the --functional-occurrence-table-output parameter for anvi-compute-functional-enrichment (pangenome mode, aka input option 1)? Well, try running this program with the flags `--include-gc-identity-as-function --annotation-source IDENTITY --functional-occurrence-table-output gc_occurrence.txt `. That will get you a file, gc_occurrence.txt, which has a matrix of gene clusters in every MAG in your pangenome. 

2) Yes, you must first check all the layers that you want to disappear, and then set the height of those layers to 0 (you can use the “edit attributes for multiple layers” section at the bottom to set the height for all the layers you checked. See screenshot:

Iva

-------------------------------------------------
Iva Veseli (she/her)
Graduate Student, Meren and Jabri Labs
Biophysical Sciences Program
University of Chicago


--
Anvi'o Paper: https://peerj.com/articles/1319/
Project Page: http://merenlab.org/projects/anvio/
Code Repository: https://github.com/meren/anvio
---
You received this message because you are subscribed to the Google Groups "Anvi'o" group.
To unsubscribe from this group and stop receiving emails from it, send an email to anvio+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/anvio/9c78f3b7-b4eb-483c-926d-e47dbb9c3a67n%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages