parent/child GO process terms

26 προβολές
Παράβλεψη και μετάβαση στο πρώτο μη αναγνωσμένο μήνυμα

Tim Martin

μη αναγνωσμένη,
19 Δεκ 2011, 3:32:27 μ.μ.19/12/11
ως UCSF EGAN
I use EGAN to enrich a visible set of genes of interest to GO
Processes. However, I am interested in somehow (auto or manually)
being able to easily arrange highly enriched GO process (added to the
network) and organized the visible gene-GO process network in groups
based on common GO parents.

For example, a few GO Processes that were highly enriched and added to
my visible network are:
regulation of cell proliferation
negative regulation of cell proliferation
regulation of cell cycle
cell cycle arrest

If I browse http://www.geneontology.org/ website, I find that these
processes belong to a common process: biological regulation (GO:
0065007). A similar effect is observed for other groups of GO
processes added to the network. I want these "rings of interest"
around these genes/GO processes to provide a general biological
mechanism at play while providing the details of gene-gene-GO process
interaction. Is this possible? What about rings-in-rings or venn
diagram style? There doesn't even have to be a physically drawn
circle around the region, so long as there is a way to hit the auto-
arrange button to make it happen.

Also, would it be possible to allow the visualization of the GO
Process number in addition to the name? Maybe when I right-click the
GO Process or replace numbers with name on the network space to
conserve space?

Thanks,
Tim Martin

ucsf egan

μη αναγνωσμένη,
19 Δεκ 2011, 5:07:18 μ.μ.19/12/11
ως UCSF EGAN
Hi Tim,

These are interesting considerations. The ontological relationships
between GO Terms are under-utilized in EGAN, just as the semantic
qualifiers on pathway interactions (e.g. activation, inhibition) are
under-utilized for KEGG and NCI-Nature data sets.

I think I understand your request, and I'm going to brainstorm a bit
about what you could do to accomplish it in present-day EGAN (future
EGAN will likely be able to do what you ask - or at least something
very close to it).

One important thing to note is that the GO gene set data files
provided in EGAN have been modified in two ways (before loading in
EGAN):

1) Gene-term associations are propagated up the GO tree so that if a
gene is associated with a child term, it automatically gets associated
with parent terms.
2) Gene sets with more than n genes are removed from the list (I
believe n is 1000 for the data available in 1.4).

If either of these is a problem, then you might want to consider
loading your own GO data sets into EGAN.

Now, moving on to visualization. The likely reason that "biological
regulation" is not included in the enrichment analysis is that it has
been removed from the data (too many genes). If it were available
then it would likely also have been found to be enriched in your data.

The current paradigm for showing enriched association nodes in EGAN is
to let the user decide based on their semantic understanding of the
terms. To show 10 enriched terms that are all basically the same
thing would produce a lot of unnecessary clutter in the graph. So the
user is given the opportunity to decide which terms make the most
sense to show (the user may or may not know about the experiment
performed, and that may or may not have influence on their decision).

Venn Diagram visualization might be valuable for small numbers of sets
and sets that are rather mutually-exclusive (like GO Slim), but as the
number of sets and set overlap scales upwards, Venn Diagrams (or their
sophisticated present-day option, BubbleSets
http://faculty.uoit.ca/collins/research/bubblesets/index.html) become
problematic. But worth consideration for future development.

If you would like to know which ID is associated with each GO set you
can click the link button in the Node Table (or right-click on the
node and choose link-out from the menu) to navigate to that term's
page at AmiGO. Unfortunately I don't believe there is a way to
display the node ID as a prefix for the text label - only node type.
If you create your own GO-gene association data files, then you can
include the GO ID in the term name. Also, I believe the MSigDB gene
set files (http://www.broadinstitute.org/gsea/msigdb/index.jsp)
contain the GO ID in their term names. You can download their data
in .GMT format and specify it in the Launch EGAN Wizard.

If you prefer to have the association nodes not influence the layout
of gene nodes (say you prefer to connect them by protein-protein
interactions and/or literature co-occurrence only), then I suggest
experimenting with some of the different layout options for
association nodes (see the layout * button to the left of the Network
View).

Thanks for posting your ideas!

Regards,

Jesse

On Dec 19, 12:32 pm, Tim Martin <timothy.michael.mar...@gmail.com>
wrote:


> I use EGAN to enrich a visible set of genes of interest to GO
> Processes.  However, I am interested in somehow (auto or manually)
> being able to easily arrange highly enriched GO process (added to the
> network) and organized the visible gene-GO process network in groups
> based on common GO parents.
>
> For example, a few GO Processes that were highly enriched and added to
> my visible network are:
> regulation of cell proliferation
> negative regulation of cell proliferation
> regulation of cell cycle
> cell cycle arrest
>

> If I browsehttp://www.geneontology.org/website, I find that these

Απάντηση σε όλους
Απάντηση στον συντάκτη
Προώθηση
0 νέα μηνύματα