an urgent inquiry

67 views
Skip to first unread message

Farah GG

unread,
Dec 8, 2020, 4:45:30 PM12/8/20
to cytoscape...@googlegroups.com
Hello,

I have a network of 100 proteins (without any quantitative or expression data) which I want to do clustering based on their biological process inside the network. I found the attached figure from an online paper and I would like to create something similar to this figure from my network. This paper has used the ClusterOne plugin of Cytoscape. I am also trying to use ClusterOne, but I could not find where I can set Biological Process (BP) of the proteins and also p-value for this clustering. I would highly appreciate if you could help me how to do that?

I was wondering if you could also introduce me any other Cytoscape plugins that can create clustering based on biological process inside a network (e.g. attached figure).

Thanks a lot for your support, in advance.

Best regards,
Farah
ClusterOne.png

Alex Pico

unread,
Dec 8, 2020, 7:28:16 PM12/8/20
to cytoscape-helpdesk, Tamas Nepusz
Cc’ing the app author…

Pro-tip: you can get the app author contact info from the ‘Email’ link on each app page in the Cytoscape App Store.

 - Alex



--
You received this message because you are subscribed to the Google Groups "cytoscape-helpdesk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cytoscape-helpd...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cytoscape-helpdesk/CACvvUxePVTQg_CG1QYWt8nOYLAXhoiE%2B0dc30xsC3AXVzSQNnQ%40mail.gmail.com.
<ClusterOne.png>

Farah GG

unread,
Dec 9, 2020, 12:14:44 PM12/9/20
to Tamas Nepusz, cytoscape...@googlegroups.com

Thank you very much for your explanation dear Tamas. May I ask ''clustering networks based on structural information'', you mean clustering based on edges and their weights?

I would highly appreciate if anybody can introduce me some Cytoscape plugins that can perform my requirements as below (I searched for it, but could not find any plugins):

I need to do clustering based on their Gene Ontology biological process inside my network to create some subnetworks of my initial network. In such a way that nodes to be coloured based on their cluster, and each cluster has its own colour. Also, after clustering, I need the layout of the initial network to be kept as before (not merging the nodes in a bigger node), and just each node to be coloured based on their cluster that they belong to. Also, it shows the enriched Gene Ontology biological process of each coloured cluster. 

Thanks a lot for your help and advice.

Best regards, 

Farah


On Wed, Dec 9, 2020 at 8:15 AM Tamas Nepusz <nta...@gmail.com> wrote:
Dear Farah,

It's been a while since I've developed the ClusterONE plugin -- I'm glad to see that it still works with Cytoscape and that people are using it for research!

ClusterONE is a plugin for clustering networks based on structural information only; the author of your figure probably used some external data source (Gene Ontology maybe?) to get information about the biological processes the proteins belong to, and then assigned meanings to the clusters based on the majority biological process of that cluster; but this is pure guesswork, you need to read the original paper about how the biological process annotations were assigned to the clusters. But one thing is for sure: ClusterONE is not concerned with annotations, it works only with the connections and their weights, so you can use the Gene Ontology annotations and overrepresentation analysis to confirm that a particular cluster found by ClusterONE has biological significance. There are probably other plugins for that in the Cytoscape ecosystem; unfortunately I am not in academia any more and I haven't been following the recent developments so I cannot help you with that.

As for the p-value, the ClusterONE plugin provides them in the Results panel of ClusterONE, for each individual cluster; you might need to switch the Results panel to detailed view first; see the documentation:


All the best,
T.

Tamas Nepusz

unread,
Dec 10, 2020, 11:58:39 AM12/10/20
to fsgol...@gmail.com, cytoscape-helpdesk
Dear Farah,

It's been a while since I've developed the ClusterONE plugin -- I'm glad to see that it still works with Cytoscape and that people are using it for research!

ClusterONE is a plugin for clustering networks based on structural information only; the author of your figure probably used some external data source (Gene Ontology maybe?) to get information about the biological processes the proteins belong to, and then assigned meanings to the clusters based on the majority biological process of that cluster; but this is pure guesswork, you need to read the original paper about how the biological process annotations were assigned to the clusters. But one thing is for sure: ClusterONE is not concerned with annotations, it works only with the connections and their weights, so you can use the Gene Ontology annotations and overrepresentation analysis to confirm that a particular cluster found by ClusterONE has biological significance. There are probably other plugins for that in the Cytoscape ecosystem; unfortunately I am not in academia any more and I haven't been following the recent developments so I cannot help you with that.

As for the p-value, the ClusterONE plugin provides them in the Results panel of ClusterONE, for each individual cluster; you might need to switch the Results panel to detailed view first; see the documentation:


All the best,
T.


On Wed, 9 Dec 2020 at 01:28, Alex Pico <alex...@gladstone.ucsf.edu> wrote:

Farah GG

unread,
Dec 10, 2020, 1:45:30 PM12/10/20
to cytoscape...@googlegroups.com
Hi Scooter,

Thanks for your response. Still I could not find any Cytoscape plugin to cluster proteins of a network based on their GO Biological Process and so create significant sub-networks. I would highly appreciate if you could introduce any plugins with this function.

Thank you so much for your support.

Best regards,
Farah

ann

unread,
Dec 10, 2020, 5:45:52 PM12/10/20
to cytoscape...@googlegroups.com

Farah GG

unread,
Dec 10, 2020, 6:27:23 PM12/10/20
to cytoscape...@googlegroups.com
Thanks Ann,

Yes I initially used ClueGo, but its presentation of the clustered network is different from what I need. ClueGo merges nodes of a cluster in one big node, and also one protein may be present in different clusters. But I need to show each node of each cluster distinctly. Also, all nodes of a cluster to be in the same colour. Clusters to be distinguished from each other by colour. Also each cluster should have its own biological process. An example of what I need to create is attached in my previous emails.
 I would highly appreciate any help and suggestions.

Best regards,

Farah


Dexter Pratt

unread,
Dec 11, 2020, 12:16:43 PM12/11/20
to cytoscape-helpdesk
Hi Farah,

If you want clustering based only on the GO, annotations of genes, not the nodes and edges of your network, then it seems to me that you want to separately run a gene set functional classification algorithm (which assigns genes to one of N groups based on GO annotations), select the top GO match for each group, and only then assign each node in the network to a color/new category attribute based on those groups.

However, note that because you want an algorithm that does not take network structure into consideration, there is no guarantee that nodes sharing a color are all in a connected subset of your network.  (regardless of your method for assigning the best GO classification to the gene: you simply didn't use that network's information.)

In a little googling, I found one function of DAVID that does something like this although I don't know the form in which it might export that table and after you got the table you would need to select the top GO assignment for each group in your own script.


Dexter

Farah GG

unread,
Dec 13, 2020, 7:12:43 PM12/13/20
to dexterp...@gmail.com, cytoscape...@googlegroups.com
Hi Dexter,

Thank you so much for your advice and guide. I read ''Gene Functional Classification'' of DAVID. However, I actually need to map biological process relationships within an interaction network, and I need clustering based on the GO annotations of genes within a network to also show connections and interconnections among nodes. Also, in terms of visualization of the clustered network, all nodes of each cluster should not be merged in a bigger node. Instead, each node of each cluster will stay the same size and location, but only coloured. I could not find any tool to do it. So, I am thinking of using ClueGO to do clustering based on enrichment analysis of proteins (based on GO biological processes). Then, based on each gene corresponding to which enriched function/cluster, I manually colour nodes. I was wondering if you could let me know your opinion about this way? Does this way make sense?

Also, I would highly appreciate if you could advise me on the following questions:
- As one node/gene may be overlapped in several clusters, how can I find the top/main function for each gene? as I have to assign only one cluster for each gene. Also, in terms of visualization, how can I show one node corresponds to several clusters?

- From the ClueGO result files, the file entitled '' ClueGOResults_GO-BP Genes With Corresponding Functions.txt '' is empty for me, while I need this information. How can I have the information of this file?

- I was wondering if you could let me know which clustering algorithm ClueGO uses to cluster nodes based on their annotations?

Thank you so much for your great support, in advance.

Best regards,
Farah





Scooter Morris

unread,
Dec 17, 2020, 11:52:52 AM12/17/20
to cytoscape-helpdesk
Hi Farah,
   OK, let's take a step back a bit.  Most genes are assigned multiple GO annotations, even if you use something like GO-slim to reduce the overlap.   So, in order to produce that figure, you need to somehow annotate each protein (node) with a single BP GO Term, then there are a number of ways to separate the nodes based on that annotation.  Alternatively, you could compute some sort of similarity score based on the Jaccard overlap of GO terms and then cluster based on that.  The result should be that proteins with similar terms group together, however, that could certainly result in a cluster of proteins that are annotated with two or more BP terms.  Finally, as Dexter suggested, you could compute some sort of enrichment, to find out which proteins are enriched in your particular data set compared to either the entire genome or some subset (as appropriate for your experiment).  Then you could assign the top enriched term to each protein.  Once you know the top term for each protein, you could manually annotate the proteins.  Be aware that the results of the enrichment might result in several proteins not being enriched in any given term...

-- scooter

Farah GG

unread,
Dec 17, 2020, 6:55:04 PM12/17/20
to cytoscape...@googlegroups.com
Hi Scooter,

I see. Ok. Thank you very much for your guide. Also, may I ask for the following questions:

- From the ClueGO result files, the file entitled '' ClueGOResults_GO-BP Genes With Corresponding Functions.txt '' is empty for me, while I need this information. How can I have the information of this file?

- Which clustering algorithm ClueGO uses to cluster nodes based on their annotations?

Many thanks.
Best regards,
Farah

Scooter Morris

unread,
Jan 14, 2021, 11:29:19 AMJan 14
to cytoscape-helpdesk
Hi Farah,
    I think for those questions, you probably need to contact the ClueGO authors.  However, I don't think that ClueGO actually calculates clusters -- rather, it will compare clusters based on GO annotations.  You can read about ClueGO here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2666812/

-- scooter
Reply all
Reply to author
Forward
0 new messages