show gene after cluego analysis

2,303 views
Skip to first unread message

Elvira Inglese

unread,
Oct 8, 2014, 8:42:10 AM10/8/14
to cytoscape...@googlegroups.com

hi, I've a problem with Cluego. 

I performed a GO enrichment and after this I want to see all genes involved in a path...

Usually I use "select>show all nodes and edges" but now with this command I can see only genes involved in a GO that are in my initial gene list, not all the gene involved...

how can I fix this problem? 

Elvira

Bernhard

unread,
Oct 9, 2014, 1:06:10 PM10/9/14
to cytoscape...@googlegroups.com
Hi Elvira, with the new ClueGO version we set the number of genes (that do not come from the initial list) by default to zero, to speed up the ClueGO analysis (it doesn't add all the genes from the terms initially). If you want to use CluePedia afterwards you have to set the number of genes with the new option in red 'CluePedia Properties' at least to 500 and re-run your ClueGO analysis (we added also a help button to explain), it will be a bit slower but you will see all the genes from the terms (at least 500 make the max higher if you want to even more genes!).
Best
Bernhard

Jennifer Goldman

unread,
Sep 13, 2015, 2:46:06 PM9/13/15
to cytoscape-helpdesk
Hello Bernhard,

Thank you for this response. Very helpful. Beyond this, I would like to extract a portion of the pathways (which I have found to be common with other gene clusters), and show the genes associated with this small portion of the network. If I select the pathways I'd like to show and then use the 'New Network from Selection' button, is it possible to show the genes associated with this new (smaller) network? 

I have run an initial analysis with thousands of genes with the Cluepedia 'Genes per Term vizualization threshold' set to 1000 genes, but am currently unable to show genes associated with the smaller group I extract... 

Thank you so much for any suggestions,

Jennifer

Bernhard

unread,
Sep 14, 2015, 2:48:02 PM9/14/15
to cytoscape-helpdesk
Hi Jennifer,
there is a way to save selected Pathways/GO terms in to a custom node group that again can serve a new Ontology collection. To do this select the Pathways/GO terms (gene will not be considered), then right click somewhere on the cytoscape canvas (not on a node or edge). Then you should see a menu with several options. Click now on Apps->CluePedia->Create Node Group from Selected Nodes and give a name to the new group (if you select too much Pathways the next steps can hang up or take long, so just select not more than max 50). This should create now a new custom ontology on the left panel with the ontologies inside. Now create a new empty network or a network with genes you want to use (you can also select them from your current network) with CluePedia. Now go back to ClueGO and select (check) only your custom selection and right click on the custom ontology name. Again a new menu, select "Add Selected Ontologies/Groups to Network". This should add the selected Pathways to you new/current network. It is not real intuitive yet, but it should do what you want I hope.
Best
Bernhard

Jennifer Goldman

unread,
Sep 16, 2015, 7:32:16 PM9/16/15
to cytoscape-helpdesk
Hello Bernard, 
Is it equally valid to select 'Associated genes found' for each pathway and the pathways using 'GOTerm', then to make a 'New network from selection'? This appears to work to make a network showing the pathways and the genes found in those pathways (thankfully, I only have three pathways, because selecting the genes turns out to be rather tedious!) I am having some trouble with the series of operations outlined in the suggested method, but if that is a more valid way to proceed, I will troubleshoot, certainly.
Thanks a million,
Jennifer

Jennifer Goldman

unread,
Sep 17, 2015, 12:28:13 AM9/17/15
to cytoscape-helpdesk
To clarify more, for each pathway of interest, I have copied out the 'Associated genes found', made a non-redundant, alphebetized list of those genes, and then manually selecting each gene found in the pathways, as well as the pathways themselves, then making a 'New network from selection'. 


Bernhard

unread,
Sep 17, 2015, 6:32:00 AM9/17/15
to cytoscape-helpdesk
Hi Jennifer,
it is not the same just to make a 'New network from selection', because like this you will loose the ClueGO/Pedia functionality. But it depends on what you want to do. If you just want to have final visualization of your network it is OK, but if you want to make further analyses with CluePedia it will not work.

To select the genes from a pathway you could use "Ctrl-6" keys or in the menu "Select"->"Nodes"->"First Neighbors of Selected Nodes" to make the selection easier.

Best
Bernhard

Jennifer Goldman

unread,
Sep 22, 2015, 12:40:54 PM9/22/15
to cytoscape-helpdesk
Thank you very much Bernhard!
If I look in the 'Associated genes found' for each pathway with Cluego+Cluepedia, and count the number of genes, it is far fewer than if I use 'Ctrl-6' or  "Select"->"Nodes"->"First Neighbors of Selected Nodes" of the selected pathways...  I therefore chose to use the 'Associated genes found' from the three pathways in common between three gene clusters and then use 'new network'. I think this displays the point that the three gene clusters have (3) overlapping pathways, containing different genes. I attach a figure if you have a moment for critique. 

The data are shown with degree sorted circle. When 'show genes' is on, can you help me understand what is the meaning of edge thickness?
The kappa score, I understand, contains information about the number of genes connecting pathways when the genes are not shown, but when the edge connects genes with the pathway, I wonder if this is some indication of the gene's association with that term ... or something else? 

Thanks so much for your help,
Jennifer
All_networks_landscape_morespace.2.png

Bernhard

unread,
Sep 23, 2015, 10:46:06 AM9/23/15
to cytoscape-helpdesk
Hi Jennifer,
your figure looks very nice. I forgot to tell you that when you use select first neighbors then all neighboring genes/pathways are selected, since CluePedia hides genes that are not from your input list (only shown when you select 'Show all Genes from Pathways/Terms'), it selects all genes associated (also the hidden ones) from the pathway. So it is the right way you did it.
Concerning the Kappa Score between genes and Terms/Pathways; there is no Kappa Score between them, but to apply a visual style on the edge thickness I need to define a column that maps continuous values to the edges. In this case I used the Kappa Score column. The 'Kappa Score' between Terms/Edges and Gene means in fact the evidence strength of the gene annotation to the Term. So when you see a thick line (theoretical Kappa Score 2) it means that the annotated gene was experimentally verified to be part of that Term/Pathway, so all experimental evidence codes like EXP,IDA,IPI,IMP,IGI,IEP will give a thick edge (so associated with stronger evidence) and all others a thin one (most genes are IEA so inferred by a machine like abstract mining and sometimes withdrawn afterwards by GO curators). You can also visualize the evidence code directly on the edge by clicking on the 'Show Evidence' button in ClueGO or in the left Panel next to the Evidence Codes. It is quite small written you will have to zoom in on the edges to see it (can be increased using the Cytoscape Edge Label Font options).

Best
Bernhard

For Info:
Experimental Evidence Codes from GO:
EXP (Inferred from Experiment)
IDA (Inferred from Direct Assay)
IPI (Inferred from Physical Interaction)
IMP (Inferred from Mutant Phenotype)
IGI (Inferred from Genetic Interaction)
IEP (Inferred from Expression Pattern)

Other Evidence Codes from GO
IC  (Inferred by Curator)
IGC (Inferred from Genomic Context)
RCA (Inferred from Reviewed Computational Analysis)
ISA (Inferred from Sequence Alignment)
ISM (Inferred from Sequence Model)
ISO (Inferred from Sequence Orthology)
ISS (Inferred from Sequence or Structural Similarity)
ND  (No biological Data available)
NAS (Non-traceable Author Statement)
NR  (Not Recorded)
TAS (Traceable Author Statement)
IEA (Inferred from Electronic Annotation)
Message has been deleted

Jennifer Goldman

unread,
Oct 23, 2015, 8:02:41 PM10/23/15
to cytoscape-helpdesk
Dear Bernhard, 
Thank you in advance for any advice. 
I have another (perhaps simpler) question. The pathways shown in the figure in dark blue (mRNA catabolic process and epidermal growth factor receptor) were both originally grey (I manually changed the color of both, but still to match). Is it arbitrary that they came out the same color (grey) or is this due to some relationship I have overlooked? They appear to be only related by the sharing of RPS27A, but otherwise seem to represent quite different biological processes. I would like to change the color of one for clarity, but notice that the node (RPS27A) is divided in thirds, one green (p75), one turquoise (HH ligand biogenesis) and one grey (well, blue now for aesthetic), so I am not certain why there are not quarters if this gene contributes to four different pathways. 

Can you help me interpret why both mRNA catabolic process and epidermal growth factor receptor have the same color (grey in original)? 

Thanks a lot,
Kind regards,
Jennifer
gene network.png

Bernhard

unread,
Oct 26, 2015, 11:05:05 AM10/26/15
to cytoscape-helpdesk
Dear Jennifer,
the initial gray color comes from the fact that the default number of groups is set to two. So all nodes (Terms/Pathways) that are are not grouped to at least another terms are shown in gray. That doesn't mean that they are not important!!! It just means that they are not grouped. So you could have only single but very interesting terms. To change this and give also colors to single terms by default just change in "Grouping Options" the Initial Group Size to 1.
See down
Best


Auto Generated Inline Image 1

Jennifer Goldman

unread,
Nov 7, 2015, 11:16:44 AM11/7/15
to cytoscape-helpdesk
Thanks so much Bernhard! I really appreciate your generously clear answers - extremely helpful. 

If I may ask another question- the networks I am producing are often very complex and the node titles are often impossible to read without post-hoc adjustment.
I have looked but cannot find a way to change the node titles to numbers which refer to the cluego results table for example (such that the table could be organized 1:n as a reference for the nodes in the network labeled 1:n). 

I can do this manually, but it is kind of difficult and time consuming. I would be happy to know any tips! 
Best,

Bernhard

unread,
Nov 9, 2015, 4:44:31 AM11/9/15
to cytoscape-helpdesk
Hi Jennifer,
here a quick response. Something you can try is to add an additional column to the node table. To do this you have two options:
1. Click (see the attached image) on "+" to add an integer column ("Add new single column") or what ever kind of names you want (also text if you want).
or if you have a large number of nodes
2. export the node table, then add the node number/name you want in an xls, then re-import the new number/name column import from the xls -> File->Import->Table->File

After this you should have a new column e.g. "Number". Then go to "Style" -> "Node" -> "Label"  select "Number" and "Passthrough". You should see now you new labels on the nodes. Now you can also change font type or size if you want.

Try this and see if it is working.
Best


Auto Generated Inline Image 1

Jennifer Goldman

unread,
Nov 10, 2015, 9:41:46 AM11/10/15
to cytoscape-helpdesk
Thank you! It is working up to the point of re-importing the table file. I have tried both as .csv and .xsl imports. 
I have also tried just re-importing the unmodified table and in all conditions, I get the following message:
"Loading table data
 - Types of keys selected for tables are not matching"

If you have any insight, I would be thrilled to hear! I am trying to manually enter the numbers into the new column, but this seems kind of buggy (the program seems to keep changing my numbers ...?).

Kindest regards, Jen

Jennifer Goldman

unread,
Nov 10, 2015, 11:03:42 AM11/10/15
to cytoscape-helpdesk
Actually, maybe my typing was buggy - I am able to enter numbers manually, but certainly I would be happy to trouble shoot the 'upload table' option.

Bernhard

unread,
Nov 10, 2015, 11:53:07 AM11/10/15
to cytoscape-helpdesk
Hi Jennifer, when you have the .csv or .xls file it should have only 2 columns, GOID and (e.g) MyNumber.
Then import it as Table. You should see now a window like this:



Make sure you selected the Network you want to import it to before. Now you have to set GOID as key! When you import columns you always need unique keys to fit it to the table.
In fact thats it, if you click now on "OK" you should have the additional "MyNumber" column in the table. The rest you know I guess.
Best Regards
Auto Generated Inline Image 1

Jennifer Goldman

unread,
Nov 10, 2015, 12:26:59 PM11/10/15
to cytoscape-helpdesk
Thank you thank you thank you!!! 

Jennifer Goldman

unread,
Apr 3, 2016, 1:55:51 AM4/3/16
to cytoscape-helpdesk
Hello Bernhard, 

I wonder if you would QC a legend for the attached figure, specifically with respect to validity/clarity of node and edge attribute descriptions.

Thank you in advance, 

Jennifer

Figure 4: Enrichment Analysis of Gene Networks Related to Adult Neural Connectivity

Three gene modules with highly significant relationships to neuroanatomical connectivity in adult brain (Turquoise, Green, and Blue) were analyzed for evidence of functional pathway enrichement. Examining the intersection of pathways significantly enriched in all three gene modules following Bonferroni correction identified three : CNS differentiation, neuron migration, and ligand-gated ion channels. Module-specific details of pathways enriched in all connectivity-correlated gene clusters are shown in a (Turquoise), b (Green), and c (Blue). Size of coloured nodes indicates the number of genes participating in that pathway. Edges between gene and pathway terms indicate archived evidence for participation of the gene in the connected pathway. Line weights indicate the strength of empirical evidence supporting participation of the annotated gene in the functional pathway. In the Turquoise module (a), all pathways share genes, whereas genes participating in ligand-gated ion channel pathways are distinct from those comprising CNS differentiation and neuron migration pathways in Green and Blue modules (b-c). 

All_networks_landscape_Oct28.png

Bernhard

unread,
Apr 4, 2016, 6:51:21 AM4/4/16
to cytoscape-helpdesk
Hi Jennifer,
your figure legend sounds fine! But I am not sure about the sentence: "Size of coloured nodes indicates the number of genes participating in that pathway."
Do you show all the gene from the terms/pathways? or just the ones you mapped from your input list? From the size of the nodes shown on your fig. it is a bit miss leading because e.g. the red node with the thick black border in the middle of the figure is bigger than other nodes although it has less genes linked. By default ClueGO maps the (bonferroni corrected) enrichment significance to the node size. So if you did not select something else the sentence should probably be: "Size of coloured nodes indicates the enrichment significance." See the attached figure.
Best
Bernhard

Auto Generated Inline Image 1

Jennifer Goldman

unread,
Apr 4, 2016, 12:24:16 PM4/4/16
to cytoscape-helpdesk
Thanks so much Bernhard. 

Azeem Butt

unread,
Jul 29, 2016, 3:53:53 AM7/29/16
to cytoscape-helpdesk
Hi Bernhard
          I am recently trying to run GO analysis with ClueGO but potentially due to large number of genes, I am getting error so I was wondering if you could please explain about this...My list contains 10475 genes and I am running analysis with following parameters:

GO Biological Process (Date: 25.07.2016, Evidence: ALL)
GO Term Fusion selected
Show only pathways P<0.05 selected
Under advance term/pathway settings:
Cluster#1:
Minimum gene =3, % Genes = 4
enrichment right sided test with bonferroni step down
Default Kappa score: 0.4%

The error I am getting is " Algorithm didn't converge". I chose to continue analysis and once the analysis is finished and I export results to be saved as excel sheet, there are several GO categories for which the "Associated Genes Found" column is simply blank. Plus the issue remains same with every type of GO annotation. What I understand here is that if I increase Kappa score and Terms clustering options, it may work (haven't tried yet) but in this case will it also not led to skipping some of the annotation categories? Although I am able to obtain GO annotation of same set of genes by using other commonly used servers but I am interested to use ClueGO along with cytoscape for further analysis and interactions plots.

PS: Sorry for Hijacking this post Jennifer and Bernhard. I tried to email you but your email ID seems to be non-accessible. So if you could please send me your email ID, I can send you genes list as well

Bernhard

unread,
Jul 29, 2016, 6:01:25 AM7/29/16
to cytoscape-helpdesk
Hi Azeem,

" Algorithm didn't converge" is not necessary and error. It just means that the grouping didn't work out because you have too much GO terms that are too similar as a result of a too large input gene list. In this case you should just use smaller lists of genes or be more restrictive (e.g. % Genes = 50% o more mapped and Minimum gene = 10 genes/term) or skip the grouping of the terms. This will not influence the enrichment it just skips the coloring of groups of GO terms. The excel sheet export could still have bugs, better use the save project option just next to the export xls button and see if the problem resists.

Best

Azeem Butt

unread,
Jul 29, 2016, 6:36:34 AM7/29/16
to cytoscape-helpdesk
Thank your for your prompt reply, So if I uncheck GO Term Grouping option, will this not lead to large number of GO hits in output (currently running the analysis) ? I have tried the project saving option but system gets freeze at saving kappa matrix files and then there is low memory error. I am currently using 12GB RAM so not sure if this step requires more memory.... One more question,,, the output excel file and the ClueGO results tab contains several GO annotation entries as duplicates with exactly same name, P values, associated gene count. What could be the possible reason for this and/or any fix for this?

Thank you...

Bernhard

unread,
Jul 29, 2016, 7:25:56 AM7/29/16
to cytoscape-helpdesk
Hi, you will still get the same large number of GO terms if you are not more restrictive. So if you have 10000 genes you have to put a very high percentage of genes to be mapped per term and/or a high number of genes mapped. What you want I guess is a visualization of the most prominent GO terms hit by your gene list. So more than 50-100 GO terms are the max one could visualize in a sense full way. So 3 options:
- reduce your input list the most significant genes ~500 genes and/or
- increase % mapped genes and/or  -> (this will keep also smaller and more specific terms)
- increase total number of genes per term -> (this will mainly result in more/very general terms)
Try this and you will get a reasonable network of terms.

If you use grouping you can have duplicate terms in the xls file because there will be all terms per group shown in this file. So you better use the save all data option to get unique info.

Best

Ragavendrasamy Balakrishnan

unread,
Oct 1, 2017, 12:04:05 AM10/1/17
to cytoscape-helpdesk
Dear Bernhard,
Greetings

I am using ClueGo for visualising pathways and network from upregulated and down regulated genes as two clusters represented by different colours

I have used Go term fusion and have kept Kappa score to 0.6 to increase specificity of the network. I am also going through the same problem of nodes being overlapped one over the other -  the same issue that you had answer in this earlier post. However I am unable to perform this in the present ClueGo version. I have attached a screenshot for your reference. Kindly suggest how to proceed.

Thanks

Regards,
Ragavendrasamy
Auto Generated Inline Image 1

Bernhard

unread,
Oct 4, 2017, 11:36:57 AM10/4/17
to cytoscape-helpdesk
Dear Ragavendrasamy,

you could first try to be more restrictive to get less nodes and then you could select only the terms that are interesting to you and hide all the other names here:



Try this first, if this is not enough, You will have to export your current table data then add a column with numbers in excel and re-import it to Cytoscape with import Table under File. The steps should be the same as I described before. It didn't change with the Cytoscape versions.
Let me know if you can manage.
Best
Auto Generated Inline Image 1

Elena MacFarlane

unread,
Nov 24, 2021, 8:05:31 PM11/24/21
to cytoscape-helpdesk
Hi, 
I was looking for answers to same problem stated by Elvira, and found this answer from Bernhard. I tried to do as directed- I was able to create a node group (I can see the system processing) but it does not show up anywhere in the left panel so I cannot select it as custom ontology....any help? Where does this file go?

Thank you!

Elena

Scooter Morris

unread,
Dec 2, 2021, 11:30:21 AM12/2/21
to cytoscape-helpdesk
Hi Elena,

You'll need to be a bit more specific.  This was a long thread, and In Bernard's answer to Elvira, I don't see any discussion of node group -- just a setting to increase before running CluePedia.

-- scooter

Elena MacFarlane

unread,
Dec 2, 2021, 11:33:32 AM12/2/21
to cytoscape...@googlegroups.com
Hi!

Thank you for your reply— I was actually able to see where it goes eventually—the file is created in ClueGO Configuration folder, so I am all set!
Thank you so much,

Elena


-- 
You received this message because you are subscribed to the Google Groups "cytoscape-helpdesk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cytoscape-helpd...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cytoscape-helpdesk/a3da3d64-7f72-4c3d-b32b-f0d67e16ebf4n%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages