Cytoscape - ClueGO Limitations

159 views
Skip to first unread message

acz...@gmail.com

unread,
Jan 22, 2019, 3:18:20 PM1/22/19
to cytoscape-helpdesk
Dear All,

I am trying to run ClueGO using the most global parameters, with minimum kappa score settings for a list of about 350 proteins. At some point of the procedure, a window appears which informs me that there are more than a certain number of nodes in the network. When I choose to continue analysis, there are several occasions when the analysis goes on for even more than a day (24 hours) with no sign of the program being frozen. I would like to know if there are limitations to the number of nodes the application can process in a single run and if so what are those. Are these limitations relative to the hardware configuration that runs the application? If so, are there any minimum requirements for a PC running Cytoscape to its full potential?

I thank you in advance for your time.

Athanassios

Bernhard

unread,
Jan 23, 2019, 4:22:16 AM1/23/19
to cytoscape...@googlegroups.com
Dear Athanassios,
when you get the message that there will be too much nodes it means the you will get a huge network that will be difficult to interpret and grouping will take a long time. If you really want to go on with this setting you should have at least 16GB RAM and an i7 or xeon 3.8GHz processor preferred and you should skip the grouping to speed up the process. You could see also a description here Comprehensive functional analysis of large lists of genes and proteins. The idea of ClueGO is it to get the most interesting terms for your gene/protein selection, so you should try to be more restrictive with the number and percentage of genes per term (this depends also on the number of genes you want to map). You can set this under "Advanced Term/Pathway Selection Options". Try different settings and you will see that you will end up with a reasonable number of terms for your gene list.
Best

acz...@gmail.com

unread,
Jan 23, 2019, 5:20:34 AM1/23/19
to cytoscape-helpdesk
Dear Bernhard,

I am much obliged for your prompt reply. I am using ClueGO for quite a while now and I can appreciate the philosophy behind its design. However I am currently trying to associate specific GO terms to members of an approximately 350 protein list, which terms I know from a different analysis that are related to certain proteins on that list. Through this process I am aiming to find out the specific proteins that are related to the GO terms of interest. Since the parameters I usually go with do not assign any proteins to these specific GO terms of interest, I am using more global settings in order for these terms to also be included in the analysis results. If you can point out to me a different, less time and resources consuming way to do this I would very much appreciate. At this very moment I am running a session of ClueGO since last night (for over 12 hours now) for a reported number of nodes just under 6,000. Is this normal? Should I terminate the procedure or must I still wait and for how longer, if such an assessment could be made. Are there any other than CPU and RAM, parts of hardware (graphics card properties, HDD speed, internet speed) which could affect ClueGO’s performance?
Thank you so much for your time.

Best regards,

Bernhard

unread,
Jan 23, 2019, 7:58:45 AM1/23/19
to cytoscape-helpdesk
Dear Athanassios,
I think no analysis should run longer than one hour even on a slow computer. Since ClueGO and Cytoscape are quite RAM intensive this is the limiting point. We tried to run again an example with 3600 terms to display (without grouping) and it took ~ 10min on a i7 ~4GHz with 32GB RAM (~13GB used by Cytoscape and the OS itself after finishing the network). So I think you would need probably more than 16GB RAM for 6000 terms and it would run for a while. So your system is probably out of RAM or Cytoscape froze at some point. Try first to run not more than 1600 terms and check if it terminates and increase then the number of terms to see when the limit is reached. Don't forget to switch of the grouping when you run lots of terms.
Let us know if it works
Best
Reply all
Reply to author
Forward
0 new messages