CLUEGO for Rainbow trout results

68 views
Skip to first unread message

mortega-...@umh.es

unread,
Nov 2, 2017, 8:27:51 AM11/2/17
to cytoscape-helpdesk
Dear Cytoscape-helpdesk,
We would like to carry out a ClueGo analysis with results from RNAseq and Proteomics of Onchorhynchus mykiss samples. Could you please please let us know what would be the best way to perform a Cluego analysis.
With best regards,
Maria

Bernhard

unread,
Nov 2, 2017, 9:17:13 AM11/2/17
to cytoscape-helpdesk
Dear Maria, we will add Onchorhynchus mykiss to the downloadable organisms as soon as possible and let you know.
Best

Bernhard

unread,
Nov 6, 2017, 12:48:39 PM11/6/17
to cytoscape-helpdesk
Dear Maria, we added Onchorhynchus mykiss (tax:8022) to the organism list. Let us know if it works.
Best

joseph....@noaa.gov

unread,
Dec 6, 2017, 7:00:36 PM12/6/17
to cytoscape-helpdesk
Hello Bernard,
Following on Maria's question. I see that O. mykiss has been added, but what steps in the ClueGO documentation 'Walkthrough' must be added or modified to use for an added organism?
I have loaded the O. mykiss under the 'Load Marker List'
I have selected "UniProtKB_AC" as an Identifier type
I have selected a list identifiers generated from a Blastx alignment against the Rainbow Trout Uniprot database.

However, when I attempt to run an enrichment analysis I repeatedly get the error message: "There were no GO terms found for this selection! Please choose less restrictive parameters under 'Advanced Settings' and re-run ClueGO!"
my settings include:
Go Tree interval is 0 to 20,
GO Term/pathway connectivity of 0.3

What step am I missing and what could be the source of the error?
Thank you,
Joe

joseph....@noaa.gov

unread,
Dec 6, 2017, 7:25:39 PM12/6/17
to cytoscape-helpdesk
Bernard,
You may ignore my specific issue with my rainbow trout file. I opened up the gene2accession and gene2uniprot configuration files downloaded to my computer and I realized I did not have any matches, indicating that my identifier file is incorrect.
However, my inquiry regarding additional steps or modifications to the walkthrough for non-model organisms still stands.
Thank you.
joe

Bernhard

unread,
Dec 7, 2017, 8:13:04 AM12/7/17
to cytoscape-helpdesk
Hi Joe,
if you don't get any terms found the best is that you select all ontology levels and all genes in the 'Advanced Settings', then you will get the max possible hits. It can be that some (or a lot in some cases) of the uniprot ids have no ncbi (entrez)
gene id associated. We use the info given by uniprot unfortunately in case of rainbow trout only ~1300 out of 50000 proteins have yet a ncbi gene association (see atached xls file). So only those 1300 that have also a GO annotation would be considered for the enrichment. The GO annotations were taken from QuickGO (https://www.ebi.ac.uk/QuickGO-Old/). If you know better annotation sources let us know and we can make a custom file for you. To get more hit one could also use the unprot ids as unique ids but since one gene can have several protein associations this would also modify the enrichment since you would have much more hits for the same gene that's why we prefer gene ids a unique ids.
Best
uniprot-oncorhynchus+mykiss.xls.zip

joseph....@noaa.gov

unread,
Jan 3, 2018, 6:19:50 PM1/3/18
to cytoscape-helpdesk
Hello Bernhard,
If you have a moment, I wanted to circle back to this discussion. I have been working with ClueGO for a few weeks now. I took a break from the rainbow trout annotation and was using the GO and KEGG annotations for Atlantic salmon (Salmo salar). Although this species is less similar to my organism than rainbow trout, it does have greater number of ncbi genes associated with it and therefore more GO annotations as well as the KEGG pathways. I have been very happy with the results thus far. However for completeness, I would like to turn to rainbow trout again for comparison. I have a couple questions in an attempt to increase the GO annotations for rainbow trout with my specific dataset:
1. I have constructed a file of QuickGO annotations for my rainbow trout genes with UniprotIDs, compiled for all of my clusters (attached). Is there a way to make this into custom configuration and identifier files? If so, how?
2. If that is not possible, can you elaborate on how to make the uniprot ids unique IDs (as you mentioned in your previous post) - recognizing the risk of modified enrichment due to multiple protein associations?

Bernhard

unread,
Jan 4, 2018, 8:19:25 AM1/4/18
to cytoscape-helpdesk
Hi Joseph,
I guess the problem is the link between uniprot and entrez/ncbi gene ids. If you send us your custom uniprot to GO id file we can make you a custom annotation file based on uniprot ids. But as you said you have to keep in mind that the enrichment will be slightly biased due to the 1:n associations of genes to proteins. The customization of GO annotation files  is not yet automatized so we can do it for you.
Best
Reply all
Reply to author
Forward
0 new messages