we verified once again the ClueGO results, and we found out that
we have a bug in the log. Thanks for reporting us about this
problem.
To get all annotations for your mutation list, I
made a ClueGO analysis using GO Biological Process and Molecular
Function. I think these sources are covering most (if not all)
annotations, including predictions (IEA evidence code). I mapped the
identifiers in all GO levels, with at least 1 gene/term, and with all
% (there are terms with less than 1% found genes). I included in the
network the genes without annotations. See attached the
example.
Your list with mutant specific identifiers
contained 1047 NM_ transcript ids, some of them were included more
than once. The list contains 972 unique NM_ transcript identifiers,
which were all recognized in ClueGO (for each of these ids an
EntrezGeneID was found).
725 unique EntrezGeneIDs (the main id
type used in ClueGO) corresponded to the 972 transcript ids.
Out
of the 725 genes, 615 were annotated in BP and/or MF, this
representing 84.82% of the found genes. 110 genes (15.17%) have no
annotation.
So the results obtained with ClueGO are
correct, you can see the network and the corresponding table. The bug
affected only the calculation of the number and percentage of found
genes reported in the log. The initial calculation reporting the
number and the percentage of found genes to unique uploaded
identifiers (here 972 ids) is now changed to unique genes (725
genes). In fact, 15% of the 725 genes from your list have no
annotations in GO BP and MF. Between these are RIKEN cDNA ids,
predicted genes and miRNAs. The 36% indicated in the log was not
correct.
Functional enrichment results rely on correctly annotated
genes/proteins and on recent ontologies/pathways. For this ClueGO provides the automatic update of annotation and
ontology sources. Users can thus have the possibility to analyze
their genes/proteins in the context of the latest NCBI gene info,
together with up to date GO, KEGG, Reactome, WikiPathway. In general,
most of the genes are recognized. We provide also additionally
conversion files: e.g. EntrezGeneID to UniProt or Affymetrix that are
included in the organism archive or are available to download within
ClueGO.
Next, ontologies are more detailed in certain areas, with known genes with established functions that have many functional annotations in many sources. In contrast, other genes have scarce functional information. Ontologies are continuously improved, see for example the GO project.
Best