No enriched GO terms in goseq analysis

127 views

Skip to first unread message

Rafaela Perez

unread,

Sep 28, 2021, 7:30:53 PM9/28/21

to trinityrnaseq-users

Hello all!

I followed Trinity's pipeline for transcriptome assembly and downstream analysis, as well as Trinotate's pipeline for annotation. Then, I went for GO terms enrichment analysis on differentially expressed genes, also using Trinity's pipeline (https://github.com/trinityrnaseq/trinityrnaseq/wiki/Running-GOSeq).

However, when I performed enrichment with GOseq, there were no GO terms enriched. I double checked the steps to be sure this result was not my mistake in providing wrong files, or wrong arguments. But could not find anything (the code is attached to this email). So now I'm trying to understand why I'm getting this result.

This dataset I'm working with has 2221 DEGs (DESeq2 - p-value<=0.05) and out of this 275 are annotated. When I ran goseq, all the adj-p-values are 0.99999.... (EC.david and EC.genelist attached)

GOseq's first step is to fit a Probability Weighting Function (PWF) to obtain a weighting for each gene. After that, the enrichment is done. The function that fits the PWF generates a plot so you can verify the goodness of fit before continuing the analysis (this info can be found in the goseq package vignette). The image bellow is the graph I obtained.

My data does not fit the model and I think this is why I'm not getting enriched go terms. I'm not sure how to interpret this...it looks like there are a bunch of genes with small lengths that are super expressed and a few very big genes with almost no expression. But I'm confused on how that happened. Maybe this is simply a reflection of my de novo transcriptome assembly.

I really need the go terms enrichment analysis so I'm trying to understand what is happening here as well as trying to find other R packages/platforms that perform this type of analysis (so far, I've found GOglm, topGO and clusterProfiler, the only ones that do not require having a reference organism).

Has anyone had this type of problem before? Have you ever seen anything like this? Do you have suggestions of other R packages/platforms to perform go terms enrichment analysis that do not require a genome annotation file from a reference organism? Or suggestions on how to make GOseq to work?

Thanks a lot in advance for your input, guys!

EC.genelist

EC.david

GO_enrichment_commands.txt

Reply all

Reply to author

Forward

0 new messages