CRISPR Screen hit analysis GSEA preranked

258 views
Skip to first unread message

korbini...@googlemail.com

unread,
Oct 11, 2023, 7:32:20 PM10/11/23
to gsea-help
Hi,

I would appreciate feedback whether I'm using the right approach. I have a CRISPR screen hit file that I want to test with GSEA preranked. These are my steps:

1) Start with the log fold change averaged across all guides (except non targeting controls)
2) Convert this file into a tab delimited .rnk file (save as tab delimited and add .rnk)
3) load it into GSEA (I can open it in data viewer and it looks fine)
4) Select the databases I'm interested in (GO-BP, KEGG, REACTOME) 
5) Set Collapse/Remap to gene symbols to "No_Collapse"
5) Run GSEA preranked 

The spreadsheet looks fine and the pathways make sense. Is this the correct approach? Also how do you deal with FDR=0. I've set it arbitrarily to 1/number of permutations (1000).

Thank you !
Korbinian

Anthony Castanza

unread,
Oct 12, 2023, 6:24:43 PM10/12/23
to gsea-help
Hi Korbinian,

This isn't a scenario we have specific recommendations for, I'd need a little more information to determine if the experiment seems like a reasonable use of GSEA.
Are your guides targeting many different genes, or a screen of many guides for the same gene?
Did you perform genome-wide RNA-seq on each sample?

In general, as long as you have information for all expressed genes, then it should be reasonable to perform GSEA.
If you have many samples, where you would expect the targeting to result in the same phenotype (i.e. knockdown of a specific gene) as well as non-targeted controls, I might recommend that you actually perform standard two-phenotype GSEA, but it would depend on the specifics of your CRISPR screen.

I also probably wouldn't recommend using the "No_Collapse" option, generally you always want to use collapse with an appropriate chip file, even if your data is already in Gene Symbols as it helps ensure that the specific iterations of the gene symbols match those in MSigDB as precisely as possible.

-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

Message has been deleted

korbini...@googlemail.com

unread,
Oct 13, 2023, 12:23:09 PM10/13/23
to gsea-help
Hi Anthony,

Thank you so much for reply and for following up. I did not perform RNA-seq on my samples.  

I've ran a whole genome CRISPR knockout screen and ended up with a list containing  18,797 genes with a fold change value assigned to each gene (some genes targeted by the library are not included as they are essential and I don't have reads for these). The fold change was calculated by comparing the guide reads of the replicates of two conditions (MAGeCK algorithm). My idea was to use this list with the GSEA preranked module to understand whether there is enrichment of gene sets at either end of the ranking. Is this a reasonable use of GSEA preranked? 

Thanks also for highlighting the importance of not using the "no_collapse" option.

Best and thank you for your response,
Korbinian.
Reply all
Reply to author
Forward
0 new messages