Salmon and DESeq2

218 views
Skip to first unread message

Viktor

unread,
Dec 29, 2021, 4:09:37 PM12/29/21
to GenePattern Help Forum
Hello,

Can I use NumReads column from *quant.genes.sf file generated by Salmon as input for DESeq2?

Thank you,
Viktor

Anthony Castanza

unread,
Dec 29, 2021, 4:16:22 PM12/29/21
to genepatt...@googlegroups.com
Hi Viktor,

The general recommendation from both the Salmon and DESeq2 developers here is to not do this. The recommended pipeline is to provide the transcript level quant.sf files to TxImport and then to provide the TxImport object containing both estimated count and transcript length information to DESeq2. We've implemented that pipeline in the tximport.DESeq2 module available here: https://cloud.genepattern.org/gp/pages/index.jsf?lsid=urn:lsid:8080.gpserver.ip-172-31-26-71.ip-172-31-26-71.ec2.internal:genepatternmodules:179:3.2.3

That module has a warning about it being specifically for a workshop, but it should work well enough for general use.

If you only have the gene level quant.genes.sf files, then you could probably use the counts with DESeq2, but again, that isn't really recommended. I can't remember if our module automatically rounds the counts in that case or would error if the counts are not integers (which is the default DESeq2 behavior).

-Anthony

Anthony S. Castanza, PhD
Mesirov Lab, Department of Medicine
University of California, San Diego

--
You received this message because you are subscribed to the Google Groups "GenePattern Help Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genepattern-he...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/genepattern-help/96d046bd-3a47-4655-a625-a02096f9fd24n%40googlegroups.com.

Viktor

unread,
Dec 29, 2021, 4:37:17 PM12/29/21
to GenePattern Help Forum
Hi Anthony,

Thank you for the quick reply. I will take a look at the module you suggested.

I can report that GenePattern DESeq2 module accepts NumReads from Salmon as they are, without having to round them to nearest integer and without generating error message or warning. DESeq2 results are not identical but very similar, feel free to look at jobs 401567 and 401570.

Best,
Viktor

On Wednesday, December 29, 2021 at 4:16:22 PM UTC-5 Anthony Castanza wrote:
Hi Viktor,

The general recommendation from both the Salmon and DESeq2 developers here is to not do this. The recommended pipeline is to provide the transcript level quant.sf files to TxImport and then to provide the TxImport object containing both estimated count and transcript length information to DESeq2. We've implemented that pipeline in the tximport.DESeq2 module available here: https://cloud.genepattern.org/gp/pages/index.jsf?lsid=urn:lsid:8080.gpserver.ip-172-31-26-71.ip-172-31-26-71.ec2.internal:genepatternmodules:179:3.2.3

That module has a warning about it being specifically for a workshop, but it should work well enough for general use.

If you only have the gene level quant.genes.sf files, then you could probably use the counts with DESeq2, but again, that isn't really recommended. I can't remember if our module automatically rounds the counts in that case or would error if the counts are not integers (which is the default DESeq2 behavior).

-Anthony

Anthony S. Castanza, PhD
Mesirov Lab, Department of Medicine
University of California, San Diego

Anthony Castanza

unread,
Dec 29, 2021, 5:01:01 PM12/29/21
to genepatt...@googlegroups.com
Hi Viktor,

I took a look at the code, what is happening here is that the deseq2 module is first keeping all genes with >1 count across all samples, and then rounding the counts data to integers, so rounding first and supplying the pre-rounded is reversing this process and retaining genes that would not normally pass the >1 count filter. I'd still recommend using the transcript level quant.sf file with the tximport.DESeq2 module as the preferred pipeline, but if you really must use the gene level file I'd probably recommend using the unrounded counts, and letting the module handle the filtering and rounding in that order.

-Anthony

Anthony S. Castanza, PhD
Mesirov Lab, Department of Medicine
University of California, San Diego

Viktor

unread,
Dec 29, 2021, 5:09:55 PM12/29/21
to GenePattern Help Forum
Hi Anthony,

Sounds good, thank you for your help.

Viktor
Reply all
Reply to author
Forward
0 new messages