Add reporting on control sequences in RNA-seq

31 views
Skip to first unread message

Gautam Naishadham

unread,
Sep 13, 2022, 4:13:34 PM9/13/22
to QualiMap

Hi Konstantin,

I’d like to know if you would be open to adding reporting specifically on control spike-in sequences used in RNA-seq such as ERCCs[1] and SIRVs[2]. In addition to providing information on absolute and relative transcript quantification, these provide a known ground-truth for metrics such as correct-strandedness and 5'-3' coverage evenness across transcripts.

Such a module might take as input a BED file containing spike-in reference information along with a column with the expected concentration of each synthetic transcript. Relevant quality metrics could then be compiled specifically for these transcripts from the input BAM, and an observed-vs-expected correlation plot could be generated using observed counts and expected concentrations.

I would be happy to contribute to the development of this enhancement if this is something you think might be a good fit in QualiMap.

-Gautam

  1. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3166838/

  2. https://www.biorxiv.org/content/10.1101/080747v1

Konstantin Okonechnikov

unread,
Sep 15, 2022, 5:41:17 AM9/15/22
to qual...@googlegroups.com
Hi Gautam,

thanks a lot for nice suggestion for Qualimap extension! I fully agree that it would be useful improvement for the tool, for example it could be included as an option for the RNA-seq QC mode. However, from my side unfortunately would lack time in development, too much other research stuff on-going... You noted that could contribute: are you interested in direct development? Project is fully open-source, you could establish own development branch from repo, then we can easily merge with the main version. In this case, please let me know what information would be useful from my side. 

There might be other opportunities, for example I could state this as a project for a master student, should have such an opportunity quite soon (end of Sept / start of October). But in this case help from your side also would  be useful, for example do you have already clear test datasets that could serve as a control for such an option? Or any direct plots / stats as examples for the report?

Best regards,
   Konstantin


 

--
You received this message because you are subscribed to the Google Groups "QualiMap" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qualimap+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/qualimap/4302758d-cb5e-4e9a-8d41-594be166a537n%40googlegroups.com.

Gautam Naishadham

unread,
Sep 16, 2022, 3:25:16 PM9/16/22
to QualiMap
Great to hear your support for this! I am definitely interested in direct development, but I will look through the codebase and let you know to what extent (if any) it might be worth engaging a student. Either way I'll begin putting together some test datasets including reads from ERCCs and SIRVs in human background.

Best,
Gautam

Reply all
Reply to author
Forward
0 new messages