Advice on dealing with edited genome

33 views
Skip to first unread message

Matt K

unread,
Aug 29, 2023, 12:55:06 PM8/29/23
to rna-star
Hi,

First, big thanks to Alex and the group for helping me over the years!

I have a question on how to deal with an edited genome. I have a mouse gene that has an exon replaced with the human version. The human exon is slightly shorter than the mouse exon, so if I replace the endogenous genomic sequence with the edited sequence, the coordinates in the gtf for all subsequent genes will be wrong. 

My approach was:

1) Add a new chromosome to the fasta that represents the edited gene loci sequence and add a corresponding entry to the gtf for this new chromosome/gene
2) Remove the endogenous gene from the gtf to avoid multiple mapping issues

My questions are:

1) The endogenous sequence is still present, but not associated with a gtf entry. Will this still be flagged as a multiple mapper and not added to the count table, or would this work correctly?
2) More generally, is there an established way to created references from edited genomes? I imagine this is fairly prevalent now, but couldn't find any advice online.

Thanks for your time!

Best,
Matt

Alexander Dobin

unread,
Sep 5, 2023, 3:59:25 PM9/5/23
to rna-star
Hi Matt,


1) The endogenous sequence is still present, but not associated with a gtf entry. Will this still be flagged as a multiple mapper and not added to the count table, or would this work correctly?

For --quantMode GeneCounts option it will be considered a multimapper and not counted.

2) More generally, is there an established way to created references from edited genomes? I imagine this is fairly prevalent now, but couldn't find any advice online.
 
I am not aware of an established way to do it. I would recommend masking the entire endogenous gene locus (exons and introns) with Ns, adding the edited gene locus. You will need to transfer (and edit) all the annotations (exon lines). 
Reply all
Reply to author
Forward
0 new messages