Hi,
First, big thanks to Alex and the group for helping me over the years!
I have a question on how to deal with an edited genome. I have a mouse gene that has an exon replaced with the human version. The human exon is slightly shorter than the mouse exon, so if I replace the endogenous genomic sequence with the edited sequence, the coordinates in the gtf for all subsequent genes will be wrong.
My approach was:
1) Add a new chromosome to the fasta that represents the edited gene loci sequence and add a corresponding entry to the gtf for this new chromosome/gene
2) Remove the endogenous gene from the gtf to avoid multiple mapping issues
My questions are:
1) The endogenous sequence is still present, but not associated with a gtf entry. Will this still be flagged as a multiple mapper and not added to the count table, or would this work correctly?
2) More generally, is there an established way to created references from edited genomes? I imagine this is fairly prevalent now, but couldn't find any advice online.
Thanks for your time!
Best,
Matt