Strand information in GTF

44 views
Skip to first unread message

samha...@gmail.com

unread,
Jun 15, 2016, 10:07:53 AM6/15/16
to mitranscriptome
Hi Yashar,

We are currently trying to use the gtf file for a differential expression analysis project. However, I notice that a small but significant number of features do not contain any strand information, (i.e. column 7 should be + or -, but some are .).

As such, we are having trouble aligning our reads with TopHat2 as it requires a gtf with strand information.

Do you have a version which has the complete strand information for all lncRNAs, please?

Look forward to your reply and best wishes,

Sam

Yashar Niknafs

unread,
Jun 15, 2016, 4:24:47 PM6/15/16
to mitranscriptome
Hello,

An unstranded transcript means it was built from underlying RNA-seq data with no strand information, so there is no way to impute strand from that. 

It is a tricky thing. It could be a real monoexonic previously unidentified gene. But there it may be hard to investigate such transcription. If we had more strand-specific data, we could get more clarity on these types of transcripts, but with what we have at our disposal now, its a bit tough. 

-Yashar 

samha...@gmail.com

unread,
Jun 16, 2016, 5:33:51 AM6/16/16
to mitranscriptome
Hi Yashar,

Thank you very much for your quick response and for your explanation. I think for now we will either omit these targets or perhaps treat them as 'unstranded' and accept hits on either strand.

Thanks again and best wishes,

Sam
Reply all
Reply to author
Forward
0 new messages