rn6 rat gtf question

Visto 25 veces
Saltar al primer mensaje no leído

Christa-Lynn Blenck

no leída,
21 abr 2015, 16:43:5621/4/15
a gen...@soe.ucsc.edu
Hello,

I recently used your genePredToGtf program for the rat rn6 genome to create a gtf file in which the gene ID and transcript ID were different. I was able to get this to work and have been using the output gtf for my subsequent RNA-seq analysis. When I was taking a closer look at my gtf file today I noticed something this looks a little strange. At the end of each gene after the last exon or CDS is listed, I then get the coordinates/info for the stop/start codon, and while the coordinates appear to be correct, the exon number that is referenced in both the start and stop codon is always (or from what I have seen) listed as exon number 1, but shouldn’t it be in an exon after number 1? I have attached an example in which it looks like the stop codon should be in exon 2, not exon 1. Is this just a bug in the genePredToGtf program and is there a way to fix it? Will it affect my RNA-seq analysis so far?


I also noticed that there aren’t any values for the “score” attribute, I only have “.” instead in that column. Is this normal as well?
Thanks for the help,

Christa Blenck

140420_GTF_codonexample.rtf

Jonathan Casper

no leída,
21 abr 2015, 18:06:1021/4/15
a Christa-Lynn Blenck,gen...@soe.ucsc.edu

Hello Christa,

Thank you for your question about our genePredToGtf utility. Regarding the first part of your question, which has to do with exon numbers in start and stop codon records, we are still formulating a response. As for the score attribute values, the GTF 2 format specification linked to from http://genome.ucsc.edu/FAQ/FAQformat.html#format4 states that the score field is purely optional and may be replaced with a dot ("."). genePredToGtf, along with the UCSC Table Browser, always fills in the score values with a dot.

Note that when used, the score parameter is intended to reflect the level of confidence in the feature's existence and location.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu or genome...@soe.ucsc.edu. Questions sent to those addresses will be archived in publicly-accessible forums for the benefit of other users. If your question contains sensitive data, you may send it instead to genom...@soe.ucsc.edu.

--
Jonathan Casper
UCSC Genome Bioinformatics Group



--


Responder a todos
Responder al autor
Reenviar
0 mensajes nuevos