Hi Phalchandra,
Thank you for your question about obtaining a valid GTF file from
the knownGene table. You will need to install MySQL and query our public
MySQL server if you would like to get a genePred that will result in a
GTF file with non-matching gene_id and transcript_id fields. After
installing MySQL, here is a command that will result in a genePred that
you can use with genePredToGtf:
$ mysql --host=genome-mysql.soe.ucsc.edu --user=genome -Ne "select a.name, a.chrom, a.strand, a.txStart, a.txEnd,\ a.cdsStart, a.cdsEnd, a.exonCount, a.exonStarts, a.exonEnds, 0 as score, b.geneSymbol from knownGene a join \ kgXref b on a.name=b.kgID" hg19 > hg19.genePred
uc001aaa.3 chr1 + 11873 14409 11873 11873 3 11873,12612,13220, 12227,12721,14409, 0 DDX11L1 uc010nxr.1 chr1 + 11873 14409 11873 11873 3 11873,12645,13220, 12227,12697,14409, 0 DDX11L1 uc010nxq.1 chr1 + 11873 14409 12189 13639 3 11873,12594,13402, 12227,12721,14409, 0 DDX11L1
genePredToGtf file hg19.genePred hg19.knownGene.gtf
chr1 hg19.genePred transcript 11874 14409 . + . gene_id "DDX11L1"; transcript_id "uc001aaa.3"; gene_name "DDX11L1"; chr1 hg19.genePred exon 11874 12227 . + . gene_id "DDX11L1"; transcript_id "uc001aaa.3"; exon_number "1"; exon_id "uc001aaa.3.1"; gene_name "DDX11L1"; chr1 hg19.genePred exon 12613 12721 . + . gene_id "DDX11L1"; transcript_id "uc001aaa.3"; exon_number "2"; exon_id "uc001aaa.3.2"; gene_name "DDX11L1"; chr1 hg19.genePred exon 13221 14409 . + . gene_id "DDX11L1"; transcript_id "uc001aaa.3"; exon_number "3"; exon_id "uc001aaa.3.3"; gene_name "DDX11L1";
Please let us know if you have any further questions!
Thank you again for your inquiry and using the UCSC Genome Browser. If
you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a
publicly-accessible forum. If your question includes sensitive data,
you may send it instead to genom...@soe.ucsc.edu.
Christopher Lee
UCSC Genomics Institute
--
---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To post to this group, send email to gen...@soe.ucsc.edu.
Visit this group at https://groups.google.com/a/soe.ucsc.edu/group/genome/.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CAN5c_Teag%2BO-_bXTW5rVahurYAgoLOeXYxbcW9EhMWqOpcXXSg%40mail.gmail.com.
For more options, visit https://groups.google.com/a/soe.ucsc.edu/d/optout.