Table Browser RefSeq Genes refGene GTF Output

1,630 views
Skip to first unread message

Andy Rampersaud

unread,
Jul 15, 2013, 1:28:16 PM7/15/13
to gen...@soe.ucsc.edu
Hello,

I was downloading the  RefSeq Genes refGene (mouse mm9 assembly) in the GTF format.  I noticed that the name2 column in the refGene table was not successfully transferred to the GTF output.  The GTF output instead has transcript_id and gene_id both being the same value (eg. NM_001277487).  I was hoping I could get some help with getting the correct GTF format for this refGene table.

Thanks,
Andy


--
Andy Rampersaud
Graduate Student, Bioinformatics
Waxman Lab, Boston University

Jonathan Casper

unread,
Jul 16, 2013, 7:21:17 PM7/16/13
to Andy Rampersaud, gen...@soe.ucsc.edu

Hello Andy,

Thank you for your question about GTF output from the table browser. Unfortunately due to the way the table browser processes data, it's not possible to get the transcript ID included in GTF output. There is, however, a different way to get a GTF file with those IDs included. If you set up the genePredToGtf utility on your computer, you can generate your own (more complete) GTF files by querying our public mysql server directly. For more information on this option, see the following wiki page: http://genomewiki.ucsc.edu/index.php/Genes_in_gtf_or_gff_format. Please note that the wiki page gives directions for obtaining the knownGene table; for your request you should replace all instances of "knownGene" with "refGene" in the commands on that page.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu.
--
Jonathan Casper
UCSC Genome Bioinformatics Staff



--
 
 
 

Andy Rampersaud

unread,
Jul 22, 2013, 9:50:54 AM7/22/13
to gen...@soe.ucsc.edu
Hi Jonathan,

Thank you for your helpful solution.  I was successful in getting the GTF file I was seeking.  Minor question: Are the kent command utility tools available for Linux 32 bit operating systems?  I have access to a 64 bit server and was able to get the tool working but I was just curious. 

2nd Question:

I was hoping to find the UCSC source for a GTF file I attained from a fellow student.  The path to the GTF gene file:

genome/mm9bowtie2/Mus_musculus/UCSC/mm9/Annotation/Archives/archive-2012-03-09-05-07-56/Genes/genes.gtf

I have also attached a file listing of this directory (genome_file_list.txt). 

I would like to know where/how one would go to download this folder from UCSC?  I basically want to make sure I'm using the most up-to-date gene.gtf file.

Thanks,
Andy
 

Andy Rampersaud

unread,
Jul 22, 2013, 9:52:30 AM7/22/13
to gen...@soe.ucsc.edu
(Attachment added)

Hi Jonathan,

Thank you for your helpful solution.  I was successful in getting the GTF file I was seeking.  Minor question: Are the kent command utility tools available for Linux 32 bit operating systems?  I have access to a 64 bit server and was able to get the tool working but I was just curious. 

2nd Question:

I was hoping to find the UCSC source for a GTF file I attained from a fellow student.  The path to the GTF gene file:

genome/mm9bowtie2/Mus_musculus/UCSC/mm9/Annotation/Archives/archive-2012-03-09-05-07-56/Genes/genes.gtf

I have also attached a file listing of this directory (genome_file_list.txt). 

I would like to know where/how one would go to download this folder from UCSC?  I basically want to make sure I'm using the most up-to-date gene.gtf file.

Thanks,
Andy
genome_file_list.txt

Jonathan Casper

unread,
Jul 23, 2013, 7:17:37 PM7/23/13
to Andy Rampersaud, gen...@soe.ucsc.edu

Hello Andy,

We make binaries of the command line utilities available for several systems, but 32-bit Linux is not one of them. If you go to our downloads page at http://hgdownload.soe.ucsc.edu and scroll down to the "Source Downloads" section, you'll find a link to instructions for downloading the source code for these utilities and building them on your own system.

Unfortunately, I can't really tell where your colleague obtained that GTF file from. The file listing you provided doesn't look like anything on our downloads server. In any case, the most up-to-date version of a genes list in GTF format you're likely to find is the file you just generated with genePredToGtf.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu.

--
Jonathan Casper
UCSC Genome Bioinformatics Staff

--
 
 
 

Reply all
Reply to author
Forward
0 new messages