Hi Jackie,
The file you are getting from our downloads page is in genePred format.
You can see the format of the table by hitting the "describe table
schema" button in the Table Browser when you have the refGene table
selected.
If you choose the output format "all fields from selected table" in the
Table Browser, you should get results in genePred format, just like you
see it on the downloads page. (Note however that we update the table
available via the Table Browser daily, while we update the file
available via the downloads page only on the weekends, so it is possible
to see a small number of differences between the two files.)
It sounds like you were using the BED output format option in the Table
Browser. Many table types can be converted to BED format in the Table
Browser, including genePred. BED format is described here:
http://genome.ucsc.edu/FAQ/FAQformat.html#format1
The "blockSizes" and "blockStarts" fields in BED format look similar to
the "exonStarts" and "exonEnds" fields in genePred format, but the
fields in BED format are not chromosomal positions, as they are in
genePred format.
I hope this information explains what you are seeing in the various
downloads of the refGene table. If you have further questions, please
contact us again at
gen...@soe.ucsc.edu.
--
Brooke Rhead
UCSC Genome Bioinformatics Group
On 5/15/13 10:33 AM, Jackie Jia Zhou wrote:
> Hi,
>
> I am trying to download gene annotation files for mm9 from
>
genome.ucsc.edu <
http://genome.ucsc.edu>
> I thought there might be two ways of getting the annotation files:(1) go
> to the 'downloads' page for mm9, and download 'refGene.txt' from there;
> (2) go the 'Table' page, select the correct 'genome' and 'assembly'.
> Select 'Gene and Gene prediction Tracks' --> 'RefSeq Genes' -->
> 'refGene' , and then click on 'get output' , and then select 'whole
> Gene' to get the .bed file for genes.
>
> However, the files I could get from these two different ways are very
> different. the total number of entries are the same, but the starting
> and ending coordinates of each entry and so different in these files.
>
> I wonder which file I can trust more? and why is such difference in the
> starting and ending coordinates?
>
> Thank you,
>
> Jackie Zhou
> /PhD Candidate /
> /Division of Biology & Biological Medical Sciences/
> /Washington University in St. Louis, School of Medicine/
>
>
> --
>
>
>