genome browser inquiry

6 views
Skip to first unread message

Kathleen Larson

unread,
Aug 18, 2015, 2:50:04 PM8/18/15
to gen...@soe.ucsc.edu
Hello,

I am a undergraduate student at the University of Rochester conducting a bioinformatics project in a lab at Beth Israel Deaconess Medical Center regarding genes related to autism.

For my project, I need a list of genes for every single chromosome, along with the range of base pairs across which the gene lies.  Do you have this available for download in your database?  I can only find full assembly sequences, whereas I need a list of the actual genes.  I am using hg17, hg 18, and hg19.

Thank you and I look forward to hearing from you,

Kathleen Larson
University of Rochester 17'
Biomedical Engineering

Steve Heitner

unread,
Aug 21, 2015, 4:04:54 PM8/21/15
to Kathleen Larson, gen...@soe.ucsc.edu

Hello, Kathleen.

You can obtain the information you are looking for by downloading the contents of one of our gene tracks.  The RefSeq Genes table already contains the gene symbol (in the “name2” field), so this would probably be your best bet.  To obtain this table in its entirely, download refGene.txt.gz from http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/ (hg17 and hg18 have similar files in their respective directories on the download server).  Note that many genes have multiple transcript variants, so you will see multiple entries for many of the gene symbols.

If you’re not certain what the items in the table represent, you can view the table schema by going to the track description page (in this case, for hg19 RefSeq Genes, http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=refGene) and clicking the “View table schema” link at the top of the page.  This will provide you with the schema along with some sample lines from the table.

If you ever need to download only part of a table or cross-reference the contents of multiple tables, you can also use our Data Integrator.  I provided a Data Integrator example to another user just yesterday: https://groups.google.com/a/soe.ucsc.edu/d/msg/genome/oW_2v8K0a00/GAvxCxnLFAAJ

Please contact us again at gen...@soe.ucsc.edu if you have any further questions. 
All messages sent to that address are archived on a publicly-accessible Google Groups forum.  If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

---
Steve Heitner
UCSC Genome Bioinformatics Group

--

Reply all
Reply to author
Forward
0 new messages