hg19 refseq genes standard bed file

1,553 views
Skip to first unread message

Sarah Halawa

unread,
May 24, 2017, 8:33:54 PM5/24/17
to gen...@soe.ucsc.edu
Dear concerned Sir/Madame
I followed these steps to get the hg19 annotation file
  • I went to the UCSC genome browser
  • Selected "Genes and Gene prediction tracks" from the "group" drop-down menu.
  • Selected the "Refseq Genes" from the "track" drop-down menu.
  • Selected "refGene" table
  • Selected "BED- browser extensible data" for the "output format"
  • Clicked "get output" and on the following page clicked "get BED"without changing any options.
  • Finally I saved the output as a text file

Unfortunately, it didn't give the expected result.

Is there a standard refseq.hg19.bed.txt file?

Thank you so much!

All the best,

Sarah


Cath Tyner

unread,
May 26, 2017, 8:14:31 PM5/26/17
to Sarah Halawa, UCSC Genome Browser Public Help Forum
Hi Sarah,

Thank you for contacting the UCSC Genome Browser support team and inquiring about finding the nearest genes based on a reference point, including the distance from the reference point to the transcript, as well as the strand.

Please see this recently revised wiki page:
http://genomewiki.ucsc.edu/index.php/Finding_nearby_genes

If you are comfortable creating a script, you can copy the script in this section and create a script:
http://genomewiki.ucsc.edu/index.php/Finding_nearby_genes#Script_for_refGene_on_hg19
This script uses your example as reference point: chr1 991973 991973
and finds the 10 nearest transcripts in refGene, hg19, that are upstream (and the 10 nearest downstream).

The output includes the transcript name and coords, the gene name/alias, the strand, and the distance (in bp) from the reference point to each transcript.
Please note that there is a url to a genome browser session in the section for "refGene/hg19" which visualizes the example output.

Please also note the section for "Alternatives" which point to other options to accomplish your goal.
http://genomewiki.ucsc.edu/index.php/Finding_nearby_genes#Alternatives

Please respond to this list if you have further questions!

Thank you for contacting the UCSC Genome Browser support team. 
​Please send new and follow-up questions to one of our UCSC Genome Browser mailing lists below:

  * Post to the Public Help Forum: E
mail 
gen...@soe.ucsc.edu
​ or search the Public Archives
​  * Post to the Mirror Help Forum: Email
 
genome...@soe.ucsc.edu 
or search the Mirror Archives​
​  * Confidential/private help: Email
 
genom...@soe.ucsc.edu

UCSC Genome Browser Announcements List (email alerts for new data & software):
  * Subscribe: Email genome-announce+subscribe@soe.ucsc.edu 
  * Unsubscribe: Email genome-announce+unsubscribe@soe.ucsc.edu

Join us on Social Media! FacebookTwitter, Wordpress BlogYouTube

​Enjoy,​
Cath
. . .
Cath Tyner
UCSC Genome Browser, Software QA & User Support
UC Santa Cruz Genomics Institute



--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To post to this group, send email to gen...@soe.ucsc.edu.
Visit this group at https://groups.google.com/a/soe.ucsc.edu/group/genome/.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CABULLd9UJxHyxKy9RA5hFMjSJuFVUDyMjbTA7_x_m-kHun-pKQ%40mail.gmail.com.
For more options, visit https://groups.google.com/a/soe.ucsc.edu/d/optout.

Sarah Halawa

unread,
Jun 7, 2017, 11:38:58 AM6/7/17
to Cath Tyner, UCSC Genome Browser Public Help Forum
Hey Cathy,
Hope you're doing great!
I've been trying to follow the steps you provided me with, but they're too complicated for me.
Is there an easier way?

On another note:
Is there a way using the UCSC genome table browser to download a hg19 bed file that contains official gene symbols, for example:
chr20    33506636    33563217    MYH7    0    +    33513606    33562869    0    35    350,71,123,289,57,83,166,107,252,97,102,159,179,183,145,147,231,194,128,170,168,135,144,134,139,349,122,121,122,2616,114,156,102,57,612,    0,2902,4580,6867,8177,10345,10641,11568,11906,14492,16652,17275,17976,18472,20104,21051,22473,23833,24248,24578,35240,35815,38033,39128,41183,42325,43201,44655,46449,46995,52344,52641,53549,54604,55969,

Thank you very much,
All the best,
Sarah

  * Subscribe: Email genome-announce+subs...@soe.ucsc.edu 
  * Unsubscribe: Email genome-announce+unsub...@soe.ucsc.edu

Christopher Lee

unread,
Jun 13, 2017, 1:00:25 PM6/13/17
to Sarah Halawa, Cath Tyner, UCSC Genome Browser Public Help Forum
Hi Sarah,

Thank you for your questions about obtaining a bed file of transcripts
nearest to a particular position, and obtaining a bed file of
transcripts with an official gene symbol in the 4th column.

As for your second question, there is no way to get a bed file with
the gene symbol as the 4th field from the Table Browser. However you
could get that info from our public MySQL server. You will need to
install MySQL and have access to a command line, where you can run the
following command:

$ mysql --user=genome --host=genome-mysql.soe.ucsc.edu -A -Ne "select
chrom, txStart, txEnd,\
name2, score, strand, cdsStart, cdsEnd, 0 as itemRgb, exonCount,
exonStarts, exonEnds \
from refGene" hg19 > hg19.refGeneTranscripts.bed

This will result in a bed file of ALL hg19 refGene transcripts with
the name2 field as the 4th column.

As for finding the transcript, distance, etc., nearest to a single
position, this also cannot be accomplished via the Table Browser, but
can be accomplished via the script from the wiki page in Cath's
previous answer. Where are you struggling in trying to run the script,
perhaps we can help you get it going so you can keep using it in the
future?

Thank you again for your inquiry and using the UCSC Genome Browser. If
you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a
publicly-accessible forum. If your question includes sensitive data,
you may send it instead to genom...@soe.ucsc.edu.

Christopher Lee
UCSC Genomics Institute
>> * Subscribe: Email genome-annou...@soe.ucsc.edu
>> * Unsubscribe: Email genome-announ...@soe.ucsc.edu
> https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CABULLd_Ud_VAQoDKczADzQVDBeK-k-NMSU1X8Yj%3DXNz_%3DfObUA%40mail.gmail.com.

Sarah Halawa

unread,
Jun 19, 2017, 11:41:31 AM6/19/17
to Christopher Lee, Cath Tyner, UCSC Genome Browser Public Help Forum
Hey Christopher,
Can't thank you enough! This was of tremendous help:)
Thank you very much
Best wishes,
Sarah
Reply all
Reply to author
Forward
0 new messages