Getting RefSeq GeneNames for a BED file

1,346 views
Skip to first unread message

vijai2007

unread,
Aug 22, 2012, 7:59:30 PM8/22/12
to gen...@soe.ucsc.edu
Hi, 
I have a bed file in this format.
1       6520057 6520208 +       target_783
1       6521492 6521824 +       target_784
1       6522052 6522236 +       target_785
1       6522687 6522728 +       target_786
1       6522921 6523032 +       target_787
1       6523123 6523209 +       target_788
1       6524433 6524515 +       target_789
1       6524610 6524781 +       target_790
1       6525146 6525284 +       target_791
1       6525498 6525622 +       target_792
1       6526127 6526169 +       target_793
1       6527621 6527634 +       target_794
1       6527883 6528648 +       target_795
1       6529100 6529303 +       target_796
1       6529393 6529512 +       target_797
1       6529602 6529738 +       target_798
1       6530294 6530417 +       target_799
1       6530564 6530705 +       target_800

I would like to substitute the target_names for their actual RefSeq names.
How can I do this using tools available at genome.ucsc.edu
Thanks
~GeneHunter

Pauline Fujita

unread,
Aug 24, 2012, 2:49:36 PM8/24/12
to vijai2007, gen...@soe.ucsc.edu
Hello GeneHunter,

The easiest way to do this would be to make a custom track of your
regions and then intersect this with the RefSeq track using our Table
Browser tool (http://www.genome.ucsc.edu/cgi-bin/hgTables). Before you
can make a custom track you will need to correct the format of your
input:

1 6520057 6520208

should be:

chr1 6520057 6520208

Also note that item names belong in the fourth field and the sixth
field is for the strand. For more detail on BED format please see this
FAQ:

http://www.genome.ucsc.edu/FAQ/FAQformat.html#format1

To input your custom track click on the "add custom tracks" button
below the main browser display or select "custom tracks" from the "My
Data" tab, paste your input in the main box and click submit then
click "go to table browser".

In the Table Browser select:

group: Genes and Gene Prediction Tracks
track: Refseq Genes
table: refGene

the click "create" intersection. In the intersection menu select:

group: Custom Tracks
track: (your custom track)

then select a criteria for how closely you expect your coordinates to
overlap the corresponding RefSeq items and hit submit. In the main
menu select BED output format and then click "get output".

This method will retrieve the RefSeq records that overlap with your
regions and will output the coordinates of the RefSeq items. If you
are wanting to retain your input coordinates you may be able to use
the tools over at Galaxy (http://g2.bx.psu.edu/) to retain your
coordinates. In this case click "Send output to Galaxy" before hitting
"get output" to see your data on the Galaxy site. The "Join, Subtract
and Group" or "Operate on Genomic Intervals" menus might be of use for
this. For questions regarding Galaxy you'll have to consult their
support section and the mailing lists referenced therein:

http://wiki.g2.bx.psu.edu/Support

Best regards,

Pauline Fujita
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu
> --
>
>
>
Reply all
Reply to author
Forward
0 new messages