Dear Ron,
Thank you for using the UCSC Genome Browser and your question about pulling SNPs given genomic coordinates.
In that mailing list question, a user is doing the reverse of searching for coordinates given a long list of rs#### Identifiers. How many coordinates are you looking to query? You could do the same actions by using the Table Browser's "define regions" button, however it would be limited to 1,000 regions per inquiry.
To use the Table Browser, you would select the "Variation and Repeats" group to find the "AllSNPs(137)" track and select the "snp137" table. Then click the "define regions" button and then add locations by either clicking the "Choose File" button or pasting coordinates in the box. You will have to modify coordinates that are not greater than one base, that is chr6:31324144-31324144 will have to become chr6:31324144-31324145. Then you would change the output format to "selected fields from selected tables" and click "get output". Then you could select just the fields you would like, such as "chrom", "chromStart", "chromEnd" and "name" to "get output" such as:
chr6 31324143 31324144 rs4997052
chr6 31324144 31324145 rs9266150
Another option would be to follow the suggestion of downloading the snp137.txt.gz file if you have many thousands of coordinates and then running a command like:
zcat snp137.txt.gz | grep -Fwf myCoordinates.txt > mySnps.txt
where myCoordinates.txt was a list of **tab** delineated coordinates like:
chr19 7143144 7143144
chr19 7143562 7143563
chr19 7143574 7143575
You would then have a line for each matching coordinates in the mySnps.txt database. You could further select out just the rs Identifiers and coordinates with a command like:
awk '{print $2,$3,$4,$5}' mySnps.txt
To get output like:
chr19 7143574 7143575 rs191708249
This strategy will only work if your exact coordinates are listed in the snp137 table, for example, chr6 31324144 31324144, will not find any match. Whereas a shortened first coordinate entry, "chr6 31324144", will find the match "chr6 31324144 31324145 rs9266150", yet miss a match for "chr6 31324143 31324144 rs4997052".
Thank you again for your inquiry and using the UCSC Genome Browser. If you have any further questions, please reply to
gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to
genom...@soe.ucsc.edu.
All the best,
Brian Lee
UCSC Genome Bioinformatics Grou